Paul is the Richard Feynman of audiophile world due to his ability to effortlessly explain complex electrical engineering concepts in simple (yet accurate) terms that laypeople can understand. This video exemplifies this ability.
What a great explanation....I understood it, at least till I forgot it a couple of minutes later. But I can watch it again and I'm really grateful for your off the cuff explanation. Thanks Paul
This channel is pure gold! I'm an engineering student and also interested in building my own active speakers as a hobbyist. I really love to get into a deep understanding of all the audio stuff and you are helping me a lot! Thank you!
@@Harald_Reindlhe said -is our chief engineer bob standards hahaha and snap shots goes chakp chakp chakp chakp 😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂 and i love to use water filter on dac..😅😅😅😅
In a nutshell: Sampling Rate = how many snapshots of the music are taken each second. Bit Depth = how much information is present in each snapshot. Although Paul is correct, in that upsampling does not mean more information, this needs to be put into perspective. Because if there was no more information, then upsampling would be meaningless. More information will not be present in the source file (no modification to the source data is made). But upsampling (or oversampling) calculates what would be present between two existing samples, creates the samples on the fly, and feeds the new, additional samples to the DAC. So there is more information. Although a source file contains, for example, 44,100 samples per second, upsampling it to 88,200 samples per second will result in what ultimately reaches the part of the processing that converts those samples into an analog sound (88,200 samples per second will be seen by the DAC and converted into sound by the DAC (the digital to analog converter)). Upsampling is a mixed bag, and can result in a euphoric sound, which could be pleasant sounding, but will not be as accurate as a native file at the higher sampled rate. And although upsampling has benefits, it could make songs sound unnatural. If you had two files of the same song: #1 produced at 44,100 samples per second (CD quality), and #2 produced at 176,400 samples per second (quadruple #1), then... ... upsampling the 44,100 file by a factor of 4 will not yield the same sound quality as the native 176,400 sampled file.
Do you know what I liked about your post? I read it in like 2 minutes and basically understood everything Paul said in the video and some more. So thank you for your short and meaningful answer. Can I subscribe to you?
This master series is like working with an open minded dude who was involved in common sense audio for 40 years. No nonsense, but not dismissive of newbies. Audiophiles can be snobby and know it all. Also, engineers are great for making audio products, but they can't explain how to install a light bulb. Paul is the middle man of audio information.
Needs Part 2: The Reconstruction Filter. The main reason for up-sampling a 44.1KHz-sampled signal, is to perform most of the playback reconstruction filtering digitally, and greatly simplify the analog filter which follows the DAC. This is economically important, because every playback device needs one, while the anti-aliasing filter used in front of the ADC for recording only needs to exist in the recording studio, so its cost is much less significant to the audio community.
Yes. Nyquist Shannon theorem assumes ideal, impossible engineering, especially in the integration and anti-aliasing filter. Oversampling, interpolation, and especially using higher rates in the first place, allows more tolerance in variations in the hardware implementation. Totally separate of the aliasing issues, it's also useful not to have much extreme ultrasonic content on even hi res recordings, though, because analog gear performs worse downstream with it.
*TWITCH* *TWITCH* *grimace* Sorry, I'm a retired digital engineer and several times I wanted to correct you (as you know, engineers are like that), BUT OVERALL you gave a very cogent, understandable, simple explanation of a very complex subject, just as you intended. Bravo! Well done!
EE/Control Systems Engineer myself and yeah, it's not perfect or exhaustive, but I seem to remember a similar print article from Julian Hirst being my introduction to DSP when I was in high school.
16 bits = maximum positive integer value of 65535. Since you're doing a sine wave, you generally want positive and negative, so a signed 16 bits yields a range of (-32768 to 32767). That describes the amount of accuracy you have in the amplitude (height) of the wave. The sample rate of 44.1 kHz means each sample represents .0000226s. To determine the maximum value of a certain number of bits, the formula is: 2ⁿ-1. For 16 bits, that's 65535. For 24 bits, it's 16 Million. For 32 bits it's 4.2 billion. The only advantage I see of upsampling is synthetically aiding the higher frequency harmonics in potentially getting a more accurate representation. A 20 kHz wave only gets 2.205 samples per cycle, which is terrible, though that's at the upper limit of human hearing, so it doesn't really matter. In the range of music, we get a lot more samples. The top key on a standard 88 key piano keyboard has a frequency of 4186.009 Hz. A 44.1 kHz sampling frequency will capture that top note with 10 samples per wave. You can get a pretty decent sine curve with 10 samples.
In his window metaphor, the bit depth would be the size of the window you peer through. 16 bit you can see quite a bit, but 24 bit you can see a little more in your periphery. Kind of how 24 bit sounds just a little more intimate and detailed. We get to see a little bit more through our window, even if we are still only glancing 44.1 times a second. Feel free to correct my scenario if I'm incorrect audiophiles! 😊
If you are confused about the order of the bits where he flipped the place values, it is because he is doing Least Significant Bit first (LSB) which is how digital audio is usually represented.
Paul, I can clearly see that you are a salesman for audio gear^ man, to keep it simple and true: the bit depth just gives you the possible dynamic range, which is 96 db with 16 bit and 144 db with 24 bit. the sample rate gives you the frequency, which is ca. 22 kHz with 44.1 kHz and 48 kHz with 96 kHz.
Yep. This video is not ideal is it. If anyone wants to lean about digital audio properly they should probably first go and watch “digital audio show and tell with Monty Montgomery”. It’s available here on UA-cam and explains how digital audio works
Paul, you might someday want to explain how interpolation plays a function in error correction such as when data on a CD becomes missing where there is a physical scratch on the disc.
Great explanation Paul. I even like the conclusion but there is one nit I must pick. Your conclusion is correct that these interpolating DACs reduce the problem with the brick-wall filter and that is the real reason for their benefit since no new information is created.. However, it is only on the DAC side so it is the reconstruction filter that is aided and it does not have to do with anti-aliasing, which is on the acquisition side. It is similar though. Just a small error in terminology near the end.
A short answer to this question for the technically inclined would be “up-sampling a) reduces noise by noise-spreading and noise-shaping, b) it reduces harmonic distortion because the DAC is more linear and c) it provides a more neutral (wrt to frequency and phase) handling of the audio signal, compared to a traditional DAC”. Now, the long technical answer: a) By upsampling the DAC noise is spread over a much larger frequency range while it’s total power remains the same. So the effective noise power within the audible region is decreased by the amount of oversampling. Furthermore, by employing cautiously designed filters, the noise spectrum is shaped so that more noise energy is directed towards the high-end of the upsampled bandwidth, well beyond the audible range, resulting in even larger noise reduction in the audible range. b) an oversampling DAC has inherently more linear behavior, therefore exhibits lower harmonic distortion. The explanation is that with oversampling we can use less bits for the signal reconstruction (typically 4 bits in most high-end DACs today). PSAudio’s Direct Stream goes even further by using only 1 bit for the final signal reconstruction. The idea here is that in effect, we prefer to reconstruct the signal voltage not directly in the voltage domain, but in the time domain instead. With today’s technology we cannot directly reconstruct an arbitrary voltage with as much an accuracy as we would like to (say with an R2R ladder). However, in the time domain, we can obtain extremely highly accurate intervals, in the order of a few parts in a trillion. So in the final stage of an oversampling DAC, the voltage levels are basically encoded in the time domain by a high rate PWM or PDM signal. c) Of course oversampling, as Paul explained, helps us design audio reconstruction filters that have a more flat frequency and linear phase response within the audible range. Hope that helps. What a great series of videos Paul!
isn't DAC nowadays has 18 to 24bits? even some has 32bit floating point (which is not a real 32bit). it's either multibit (16/18/20/24) or bitstream (1bit / delta sigma). I might be wrong, though
S. Kojina a 16/24/32 bit DAC means how many bits it can handle at its input. Here the issue is how the DAC eventually reconstructs the analogue signal. Almost all modern high-end DACs do this with high rate sigma-delta based 1/2/4 bits converters
aah, i didn't notice that one. thank you to let me know! so, i use AD1853 right now. the datasheet only state "multi-bit sigma delta modulator" with no specific how many bits does it convert. internal interpolator is 8xFs. did you have formula to calculate the conversion bits?
No, you cannot get higher resolution by upsampling. It is impossible to add resolution to a recording that didn't have it to begin with. Think of a favorite photo scanned at low res. When you magnify enough, you'll see pixels. Yes, you might be able to "smear" the edges of pixels as to not see the lines (which would be the equivalent of Aliasing) , but it will still be blurry.
Ok...maybe I’m missing something but this discussion was about recording. Wasn’t the question about playback? Why take a 44k signal from a CD and upsample it before the DAC?
This is a great video. Thanks. I would love to see this kind of breakdown about PCM to DSD or about exactly how DSD noise shaping works with only 1-bit samples
OGmolton1 DSD, instead of quantizing the amount of voltage in a signal at a giver sample rate, quantizes whether the voltage is rising or falling at any point in time, at a given sample rate. It turns out that if you sample the signal 2.8 million or 5.6 million times per second (as opposed to 44,100 or 385,000 times per second for mqa pcm) the signal can faithfully be reproduced with just that one bit of info: up or down, 1 or 0. My understanding is that DSD requires highly precise internal clocks because of the sample rate, but the 1bit info stream is less error prone in processing.
That was a great explanation! If I understood right, I can upsample 44.1 kHz Red Books to 88.2 kHz and choose on a DAC a filter named Super Slow Roll-Off, than I don't need Hi-Res?
Nice explanation, Paul, but please stop drawing stair steps - it perpetuates the myth that the output from DACs are stair-stepped, which is not at all the case. It’s more fair to draw the samples as discrete unconnected points.
@Dan B Yes, it would be a much better representation for most of his discussion, if he used discrete points - because that's how the information exists in the digital domain. It's also much easier to draw. However, where on earth did you get the idea that the analog DAC output isn't stair-stepped? What do you imagine that it's doing between the time points where it's set? It's an analog signal, and exists continuously, so it has to have a value. The only deviation from stair-step, is the buffer amp's slew rate and settling response.
@Dan B Monty did a very nice job on that discussion. However, he completely neglected to mention one important point: Nowhere did he show the actual DAC output. Instead, he only showed the output downstream of the reconstruction filter, which is built into his A/D/A converter box. If he added an internal test point to the converter box at the actual DAC output, and connected it to his 'scope, you would have seen a stair-step - just like the one displayed by the app running on his ThinkPad.
I see what you’re saying, thanks for pointing that out, Marianne. I was only concerning myself with the signal which is sent out to the amplifier/preamplifier, which after filtering is not stair-stepped. In that context, I consider the filter to be part of the DAC.
I think in this context, considering the filter at the end of the DAC would totally obfuscate the lesson that's being imparted here. getting those "stair steps" as small as possible can present the filter with a more accurate representation of the original sine wave even before smoothing takes place.
The section about oversampling begins at 12:20 . Impressive how the noise floor perceptibly decreases on an 8-bit signal as the sample rate is doubled. One can create a test tone of a given low level in a set of 24/48/96/192 kHz files: it is barely discernible above the dither in the 24 kHz version, but can be heard quite clearly in the 192 kHz case, without any noise shaping. It makes an inefficient storage format though, as adding a bit is cheaper than quadrupling the sampling rate. The motion of the signal can even be "seen", given long enough time slice, as the noise moves up and down roughly following the shape of the wave. I recall there were 14-bit DACs that made use of the information that already existed in the stored 16-bit stream, without creating it, and achieved near 16 bit signal to noise ratio.
Yes, because when red book came out, 14/44.1 was all that existing hardware could do, and a few companies had promised 16bit DACs soon. These companies were impatient for profits and didn't want to wait until 20bit/56khz DACs. Nowadays, average DACs can do it fine at 48khz, though even 44.1khz sounds better than it used to.
great thank you, if I got that right for a change, it is smoothing the analogue output in the digital domain, so to speakby removing the auidble impact of the filterthat "digital glare" is at least reduced
M Scaler by Chord definitely makes the sound better. I understand that it's an upsampler? So it does increase resolution. Im not experienced in these gadgets, but for sure it improves the already excellent Dave and TT2.
Does all of this mean that the major sonic benefit of upsampling is on the higher frequency part of the sound spectrum, which is presumably most impacted by the low pass digital filter?
I watched a video from Hans Beekhuyzen Channel and he said the amplitude is not really encoded and decoded in ‘steps’ form, but a proper sine wave. Confused now lol..
I watched that video. If I remember properly, his contention was that the electronics cannot change voltages instantly, and the momentum has the effect of smoothing the sine wave.
@n4jg6 Actually, it's not a proper sine wave, either. It's two sine waves. If F(in) is the input signal sine wave frequency, then the digital representation will include both F(in) and F(sample) - F(in). For example, recording an 18KHz sine wave using 44.1KHz sampling, produces samples which are the superposition of 18KHz and 26.1KHz sine waves. During playback, a reconstruction filter is required to remove the unwanted 26.1KHz component. If up-sampling is used for the playback chain, much of this filtering can be done digitally, upstream of the DAC.
The whole stair-step thing is a result of a DA converter doing something called “sample and hold”. It only exists as the output of the DA converter before filtering. The samples on a CD are indeed snapshots (really like that ge used that term) and they are enough to capture and replicate all the audible frequencies. Why? Because we can only hear pure sine waves at 20 kHz (well, a young person can). A 20 kHz signal that is anything else but a sine wave (like a stair-stepped signal) will have higher frequency components. Filter them out and you are back to the sine wave.
The amplitude *is* really encoded in steps. But the sharp edges and straight lines of those steps require very high frequencies to produce, so the first stage of the DAC effectively produces very high frequency noise. The second part of the DAC is a low pass filter (which is the filter Paul talks about at the end) that removes all of the quantization noise. Without those high frequencies, the resulting signal must be returned back into the original smooth curve. So, by the output, the waveform can once again be a proper sine wave.
Sort of explains why vinyl is still around. Smooth analog sine waves but mechanical limitations in theoretical frequency response limits and dynamic range. However the cost of a really good turntable, cartridge and preamp can be out of reach for most people vs. a decent CD player.
George Bedorf Hey, a lot of turntables are over engineered, I bought a ATLP120 on sale for 230 dollars,because I like direct drive. In spite of what they tell you spinning a platter at 33 1/2 RPM is not that hard, major differences in TT's is the cartridges
Very difficult subject. Lots to get across even in basics. But I think you missed the context of the question. I think the questioner was asking about playback, not record. While an understanding of the record process and problems would have a correlation to those complimentary problems on the playback side. The needs and process are reversed. We would start with a CD that already has the audio stored digitally on it with 44.1K/16 bit. If the CD player did a 44.1k sample, it would get one sample of the data for that word every 44.1 thousands of a second. Output filtering would have to be very sharp to remove anything above 20K. If you double the clock, double the number of pulses from the laser to the CD, you can sample that physical data twice. Thus if there were errors in the first sample, they error rate could be reduced with the second sample. 4 times over sample, 4 chances to get the sample correct. Or at least average the 4 samples to produce one output voltage more accurately. Now, with a 4 times sampling rate, if your DAC (digital to Analog converter) in the CD player will reproduce a 196.4K data rate/ frequency, then the OUTPUT filtering does not have to be as dramatic for the eventual 22K analog output. Technology advanced allowing lower cost circuitry with higher sampling rates and bit depth. But the delivery method that is CD was locked in. So they designed around the delivery method.
You get more information, but it is interpolated, hence calculated extra information between the existing data points. You might get information that was not there. It might not be correct, but they are predicted to be there. So you get more information to work with after an upsampling. It might be easier to run this signal through a DSP after, because the data is less jagged.
A fun watch. The good news is with the right playback solution Redbook CD's or equiv. digital 44.1/16 files can sound really good. IMHO, it's all about the DAC and how it recreates the analogue signal.
A good way to explain the filter problem is to consider a square ledge in a tank of water - a wave will reflect off the sharp edge . In music you get the artifacts as ring harmonics. So if you have a soft filter the wave does not reflect and you don't get the wave "bouncing" of the band pass filter. Maybe this is too simplistic, but I could describe this with fourier analysis but that would be too far the other way. As always - I appreciate you knowledge of music and audio .
Just watched the video - I like the tank of water analogy - I presume this is referring to the wave-fronts leaving the speakers (chopped at 22khz) acting in this way. A cd (44.1khz) can capture a signal upto a frequency of 22khz so if you sample at say 176khz you can see a signal at 88khz signal. I assume by interpolating the signal between the existing signal points we can sample at 176khz and construct a reading at 88khz. Which allows the filter to be softened. So we are generating a wave at 88khz not 44.1khz and the chop off distortions are at 88khz not 22khz so they have much less effect.
I'm sure it's a bit different but it seems very similar to upscaling video. Some devices do it more elegantly than others. Because of the 'prediction' it gives the illusion of more data.
Good intention and overall a solid channel, but, I think the point of the Nyquist theorem is that sampling at 2x the highest frequency does in fact produce the equivalent output ( not stair-stepped). The issue is more around the accuracy of the electronics and the impact of filters which he does cover and that is accurate and helpful in understanding oversampling on playback
Not quite. Specifically for the Nyquist-Shannon theorem to work, the channel must be bandwidth limited . This is where the low pass filter comes in. Creating the stair step signal injects very high frequencies on the signal in order to make the straight lines and sharp corners of the stair steps. It is only when all of those high frequencies are removed that the smoothing takes place. Without the high frequencies to generate the sharp edges of the quantization noise, the signal is smoothed back into its original rounded shape. What the Nyquist-Shannon theory really does, is determine the cutoff frequency between the frequencies needed to accurately reproduce the signal and the frequencies that only generate quantization noise.
@@timharig that is why I called out the accuracy of the filters. There is no stair step in the final output, that is the whole point of the Nyquist theorem- if you sample at twice the rate of the Top frequency of a band limited signal, the output IS equivalent to the input, i.e. it is not stair stepped
@@steveg219 I'm seeing a lot of confusion in the comments from people who are failing to understand the purpose of the filter. Paul didn't properly explain the need for the lowpass filter. It is important for this explanation for them to understand that Nyquist-Shannan only produces an "equivalent output" AFTER the quantized "stair step" signal from the DAC resistor ladder (or pwm or whatever) has gone through the lowpass filter to return it to a smooth signal. Until it goes through that lowpass filter it is still a "stair step" PCM or a square wave width pulsed signal with sharp corners. Paul's entire point is about what happens BEFORE the "final" output. If they assume that the signal already came out as a smoothed then they see no benefit of oversampling because they are only seeing the finished signal after the lowpass filtering has already been accomplished. Without understanding that the quantization noise ever existed in the first place, how can they understand the benefits of moving that noise up the spectrum by increasing the sampling frequency? They assume that any filtering is done on the signal itself rather than on the artifacts created by the DAC modulation. Finally, the Nyquist-Shannon theorem is not simply about sampling at double frequency. It is much more general than that. It determines the lowest frequency of the quantization noise. Sampling at double frequency is simply the special case at which the sampling frequency can no longer be reduced before the quantization noise occurs in the frequencies of the signal itself. Paul's whole point is that by increasing the sampling rate above the signal rate, you move the lowest noise frequency up the spectrum. This creates a gap between the highest signal frequency and the lowest frequency of the quantization noise. If you increase the sample rate enough then you can you create a gap large enough for a the lowpass filter to have a long runoff before it has to attenuate the high frequency quantization noise. You cannot fully understand that if you only think Nyquist-Shannon is as simple as doubling the sampling rate. The fact that the sampling frequency must be double the signal frequency is a consequence of the theorem. Not the theorem itself.
@@timharig good additional information, I think we agree here that his point was good but was lacking some relevant info. Thanks for the additional clarification. Btw, I do all my recordings at 96k!
Rick Yarussi Yes, and I also cringed when he thought 16 bits would be around a million levels instead of 65536, but his preface meant that it was not for us. Also, when he counted the levels per numbers of bits he went 2,4,6,8 instead off 2,4,16,32. I am glad he was trying to make a simple explanation.
I once owned a Arcam 73 CD player , a mid priced award winner. I spent a fair bit of money to get it upgraded to the next model up which had a 192 upsampling dac. My nice sounding CD player was transformed into a noisy mess.
A smooth curve has a lot of information, the smoother you make the curve through interpolation the closer it approaches the analogue in terms of presentation, not information. I would use this analogy, it is an improvement, in that it becomes more bio available to the analogue brain system.
There’s a lot more to tell about this subject in general, but suffice it to say the 44.1kHz is all you need to capture every bit of information in the audible frequency spectrum. Nyquist/Shannon tells us that a band-limited signal only has to be sampled at twice the upper frequency of the band. The stair-steps are not present in the analog output of a digital source because these stair-steps are made of frequencies beyond the band limit, and are filtered out. The only thing I kind of missed in this video is the fact that 44.1 is enough, as the question implied it wasn’t because “you only have so many samples”.
Just watched the video - I like the tank of water analogy in the comments below - its like a square ledge in a tank of water - a wave will reflect off the sharp edge - creating ring harmonics. I presume this is referring to the wave-fronts leaving the speakers (chopped at 22khz) acting in this way. A cd (44.1khz) can capture a signal upto a frequency of 22khz so if you sample at say 176khz you can see a signal at 88khz signal. I assume by interpolating the signal between the existing signal points we can sample at 176khz and construct a reading at 88khz. Which allows the filter to be softened. So we are generating a wave at 88khz not 44.1khz and the chop off distortions are at 88khz not 22khz so they have much less effect
Never thought about it, but does 44khz mean a full wavelength of a 22k sine wave is approximated by two samples? Sounds pretty drastic. Makes me wonder why it sounds anything like the original wave
How many DACs will actually alter the brick wall slope according to the incoming bit rate? My DAC offers 3 brick wall slope variations and the only effect seems to be a small drop of sibilance or brightness in the upper midrange. Over sampling does smooth the sound but I figured this was from the occasional rogue bit getting through, not a lower order brick wall filter automatically appearing. Digital audio has been around long enough that it shouldn’t cost over 5 or 10 thousand to make it better than just decent! Just decent isn’t as good as a vinyl record. It may be more convenient but not better! To be honest my turntable and cartridge is probably worth over a thousand bucks, but digital is supposed to be way better! I’d really appreciate answers or tips. Why does digital cost as much or more to sound as good as vinyl?
Hi Paul! I always wondered what is about with the lower part of the frequency response of an audio CD. Is the lower limit 1HZ or is there some filter also in the low end, say between 1 and 20HZ?
Péter Sági No. The Nyquist theorem puts only an upper bound. EVERYTHING below the half Nyquist frequency can be reproduced correctly i.e. 1Hz as well. There the limit is actually the physical capability of your reprodiction system (speaker, headphone).
Thanks for your response Andras, but it's not 100% correct. According to the Nyquist theorem you could put frequencies up to 22KHZ on a CD. The reason it only goes up to 20KHZ is that there's a 2KHZ region to put gradual rolloff filters into - as Paul describes in the video. My question was that is there a need for such a filter in the low end region.
@@petersagi275 No. DAC's do not need low pass filters. I think that you are confusing the purpose of the low pass filter in this instance. This might be because of the simplified explanation or because you are associating with the purposes used for low pass filters in other places -- such as in a speaker crossover. In other cases, you are trying to remove unwanted parts of a signal. In this case, you are effectively doing something more akin to a mathematical function. You are removing something that was produced by the DAC and never part of the signal itself. If you look at the stair step wave at 12:00, you will notice that the stair step structure produces sharp angles and straight lines. These sharp angles and straight lines mathematically require very high frequencies to produce, so the part of the DAC that produces the stepped voltages is effectively inserting a bunch of very high frequencies onto the signal that were never actually present in the digital stream. This can be referred to a quantization noise. The low pass filter acts as a mathematical operation to convert that stair step pattern with all of the sampling noise back into the original smooth waveform that was recorded. It does this by removing all of those frequencies that were injected by the DAC. Without those frequencies, it is impossible to recreate those lines and sharp angles and so the waveform is smoothed back into its original rounded form. So that is why you absolutely need a low pass filter out of the DAC.Without it, you would end up with the quantization noise in the output. But, you would not want to pass the output through a high pass filter. The quantization noise is all above the Nyquist-Shannon sampling frequency. If you were to use a high pass filter, you would be removing parts of the original signal that were part of the recording.
Daniel Denholm yeah i agree. i've find 44 and 48 khz upsampled to 176 on my dac sounds more crisp and detailed but again it's all volume relative. turn up the volume and EQ the mix you will get more separation. the key is to achieve that at low volume..
When you mix, you add information so this is common sense, but the point is, the original recording will not be better. What you should do in the first place, is to set your DAW environment to 32-bits at 96kHz in the first place, or 192kHz. Why are you going from something worse to artificial?
In short: every digital signal is a snapshot were something has to be prophetically predicted on the assumption that what happens next is coherent with what happened before. More samples makes it possible to smooth things out even if there is not any actual data.... why does this look a lot like what an mp3 codec does....
Thanks for this. I would argue that the interpolation does *create* more information but this is neither here nor there. I now understand why upsampling could be good. Thanks for this.
There’re so many factors to got good audio quality, it’s not only bit depth & sample rate,, but it’s also the amount of khz, the amount of kbps and the s/n to noise ratio, it drives me insane, one 16bit audio sample is not the othef 16bit audio sample,,, for instance an 16bit audio sample at 11khz at 64kbps doesn’t sound better then an 8bit sample at 22khz at 128kbps ,as a result we consumers can get easily fooled & confused by those audio equipment company’s, aaarrrggg.
Guess you lose a lot of phase shift with a lower order filter. I listened to your talk about regulating power supplies. Although the output stage doesn’t use voltage regulation it can provide a huge buffer against output power rippling the supply voltage. I believe using a voltage regulated power supply could trigger harmonics within the regulation, introducing some low level noise. A lot of film bypassed capacitance and a high current transformer may allow the output voltage to wander a bit with ac variations but should create pretty smooth and stable road for the signal to ride on, not introducing any power supply artifacts. A cathode follower tubed power supply is awesome but the price does add up.
Nice video as always Paul! You have two kinds of people, those who can count binary and those who don’t :-) Maybe it’s more intuitive to show it as 1 and 0 instead of the stair steps
Exactly, Paul doesn't know how to properly set up the bits on that board, the most significant bits to the least significant bits. Also, he doesn't understand nibble, byte and word terms -> 4-bits, 8-bits and 16-bits.
This is one of fhe best from Paul! I have always wonder does upsampling has a same effect as converting from flac to wav and then to analogue. Does oversampling draws more power that usual sampling and in that way compromise the power supply like that happens with flac? Because interpolation needs some resourses or I got that wrong?
I've never heard down sampling decimation make something better unless it's with content with a bunch of ultrasonic garbage you're trying to get rid of, but I have heard good oversampling DACs and especially interpolative upsampling, both from Burmester Audiosysteme and on the Denon DJ DN-X1700 inputting 44.1khz and letting the mixer run at 96khz. The 44.1khz strait mode was closer to what other DACs produced with 44.1khz, but I found the interpolation to be more pleasing and seductive sounding. I've never heard any of the plug-ins for Winamp or Foobar for upsampling that did that type of pleasing interpolation, though. Maybe someone knows a good one? I realize you have to use a USB DAC that's capable of exploiting it. I also am a big fan of HDCD, though that's apparently a little different.
Indeed, you cannot got more info out of 16bit but you can use linear prediction to add more samples between existing samples , being based on the next and previous samples to get a smoother audio, 100hz smooth motion works similar , if in the 1st image the hand was rased down while in the next frame the hand is all the way rased on top, then the system can presume that hand was in the middle first and so on,resulting in smoother motion, in fact you could multiply new samples & pictures as much as you want based on predictions by creating new predictions based on previous predictions and so on. So eventrough the information wasn’t there but by using prediction, we can asume what’s supposed to be there, albeit not 100% but still, it’s incredible what you can do nowaday’s with such advanced AI technology👍
Without getting way too complicated and fast Fouriertransform explaining here .... let me try to state it simply as I know how to: If you apply a digital FIR filter a finite impulse response filter in the process of interpolating data between the samples you were effectively controlling the main lobe shaped and sidelobe shapes or stopband of the frequency response. That is you are controlling the shape of the main passband response how rounded is how quickly it falls off after 22,050 Hz and how low the sidelobes or stop band a bit higher frequencies between let’s say 23,000 to 44,000 Hz what those are. If you make the assumption that there are no more higher frequency signals than 22,050 Hz that you are interested in because the human ear cannot Hear them .... then you have lost no information and you have not gained any information by up sampling - that is what you have done is really controlled some of the unintended distortion and aliasing effects that can happen if you were not perfectly filtering (in your output or reconstruction filter after the D-to-A stage) right after the 22.05 kHz (44.1 k sample rate) limit. Or You have made it easier on yourself to design the output with filter if the sample rate was double that, that is you were not having unintended noisy noticeable effects between 22,051 Herz to 44,100 Hz (in the case of 88.2 kHz sample rate. ) or you may find it’s easier to design a linear phase filter in case where the sample rate has been moved up higher. If I could try to boil this down as simply as possible : digital FIR filtering and creating interpolated samples can just give you only a more faithful representation of that original smooth or jaggy time-varying signal but cannot create any more information of the band-limited single (say 20 Hz and up to 22,050 Hz) you started with in original signal (there is always inherently a band limited filtering or an actual filter before the sample and hold A-to-D stage). If people want to get into debates about how you CAN hear the harmonics of percussion and brass and other instruments higher than 20,000 Hz in the human ear that is totally another discussion, folks.
yeah, I know: I am known to be verbose. i get passionate about engineering, dsp, audio and similar principals. I just hate it when people claim that there are more higher frequencies that the human being and its ear are supposed to “hear” which is simply just not true :)
@@chucknorrismeta3171 Yeah, I recently found out that steep filtering causes phase issues. And frequencies above 20kHz can affect audible frequencies because of intermodulation.
I hate upsampling, there are so many algorithms and almost all of them make sound upleasant, maybe cleaner, but there is always something wrong. Worst of all is dual upsampling, like 44.1 to 96khz on PC with unknown algorithm and then 96khz to 192khz in DAC... I think this is reason so many people hate computer as device to play music... sometimes there is also digital volume control on PC which can degrade sound even more... I also have NOS DAC, sound nice, but there much less detail in highs... New 96khz recordings sound best for me on modern DAC-s (like ESS DAC-s) without upsampling, but 99% of my music is old... so what to do?
I’m afraid all values between two measured ones are greatly created with an output low pass filter. We never hear the steps. We hear harmonics of sine waves. What low bit depth makes is to introduce a noise, and to reduce a dynamic range. A really bad thing is that this noise is a periodic one and sounds like ringing. But if we simply increase a depth, we will not increase a dynamic range, and so will not add more information. We add zeroes during an upsampling (yep). We can make nothing with already measured values because we suppose they are valid, because they came from an original file. But this is not true, they contain a quantification error, and so noise remains. To achieve this we have to push a noise out of an audible range with a noise shaping technique in conjunction with upscaling and upsampling. We can even downscale (but not downsample) the values, get back to an original bit depth, and get a bigger dynamic range anyway. Just a “simple” math. Not an electronics.
Paul is the Richard Feynman of audiophile world due to his ability to effortlessly explain complex electrical engineering concepts in simple (yet accurate) terms that laypeople can understand. This video exemplifies this ability.
'Don't get all pissey on me" Lol, nearly fell off my chair! Love it when Paul talks straight!
What a great explanation....I understood it, at least till I forgot it a couple of minutes later. But I can watch it again and I'm really grateful for your off the cuff explanation. Thanks Paul
This channel is pure gold! I'm an engineering student and also interested in building my own active speakers as a hobbyist. I really love to get into a deep understanding of all the audio stuff and you are helping me a lot! Thank you!
When you are a student don't listen to that poor minded audiophiles
@@Harald_Reindlhe said -is our chief engineer bob standards hahaha and snap shots goes chakp chakp chakp chakp 😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂😂 and i love to use water filter on dac..😅😅😅😅
@@pracheerdeka6737 are you drunken?
@@Harald_Reindl no but it's funny
@@pracheerdeka6737 if you find such useless comments funny you are as dumb as the typical audiofool
In a nutshell:
Sampling Rate = how many snapshots of the music are taken each second.
Bit Depth = how much information is present in each snapshot.
Although Paul is correct, in that upsampling does not mean more information, this needs to be put into perspective. Because if there was no more information, then upsampling would be meaningless.
More information will not be present in the source file (no modification to the source data is made). But upsampling (or oversampling) calculates what would be present between two existing samples, creates the samples on the fly, and feeds the new, additional samples to the DAC. So there is more information.
Although a source file contains, for example, 44,100 samples per second, upsampling it to 88,200 samples per second will result in what ultimately reaches the part of the processing that converts those samples into an analog sound (88,200 samples per second will be seen by the DAC and converted into sound by the DAC (the digital to analog converter)).
Upsampling is a mixed bag, and can result in a euphoric sound, which could be pleasant sounding, but will not be as accurate as a native file at the higher sampled rate.
And although upsampling has benefits, it could make songs sound unnatural.
If you had two files of the same song:
#1 produced at 44,100 samples per second (CD quality), and
#2 produced at 176,400 samples per second (quadruple #1), then...
... upsampling the 44,100 file by a factor of 4 will not yield the same sound quality as the native 176,400 sampled file.
Do you know what I liked about your post? I read it in like 2 minutes and basically understood everything Paul said in the video and some more. So thank you for your short and meaningful answer. Can I subscribe to you?
see my comment above
Excellent breakdown, thank you.
To be precise: bit depth is how accurate is each snapshot, but yes. Good explaination.
moron
Thanks for sharing!
Keep doing these great videos with explanations for newbies, even if it is 95% correct
it is a free lesson!
I have watched more of your videos than I could count, and have benefited from many, but THIS ONE helped me immeasurably.
This master series is like working with an open minded dude who was involved in
common sense audio for 40 years. No nonsense, but not dismissive of newbies.
Audiophiles can be snobby and know it all. Also, engineers are great for making
audio products, but they can't explain how to install a light bulb. Paul is the middle man of audio information.
Master snake salesman
@ pls explain what you take issue with in this video.
Natural teacher.
Anthony Mak he's probably pissed cuz he can't hear the difference between his 128bitrate Spotify stream and an uncompressed bitperfect ripped CD
A true expert will be able to explain it in layman’s terms. But one has to know the layman’s terms to begin with...
Needs Part 2: The Reconstruction Filter.
The main reason for up-sampling a 44.1KHz-sampled signal, is to perform most of the playback reconstruction filtering digitally, and greatly simplify the analog filter which follows the DAC. This is economically important, because every playback device needs one, while the anti-aliasing filter used in front of the ADC for recording only needs to exist in the recording studio, so its cost is much less significant to the audio community.
Yes. Nyquist Shannon theorem assumes ideal, impossible engineering, especially in the integration and anti-aliasing filter. Oversampling, interpolation, and especially using higher rates in the first place, allows more tolerance in variations in the hardware implementation. Totally separate of the aliasing issues, it's also useful not to have much extreme ultrasonic content on even hi res recordings, though, because analog gear performs worse downstream with it.
Paul, you made understand so much in few minutes. I love your videos. Thank you so much!
*TWITCH* *TWITCH* *grimace*
Sorry, I'm a retired digital engineer and several times I wanted to correct you (as you know, engineers are like that), BUT OVERALL you gave a very cogent, understandable, simple explanation of a very complex subject, just as you intended.
Bravo! Well done!
EE/Control Systems Engineer myself and yeah, it's not perfect or exhaustive, but I seem to remember a similar print article from Julian Hirst being my introduction to DSP when I was in high school.
i'm an electrical engineer. i watch these videos just because they make me angry.
Thank you Paul. Your explanation was easy for me to understand. Your down to earth and honest videos are refreshing.
First time I understood this...
Thank you sir!
16 bits = maximum positive integer value of 65535. Since you're doing a sine wave, you generally want positive and negative, so a signed 16 bits yields a range of (-32768 to 32767). That describes the amount of accuracy you have in the amplitude (height) of the wave. The sample rate of 44.1 kHz means each sample represents .0000226s. To determine the maximum value of a certain number of bits, the formula is: 2ⁿ-1. For 16 bits, that's 65535. For 24 bits, it's 16 Million. For 32 bits it's 4.2 billion.
The only advantage I see of upsampling is synthetically aiding the higher frequency harmonics in potentially getting a more accurate representation. A 20 kHz wave only gets 2.205 samples per cycle, which is terrible, though that's at the upper limit of human hearing, so it doesn't really matter. In the range of music, we get a lot more samples. The top key on a standard 88 key piano keyboard has a frequency of 4186.009 Hz. A 44.1 kHz sampling frequency will capture that top note with 10 samples per wave. You can get a pretty decent sine curve with 10 samples.
dont beat yourself up for getting old sir, we highly enjoy these videos.
In his window metaphor, the bit depth would be the size of the window you peer through. 16 bit you can see quite a bit, but 24 bit you can see a little more in your periphery. Kind of how 24 bit sounds just a little more intimate and detailed. We get to see a little bit more through our window, even if we are still only glancing 44.1 times a second.
Feel free to correct my scenario if I'm incorrect audiophiles! 😊
Thank you! I finally get it!!!! You're the man!
If you are confused about the order of the bits where he flipped the place values, it is because he is doing Least Significant Bit first (LSB) which is how digital audio is usually represented.
Paul, I can clearly see that you are a salesman for audio gear^
man, to keep it simple and true: the bit depth just gives you the possible dynamic range, which is 96 db with 16 bit and 144 db with 24 bit.
the sample rate gives you the frequency, which is ca. 22 kHz with 44.1 kHz and 48 kHz with 96 kHz.
Yep. This video is not ideal is it.
If anyone wants to lean about digital audio properly they should probably first go and watch “digital audio show and tell with Monty Montgomery”. It’s available here on UA-cam and explains how digital audio works
Paul, you might someday want to explain how interpolation plays a function in error correction such as when data on a CD becomes missing where there is a physical scratch on the disc.
That sure made a lot of sense to me Paul, great job of explaining that, you made that much easier for me to understand.
Thx Paul that was a good one. Got a better understanding of the 24 bit 192khz DAC in my cd player.
Great explanation Paul. I even like the conclusion but there is one nit I must pick. Your conclusion is correct that these interpolating DACs reduce the problem with the brick-wall filter and that is the real reason for their benefit since no new information is created.. However, it is only on the DAC side so it is the reconstruction filter that is aided and it does not have to do with anti-aliasing, which is on the acquisition side. It is similar though. Just a small error in terminology near the end.
A short answer to this question for the technically inclined would be “up-sampling a) reduces noise by noise-spreading and noise-shaping, b) it reduces harmonic distortion because the DAC is more linear and c) it provides a more neutral (wrt to frequency and phase) handling of the audio signal, compared to a traditional DAC”. Now, the long technical answer:
a) By upsampling the DAC noise is spread over a much larger frequency range while it’s total power remains the same. So the effective noise power within the audible region is decreased by the amount of oversampling. Furthermore, by employing cautiously designed filters, the noise spectrum is shaped so that more noise energy is directed towards the high-end of the upsampled bandwidth, well beyond the audible range, resulting in even larger noise reduction in the audible range.
b) an oversampling DAC has inherently more linear behavior, therefore exhibits lower harmonic distortion. The explanation is that with oversampling we can use less bits for the signal reconstruction (typically 4 bits in most high-end DACs today). PSAudio’s Direct Stream goes even further by using only 1 bit for the final signal reconstruction. The idea here is that in effect, we prefer to reconstruct the signal voltage not directly in the voltage domain, but in the time domain instead. With today’s technology we cannot directly reconstruct an arbitrary voltage with as much an accuracy as we would like to (say with an R2R ladder). However, in the time domain, we can obtain extremely highly accurate intervals, in the order of a few parts in a trillion. So in the final stage of an oversampling DAC, the voltage levels are basically encoded in the time domain by a high rate PWM or PDM signal.
c) Of course oversampling, as Paul explained, helps us design audio reconstruction filters that have a more flat frequency and linear phase response within the audible range.
Hope that helps.
What a great series of videos Paul!
isn't DAC nowadays has 18 to 24bits? even some has 32bit floating point (which is not a real 32bit). it's either multibit (16/18/20/24) or bitstream (1bit / delta sigma). I might be wrong, though
S. Kojina a 16/24/32 bit DAC means how many bits it can handle at its input. Here the issue is how the DAC eventually reconstructs the analogue signal. Almost all modern high-end DACs do this with high rate sigma-delta based 1/2/4 bits converters
aah, i didn't notice that one. thank you to let me know!
so, i use AD1853 right now. the datasheet only state "multi-bit sigma delta modulator" with no specific how many bits does it convert. internal interpolator is 8xFs. did you have formula to calculate the conversion bits?
S. Kojina I guess they’re using 4bits with dithering
Apostolos Georgiadis thanks! nice to know that
No, you cannot get higher resolution by upsampling. It is impossible to add resolution to a recording that didn't have it to begin with. Think of a favorite photo scanned at low res. When you magnify enough, you'll see pixels. Yes, you might be able to "smear" the edges of pixels as to not see the lines (which would be the equivalent of Aliasing) , but it will still be blurry.
It depends what is worse to you, seeing (hearing) those edges, or blending them away, looking soft (sounding less harsh)
@@paulstubbs7678 It will sound artificial.
I loved the tangents!! Great video! Thanks for sharing!
Ok...maybe I’m missing something but this discussion was about recording. Wasn’t the question about playback? Why take a 44k signal from a CD and upsample it before the DAC?
moron
Because audiophiles don't have brains
This is a great video. Thanks. I would love to see this kind of breakdown about PCM to DSD or about exactly how DSD noise shaping works with only 1-bit samples
OGmolton1 DSD, instead of quantizing the amount of voltage in a signal at a giver sample rate, quantizes whether the voltage is rising or falling at any point in time, at a given sample rate. It turns out that if you sample the signal 2.8 million or 5.6 million times per second (as opposed to 44,100 or 385,000 times per second for mqa pcm) the signal can faithfully be reproduced with just that one bit of info: up or down, 1 or 0. My understanding is that DSD requires highly precise internal clocks because of the sample rate, but the 1bit info stream is less error prone in processing.
That was a great explanation!
If I understood right, I can upsample 44.1 kHz Red Books to 88.2 kHz and choose on a DAC a filter named Super Slow Roll-Off, than I don't need Hi-Res?
Nice explanation, Paul, but please stop drawing stair steps - it perpetuates the myth that the output from DACs are stair-stepped, which is not at all the case. It’s more fair to draw the samples as discrete unconnected points.
@Dan B
Yes, it would be a much better representation for most of his discussion, if he used discrete points - because that's how the information exists in the digital domain. It's also much easier to draw.
However, where on earth did you get the idea that the analog DAC output isn't stair-stepped? What do you imagine that it's doing between the time points where it's set? It's an analog signal, and exists continuously, so it has to have a value. The only deviation from stair-step, is the buffer amp's slew rate and settling response.
This video has a good explanation: ua-cam.com/video/cIQ9IXSUzuM/v-deo.html
@Dan B
Monty did a very nice job on that discussion. However, he completely neglected to mention one important point: Nowhere did he show the actual DAC output. Instead, he only showed the output downstream of the reconstruction filter, which is built into his A/D/A converter box.
If he added an internal test point to the converter box at the actual DAC output, and connected it to his 'scope, you would have seen a stair-step - just like the one displayed by the app running on his ThinkPad.
I see what you’re saying, thanks for pointing that out, Marianne. I was only concerning myself with the signal which is sent out to the amplifier/preamplifier, which after filtering is not stair-stepped. In that context, I consider the filter to be part of the DAC.
I think in this context, considering the filter at the end of the DAC would totally obfuscate the lesson that's being imparted here. getting those "stair steps" as small as possible can present the filter with a more accurate representation of the original sine wave even before smoothing takes place.
You are the best teacher....I hope ,in your next life, you will become a professor......
The section about oversampling begins at 12:20 . Impressive how the noise floor perceptibly decreases on an 8-bit signal as the sample rate is doubled. One can create a test tone of a given low level in a set of 24/48/96/192 kHz files: it is barely discernible above the dither in the 24 kHz version, but can be heard quite clearly in the 192 kHz case, without any noise shaping. It makes an inefficient storage format though, as adding a bit is cheaper than quadrupling the sampling rate.
The motion of the signal can even be "seen", given long enough time slice, as the noise moves up and down roughly following the shape of the wave.
I recall there were 14-bit DACs that made use of the information that already existed in the stored 16-bit stream, without creating it, and achieved near 16 bit signal to noise ratio.
Yes, because when red book came out, 14/44.1 was all that existing hardware could do, and a few companies had promised 16bit DACs soon. These companies were impatient for profits and didn't want to wait until 20bit/56khz DACs. Nowadays, average DACs can do it fine at 48khz, though even 44.1khz sounds better than it used to.
great thank you, if I got that right for a change, it is smoothing the analogue output in the digital domain, so to speakby removing the auidble impact of the filterthat "digital glare" is at least reduced
Adding filter, artificial lifeless and cold sounding. This upsampling doesn't give you the true sound.
M Scaler by Chord definitely makes the sound better. I understand that it's an upsampler? So it does increase resolution. Im not experienced in these gadgets, but for sure it improves the already excellent Dave and TT2.
That was an excellent explanation, thank you!!
Does all of this mean that the major sonic benefit of upsampling is on the higher frequency part of the sound spectrum, which is presumably most impacted by the low pass digital filter?
Wow great job I really understand your explanation,learned a lot today. That is cool. Thanks
I am so glad i quid all that shit, i went back to the ladder dac from Metrum and i don’t feel i am missing out on detail or dynamic☺️
My life is richer after hearing your. Thanks🙏
I watched a video from Hans Beekhuyzen Channel and he said the amplitude is not really encoded and decoded in ‘steps’ form, but a proper sine wave. Confused now lol..
I watched that video. If I remember properly, his contention was that the electronics cannot change voltages instantly, and the momentum has the effect of smoothing the sine wave.
@n4jg6
Actually, it's not a proper sine wave, either. It's two sine waves. If F(in) is the input signal sine wave frequency, then the digital representation will include both F(in) and F(sample) - F(in). For example, recording an 18KHz sine wave using 44.1KHz sampling, produces samples which are the superposition of 18KHz and 26.1KHz sine waves.
During playback, a reconstruction filter is required to remove the unwanted 26.1KHz component. If up-sampling is used for the playback chain, much of this filtering can be done digitally, upstream of the DAC.
The whole stair-step thing is a result of a DA converter doing something called “sample and hold”. It only exists as the output of the DA converter before filtering. The samples on a CD are indeed snapshots (really like that ge used that term) and they are enough to capture and replicate all the audible frequencies. Why? Because we can only hear pure sine waves at 20 kHz (well, a young person can). A 20 kHz signal that is anything else but a sine wave (like a stair-stepped signal) will have higher frequency components. Filter them out and you are back to the sine wave.
The amplitude *is* really encoded in steps. But the sharp edges and straight lines of those steps require very high frequencies to produce, so the first stage of the DAC effectively produces very high frequency noise. The second part of the DAC is a low pass filter (which is the filter Paul talks about at the end) that removes all of the quantization noise. Without those high frequencies, the resulting signal must be returned back into the original smooth curve. So, by the output, the waveform can once again be a proper sine wave.
Sort of explains why vinyl is still around. Smooth analog sine waves but mechanical limitations in theoretical frequency response limits and dynamic range. However the cost of a really good turntable, cartridge and preamp can be out of reach for most people vs. a decent CD player.
George Bedorf Hey, a lot of turntables are over engineered, I bought a ATLP120 on sale for 230 dollars,because I like direct drive. In spite of what they tell you spinning a platter at 33 1/2 RPM is not that hard, major differences in TT's is the cartridges
Exactly. This is why vinyl records sounds so much better.
I’m waiting for dedicated 4K audio
Thanks. Basic explanations help beginners like me.
Learnt a lot there Paul. Thanks 😊🙏
Very difficult subject. Lots to get across even in basics. But I think you missed the context of the question. I think the questioner was asking about playback, not record. While an understanding of the record process and problems would have a correlation to those complimentary problems on the playback side. The needs and process are reversed.
We would start with a CD that already has the audio stored digitally on it with 44.1K/16 bit. If the CD player did a 44.1k sample, it would get one sample of the data for that word every 44.1 thousands of a second. Output filtering would have to be very sharp to remove anything above 20K. If you double the clock, double the number of pulses from the laser to the CD, you can sample that physical data twice. Thus if there were errors in the first sample, they error rate could be reduced with the second sample. 4 times over sample, 4 chances to get the sample correct. Or at least average the 4 samples to produce one output voltage more accurately.
Now, with a 4 times sampling rate, if your DAC (digital to Analog converter) in the CD player will reproduce a 196.4K data rate/ frequency, then the OUTPUT filtering does not have to be as dramatic for the eventual 22K analog output.
Technology advanced allowing lower cost circuitry with higher sampling rates and bit depth. But the delivery method that is CD was locked in. So they designed around the delivery method.
You get more information, but it is interpolated, hence calculated extra information between the existing data points. You might get information that was not there. It might not be correct, but they are predicted to be there. So you get more information to work with after an upsampling. It might be easier to run this signal through a DSP after, because the data is less jagged.
ANY serious DSP does that already
Thanks, I finally understand it...
A fun watch. The good news is with the right playback solution Redbook CD's or equiv. digital 44.1/16 files can sound really good. IMHO, it's all about the DAC and how it recreates the analogue signal.
A good way to explain the filter problem is to consider a square ledge in a tank of water - a wave will reflect off the sharp edge . In music you get the artifacts as ring harmonics. So if you have a soft filter the wave does not reflect and you don't get the wave "bouncing" of the band pass filter. Maybe this is too simplistic, but I could describe this with fourier analysis but that would be too far the other way.
As always - I appreciate you knowledge of music and audio .
Just watched the video - I like the tank of water analogy - I presume this is referring to the wave-fronts leaving the speakers (chopped at 22khz) acting in this way. A cd (44.1khz) can capture a signal upto a frequency of 22khz so if you sample at say 176khz you can see a signal at 88khz signal. I assume by interpolating the signal between the
existing signal points we can sample at 176khz and construct a reading at 88khz. Which allows the filter to be softened.
So we are generating a wave at 88khz not 44.1khz and the chop off distortions are at 88khz not 22khz so they have much less effect.
I'm sure it's a bit different but it seems very similar to upscaling video. Some devices do it more elegantly than others. Because of the 'prediction' it gives the illusion of more data.
You get an A for effort Paul!
Good intention and overall a solid channel, but, I think the point of the Nyquist theorem is that sampling at 2x the highest frequency does in fact produce the equivalent output ( not stair-stepped). The issue is more around the accuracy of the electronics and the impact of filters which he does cover and that is accurate and helpful in understanding oversampling on playback
Not quite. Specifically for the Nyquist-Shannon theorem to work, the channel must be bandwidth limited . This is where the low pass filter comes in. Creating the stair step signal injects very high frequencies on the signal in order to make the straight lines and sharp corners of the stair steps. It is only when all of those high frequencies are removed that the smoothing takes place. Without the high frequencies to generate the sharp edges of the quantization noise, the signal is smoothed back into its original rounded shape.
What the Nyquist-Shannon theory really does, is determine the cutoff frequency between the frequencies needed to accurately reproduce the signal and the frequencies that only generate quantization noise.
@@timharig that is why I called out the accuracy of the filters. There is no stair step in the final output, that is the whole point of the Nyquist theorem- if you sample at twice the rate of the Top frequency of a band limited signal, the output IS equivalent to the input, i.e. it is not stair stepped
@@steveg219
I'm seeing a lot of confusion in the comments from people who are failing to understand the purpose of the filter. Paul didn't properly explain the need for the lowpass filter.
It is important for this explanation for them to understand that Nyquist-Shannan only produces an "equivalent output" AFTER the quantized "stair step" signal from the DAC resistor ladder (or pwm or whatever) has gone through the lowpass filter to return it to a smooth signal. Until it goes through that lowpass filter it is still a "stair step" PCM or a square wave width pulsed signal with sharp corners.
Paul's entire point is about what happens BEFORE the "final" output.
If they assume that the signal already came out as a smoothed then they see no benefit of oversampling because they are only seeing the finished signal after the lowpass filtering has already been accomplished. Without understanding that the quantization noise ever existed in the first place, how can they understand the benefits of moving that noise up the spectrum by increasing the sampling frequency? They assume that any filtering is done on the signal itself rather than on the artifacts created by the DAC modulation.
Finally, the Nyquist-Shannon theorem is not simply about sampling at double frequency. It is much more general than that. It determines the lowest frequency of the quantization noise. Sampling at double frequency is simply the special case at which the sampling frequency can no longer be reduced before the quantization noise occurs in the frequencies of the signal itself.
Paul's whole point is that by increasing the sampling rate above the signal rate, you move the lowest noise frequency up the spectrum. This creates a gap between the highest signal frequency and the lowest frequency of the quantization noise. If you increase the sample rate enough then you can you create a gap large enough for a the lowpass filter to have a long runoff before it has to attenuate the high frequency quantization noise.
You cannot fully understand that if you only think Nyquist-Shannon is as simple as doubling the sampling rate. The fact that the sampling frequency must be double the signal frequency is a consequence of the theorem. Not the theorem itself.
@@timharig good additional information, I think we agree here that his point was good but was lacking some relevant info. Thanks for the additional clarification. Btw, I do all my recordings at 96k!
Holy s.. this is amazing. Thank you so much for all the videos Paul.
I think the math around 9:50 is wrong? With 4 bits you would have 16 steps, not 4
Rick Yarussi Yes, and I also cringed when he thought 16 bits would be around a million levels instead of 65536, but his preface meant that it was not for us. Also, when he counted the levels per numbers of bits he went 2,4,6,8 instead off 2,4,16,32. I am glad he was trying to make a simple explanation.
I once owned a Arcam 73 CD player , a mid priced award winner.
I spent a fair bit of money to get it upgraded to the next model up which had a 192 upsampling dac. My nice sounding CD player was transformed into a noisy mess.
An Arcam CD192 once made me bolt out of a demo room. Much preferred the less resolved CD 73.
Thank you !
Really good one Paul.
YEAH! I really enjoyed that video too!
😂 Really?
Was all of that explained in one take? Cause that is impressive.
I learned so much thanks 👍🏾
Great explanation. Worth the tangent :-)
A smooth curve has a lot of information, the smoother you make the curve through interpolation the closer it approaches the analogue in terms of presentation, not information. I would use this analogy, it is an improvement, in that it becomes more bio available to the analogue brain system.
There’s a lot more to tell about this subject in general, but suffice it to say the 44.1kHz is all you need to capture every bit of information in the audible frequency spectrum. Nyquist/Shannon tells us that a band-limited signal only has to be sampled at twice the upper frequency of the band. The stair-steps are not present in the analog output of a digital source because these stair-steps are made of frequencies beyond the band limit, and are filtered out. The only thing I kind of missed in this video is the fact that 44.1 is enough, as the question implied it wasn’t because “you only have so many samples”.
Excellent sir so well said
Aaaaargh that hurt! :) Let's speak again about cables please, it's so comforting :D
Unnecessary cables you mean?
Are you talking about /advocating upsampling at the source (e.g. in software) or in the DAC? Most DACs do that already.
Just watched the video - I like the tank of water analogy in the comments below - its like a square ledge in a tank of water - a wave will reflect off the sharp edge - creating ring harmonics. I presume this is referring to the wave-fronts leaving the speakers (chopped at 22khz) acting in this way. A cd (44.1khz) can capture a signal upto a frequency of 22khz so if you sample at say 176khz you can see a signal at 88khz signal. I assume by interpolating the signal between the
existing signal points we can sample at 176khz and construct a reading at 88khz. Which allows the filter to be softened.
So we are generating a wave at 88khz not 44.1khz and the chop off distortions are at 88khz not 22khz so they have much less effect
Never thought about it, but does 44khz mean a full wavelength of a 22k sine wave is approximated by two samples? Sounds pretty drastic. Makes me wonder why it sounds anything like the original wave
You are correct.
You could also compare bits as pixels in a camera ?
And he could cut out 12 minutes of the video
Ladies and Gentleman's, I present you Mr. Paul Mcgowan, The Great Mediator between Audio Engineer and novice audiophile regular people. 👍..
thanks Paul
I actually understood that. Whoa.
You are badass! My highest compellent.
How many DACs will actually alter the brick wall slope according to the incoming bit rate? My DAC offers 3 brick wall slope variations and the only effect seems to be a small drop of sibilance or brightness in the upper midrange. Over sampling does smooth the sound but I figured this was from the occasional rogue bit getting through, not a lower order brick wall filter automatically appearing. Digital audio has been around long enough that it shouldn’t cost over 5 or 10 thousand to make it better than just decent! Just decent isn’t as good as a vinyl record. It may be more convenient but not better! To be honest my turntable and cartridge is probably worth over a thousand bucks, but digital is supposed to be way better! I’d really appreciate answers or tips. Why does digital cost as much or more to sound as good as vinyl?
Thank you Paul.
This is very good. Thank you.
Hi Paul! I always wondered what is about with the lower part of the frequency response of an audio CD. Is the lower limit 1HZ or is there some filter also in the low end, say between 1 and 20HZ?
Péter Sági No. The Nyquist theorem puts only an upper bound. EVERYTHING below the half Nyquist frequency can be reproduced correctly i.e. 1Hz as well. There the limit is actually the physical capability of your reprodiction system (speaker, headphone).
Thanks for your response Andras, but it's not 100% correct. According to the Nyquist theorem you could put frequencies up to 22KHZ on a CD. The reason it only goes up to 20KHZ is that there's a 2KHZ region to put gradual rolloff filters into - as Paul describes in the video. My question was that is there a need for such a filter in the low end region.
I don't think you need a high pass filter. it will works fine at low frequency since the sample rate only limits the highest frequency possible.
@@petersagi275
No. DAC's do not need low pass filters.
I think that you are confusing the purpose of the low pass filter in this instance. This might be because of the simplified explanation or because you are associating with the purposes used for low pass filters in other places -- such as in a speaker crossover. In other cases, you are trying to remove unwanted parts of a signal. In this case, you are effectively doing something more akin to a mathematical function. You are removing something that was produced by the DAC and never part of the signal itself.
If you look at the stair step wave at 12:00, you will notice that the stair step structure produces sharp angles and straight lines. These sharp angles and straight lines mathematically require very high frequencies to produce, so the part of the DAC that produces the stepped voltages is effectively inserting a bunch of very high frequencies onto the signal that were never actually present in the digital stream. This can be referred to a quantization noise.
The low pass filter acts as a mathematical operation to convert that stair step pattern with all of the sampling noise back into the original smooth waveform that was recorded. It does this by removing all of those frequencies that were injected by the DAC. Without those frequencies, it is impossible to recreate those lines and sharp angles and so the waveform is smoothed back into its original rounded form.
So that is why you absolutely need a low pass filter out of the DAC.Without it, you would end up with the quantization noise in the output. But, you would not want to pass the output through a high pass filter. The quantization noise is all above the Nyquist-Shannon sampling frequency. If you were to use a high pass filter, you would be removing parts of the original signal that were part of the recording.
Paul, are you talking here about increasing the bit rate of the ADC i.e the recording apparatus or the playback apparatus?
Thanks, makes sense
As a recording engineer, the best way is to mix a 44.1k or 48k multichannel mix upsampled to 96k or 192k stereo.
Daniel Denholm yeah i agree. i've find 44 and 48 khz upsampled to 176 on my dac sounds more crisp and detailed but again it's all volume relative. turn up the volume and EQ the mix you will get more separation. the key is to achieve that at low volume..
When you mix, you add information so this is common sense, but the point is, the original recording will not be better.
What you should do in the first place, is to set your DAW environment to 32-bits at 96kHz in the first place, or 192kHz.
Why are you going from something worse to artificial?
In short: every digital signal is a snapshot were something has to be prophetically predicted on the assumption that what happens next is coherent with what happened before. More samples makes it possible to smooth things out even if there is not any actual data.... why does this look a lot like what an mp3 codec does....
Thanks for this. I would argue that the interpolation does *create* more information but this is neither here nor there. I now understand why upsampling could be good. Thanks for this.
It's artificial information, not real information.
@@380stroker artificial cold (read wrong) information. Correct!
yeah its some what difficult to remember that number especially if you dont use it everyday.
I have a cary cdp1cd player that has a choice of upsampling rates you can choose. Is there an ideal upsampling rate?
My advice. Do not use any. It just created an artificial sound.
finaly i think i got it..Thank You
And then there's delta sigma and multi bit dacs to complicate this even more. Maybe you can do a video on that?
You are a great teacher.. Thank you..
There’re so many factors to got good audio quality, it’s not only bit depth & sample rate,, but it’s also the amount of khz, the amount of kbps and the s/n to noise ratio, it drives me insane, one 16bit audio sample is not the othef 16bit audio sample,,, for instance an 16bit audio sample at 11khz at 64kbps doesn’t sound better then an 8bit sample at 22khz at 128kbps ,as a result we consumers can get easily fooled & confused by those audio equipment company’s, aaarrrggg.
Guess you lose a lot of phase shift with a lower order filter. I listened to your talk about regulating power supplies. Although the output stage doesn’t use voltage regulation it can provide a huge buffer against output power rippling the supply voltage. I believe using a voltage regulated power supply could trigger harmonics within the regulation, introducing some low level noise. A lot of film bypassed capacitance and a high current transformer may allow the output voltage to wander a bit with ac variations but should create pretty smooth and stable road for the signal to ride on, not introducing any power supply artifacts. A cathode follower tubed power supply is awesome but the price does add up.
Fascinating!
Nice video as always Paul!
You have two kinds of people, those who can count binary and those who don’t :-)
Maybe it’s more intuitive to show it as 1 and 0 instead of the stair steps
Exactly, Paul doesn't know how to properly set up the bits on that board, the most significant bits to the least significant bits. Also, he doesn't understand nibble, byte and word terms -> 4-bits, 8-bits and 16-bits.
This is one of fhe best from Paul! I have always wonder does upsampling has a same effect as converting from flac to wav and then to analogue. Does oversampling draws more power that usual sampling and in that way compromise the power supply like that happens with flac? Because interpolation needs some resourses or I got that wrong?
I've never heard down sampling decimation make something better unless it's with content with a bunch of ultrasonic garbage you're trying to get rid of, but I have heard good oversampling DACs and especially interpolative upsampling, both from Burmester Audiosysteme and on the Denon DJ DN-X1700 inputting 44.1khz and letting the mixer run at 96khz. The 44.1khz strait mode was closer to what other DACs produced with 44.1khz, but I found the interpolation to be more pleasing and seductive sounding. I've never heard any of the plug-ins for Winamp or Foobar for upsampling that did that type of pleasing interpolation, though. Maybe someone knows a good one? I realize you have to use a USB DAC that's capable of exploiting it. I also am a big fan of HDCD, though that's apparently a little different.
Was attacked on a facebook audio group today for saying upsampling is wonderful and I hear it.
Indeed, you cannot got more info out of 16bit but you can use linear prediction to add more samples between existing samples , being based on the next and previous samples to get a smoother audio, 100hz smooth motion works similar , if in the 1st image the hand was rased down while in the next frame the hand is all the way rased on top, then the system can presume that hand was in the middle first and so on,resulting in smoother motion, in fact you could multiply new samples & pictures as much as you want based on predictions by creating new predictions based on previous predictions and so on.
So eventrough the information wasn’t there but by using prediction, we can asume what’s supposed to be there, albeit not 100% but still, it’s incredible what you can do nowaday’s with such advanced AI technology👍
Prediction is not the way to go to reproduce the truth in sound. This will always sound artificial. This is why I stay away from upsampling.
Without getting way too complicated and fast Fouriertransform explaining here .... let me try to state it simply as I know how to: If you apply a digital FIR filter a finite impulse response filter in the process of interpolating data between the samples you were effectively controlling the main lobe shaped and sidelobe shapes or stopband of the frequency response. That is you are controlling the shape of the main passband response how rounded is how quickly it falls off after 22,050 Hz and how low the sidelobes or stop band a bit higher frequencies between let’s say 23,000 to 44,000 Hz what those are. If you make the assumption that there are no more higher frequency signals than 22,050 Hz that you are interested in because the human ear cannot Hear them .... then you have lost no information and you have not gained any information by up sampling - that is what you have done is really controlled some of the unintended distortion and aliasing effects that can happen if you were not perfectly filtering (in your output or reconstruction filter after the D-to-A stage) right after the 22.05 kHz (44.1 k sample rate) limit. Or You have made it easier on yourself to design the output with filter if the sample rate was double that, that is you were not having unintended noisy noticeable effects between 22,051 Herz to 44,100 Hz (in the case of 88.2 kHz sample rate. ) or you may find it’s easier to design a linear phase filter in case where the sample rate has been moved up higher. If I could try to boil this down as simply as possible : digital FIR filtering and creating interpolated samples can just give you only a more faithful representation of that original smooth or jaggy time-varying signal but cannot create any more information of the band-limited single (say 20 Hz and up to 22,050 Hz) you started with in original signal (there is always inherently a band limited filtering or an actual filter before the sample and hold A-to-D stage). If people want to get into debates about how you CAN hear the harmonics of percussion and brass and other instruments higher than 20,000 Hz in the human ear that is totally another discussion, folks.
yeah, I know: I am known to be verbose. i get passionate about engineering, dsp, audio and similar principals. I just hate it when people claim that there are more higher frequencies that the human being and its ear are supposed to “hear” which is simply just not true :)
Upsamples doesn't increase the quantity but make less the error
I still don't really get what's wrong with a steep roll-off filter near the 20kHz.
it affects the quality of the audible portion
@@chucknorrismeta3171 Yeah, I recently found out that steep filtering causes phase issues.
And frequencies above 20kHz can affect audible frequencies because of intermodulation.
I hate upsampling, there are so many algorithms and almost all of them make sound upleasant, maybe cleaner, but there is always something wrong. Worst of all is dual upsampling, like 44.1 to 96khz on PC with unknown algorithm and then 96khz to 192khz in DAC... I think this is reason so many people hate computer as device to play music... sometimes there is also digital volume control on PC which can degrade sound even more... I also have NOS DAC, sound nice, but there much less detail in highs...
New 96khz recordings sound best for me on modern DAC-s (like ESS DAC-s) without upsampling, but 99% of my music is old... so what to do?
Paul, a 4 bit word is a nibble, an 8 bit word is a byte.
and nor is 4 bit word a word, nor is an 8 bit word a word
you got the other things right, though
I’m afraid all values between two measured ones are greatly created with an output low pass filter. We never hear the steps. We hear harmonics of sine waves. What low bit depth makes is to introduce a noise, and to reduce a dynamic range. A really bad thing is that this noise is a periodic one and sounds like ringing. But if we simply increase a depth, we will not increase a dynamic range, and so will not add more information. We add zeroes during an upsampling (yep). We can make nothing with already measured values because we suppose they are valid, because they came from an original file. But this is not true, they contain a quantification error, and so noise remains. To achieve this we have to push a noise out of an audible range with a noise shaping technique in conjunction with upscaling and upsampling. We can even downscale (but not downsample) the values, get back to an original bit depth, and get a bigger dynamic range anyway. Just a “simple” math. Not an electronics.
Lessons on binary digital are either good or bad. There is no in between.
what the hell is he talking after 14:00? I am amazed how people understood that.