Open Letter To Curtis Judd on 32-Bit-Float Audio

Maxotics

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 3 жов 2024
Originally, I figured I'd sent this to him as a google video, but I figure others might be interested. It's not like its easy to make videos for this channel.
1. All waveforms should always be shown without data truncated (and then zoom in)
2. Fundamentally, all microphone data is a set of simple numbers representing voltages at points in time. There's no noise floor, 0db, clipping, frequencies in the data. Those are subjectively INFERRED later. If one has never worked in electronics, or with data, it's easy to infer information in audio software that simply is NOT in the data. I too, had this view of audio data, long ago!
3. In short, I look at audio data as the voltages coming out of the microphone (then pre-amp) and the numbers that represent them. THERE IS NOTHING ELSE. If the pre-amp is set correctly and one has at least 16-bit of data, I have found no evidence that the voltages (numbers) exceed that amount of numbers set aside to represent them.
When I took the Tascam's Portacapture 32-bit-float files and converted them to 24-bit fixed they matched PERFECTLY bit to bit. There is no information in 32-bit float files that cannot be represented in 24-bit fixed. INDEED, behind the scenes all microphone 32-bit float data is CREATED FIRST from 24-bit internally.

КОМЕНТАРІ • 29

@SpudUna 12 днів тому ⁺¹
The 32 bit float recording has made a massive difference to me. Being slightly deaf plus an audio processing disorder.
@MaxoticsTV 12 днів тому
I don't doubt something you're doing improves audio for you. But sorry, it has nothing to do with 32-bit float. Or you'd have to explain it to me scientifically. Thanks!
@RandumbTech 19 днів тому ⁺²
Curtis Judd is probably the last guy I’d pick a fight with 😂. Personally, I could care less about the math. All I want to know is does 32-bit float work. The answer is a resounding YES! As a content creator I don’t have time to properly set my gain, let alone adjust it in a dynamic situation. 32-bit float just works - and that’s what most people are concerned with.
@MaxoticsTV 19 днів тому ⁺¹
I haven't picked a fight with Curtis. I never get comments like this. "I don't understand X...but I know X does this." 32-bit float does not improve anything in the audio acquisition process (mic voltages to preamp then digitization). If it does, please explain how. I explain why it CANNOT work in my essay. Tear it apart! But you have to bring evidence. We are all learning. Curtis, me, you! We all make mistakes. IMO, some manufacturers broke the trust Curtis had in them. We'll see!
@mch2359 18 днів тому ⁺¹
Measuring an analog signal with twice the hardware results in more resolution. The transducer will fail to provide an analog signal before a lack of resolution limits our measurement. "Floating point" is just a way of noting the measurement.
@MaxoticsTV 17 днів тому
I don't understand what you're saying, sorry. Maybe you can go into greater detail. Floating point is a computational/mathematical way of creating more scale. It does not create more resolution. The resolution remains at 24-bits.
@mch2359 17 днів тому
@@MaxoticsTV Floating point is merely a way to express large numbers without having to write the whole number out. This method is good enough because the lower positions are not significant. The resolution of a measurement depends upon the resolution of the scale used to make the measurement. It is obvious you can't get a higher resolution by multiplying two measurements together. The witchcraft in "32 bit float" is combining two measuring devices into one.
@appads 22 дні тому ⁺²
To my (non-expert) knowledge, 32 bit float works in a similar way as dual gain output (DGO) camera sensors. Cameras can have multiple native ISOs by using analog amplifiers. This is different than just regular ISO that just digitally amplifies your signal. It gives you relatively cleaner signal at lower light, for example, than just turning up the ISO. Now if you combine two native ISOs (DGO), you get a larger dynamic range, better signal to noise in your low signal, and higher clipping point in your high signal.
32 bit float gives you larger dynamic range. In your data, that means a larger highest number. But the increased dynamic range isn't just because of the higher bit depth. The way you collect that data is important too. Not all amplifiers are equal. Similar to DGO, they use multiple analog amplifiers optimized to different ranges and then combine them. Meaning that for all practical purposes, setting the gain is not as important, because the dynamic range of the recording is composed of multiple gain stages combined, and that range is larger than the dynamic range of the microphone.
When Curtis shows the data with the clipping first, and then brings it down. He's not dumb. He knows that data is already there. He's not trying to trick anyone, or push industry hype. He's illustrating the point that, that is where the clipping would be in a lower bit recording if the gain wasn't set properly. Also, that's the way many audio programs bring in the data, normalized to a 16 bit audio signal, and in that format, numbers that high are clipped. So he scales it down into line with the 16 bit range of numbers (data).
@MaxoticsTV 21 день тому
32-bit float works this way. Save you acquire an amplitude value of 5699 at 1/48,000th of a second.. In 32-bit you move a decimal point (float it) to the first digit. So you get 5.699. Then you raise it to an exponent of the remaining digits ^3. What I did is math. I rescaled the original number. Did I acquire additional "audio" information in that process. You can query AI yourself for how 32-bit float works. Sorry, it DOES NOT GET HIGHER DYNAMIC RANGE for the microphone data. It only increases the RANGE (Scale) of the values. Once you need to amplify the data again (listen to it) you must bring it back to 24-bit, or another fixed space.
If microphones don't resolve effectively past 12-bits what are you accomplishing?
There is no such thing, in my book, of dual ISO sensors. It's a perversion of the ISO concept. ISO was developed to test the sensitivity of film and standardize the scores so you could mix and match films. You could expect the same exposure results whether you used Kodak, Ilford or Fuji film.
Today ISO is being used to indicate amplification gain. True, DGO can theoretically provide different values depending on the amplification and circuitry. But it only works by switching the amplification circuit. So for 30 fps it would internally shoot 60 fps, each alternating sensor read going to a difference amplification.
The improvement you gain is at the amplification level ONLY. It does not change sensor level sensitivity. The sensor doesn't change, can't change.
I can't hear the difference in audio quality in the past 30 years. Of course, I'm getting old. So it might just be that ;) Anyway, sure, I believe it COULD work. I just can't see it in the data or hear or see it with my eyes. Lighting, or setting up your audio recording space and microphones, that's 99.99999% of it. In trying to improve that 0.00001% many lose quality in the double-digits :)
My message is to experiment. To think these concepts through. To focus on what's important and not get distracted by the scientifically unproven claims of unscrupulous manufacturers.
@appads 21 день тому ⁺²
"the improvement you gain is amplification level only." YES! That is exactly the point. It doesn't effect the sensitivity of the sensor, but it does effect the signal you are able to store. The microphone is analogous to the sensor. 32bF doesn't change the sensitivity of the microphone. The pre-processing makes the applied "gain" result in a cleaner image. For audio, it's not about making the audio "sound better" per say. If your gain is set correctly, a 16 bit and 32 bF, are going to sound pretty much the same. Again, it's all about the dynamic range of what you ultimately store. And that comes down to the multiple analog amplifiers set to different gains. I'm not sure why they "need" 32bF to capture the (let's call it dual gain) audio. You CAN squeeze it back down to 24 or 16 bit in post and it will probably "sound" just as good. It's all about the dual "gain" or "amplification." With a single gain 24 bit recording, you can very easily clip your signal if the gain is set too high. Let's say you're recording a couple people talking very quietly, and one suddenly shouts. Believe me, I've done it, and you can't fix it. Yes, you can turn on limiters, but that works different. If turn down your gain, you might not clip your highs, but your lows will not be as clean. There will be less signal to noise, and when you digitally amplify it in post, you get more noise than you would have with analog amplification of the raw signal.
@MaxoticsTV 21 день тому
@@appads Most audio I record is pathetic, horrible. So I'd love a fix to the problem of pre-amp gain! But I can't see it. But first, as you wonder, what would 32bF have to do with dual gain? Nothing. The same principle should work for 16-bit or any-bit. I've written all the manufacturers who claim 32bF doesn't need gain but they all stop responding once we get into the technical part. Anyway, I don't believe they are doing dual gain in audio because it's one thing to switch gain 30 times a second and quite another every 48,000th of a second (my guess). If one had done it they would give audio proof and have it patented. So, color me skeptical ;)
99% of microphone efficiency comes from the distance of the mic to the source and the strength of the source. I mean, if all this dual ISO stuff worked we'd here, no need to set ISO! Put the mic anywhere! Shoot at any f-stop you want! We'll gain it so it's perfect.
Oooookaaaayy ;)
@tevryan 21 день тому
@@MaxoticsTV What 32bF has to do with dual gain, is that you get larger dynamic range without losing accuracy. It's like RAW recording vs compressed recording. For most circumstances, raw recording is overkill, but if you need to significantly adjust your image in post, that additional information becomes very useful. I said above, if you compress a 32bF signal down to 16b, that they will sound the same. But if you have a lot of dynamics that you're trying to resolve; let's say you have to bring it back up to recover some very low signal, you will be losing information, and the amplified part will no longer be as clean. So the dual gain, gives you increased dynamic range, and the 32bF gives you more flexibility in post. Most importantly, the dynamic range avoids clipping, which is not recoverable, but you also get higher accuracy from the 32bF to recover dynamics.
You can do a tests, if you want to "test" this. Use a 32bF recorder, record a signal in 16b or 24b, and really push the recorded dynamics to their limit. Lower the gain. Record a very loud signal and a very soft signal with some sort of reproducible pre-recorded source. Do the same thing in 32bF mode. Then look at the signal to noise ratio at the low end of the signal, and see which one is cleaner. I think I recall Curtis doing something similar in one of his videos, and it was a noticeable difference.
As for dual ISO, yes it works. You see a clear difference in SNR when you go from one native ISO to the next. For DGO, you still need to set the ISO because the dynamic range is not infinite, it's just larger. And it does increase the rolling shutter. Maybe someday they will have TGO or QGO. Or maybe they'll pack 3 times the sensors, and each set can have its own sensitivity range, maybe with some high sensitivity photomultiplier tubes on the low end. Maybe they can do hyperspectral imaging for perfect color accuracy and not just RGB, and they can stitch those together with a global shutter, and we won't need to set the ISO or even use NDs, but that's not available... YET.
@MaxoticsTV 21 день тому
@@tevryan Both audio data and image data are LINEAR streams. When you set the pre-amp gain, in either case, you're setting your maximum voltage, FAIK. There is no such thing as compressing 32bF down to 16 bit; rather okay, but not for 24-bit and I doubt you'd hear a difference for microphones in 16bit. 32bF is SCALED data, the extra dynamic range is mathematical, not a property of the original LINEAR data. I'd like to do full tests as you suggest one day, but they require a bit of equipment, preparation, etc I have done some tests. But you're right, would need to do some very specific things to prove my thesis. I have half a mind to get the GH7 and its XLR adapter to prove that Panasonic is lying--or I don't get it! :)
Dual sensor cameras would definitely be interesting!!! Or curved sensors and simpler lenses. So many things they could do instead of letting their marketing departments mangle what the engineers tell them ;)
@aseomg 18 днів тому ⁺¹
@29:18 How am I going record and "set the gain properly" if a company like Zoom Corp doesn't have gain knobs on their audio digital recorders...because they're using two analog to digital converters and using a marketing term; 32bit Float to let the consumer know: Just record audio, and adjust levels in your software.
C.Judd made an unnecessary video explaining a marketing term by using visuals which included a waveform which you seemingly wanted to make a video about because for you, its data.
32bit Float has elimated the need for someone to hire a sound guy to record audio in the same way that a photographer doesn't need to hire a person with a camcorder to record video because the photographer's digital camera also records video.
And this is in essence what technology does. It democratizes, it creates efficiencies, it pushes our boundaries into creative possibilities.
Can something be said about technology being disruptive and harmful to us? Absolutely! But really, no one cares.
We adapt to the here & now, just like people are adapting to not having gain knobs to adjust because of 32bit Float.
@MaxoticsTV 17 днів тому
At the end of the day, whether the device is 16-bits and is using a limiter, or 32-bit-float and is using some "magic" inside, it doesn't make a difference--you'll get usable audio. I agree with you!
But if we no longer question what is science and what is not we let our wishful thinking lead us further and further into tyranny, the tyrant using the fears of the population to excuse their actions. You will probably find this quite a stretch, but there it is.
@martin-4193 6 днів тому
2:35 - waiting for you to cut to the chase.
@MaxoticsTV 6 днів тому
I feel your pain, really. But this is NOT a simple subject. You can watch some of my other videos on the subject, some have links to an essay in the description. Thanks for comment. My next video I'm trying to cut to the chase in the first few seconds!
@martin-4193 6 днів тому
@@MaxoticsTV 11:32 - I am still watching. The numberspace is 32 Bit and it isn't 2^32 like it would be in 16 or 24 Bit Audio, rather it is 3.4028235 × 10^38. So you have to translate it into dB, because its so huge at 1528dB. You can compare that huge Audio space like Rec.2020 picture data, that has to be downconverted into a SDR Rec.709 dataspace. Same happens when converting 32-Bit Float to 16 or 24 Bit Audio.
@MaxoticsTV 6 днів тому
@@martin-4193 Sorry, you don't have to translate it into dB or anything else. We do it in audio to align the values with our perception of loudness, right? Microphones produce linear voltages. The only question here is how do we quantize those voltages with the greatest fidelity. I wouldn't compare it to Rec 2020, or any color space, because they are simply mappings of color values to numbers. Yes, if you have 10-bit numbers in rec 2020 you'd need to convert to 8-bit for Rec 709. I don't see how that has anything to do with analog voltage capture. Converting 32-bit float to 16 or 24-bit fixed must round down to the 24-bit precision of the Mantissa. Maybe I don't get your point.
@martin-4193 6 днів тому ⁺¹
@@MaxoticsTVThe signal path is correct. You have a microphone, that spits out some kind of voltage. That gets transmitted via an audio cable and is then quantized, what we call A/D conversion. So prior to 32 Bit float, it was a linear representation, either in 8, 16 or 24 bit. That means, the peak voltage of your mic should but out 2^24-1 as a value. So far so good. The issue isnt really the full scale, it is more on the quiet end of things. When signals get weak, you only have e.g. 2^4 as a value. The resolution is bad, as you have lost your 24 bit precision and you are more down to 3 Bit audio - and that doesn't sound right? So the switch to 32 Bit isnt 32 Bits in a linear fashion. Rather they to en.wikipedia.org/wiki/Single-precision_floating-point_format this format. So weak signals still have high resolution when represented as a digital number. (hold on, more I'll continue in a second post)
@martin-4193 6 днів тому ⁺¹
@@MaxoticsTV Part 2: so a 32 Bit float Audio space is huge, as said more than 1500dB of dynamic range. The issue comes here, that we don't have loadspeakers and amplifiers that have the dynamic range auf 1500dB. This is, what you would need to play that kind of audio. And now comes the number space transformation. (This is why I choose the color space transform, because there are no displays that can display Rec.2020 in full scale). We set a reference value and then downconvert to a linear 16 or 24 bit audio space. Because this is what our equipment can send to loudspeakers and headphones.
@Vahamedus 22 дні тому
As my physics teacher was saying: "you need to be able differentiate first to integrate later". Sir, you are great at differential understanding, now please add some integral characteristics.
@Vahamedus 22 дні тому
Ok I understand now that my comment was too early.
@MaxoticsTV 21 день тому ⁺¹
Hopefully this makes sense. Oswald Spengler argued that the Greeks could have developed calculus but they chose not to because they didn't see the point of math that didn't represent the "real" physical world. He argued that calculus is more than just math, it's a cultural value system of "growth." That has been born out by our capitalist politics in the 21st century. Of course, without calculus we wouldn't have many of the technologies we have today. Nonetheless, the problem remains. There are many aspects of the physical world they don't apply to.
So to answer your question, more differentiation will not improve our integration because we simply won't hear the difference. The physical characteristics of microphones are fully matched with the physical characteristics of our electronics.
Indeed, they are so past the accuracy we need that we compress the heck out of it and still no one can tell the difference ;)
@edwardnixon1782 7 днів тому
I think if would have been more clear and more efficient if you had left Curtis Judd out of it and just explained your analysis method. This is confused by your contention. Also, who are you that I should consider your approach authoritative.
@MaxoticsTV 7 днів тому
I've done many videos on the subject, which you can find. This video is part of an on-going thing with me (and Curtis). I'm the kind of guy who questions authority and would get locked up or worse in other times and places ;) Anyway, "authority" me? Funny!
@VoltLover00 22 дні тому
"There's no noise floor, 0db, clipping, frequencies in the data" - this is utterly false
@MaxoticsTV 22 дні тому
Let's say you have 3 data point, 354, 322 and 366. Which was is noise, which one a clipped number, which one a frequency? Let's focus on frequency. Helmholtz and others figured out frequencies LONG before electronics, let along digital electronics. Mind blowing they figured it out! Frequencies DESCRIBE a collection of sounds (data points in the digital world). They are a mathematical model that if enough data points fit it one would say it's X frequency. The frequency does NOT exist in any vibrations, it is a man-made mathematical description of a collection of vibrations.
So I don't see how it's false, let alone utterly false. You'd need to explain it more.

Наступне

Автоматичне відтворення

Busting 32-bit Float Myths: A New Era of Audio Recording