Had an assignment last year to implement a Bayer filter on an FPGA which I failed because my professor couldn't explain what the filter was doing in terms I could understand - he insisted on trying to explain everything in terms of matrices: You've just explained it with perfect clarity in about 5 minutes. [Tears out hair] University sucks, Computerphile & UA-cam FTW.
I have a similar project at my internship basically i have to implement a module that should convert bayer images to RGB. This comment would be nostalgic to u.
One thing that's great about this aproach on learning this....is that, after you understand it on this "higher" level it's much more easy to get into the algorithm implementation part and the mahematical part. You get into it with a purpose, with a visual idea of your goal....and spend much more time in the abstract world without losing focus (pun intended).
Almost lost my sleep over how a sensors captures images. I couldn't find any information on how the sensor compensates for the extra green or what happens in the in betweens. This video helped me better understand all of this and even more. Thank you so much!
So many videos just to picture (eheh) the sensores and the filter... I got to Computerphile and understood it in the first minute... I've seen you for my Informatic Security exam, now for my optics exam.... You got it all. Thx a lot
I knew about the bayer filter but it is very interesting to hear about it more in depth. It's almost completely useless for camera operators like me to know, but my god it's so well explained and more knowledge never hurts.
You know, when I was doing research into how they made color movies before color film was developed I found out that they would split the light beam into it's component colors and they would have 2 or 3 rolls of film all going through the camera at once, taking black and whit film. They would then tint the film with it's respective color and then put them together. Some times gluing them, some times using a processes to make a kind of stamp. It's pretty interesting. I'm sure if you wanted to get a more accurate picture you could do something similar and have a beam slitter with 3 separate sensors.
I'm just getting into photography and astrophotography so this is of extra interest to me now (on top of the generally interesting topics that you tend to cover). Thanks!
You should include more simplified diagrams & visuals for better learning. Its a huge topic to cover in a 6 minute video but I think it could've been clearer. Thanks for such informative videos & keep up the good work. (y)
And the Lowered Color Resolution is why Space Probes use a Black and White (Luminance) Camera and then just Physically put a Big Filter infront of the whole camera one Color at a time, so you get the full resolution for each color.
2:13 I never got this explanation, here or elsewhere. What does our eyes capacity to capture green light have to do with how the sensor has to do it. The final compounded output of a 4x4 pixel array will be a algorithmic calculation (demosaicing ) anyway. I would assume, I can adjust this calculation/ demosaicing in a way, so that the final picture looks natural to our eyes then anyway. I would understand an explanation like, green being the most prevalent color in the environment, thus, it would make sense to focus on capturing that. Eyes capture -> Brain translates. Sensor captures -> Chip translates -> Monitor Output. 2 separte systems of capturing and translation, therefore no need to mimik one part of the "other chain". What am I missing?
By capturing more green pixels the cameras/chips can more accurately figure out the green values than they can red and blue values. The reason we see more green than other colors is because we have more green cones (biological color sensors) than other colors.
Just to nitpick #2:16, we don't distinguish luminance with much more intensity in the green channel, as lumen refers to how bright the human eye will perceive light; 100 lumen green, blue or red would look identically intense as is the purpose.
Is there any software that analyzes an image and uses different demosaicing methods for different parts of the image? Also, is noise one pixel misreading it's signal? Can noise be reduced in the demosaicing phase? By excluding information by individual pixels that don't fit their surroundings?
Watching this made me try to think of how to do that process (well, in a way) on each captured pixel. Granted it would be more expensive, but could there be a mini prism that "stretches" the light like a prism and then use that data? I'm guessing no because of the size of the photon packet, but it sounded good in my head.
There are color measurement devices that do exactly this! They're called spectrophotometers, and measure color extremely well. However, I don't think they scale well.
I think you should have shown examples of the demosaicing process. This is a purely technical theoretical explanation. I knew and understood that explanation for years before I first saw an actual visual demonstration of it. That visual completely changed my understanding of it, because just the technical story doesn't really explain it all that well. Wikipedia has a very nice demonstrative visual.
The problem is that there is not one way of demosaicing to rule them all. I agree that an example can be nice, but saying (like in the video) that it's more or less a smart averaging process is true in general, whereas a specific example could lead people to believe that is *the* way to to things.
How about RGBC sensors? (C = Clear, not sure if it's the same as White as in RGBW) They're rare, but I know some things use them, like for example some Motorola phones. Great video btw. Very interesting topic.
In film making the debate between film v digital is still very much alive and well. My Super 8 camera can give professional cinematic images. - The way the film is made and processed has improve dramatically over years, however, to gain that high widescreen quality is expensive. The company Pro-8 is famous worldwide for it's Super 8 processing (and all other film formats). Film is still largely used by Hollywood. - Most directors are anti digital. TV networks often still use film. - PBS has a strong relationship with Pro-8.
Some higher end manufacturers used to be a bit more honest about the actual interpolation breakdown in their technical spec documents like hasselblad and phase one but for some reason I can't find that data anymore - I've spent a lot of time googling to put some more data into my comment but have to go from memory so please excuse that. Most 35mm format digital cameras are in fact as many people ask/suggest offering about 1/3 of the reported resolution due to interpolation and then also make it worse by using AA filters to get rid of interpolation artifacts. So a 30 MP camera is closer to 10MP as far as the physical sensors capability is concerned regarding the colour with the least amount of pixels dedicated to it and you still have the AA edge issues which are a lot more noticeable than the marketing would make you believe - it's not just fine patterns that are affected, those are just the extreme examples - all edge quality is visibly affected if you don't downsample. If you downsample a 36MP nikon d800 shot 2-3x then you can correct the edge detail and counter the quality lost due to interpolation, especially with the excellent downsampling that you can do nowadays (choose correct technique depending on image content etc.) and arrive at a resolution of around 16MP or what was the standard for MF digital backs many years ago. The problem is that even the most expensive MF digital backs do not report interpolation figures as honestly as they used to and when judging progress of digital sensor technology we often look at the MF market, at the expensive solutions! I have no idea how many physical "pixels" are on the sensor of say an PhaseOne IQ260 which talks of 60MP resolution, by old high quality standards for interpolated MF backs 60MP reported would suggest there are more than 60MP "pixels" in total on the sensor, Maybe 30MP for the colour with the least amount of pixels but definitely not 1/3 but lack of data might suggest that it is in fact the same breakdown as in small format full frame stuff from Canon and Nikon. It is the standard that small frame digital bodies are interpolated and the resolution is over reported and almost all have AA filters but that is not what we expect of medium format backs. Most of these campaigns against film photography are based on figures and opinions stemming from use of the medium format stuff not small frame. So not to start a debate between medium format and small frame - what I am trying to say is that we are all getting shafted nowadays with this ridiculous marketing, especially against film and the data to tell how much shafted we actually are is no longer made available. These interpolation techniques which are not an issue with say 3k dollars for a nikon body are an issue when we are talking about a 50k dollar medium format back, which is at the end of the day just a big sensor you add to a system you already own and paid for. For those of you who never used MF, my feelings translate like this - it's as if you would have to buy a 3k dollar sensor that you would put in your analog Nikon F and it would offer maybe 2x time the resolution of an iphone camera interpolated up 3 times - you would be very unhappy, especially given that film is being destroyed by this kind of marketing. You can take amazing digital photos with just a pinhole attached to an old digital camera - the point is that the marketing used to be a bit more honest about the resolution. The old expensive MF backs offered maybe 4-12 MP but that was without interpolation or AA and what is left nowadays is still thankfully lack of AA but the hidden interpolation details and lack of reporting in regards to how the sensor is physically divided between the colours is all about hiding the true progress of technology. Unlike many seemingly commenting on the web about too many MP, I do actually feel there is not enough MP - I want more. I want about 120MP on a full frame MF sensor if it has to be interpolated (that is at least 56mmx56m to cover the film area in my system) and I don't want less than 80MP per colour. This kind of digital back is maybe 15 years in the future. An interpolated file without the AA filter where you get say 16MP at least for the colour chosen to be least important resolution wise is of-course going to surpass a non interpolated file where you had 16MP for each colour because the interpolation techniques and filtering is very good nowadays - it's just that we are not in the high MP numbers as we think we are because through the years the reporting and measuring changes. This might be irrelevant to most but to those of us who are attacked all the time about our film being outdated have to deal with all kinds of crazy arguments. One person told me my slow speed MF film is at most 50MP, well I disagree but due to interpolation even if my film was 50MP he won't be able to find a MF back that actually approaches that save for maybe the newest 80MP offering from Phase One. The same problem is with scanners for film and general reproduction setups - they also use interpolation and are even more ridiculous in their marketing. Arguing that interpolating something to large sizes is as good as natively reproducing it at those sizes requires tinfoil hats - it's like saying on the other side of that argument, that a slide enlarged to cover a whole building magically increased in resolution - we might perceive more resolution and fantastic detail, the colours might be vibrant with that super expensive projector and so on and so forth but it's still not a ground to argue against a competing technology.
***** Buy a seccond hand high quality zeiss lens for an older analog only mount and then buy a converter to go with it - for instance from fotodiox. For instance the hasselblad lenses I use have the main shutter inside them instead of inside the body itself so sometimes you can get ones with a broken shutter or one with innacurate times for a pittance. You could get one with mint glass for say maybe 200 quid since fixing it is another 200-300 at least. Only you never fix it, you buy a 30 quid converter and mount it on your sony. It's not a solution for wide angle but everything beside it, it's excellent - you would have to spend at least 2000 quid to get a high quality zeiss lens in native sony mount and a high quality zeiss lens from yesteryear is the same exact thing, sometimes even better. The only thing that changed with most zeiss designs over the years where ergonomics, it's the same glass configuration. MF lenses work great on small frame bodies because you end up shooting through just the centre most part of the lens and that with most lens designes is the sharpest so not only are you using top of the range glass, the aberations at the edges start where your sensor ends!
The problem of figuring out what goes in the gap of the color sensors seems like something that a neural network would work well to solve. Train a neural network to recognize patterns like edges, curves, and grain, by using incomplete color information. And maybe try to shape it's network structure so that it corroborates it's guess for one color of sensor with its given data from the other available sensor colors, rather than just a single type. I bet you could get an astounding quality image using this processing technique. (Come to think of it, I think that "RAW" image data may actually be the data from the individual color sensors, before the camera tries to squash them into normal pixels. So the neural network could be used well after taking the picture, if your camera saves the raw data.)
@@MrConminer Pretty neat. I haven't kept up on camera tech or AI lately. Been focused on VR and creative stuff. Thanks for letting me know, though. ^_^
Sounds like we need a camera that detects the exact color for every pixel instead of creating the color by mixing nearby rgb values. I wonder if that's possible. A sensor that detects the exact color frequency hitting its sensor. Not rellying on using rgb filters to determine the rgb values.
The whole point is that we don't need the camera because with closely enough placed RGB components, our eye doesn't have enough resolution to get affected by the distortion anyways.
In addition to the beam splitting approach mentioned by Sean, there is also an approach called "sensor stack". You can actually buy such cameras whose sensor is produced by Foveon. However, while this technology is more elegant in principle, eliminating color interpolation and not wasting any photons to color filters, in practice it has noise issues and introduces other weirdness by not having such a clear color separation in hardware.
It would be remotely possible but one must remember the math involved; in order to break out the constituent colours you need to record the 'mixed' waveform of the 3 constituent colours...then do a fourier analysis to find what waves caused the waveform. Unfortunately silicon sensors relying on the photoelectric effect don't record the incoming photons as waveforms, rather just as charges built up on the individual pixels. By just 'counting charge' you can't tell whether it was 3 red and 2 green or 1 blue and 4 red to make the pixel that particular level of charge.
frollard What you are talking about is doing a spectral analysis for every pixel -- only to then reduce it back to RGB. There is a patent by Nikon that does the normal beam separation stuff on chip. Which is basically what you are talking about but simplified to match the task and not do more than required. Not that it wouldn't be cool to have a full spectrum for every pixel ... but the amount of subpixels you'd need to get something one would call a spectrum ... well, some day perhaps.
Interesting. But what happens if for example a yellow color photon hits one of these filters? Will it get through or get discarded? Is there a range of frequency these R,G & B filters let through and if so does these ranges of frequencies cover the whole spectrum of visible light? So many questions, so little time. :)
Wouldn't the colour filtering cause 2/3 or so of the photons to go uncounted? I know they could keep detecting longer but can't that be more efficient? Couldn't they also use tiny optics like prisms to split the spectrum and get a larger amount of the source red, green, and blue light to the intended detectors? I saw that some cameras use a beam splitter to split the total light but even they filter out 2/3 or so of the photons before detection instead of sending most red, green, and blue to the appropriate detectors. Chromatic dispersion seems to be more efficient if they could just make the millions of tiny prisms over the detector.
Why are raw images so huge in file size? It seems like there is less data than in a demosaiced image because each and every pixel has three values, not just the one in the raw.
I think demosaicing usually happens in the hardware before the raw image is saved so you end up with 10 million pixels times 8 bits per pixel times 4 (or 3 if we drop the alpha channel) which comes out to about 40 (or 30) Megabytes plus header space. That's also assuming it's using 8 bits per color, some formats use more and others less. The actual raw image would still be 10 MB plus header space. Image compression formats tend to shrink file sizes by a ton using a variety of methods including some that outright erase raw data under the presumption that we can't tell the difference. If it's not already been done image compression would make an excellent video if not series of videos.
***** Nice to know how all of this works, thanks for the post. I have a question about these filters: how possible would it be to have a sensor with a specific pattern of R, G, and B filters that shift 3 times during the taking of a picture and each time capturing the specific amount of R, G, B for each separate pixel in the sensor. I understand this would have to be a very precise movement of the filters relative to the sensor, and that this would either work for still images or for very high shutter speeds; but would it work at all? I thought about this because I was thinking about HDR pictures and how, when using this setting, cameras take multiple pictures in different brightness settings and make a composite one based on what the software thinks is the best HDR picture. So thinking along this same line, with te shifting filters you record separate R,G, and B value based on three different pictures and make a composite RGB value.
What you described is essentially a color wheel. Many projectors actually use this method but it requires moving parts which is something you generally want to minimize on a camera.
Huh. Always thought cameras captured the whole of the image in front of them, but it seems it's a game of knowing how human eyes work and how that can be applied to make image capturing easier. Does that mean that other species might see pictures far different from how we do?
In short:yes. It's assumed for example that some bats SEE sounds. Or We know that some crab can see vastly more colors than we do. Some animals don't see some colors as well (Dogs for example)
oBLACKIECHANoo "don't see some colors as well" ≠ "don't see colours". dogs and cats struggle to perceive, i believe, the difference between red and green. instead of red green and blue cones, they have blue and yellow ones.
Kit Vitae Which is why it's evil to toss a red ball for a dog to find in the grass. In fact, dogs will see a tennis ball far easier (due to difference in green) than a red ball with the same luminosity as the green grass.
Quintar Farenor "We know that some crab can see vastly more colors than we do" - You probably mean the mantis shrimp. But that has been debunked. While they can see a wider part of the spectrum, including ultraviolet, their color vision is actually quite bad.
I always thought each of these r/g/b sensors where grouped to 1 pixel which is independent from the other pixels and the brightness get's averaged out from these grouped sensors...
Well, that's *essentially* what happens, after it gets processed. It makes little sense to build a more complicated sensor, when you can just achieve the exact same result by processing the data after it was captured. Hence it's done that way.
Mythricia What understood from this video, each r, g or b sensor makes up 1 pixel in the final image. I.e. for a pixel where's only a g sensor available the r and b values are just getting estimated from the surrounding r and b sensors and the g sensor itself still corresponds to pixel, no grouping as I thought and explained in my op.
@@CelmorSmith Similar to how routers are marketed they represent numbers is a way that makes them seem better. In the camera example a 10 MP camera really only captures about 3 MP for each color and in the router example they tend to represent the total bandwidth of each frequency/channel they can use instead of what the average consumer will actually see (so they add the bandwidth for 2.4GHz, 5 GHz, and more recently 6GHz when in reality each device will only connect with one of those at a time).
Well, this says nothing about the sensor size in relation to the lens, which is also important to actually see anything, or even to know the optimum amount of sensors.
Aren't there 3 subpixels for each pixel? This guy says there are actually 3 megapixels in a 10 megapixels sensor, but it seems more plausible to me that there are actually 30 megasubpixels and therefore 10 megapixels, can someone clarify?
Rough idea: Build a camera with a semi-transparent mirror which transmits 67% of the light into a plain B&W sensor. The other 33% gets reflected into another sensor with a checkboard-like filter of just blue and red pixels. A chip could guess the green channel and you got the best of two worlds: great looking edges without texture aberrations and just two cheap sensors (the blue/red filter and the B&W sensor are more simple than the bayer R/G/B filter) not three sensors. Because it would be a YCbCr camera, the YCbCr 4:2:2 mp4 compression process would be easier as well.
explodingpenguin3 No, that's correct. Our eyes are most sensitive to light in the middle of our visible spectrum, i.e. green. A common formula for converting a color given by RGB values to a grayscale value of *equal brightness* is this: 29.9% R + 58.7% G + 11.4% B. As you can see there, the green channel contributes most to the result, and the blue channel the least.
@@THBfriend I knew about green but didn't realize we saw red better than blue, I suppose we probably just have more green cones (color sensor of the eye) than red and more red than blue.
8640000 bits per color in a 1 mega pixel image, tell me if i am wrong but 8 bits per pixel one mega pixel is 1200 x 900, so 8x(1200x900) should be 8640000
no there are not. rgb is an additive colour space - light mixes together and gets brighter. cmyk is a subtractive colour space - inks mix together and get darker. r+g+b=white, c+m+y=black. (k is also black, since it is better to use black ink for greyscale than mixing 3 different inks together, as 3 inks on top of each other get the printing medium wetter than just one). this is why it is impossible to get an accurate representation of a printed image on a screen, as screens are backlit and print is not. it affects your perception of the colours. there are software tools to help you make adjustments to get the perceived colours _closer_ (such as when designing for print media on a computer), but it will always be a fundamentally different way of producing an image from light.
I don't think a CMYK camera would make sense. The color mixing principle is very different between RGB and CMYK. CMYK is a subtractive color model used for printing and describes the color of mixtures of color pigments (e.g. ink or paint). RGB is an additive color model. It describes the color of mixtures of different wavelengths of light. Since a camera captures lightwaves, it has to work with RGB (or similar). Later, the image can be converted to CMYK, but the raw sensor data will always use an additive color model.
Yea... in theory you could make a CMY camera (but not a CMYK, its not possible to make a K sensor).. in that case it would be a CMYW sensor. Actually this would be quite interesting because a camera of that type would be twice as sensitive to light. The problem would be that you it would be more complicated to calculate the value of every single pixel.
Bitchslapper12 i wrote it out of my own head! (and when i wrote it none of the other comments were on my screen…) i suppose we all just remember the info well. though ofc i can't speak for anyone else. i think we each gave different info though, beyond the "no because additive vs subtractive" vocabulary. the extra padding, if you will.
1. Why not CMYK? Would fit the 4 pixels perfectly, and CMYK is used for colour kind work no? 2. Insensitive to colour? Human eye will still see if you used rec.2020, if not, why is 2020 needed to begin with?
So our eyes don't see color that well... Does this mean that pictures taken with our cameras would look quite strange to animals that see color better?
It would look as strange to them as pictures simulating color blindness look to us. The problem is that we use to manufacture picture recording and displaying devices that suit OUR needs. Even if you take a picture with a camera that takes a wider part of the spectrum (UV and IR for example) there would still be a problem of displaying the information to your “animals” since there are no monitors that display the additonal information on screen nor are there printers using inks whose molecular structure absorbs UV or IR light. But I thinks technically it would be possible … only for the sake of showing pictures to “animals” with a more sophisticated perception. :)
So are we interpolating pixels just for the sake of cramming more of them in? Couldn't we just used less pixels and stores only the ones whose values we truly know?
We're interpolating the values for the sake of correct colour representation. If this was not done, you could zoom into an image and see exactly as illustrated in the video: one pixel would have only a green component and another would only have a red or blue component. The demosaic process is employed so that every pixel can have a value for each colour filter.
zap813 No, the interpolation is necessary because you cannot have sensels (=sensor elements) of all three channels/colors *in the exact same spot*. At least not with a Bayer type sensor. So you have to get the missing information from the neighboring sensels. Foveon sensors on the other hand stack the sensels for red, green, and blue vertically on top of each other. So, they don't need to interpolate (but have other problems instead).
Step 1: Go into a store that sells GoPros. Step 2: Ask for a free GoPro. If unsuccessful, go to a different store. Repeat until you have obtained a free GoPro.
No, a camera with x megapixels isn't x/3 for red, x/3 for blue, x/3 for green... Every pixel is composed of an amount of g/r/b... So he's kind of contradicting himself since when the interviewer asks him he says yes, but then explains it differently.
You're confusing the pixels on the screen used to display the image and the sensors used to capture the image. If you have an 8 megapixel camera, it means there are 8 million individual sensors that capture the image. 4 million of those capture green light, 2 million capture red light and the remaining 2 million capture blue light. Then the image gets processed and all the colors get blended and whatnot to produce the image that gets displayed on whatever you're looking at.
FreaknShrooms Actually not really, it depend on the marketing department really.How may pixel they like to say the camera have. And they simply just make the software so it will present that setting as a maximum. I seen everything from them printing 3 times the actually number of subpixel to that they print just a fraction of the number. Using the number of Green pixel as a base is common. Also some border pixels might be excluded to.
FreaknShrooms Yea, but what he sad was not really true, it was only a broad generalization that i believe in its not even true in majority of the cases. At least for system cameras, for mobile cameras is probably is.
matsv201 No, advertising the camera as 3 times the number of "sub-pixels" is stupid and just isn't done. Each and every one of those pixels undergo the demosaic process to obtain a full RGB value and hence are valid pixels. This is the number that is advertised (as the video and the person you were arguing with mentioned). Also, if this method was employed solely as a naming convention for mobile devices then it will probably be the majority still. Why? Practically everyone has a phone, not everyone is a photo buff owning various types of devices with camera modules.
Great explanation. Most explanations online provide only a surface level. This answered every question I was grasping for. Thanks!
Had an assignment last year to implement a Bayer filter on an FPGA which I failed because my professor couldn't explain what the filter was doing in terms I could understand - he insisted on trying to explain everything in terms of matrices: You've just explained it with perfect clarity in about 5 minutes.
[Tears out hair] University sucks, Computerphile & UA-cam FTW.
I have a similar project at my internship basically i have to implement a module that should convert bayer images to RGB. This comment would be nostalgic to u.
One thing that's great about this aproach on learning this....is that, after you understand it on this "higher" level it's much more easy to get into the algorithm implementation part and the mahematical part. You get into it with a purpose, with a visual idea of your goal....and spend much more time in the abstract world without losing focus (pun intended).
Almost lost my sleep over how a sensors captures images. I couldn't find any information on how the sensor compensates for the extra green or what happens in the in betweens. This video helped me better understand all of this and even more. Thank you so much!
Excellent video!! Please bring Mike back for more, He is a very articulate presenter and I would love to see more of him.
I've been looking for detailed information like this everywhere. Bryce Bayer bless this video. You guys are the best.
So many videos just to picture (eheh) the sensores and the filter... I got to Computerphile and understood it in the first minute... I've seen you for my Informatic Security exam, now for my optics exam.... You got it all. Thx a lot
I knew about the bayer filter but it is very interesting to hear about it more in depth. It's almost completely useless for camera operators like me to know, but my god it's so well explained and more knowledge never hurts.
this is great. as a photographer and an technician I am loving this series
Very nice graphics in this video. The content of the discussion is very interesting too!
You know, when I was doing research into how they made color movies before color film was developed I found out that they would split the light beam into it's component colors and they would have 2 or 3 rolls of film all going through the camera at once, taking black and whit film.
They would then tint the film with it's respective color and then put them together. Some times gluing them, some times using a processes to make a kind of stamp. It's pretty interesting.
I'm sure if you wanted to get a more accurate picture you could do something similar and have a beam slitter with 3 separate sensors.
I'm studying now electrical and computer engineering and all that technics are part of the courses that I have taken. Very nice presentation Congrats!
More videos about graphics, 3D and CGI please!!!! :D :D Loving it so far!!!
It might be a good idea to explain the antialiasing filter that tries to prevent demosaicing artifacts by blurring the lines between the color pixels.
I want to know more about this. Thank you
More in the pipeline! :) >Sean
***** Do an episode on Y2K!
***** Yep, I'm with TheJoshinils on this. This one had me gripped.
I love the whiteboard marker you used, it's so smooth and soothing...I wish all my teachers used it :)
Most intriguing. More, please.
I'm just getting into photography and astrophotography so this is of extra interest to me now (on top of the generally interesting topics that you tend to cover). Thanks!
Great video, also love the fan-fold paper :)
😉
You should include more simplified diagrams & visuals for better learning. Its a huge topic to cover in a 6 minute video but I think it could've been clearer. Thanks for such informative videos & keep up the good work. (y)
And the Lowered Color Resolution is why Space Probes use a Black and White (Luminance) Camera and then just Physically put a Big Filter infront of the whole camera one Color at a time, so you get the full resolution for each color.
U saved me 3 hours before my exam! Thanks
The Foveon X3 image sensors are different. I'd love to see a video on their tech.
2:13 I never got this explanation, here or elsewhere.
What does our eyes capacity to capture green light have to do with how the sensor has to do it. The final compounded output of a 4x4 pixel array will be a algorithmic calculation (demosaicing ) anyway. I would assume, I can adjust this calculation/ demosaicing in a way, so that the final picture looks natural to our eyes then anyway.
I would understand an explanation like, green being the most prevalent color in the environment, thus, it would make sense to focus on capturing that.
Eyes capture -> Brain translates.
Sensor captures -> Chip translates -> Monitor Output.
2 separte systems of capturing and translation, therefore no need to mimik one part of the "other chain". What am I missing?
By capturing more green pixels the cameras/chips can more accurately figure out the green values than they can red and blue values. The reason we see more green than other colors is because we have more green cones (biological color sensors) than other colors.
Just to nitpick #2:16, we don't distinguish luminance with much more intensity in the green channel, as lumen refers to how bright the human eye will perceive light; 100 lumen green, blue or red would look identically intense as is the purpose.
Is there any software that analyzes an image and uses different demosaicing methods for different parts of the image? Also, is noise one pixel misreading it's signal? Can noise be reduced in the demosaicing phase? By excluding information by individual pixels that don't fit their surroundings?
I'd love to hear answers on this as well.
will you guys please explain why some video cameras record moving bars of light and dark when filming computer screens?
Watching this made me try to think of how to do that process (well, in a way) on each captured pixel. Granted it would be more expensive, but could there be a mini prism that "stretches" the light like a prism and then use that data? I'm guessing no because of the size of the photon packet, but it sounded good in my head.
There are color measurement devices that do exactly this! They're called spectrophotometers, and measure color extremely well. However, I don't think they scale well.
is that the reason why sometimes when taking a picture of a fine mesh or looking at it on a screen can cause weird colour artifacts ?
Masterfully said.
I think you should have shown examples of the demosaicing process. This is a purely technical theoretical explanation. I knew and understood that explanation for years before I first saw an actual visual demonstration of it. That visual completely changed my understanding of it, because just the technical story doesn't really explain it all that well.
Wikipedia has a very nice demonstrative visual.
The problem is that there is not one way of demosaicing to rule them all. I agree that an example can be nice, but saying
(like in the video) that it's more or less a smart averaging process is true in general, whereas a specific example could lead people to believe that is *the* way to to things.
this answered so many things for me. great explanations.
I love this guy man! So much genius
How about RGBC sensors? (C = Clear, not sure if it's the same as White as in RGBW)
They're rare, but I know some things use them, like for example some Motorola phones.
Great video btw. Very interesting topic.
you guys should do a video on light field cameras
does it mean if i buy a camera that only shoot black and white i will have a exact pixel aince it dont use to filter blue and green?
In film making the debate between film v digital is still very much alive and well. My Super 8 camera can give professional cinematic images. - The way the film is made and processed has improve dramatically over years, however, to gain that high widescreen quality is expensive. The company Pro-8 is famous worldwide for it's Super 8 processing (and all other film formats). Film is still largely used by Hollywood. - Most directors are anti digital. TV networks often still use film. - PBS has a strong relationship with Pro-8.
Great explanation, thank you!
That was really interesting!
Nice overview, thanks for that.
Some higher end manufacturers used to be a bit more honest about the actual interpolation breakdown in their technical spec documents like hasselblad and phase one but for some reason I can't find that data anymore - I've spent a lot of time googling to put some more data into my comment but have to go from memory so please excuse that.
Most 35mm format digital cameras are in fact as many people ask/suggest offering about 1/3 of the reported resolution due to interpolation and then also make it worse by using AA filters to get rid of interpolation artifacts. So a 30 MP camera is closer to 10MP as far as the physical sensors capability is concerned regarding the colour with the least amount of pixels dedicated to it and you still have the AA edge issues which are a lot more noticeable than the marketing would make you believe - it's not just fine patterns that are affected, those are just the extreme examples - all edge quality is visibly affected if you don't downsample.
If you downsample a 36MP nikon d800 shot 2-3x then you can correct the edge detail and counter the quality lost due to interpolation, especially with the excellent downsampling that you can do nowadays (choose correct technique depending on image content etc.) and arrive at a resolution of around 16MP or what was the standard for MF digital backs many years ago.
The problem is that even the most expensive MF digital backs do not report interpolation figures as honestly as they used to and when judging progress of digital sensor technology we often look at the MF market, at the expensive solutions!
I have no idea how many physical "pixels" are on the sensor of say an PhaseOne IQ260 which talks of 60MP resolution, by old high quality standards for interpolated MF backs 60MP reported would suggest there are more than 60MP "pixels" in total on the sensor, Maybe 30MP for the colour with the least amount of pixels but definitely not 1/3 but lack of data might suggest that it is in fact the same breakdown as in small format full frame stuff from Canon and Nikon.
It is the standard that small frame digital bodies are interpolated and the resolution is over reported and almost all have AA filters but that is not what we expect of medium format backs. Most of these campaigns against film photography are based on figures and opinions stemming from use of the medium format stuff not small frame.
So not to start a debate between medium format and small frame - what I am trying to say is that we are all getting shafted nowadays with this ridiculous marketing, especially against film and the data to tell how much shafted we actually are is no longer made available.
These interpolation techniques which are not an issue with say 3k dollars for a nikon body are an issue when we are talking about a 50k dollar medium format back, which is at the end of the day just a big sensor you add to a system you already own and paid for.
For those of you who never used MF, my feelings translate like this - it's as if you would have to buy a 3k dollar sensor that you would put in your analog Nikon F and it would offer maybe 2x time the resolution of an iphone camera interpolated up 3 times - you would be very unhappy, especially given that film is being destroyed by this kind of marketing.
You can take amazing digital photos with just a pinhole attached to an old digital camera - the point is that the marketing used to be a bit more honest about the resolution. The old expensive MF backs offered maybe 4-12 MP but that was without interpolation or AA and what is left nowadays is still thankfully lack of AA but the hidden interpolation details and lack of reporting in regards to how the sensor is physically divided between the colours is all about hiding the true progress of technology.
Unlike many seemingly commenting on the web about too many MP, I do actually feel there is not enough MP - I want more. I want about 120MP on a full frame MF sensor if it has to be interpolated (that is at least 56mmx56m to cover the film area in my system) and I don't want less than 80MP per colour.
This kind of digital back is maybe 15 years in the future.
An interpolated file without the AA filter where you get say 16MP at least for the colour chosen to be least important resolution wise is of-course going to surpass a non interpolated file where you had 16MP for each colour because the interpolation techniques and filtering is very good nowadays - it's just that we are not in the high MP numbers as we think we are because through the years the reporting and measuring changes.
This might be irrelevant to most but to those of us who are attacked all the time about our film being outdated have to deal with all kinds of crazy arguments. One person told me my slow speed MF film is at most 50MP, well I disagree but due to interpolation even if my film was 50MP he won't be able to find a MF back that actually approaches that save for maybe the newest 80MP offering from Phase One.
The same problem is with scanners for film and general reproduction setups - they also use interpolation and are even more ridiculous in their marketing.
Arguing that interpolating something to large sizes is as good as natively reproducing it at those sizes requires tinfoil hats - it's like saying on the other side of that argument, that a slide enlarged to cover a whole building magically increased in resolution - we might perceive more resolution and fantastic detail, the colours might be vibrant with that super expensive projector and so on and so forth but it's still not a ground to argue against a competing technology.
***** Buy a seccond hand high quality zeiss lens for an older analog only mount and then buy a converter to go with it - for instance from fotodiox.
For instance the hasselblad lenses I use have the main shutter inside them instead of inside the body itself so sometimes you can get ones with a broken shutter or one with innacurate times for a pittance.
You could get one with mint glass for say maybe 200 quid since fixing it is another 200-300 at least. Only you never fix it, you buy a 30 quid converter and mount it on your sony.
It's not a solution for wide angle but everything beside it, it's excellent - you would have to spend at least 2000 quid to get a high quality zeiss lens in native sony mount and a high quality zeiss lens from yesteryear is the same exact thing, sometimes even better. The only thing that changed with most zeiss designs over the years where ergonomics, it's the same glass configuration.
MF lenses work great on small frame bodies because you end up shooting through just the centre most part of the lens and that with most lens designes is the sharpest so not only are you using top of the range glass, the aberations at the edges start where your sensor ends!
Very interesting insight.
I'd like to hear more about this subject.
Wow, that density plots instantly triggered a migraine with me. My brain can't handle it better than a camera...
Sorry - I did wonder about putting a warning on it :o/ >Sean
@Dennis Vance Tell me where to find one, I might be in need too.
How filters are placed over each pixel in sensor
The problem of figuring out what goes in the gap of the color sensors seems like something that a neural network would work well to solve. Train a neural network to recognize patterns like edges, curves, and grain, by using incomplete color information. And maybe try to shape it's network structure so that it corroborates it's guess for one color of sensor with its given data from the other available sensor colors, rather than just a single type. I bet you could get an astounding quality image using this processing technique. (Come to think of it, I think that "RAW" image data may actually be the data from the individual color sensors, before the camera tries to squash them into normal pixels. So the neural network could be used well after taking the picture, if your camera saves the raw data.)
5 years later our smartphone cameras are using ai for that purpose
@@MrConminer Pretty neat. I haven't kept up on camera tech or AI lately. Been focused on VR and creative stuff. Thanks for letting me know, though. ^_^
Fantastic! I won't look at my camera the same way as before...
Sounds like we need a camera that detects the exact color for every pixel instead of creating the color by mixing nearby rgb values.
I wonder if that's possible. A sensor that detects the exact color frequency hitting its sensor. Not rellying on using rgb filters to determine the rgb values.
Traditionally, broadcast cameras carry three sensors, though that's expensive.... >Sean
The whole point is that we don't need the camera because with closely enough placed RGB components, our eye doesn't have enough resolution to get affected by the distortion anyways.
In addition to the beam splitting approach mentioned by Sean, there is also an approach called "sensor stack". You can actually buy such cameras whose sensor is produced by Foveon. However, while this technology is more elegant in principle, eliminating color interpolation and not wasting any photons to color filters, in practice it has noise issues and introduces other weirdness by not having such a clear color separation in hardware.
It would be remotely possible but one must remember the math involved; in order to break out the constituent colours you need to record the 'mixed' waveform of the 3 constituent colours...then do a fourier analysis to find what waves caused the waveform. Unfortunately silicon sensors relying on the photoelectric effect don't record the incoming photons as waveforms, rather just as charges built up on the individual pixels. By just 'counting charge' you can't tell whether it was 3 red and 2 green or 1 blue and 4 red to make the pixel that particular level of charge.
frollard What you are talking about is doing a spectral analysis for every pixel -- only to then reduce it back to RGB. There is a patent by Nikon that does the normal beam separation stuff on chip. Which is basically what you are talking about but simplified to match the task and not do more than required. Not that it wouldn't be cool to have a full spectrum for every pixel ... but the amount of subpixels you'd need to get something one would call a spectrum ... well, some day perhaps.
great video. Thanks!
whats up with fuji x-trans sensors?
Interesting.
But what happens if for example a yellow color photon hits one of these filters? Will it get through or get discarded?
Is there a range of frequency these R,G & B filters let through and if so does these ranges of frequencies cover the whole spectrum of visible light?
So many questions, so little time. :)
Wouldn't the colour filtering cause 2/3 or so of the photons to go uncounted? I know they could keep detecting longer but can't that be more efficient? Couldn't they also use tiny optics like prisms to split the spectrum and get a larger amount of the source red, green, and blue light to the intended detectors?
I saw that some cameras use a beam splitter to split the total light but even they filter out 2/3 or so of the photons before detection instead of sending most red, green, and blue to the appropriate detectors. Chromatic dispersion seems to be more efficient if they could just make the millions of tiny prisms over the detector.
Why does my Canon PowerShot SX400 IS give me green and purple fringing between bright and dark areas?
Telescope ccd?
Awesome!
Love it.
Why are raw images so huge in file size? It seems like there is less data than in a demosaiced image because each and every pixel has three values, not just the one in the raw.
I think demosaicing usually happens in the hardware before the raw image is saved so you end up with 10 million pixels times 8 bits per pixel times 4 (or 3 if we drop the alpha channel) which comes out to about 40 (or 30) Megabytes plus header space. That's also assuming it's using 8 bits per color, some formats use more and others less. The actual raw image would still be 10 MB plus header space. Image compression formats tend to shrink file sizes by a ton using a variety of methods including some that outright erase raw data under the presumption that we can't tell the difference. If it's not already been done image compression would make an excellent video if not series of videos.
Cool. This is a lot like digital audio interpolation.
What's the difference between CCD and CMOS?
***** Nice to know how all of this works, thanks for the post. I have a question about these filters: how possible would it be to have a sensor with a specific pattern of R, G, and B filters that shift 3 times during the taking of a picture and each time capturing the specific amount of R, G, B for each separate pixel in the sensor. I understand this would have to be a very precise movement of the filters relative to the sensor, and that this would either work for still images or for very high shutter speeds; but would it work at all? I thought about this because I was thinking about HDR pictures and how, when using this setting, cameras take multiple pictures in different brightness settings and make a composite one based on what the software thinks is the best HDR picture. So thinking along this same line, with te shifting filters you record separate R,G, and B value based on three different pictures and make a composite RGB value.
What you described is essentially a color wheel. Many projectors actually use this method but it requires moving parts which is something you generally want to minimize on a camera.
Huh. Always thought cameras captured the whole of the image in front of them, but it seems it's a game of knowing how human eyes work and how that can be applied to make image capturing easier.
Does that mean that other species might see pictures far different from how we do?
In short:yes. It's assumed for example that some bats SEE sounds. Or We know that some crab can see vastly more colors than we do. Some animals don't see some colors as well (Dogs for example)
Please rea dmy comment again. I never said they only see in blakc and white. I said they see SOME colors not as good.
oBLACKIECHANoo "don't see some colors as well" ≠ "don't see colours". dogs and cats struggle to perceive, i believe, the difference between red and green. instead of red green and blue cones, they have blue and yellow ones.
Kit Vitae Which is why it's evil to toss a red ball for a dog to find in the grass. In fact, dogs will see a tennis ball far easier (due to difference in green) than a red ball with the same luminosity as the green grass.
Quintar Farenor "We know that some crab can see vastly more colors than we do" - You probably mean the mantis shrimp. But that has been debunked. While they can see a wider part of the spectrum, including ultraviolet, their color vision is actually quite bad.
Really helpful since i'm currently implementing demosaicking in cuda c ;)
kudos to the animator!!
I always thought each of these r/g/b sensors where grouped to 1 pixel which is independent from the other pixels and the brightness get's averaged out from these grouped sensors...
Well, that's *essentially* what happens, after it gets processed. It makes little sense to build a more complicated sensor, when you can just achieve the exact same result by processing the data after it was captured. Hence it's done that way.
Mythricia What understood from this video, each r, g or b sensor makes up 1 pixel in the final image.
I.e. for a pixel where's only a g sensor available the r and b values are just getting estimated from the surrounding r and b sensors and the g sensor itself still corresponds to pixel, no grouping as I thought and explained in my op.
@@CelmorSmith Similar to how routers are marketed they represent numbers is a way that makes them seem better. In the camera example a 10 MP camera really only captures about 3 MP for each color and in the router example they tend to represent the total bandwidth of each frequency/channel they can use instead of what the average consumer will actually see (so they add the bandwidth for 2.4GHz, 5 GHz, and more recently 6GHz when in reality each device will only connect with one of those at a time).
Well, this says nothing about the sensor size in relation to the lens, which is also important to actually see anything, or even to know the optimum amount of sensors.
Aren't there 3 subpixels for each pixel? This guy says there are actually 3 megapixels in a 10 megapixels sensor, but it seems more plausible to me that there are actually 30 megasubpixels and therefore 10 megapixels, can someone clarify?
Simply put: you are wrong, the video is correct. There are no subpixels.
I even forgot I asked this question, but thanks for the reply!
I wonder if anyone has made a mosaic pattern that's not a grid of squares. If you did triangles you could get equal proportions of red green and blue.
Can you make a video about error detection?
ua-cam.com/video/5sskbSvha9M/v-deo.html
Rough idea: Build a camera with a semi-transparent mirror which transmits 67% of the light into a plain B&W sensor. The other 33% gets reflected into another sensor with a checkboard-like filter of just blue and red pixels. A chip could guess the green channel and you got the best of two worlds: great looking edges without texture aberrations and just two cheap sensors (the blue/red filter and the B&W sensor are more simple than the bayer R/G/B filter) not three sensors. Because it would be a YCbCr camera, the YCbCr 4:2:2 mp4 compression process would be easier as well.
I came here to enjoy the experience
2:11 I thought it was the other way round? He says our eyes are MORE sensitive to green than blue & red, isn't it less?
explodingpenguin3 No, that's correct. Our eyes are most sensitive to light in the middle of our visible spectrum, i.e. green. A common formula for converting a color given by RGB values to a grayscale value of *equal brightness* is this: 29.9% R + 58.7% G + 11.4% B. As you can see there, the green channel contributes most to the result, and the blue channel the least.
Huh, well TIL. Thanks!
@@THBfriend I knew about green but didn't realize we saw red better than blue, I suppose we probably just have more green cones (color sensor of the eye) than red and more red than blue.
8640000 bits per color in a 1 mega pixel image, tell me if i am wrong but 8 bits per pixel one mega pixel is 1200 x 900, so 8x(1200x900) should be 8640000
anyone know who the interviewee is?
I'm wondering if he has a UA-cam channel or not
Yes he has.
But I don't remember the name. It's name starts with maths ..
.
Verify your electrical circuits on the go! Look up: 'Circuit Solver' by Phasor Systems on Google Play.
Why can't we put RGB filters in front of every CMOS sensor so we don't have to do any demosaicing?
Hmm.. i'd like to hear also about how IR cameras works.
are they onle RGB? are there cameras with CMYK?
no there are not. rgb is an additive colour space - light mixes together and gets brighter. cmyk is a subtractive colour space - inks mix together and get darker. r+g+b=white, c+m+y=black.
(k is also black, since it is better to use black ink for greyscale than mixing 3 different inks together, as 3 inks on top of each other get the printing medium wetter than just one).
this is why it is impossible to get an accurate representation of a printed image on a screen, as screens are backlit and print is not. it affects your perception of the colours.
there are software tools to help you make adjustments to get the perceived colours _closer_ (such as when designing for print media on a computer), but it will always be a fundamentally different way of producing an image from light.
I don't think a CMYK camera would make sense.
The color mixing principle is very different between RGB and CMYK.
CMYK is a subtractive color model used for printing and describes the color of mixtures of color pigments (e.g. ink or paint).
RGB is an additive color model. It describes the color of mixtures of different wavelengths of light.
Since a camera captures lightwaves, it has to work with RGB (or similar).
Later, the image can be converted to CMYK, but the raw sensor data will always use an additive color model.
Yea... in theory you could make a CMY camera (but not a CMYK, its not possible to make a K sensor).. in that case it would be a CMYW sensor.
Actually this would be quite interesting because a camera of that type would be twice as sensitive to light. The problem would be that you it would be more complicated to calculate the value of every single pixel.
Did everyone just copy/paste Wikipedia on this poor dude's question?
Bitchslapper12 i wrote it out of my own head! (and when i wrote it none of the other comments were on my screen…)
i suppose we all just remember the info well. though ofc i can't speak for anyone else.
i think we each gave different info though, beyond the "no because additive vs subtractive" vocabulary. the extra padding, if you will.
they answer what exactly i was looking for.
But can the Bayer Filter capture why kids love Cinnamon Toast Crunch?
It's the cinnamon-sugar swirls in every byte!
Now I understand what the `Demosaic[]` function in Mathematica does 😁👍
1. Why not CMYK? Would fit the 4 pixels perfectly, and CMYK is used for colour kind work no?
2. Insensitive to colour? Human eye will still see if you used rec.2020, if not, why is 2020 needed to begin with?
Subtitles, please!!
The Foveon sensor bypasses all this.
Humm, I always though the actual sensors raw data are the frequency and intensity of light that hit the sensor. Seems I was wrong.
Wow
thanx
Please do an episode on Y2K!!!!!!!!!!!
So our eyes don't see color that well... Does this mean that pictures taken with our cameras would look quite strange to animals that see color better?
It would look as strange to them as pictures simulating color blindness look to us. The problem is that we use to manufacture picture recording and displaying devices that suit OUR needs. Even if you take a picture with a camera that takes a wider part of the spectrum (UV and IR for example) there would still be a problem of displaying the information to your “animals” since there are no monitors that display the additonal information on screen nor are there printers using inks whose molecular structure absorbs UV or IR light. But I thinks technically it would be possible … only for the sake of showing pictures to “animals” with a more sophisticated perception. :)
Lightfields!
So are we interpolating pixels just for the sake of cramming more of them in? Couldn't we just used less pixels and stores only the ones whose values we truly know?
We're interpolating the values for the sake of correct colour representation. If this was not done, you could zoom into an image and see exactly as illustrated in the video: one pixel would have only a green component and another would only have a red or blue component. The demosaic process is employed so that every pixel can have a value for each colour filter.
zap813 No, the interpolation is necessary because you cannot have sensels (=sensor elements) of all three channels/colors *in the exact same spot*. At least not with a Bayer type sensor. So you have to get the missing information from the neighboring sensels. Foveon sensors on the other hand stack the sensels for red, green, and blue vertically on top of each other. So, they don't need to interpolate (but have other problems instead).
interesting, but it doesn't explain how i can get a gopro for free
Step 1: Go into a store that sells GoPros.
Step 2: Ask for a free GoPro.
If unsuccessful, go to a different store. Repeat until you have obtained a free GoPro.
+Noel Goetowski
Step 1: Ask for a GoPro for your birthday.
Step 2: Wait for your birthday.
Step 1: Go into a store that sells GoPro cameras.
Stem 2: Steal one
No, a camera with x megapixels isn't x/3 for red, x/3 for blue, x/3 for green... Every pixel is composed of an amount of g/r/b...
So he's kind of contradicting himself since when the interviewer asks him he says yes, but then explains it differently.
You're confusing the pixels on the screen used to display the image and the sensors used to capture the image. If you have an 8 megapixel camera, it means there are 8 million individual sensors that capture the image. 4 million of those capture green light, 2 million capture red light and the remaining 2 million capture blue light. Then the image gets processed and all the colors get blended and whatnot to produce the image that gets displayed on whatever you're looking at.
FreaknShrooms Actually not really, it depend on the marketing department really.How may pixel they like to say the camera have. And they simply just make the software so it will present that setting as a maximum.
I seen everything from them printing 3 times the actually number of subpixel to that they print just a fraction of the number.
Using the number of Green pixel as a base is common. Also some border pixels might be excluded to.
matsv201 The more you know. I was just clarifying what was said in the video.
FreaknShrooms Yea, but what he sad was not really true, it was only a broad generalization that i believe in its not even true in majority of the cases. At least for system cameras, for mobile cameras is probably is.
matsv201
No, advertising the camera as 3 times the number of "sub-pixels" is stupid and just isn't done. Each and every one of those pixels undergo the demosaic process to obtain a full RGB value and hence are valid pixels. This is the number that is advertised (as the video and the person you were arguing with mentioned).
Also, if this method was employed solely as a naming convention for mobile devices then it will probably be the majority still. Why? Practically everyone has a phone, not everyone is a photo buff owning various types of devices with camera modules.
A bit short...
the sun is green
Fuck this! I am going Sigma Merrill ! :D
2020 anyone?
2021
2022
Dd
What are you talking about I don't understand anything