A major focus of any smartphone release is the camera. For a while, all eyes were on the camera’s hardware — megapixels, sensors, lenses, and so on. But since Google’s Pixel was introduced, there’s been a lot more interest in the camera’s software and how it takes advantage of the computer it’s attached to. Marc Levoy, former distinguished engineer at Google, led the team that developed computational photography technologies for the Pixel phones, including HDR+, Portrait Mode, and Night Sight, and he’s responsible for a lot of that newfound focus on camera processing.
Levoy recently joined Verge editor-in-chief Nilay Patel for a conversation on The Vergecast after leaving Google and joining Adobe to work on a “universal camera app” for the company.
In the interview, Levoy talks about his move from Google to Adobe, the state of the smartphone camera, and the future of computational photography.
One part of the conversation specifically focuses on the balance of hardware and software for a camera in a smartphone and the artistic decisions made within the software. Below is a lightly edited excerpt from that conversation.
Nilay Patel: When you look across the sweep of smartphone hardware, is there a particular device or style of device that you’re most interested in expanding these techniques to? Is it the 96-megapixel sensors we see in some Chinese phones? Is it whatever Apple has in the next iPhone? Is there a place where you think there’s yet more to be gotten?
Marc Levoy: Because of the diminishing returns due to the laws of physics, I don’t know that the basic sensors are that much of a draw. I don’t know that going to 96 megapixels is a good idea. The signal-to-noise ratio will depend on the size of the sensor. It is more or less a question of how big a sensor can you stuff into the form factor of a mobile camera. Before, the iPhone smartphones were thicker. If we could go back to that, if that would be acceptable, then we could put larger sensors in there. Nokia experimented with that, wasn’t commercially successful.
Other than that, I think it’s going to be hard to innovate a lot in that space. I think it will depend more on the accelerators, how much computation you can do during video or right after photographic capture. I think that’s going to be a battleground.
When you say 96 is a bad idea — much like we had megahertz wars for a while, we did have a megapixel war for a minute. Then there was, I think, much more excitingly, an ISO war, where low-light photography and DSLRs got way better, and then soon, that came to smartphones. But we appear to be in some sort of megapixel count war again, especially on the Android side. When you say it’s not a good idea, what makes it specifically not a good idea?
As I said, the signal to noise ratio is basically a matter of the total sensor size. If you want to put 96 megapixels and you can’t squeeze a larger sensor physically into the form factor of the phone, then you have to make the pixels smaller, and you end up close to the diffraction limit and those pixels end up worse. They are noisier. It’s just not clear how much advantage you get.
There might be a little bit more headroom there. Maybe you can do a better job of de-mosaicing — meaning computing the red, green, blue in each pixel — if you have more pixels, but there isn’t going to be that much headroom there. Maybe the spec on the box attracts some consumers. But I think, eventually, like the megapixel war on SLRs, it will tone down, and people will realize that’s not really an advantage.
Do you think any of the pixel bending or Quad Bayer techniques — because the 48-megapixel cameras, they still spit out a 12-megapixel photo by default — do you think those help?
That remains to be seen. If you have four reds, four greens, and four blues, that makes de-mosaicing — interpreting the reds, greens, and blues that you don’t see — harder. Those Quad Bayer sensors have been subject to spatial aliasing artifacts of one kind or another, zippering along rows or columns. Whether that can really be adequately solved remains to be seen.
One of the things, as we review the phones on our very consumer side, is we have noticed a particular HDR look has emerged on each of the phones. Samsung has a very particular, very saturated look. Apple started in one place, they went to another place, and they’re going to yet a third place, I think. The Pixel has been relatively constant, but it’s moved a little closer to where the other folks are, in my opinion.
That is a big artistic decision that’s connected to a lot of engineering, but at some point, you have to make a qualitative determination. How are these photos going to look? You obviously had a huge hand in that. How did you make that determination?
You’re right, it’s an artistic decision. My team was instrumental in that. I looked at a lot of paintings and looked at how painters over the centuries have handled dynamic range. One of my favorite painters was Caravaggio. Caravaggio had dark shadows. I liked that.
That really explains a lot about the Pixel 2.
Right. Last year, we moved a little bit more toward Titian. Titian has lighter shadows. It’s a constant debate, and it’s a constant emerging taste. You’re right that the phones are different. It’s also true that there is probably some ultimate limit on high dynamic range imaging — not necessarily on how high a dynamic range you could capture, but on how high a dynamic range you can effectively render without the image looking cartoony.
One of my favorite photographic artists is Trey Ratcliff, and his look is deliberately pushed and cartoony. I think that’s his style. But I’m not sure I would want the Trey Ratcliff look with every picture that I took every day with a smartphone. I think that’s an important limit. It’s not clear how we get beyond that limit or whether we ever can.
Our friend, Marques Brownlee does these challenges every so often where he asks people to vote “blind Pepsi challenges” of smartphone photos. I think every time he’s done it, it doesn’t matter how good the photo is, the brightest photo always wins. That’s the easiest cheat that any camera maker has, is just to overexpose it a little bit and then you’ll win on Twitter. How do you solve for that in a moment like this?
That was a debate that at Google we had all the time. At Adobe, I’m hoping to put it more in the hands of the consumer or the creative professional. Let them decide what the look will be.
But of course, that was a constant debate because you’re right, brighter would often win in a one-to-one comparison. One factor that you haven’t mentioned that I should add in here is the tuning of the displays on these smartphones. Most smartphones are a little bit cranked relative to a calibrated so-called SRGB display. They’re more saturated. They’re more contrasting. You could argue that that’s probably the right thing to do on the small screen. It would be a terrible thing to do on a large screen. It would look very cartoony, but that kind of contributes to what people want to see and to taste, especially since most photographs are looked at only on the small screen.
Yeah, it’s a constant debate, a constant emerging trend. It will probably change again. We can look at photographs from the ‘50s and ‘60s — partially because of Kodak’s choices, but also because of their technology — and we could identify a Kodachrome or an Ektachrome picture, and we’ll be able to recognize pictures from various decades of digital photography as well.
Is your vision for a universal app — and I recognize you’re a team of one building a team, many steps to come — that anybody will download it on any phone, and the image that comes out will look the same, no matter the phone?
That remains to be seen. One of the interesting questions sort of hidden under your question is personalization but also regionalization. Some phone vendors do regionalize their phones, and some do not. At Adobe, I think the preference is to leave the creative decisions in the hands of the photographers, more so than the phone vendors have been doing in their software. Unknown how that will shake out.
Well, the reason I ask — assuming you get a sort of standardized access to sensor data — would your instinct be to, “Here’s a relatively neutral thing for the creative to work from that looks the same across every phone”?
That’s one possible path. So to dive down a little bit into the way raw images are processed, you can take an approach of — just among SLRs, let’s say — of putting out the kind of image that that SLR would have put out. And so Nik Imaging, when the company still existed, did that. Adobe tries to do its own white balancing and processing and give a fairly uniform look across any SLR. That’s a different decision, and maybe that would be continued for the different smartphone vendors. Maybe not. It remains to be seen.