<img src="https://certify.alexametrics.com/atrk.gif?account=43vOv1Y1Mn20Io" style="display:none" height="1" width="1" alt="">

With AI involved, can we rely on images ever again?

5 minute read
Shutterstock.

Phil Rhodes' article this week about how inappropriate compression can damage the credibility of video evidence is just the tip of an iceberg. The reality is that modern AI-based video technology is pushing the limits of our ability to tell the truth from fiction.

Compression gets a bad press. Sometimes it's deserved. Anyone who's watched digital terrestrial TV (or satellite TV when it's raining) will be familiar with the multi-coloured chessboards that scurry crazily around the screen when the decoding breaks down un-gracefully. It's not a good look. But neither is no television signal at all.

I know this might sound surprising, but compression can actually improve quality. Yes, I know that sounds like a complete contradiction. But it's all about context.

Many of you still have a 70-year-old copper cable to your house. When it was first installed, its only job was to get a raspy phone signal from a carbon microphone to the other person on the call. It didn't have a "bandwidth" because no one was talking about data at the time. It certainly had a frequency response, and you'd probably find that it tailed off severely above audio frequencies.

I'm old enough to remember the introduction of the "dial a disk" service, where you could phone a number and - I can't believe I'm saying this in the era of Spotify, Tidal, etc. - you'd hear a record. You couldn't choose: it was just pre-ordained for the day. And it sounded terrible: every bit as bad as you'd expect. And yet, for a while, it was extremely popular.

I'm relieved to say that it didn't last. But it did imprint on my mind the notion that it's a terrible idea to play music over a phone line.

This makes it more of a surprise that today you can listen to CD-quality music over that same phone line. There's no trickery here at all. Tidal subscribers get bit-for-bit accurate CD-quality audio straight to their speakers, courtesy of clever technology that sends data on hundreds of carriers through a phone line. That's why your distance from the phone exchange matters. Those high frequencies prefer to escape into space rather than tunnel through a phone line.

When broadband was slower, absolutely the only way to get near-CD quality over a copper line was to compress it. With audio, that means using psychoacoustics to avoid wasting bandwidth on frequencies "masked" by others. That's how MP3 works, and it is surprisingly effective.

And it is a clear example of how compression can improve quality. Obviously, the compressed version is not as good as the original (although most people wouldn't notice the difference on a typical consumer audio system), but it is miles better than not having any version at all.

Compression, video, and legality

The same is true - but more so - with video. Today, I'm theoretically able to watch three streams of 8K video over my phone line. I haven't tried that, but it should be possible given my bandwidth of 450 megabits/sec (that's very high for a phone line, but I am very close to the exchange!). I could probably watch more than ten streams of 4K video. The maximum I've tried is three, and they played perfectly. Again, just to remind you, this is down a perfectly unglamorous copper phone line. And, again, it's compression that allows me to see these fantastic pictures.

So that's the upside of compression. It's good because it lets us see and store video when the capabilities of the infrastructure would otherwise not allow us to.

But there definitely is a downside, although it's not all compression's fault, as we'll see in a minute.

Around the turn of the century, I started to see legal objections to using compressed video as evidence. The central pillar of the opposition to compressed video was not that the compression was lossy, so much as how it was lossy. The issue was that video compressed using a Group Of Pictures ("GOP") could not be relied on because it was "making up" many of the frames.

If you're going to be really picky, you can't argue with that. But if you're going to do that, you also have to accept that no video can be relied on: not even analogue video. Remember that flying spot on a CRT screen? You could argue that only the part of the scene illuminated by the spot is an accurate record. So the rest of the image might be something else altogether.

This is more of a philosophical category mistake than a real problem. As is the case with all types of sampling - and analogue video is at least analogous to sampling, except that the samples are frames and not quantified slices of an audio or video signal - as long as the recording system is capable of capturing and reproducing natural actions, then it should be adequate for evidence. But, of course, if you want to analyse the motion of a bullet, you'll have to shoot at a (much) faster frame-rate, but, most of the time, "normal" video is perfectly OK for evidence, regardless of "missing" frames as a result of Long-GOP compression. So, in a way, I think these arguments are like those that say you can't sit on a chair because of all the space between the atoms.

Phil mentions in his article that it gets really dicey when you start to zoom in. Let's say you're picking out a face from a crowd. With conventional video, you can zoom in until you start seeing individual pixels and severe artefacts like aliasing - the stepping effect that you see on diagonal lines.

I'm going to say that these limiting factors are good. They warn that what you see will be unreliable if you zoom in too much. But today, we're starting to lose some of those limiting factors that we can use as a reference, and I think that that will get tricky in the context of evidence.

AI

The culprit is AI. And it's all over the place, including in our pockets.

Modern smartphones need AI. Without it, there would be severe limits to their photographic abilities. It's tough to get concrete details about how Apple, for example, processes the images captured by their phones. But you can see the effects. While some might object to "over processing", I'll take that any day over an image taken with poor glass (or plastic) through what looks little bigger than a pinhole and captured with a sensor the size of a pinhead.

Despite the lack of room for optimally-sized optical elements, modern phones have reached a jaw-dropping level of competence, producing brilliant, if not necessarily accurate, photos.

Leaving aside the processing that makes camera phones "look" good, I've noticed that iPhones also process their images to allow you to zoom in to an almost unbelievable degree. And when you do that, what do you see? It's not pixels. You may see some aliasing, but not as much as you'd expect. Instead of pixels, you see what looks more like a post-impressionist painting than a matrix of dots at extreme digital zoom levels.


A zoomed in crop from an iPhone image.

Have a lot at this shot, which was taken on an iPhone 13 Pro Max from my kitchen window. It's already digitally zoomed in quite a bit. You can see houses and trees, but look even closer at the houses: they look stylised. They should either look hopelessly blurred or massively pixellated at this zoom level. But they're neither. I suspect Apple's much-vaunted AI processing is at work here.

None of this is real at this zoom level. So, in that sense, it's fake. It's certainly not suitable for evidence. That's not to say it's bad: I think it's a genuine step forward in digital photography, but that depends on how it will be used.

I feel that video for evidence needs to be "clean" and not processed by AI. Of course, there will always be a level beyond which AI will make stuff up unless it's specifically told not to.

It's very early days, and the law (and politics) always lags behind technology. So brace yourselves for some "interesting" legal cases.

And meanwhile, let's hope that experts like Phil find themselves very busy.

Tags: Technology Video compression

Comments