The idea of a cognitive immune system - similar to our "standard" immune system, but to protect us against "bad" information - is not new. But with AI, it has new relevance, and it might give us a real-time shield against deception and misinformation in the media
Unless you're severely immunocompromised, you have a biological immune system to protect you against hostile bugs. It's an incredibly sophisticated biological protective shield, but like all armour, it's not perfect.
But do we have an immune system that protects us against bad ideas and deception (a "cognitive immune system")? It depends. In some ways, we do - we wouldn't survive long if we didn't have one. Think about how a child learns about the world. If you've never experienced a flame, and nobody has warned you to avoid putting your finger in one, then it's a lesson you'll learn quickly and probably only once. Similarly, most higher animals, from mice and cats to primates and humans, recognise the edge of a cliff or an open trapdoor and stay away from it.
When you move towards higher-order information, it's not as clear-cut because different people see the same things differently.
Atheists have issues with the fact that there are multiple religions in the world because there can't logically be several all-powerful gods. People of one or another religion have no problem, however, claiming that other religions are false. Ever-widening political divisions mean that we have few intrinsic internal checks and that your worldview can adapt to whatever you believe or feel comfortable with and, essentially, what your peers and broader community accept as the "truth".
So, is the truth entirely relative? Can there be multiple truths about the same "fact"? The answer, of course, depends on how you define "fact".
All of this is a preamble to the central topic of this discussion, which is, given that our cognitive immune system (if, indeed, that's what you can call it) is so subjective, so contingent on our belief system, can we reasonably expect AI to help us to determine what is true and what is false?
Let's look at the issues. First of all, what would this look like? It could take several forms.
Reducing hallucinations
You can already use AI in the form of Large Language Models (like ChatGPT) to verify information. This method is currently not dependable. LLMs' greatest skill is actually not being correct but being plausible. Not only do these AI models make mistakes, but they mask them by wrapping them in convincing language.
An LLM is a prediction model: it predicts the future in a very specific and limited way, which is that it predicts the next word in a sentence. If you have any experience whatsoever of using an LLM, you'll know that's an oversimplification, but it is at the core of how LLMs work. To be more specific, they don't just try to predict the next word but the next plausible word. So if the sentence so far is "I've spent the day fishing by the river "….", it knows the word is "bank", but also that it's a different bank from the sentence "I've just been to the high street "…." to pay my salary into my account". So it's working deeper than mere letters: it knows about context.
LLMs are trained with real sentences, paragraphs and entire literary works, which means they are trained in plausibility. It doesn't know how not to be plausible. This being the case, it is awkward when it makes mistakes because they seem plausible as well. Ask an LLM to give you a list of ten things, and the chances are that nine of them will be entirely correct but also that one of them will be plausible, convincing and completely wrong. This drawback is called "hallucinating". Hallucinations are very convincing.
But self-checking and cross-referencing with other LLMs will reduce the errors. At some point, these models will be more reliable than almost anything else - but when they do make mistakes, they will, unfortunately, be even more convincing and, hence, misleading.
If the best AI models designed to check and verify another AI's answers can't spot an error, we're even less likely to. This shortcoming might be something we have to get used to. You can never build a perfect aircraft, but knowing and accepting that means you can devise all sorts of redundancies, cross-checks, regulations and procedures to make it so unlikely to fail that we barely think about the possibility when we board a plane. And it's true: this works very well in practice.
The tsunami of information and content that we're deluged with every day is becoming increasingly challenging to filter. Ironically, much of this is due to AI, either curating and modifying our feeds to show us material designed to make us think one way or another - because that increases engagement and hence advertising revenue - or actually generating the content itself. What chance do we have when AI can generate photorealistic images of things that have never existed and can be created merely with a short text prompt, which another AI may have generated?
The best chance for us might be to use AI itself in a fact-checking role. As we saw earlier in this piece, this idea is not without issues when many of us can't agree on what a fact is. But facts are objective, or they're not facts. To be more precise, a fact is something that is independently verifiable.
What would an AI fact-checking system look like?
We're already part way there. It's clunky, but if you made a real-time transcription of, say, a news bulletin and fed it into an LLM, it could do some fact-checking. But you can immediately see some issues - especially if it's breaking news. How does an AI fact-checker deal with new information arriving in real-time? That AI model will only be trained on information that's existed before. But that's not a reason to think it could never work, because we are intelligent, and we deal with new information, mostly successfully. How is that possible?
It's possible because of context. Context is how we and AI "understand" things in relation to other things. We do it differently to AI because we have an internal "world model", which is our understanding of the world and how we interact with it. In their initial state, LLMs don't have a world model.
Nevertheless, LLMs do have virtually instant access to a vast body of information. An LLM will know from context whether to flag up apparent falsities. Whether or not it can do this in real-time is a big question, but through continuous improvement and optimisation, it is not unreasonable to think it could happen near-instantaneously.
But who trains the models? How do we know that they're objective? We don't. To achieve true objectivity, you must have an agreed set of rules and objective oversight over the material used to train AIs. That won't be easy, but again, possible.
What would it look like? Perhaps a traffic light in the corner of the screen. Green would mean, "This sounds plausible and is likely to be true", Amber would be, "We can't verify this", and Red would be, "We know this to be false or misleading".
I'm sure everyone reading this has the same thought: isn't this open to manipulation? Yes, it absolutely is. But there are ways to safeguard it. Trusted sources could act as "verification servers" to supply previously verified information as a reference for the AI. An AI safety belt only has to be better than not having one and to avoid misleading us.
Some might say that this is too big a step, that it's another arm of the "Nanny State". But I don't think so. You'll be able to turn it on and off.
Imagine you're in a library with all the books that have ever existed surrounding you. You read something that sounds a bit off, but you have means to verify or disprove what you're being told. The internet is like that, augmented by AI
The real question is: is the AI acting as a helpful librarian, or is it writing the books now?
Tags: Technology
Comments