What AI Cannot Do -- I Get AI With a Little Help From My Friends

The Primer covers what AI is and how it works. This page covers what it cannot do, which is at least as important. AI is genuinely useful for a long list of things, and genuinely terrible at a smaller list. The terrible list is where people get hurt, so it is worth knowing in advance.

It makes things up, with confidence

This is the single most important thing to know about AI. The technical name for it is hallucination: the model will invent facts, statistics, quotes, court cases, scientific papers, and the names of people who never existed, and it will present them in exactly the same calm tone it uses for things that are true. There is no difference, in the model's voice, between a fact and a fabrication. Both come out polished.

The reason traces back to the Primer. An LLM, the kind of program that powers Claude or ChatGPT, is a very fluent autocomplete. It produces what sounds plausible given everything that came before. "Plausible" and "true" are not the same thing, and the model has no internal sense of which side of that line a particular sentence falls on. When you ask it to find a court case to support your argument, the model finds it more probable that a case with a plausible-sounding name exists than that no such case exists, so it produces one. Lawyers in several countries have been sanctioned for citing AI-fabricated cases in real court filings. They were not unlucky. They were early examples.

The fix is not to stop using AI. The fix is to verify. Every quote, every statistic, every cited source. If the AI tells you something specific and consequential, check it before you act on it.

It does not know what happened yesterday

The model's knowledge ends on a particular date, called the training cutoff: the last day of the data the model was trained on. Ask Claude or ChatGPT what happened in the news last week, and you will either get a polite "I do not know" or a confident answer about events from two years ago that you assume are recent. Always check what the cutoff is when you ask about anything time-sensitive.

The big chatbots can search the web in real time when you ask them to. Claude, ChatGPT, Gemini, and Perplexity all have this. But search is not always on by default, and a model can forget to use it. If you ask "what is the current price of X?" and the answer comes back without any indication that the model checked, the answer is probably stale. Ask it to search, or use a search-first tool like Perplexity.

It does not remember you between conversations

Each new chat starts from scratch. The model has no recollection of who you are, what you talked about yesterday, or what you decided last Tuesday. Every conversation is a clean slate.

This is mostly fixable. Claude and ChatGPT both have memory features that quietly carry context forward (your name, your preferences, ongoing projects), plus instruction features that let you set persistent rules ("I am Australian, write in Australian English"; "I work in healthcare, assume that context"). These are covered in How to Talk to AI. But the default is "starts fresh every time", and most people never turn the memory feature on.

It agrees with you when it should not

This one is a quiet trap, and the technical word for it is sycophancy: agreeing with the person you are talking to even when they are wrong. Real people in the model's training data tend to agree with each other in cooperative conversations, and the model has learned that pattern. So when you say "this email is a great draft, is it not?", the model is statistically more likely to say "yes, here is why" than to point out the three things wrong with it.

If you want pushback, you have to ask for it. "Be sceptical." "Argue against this." "What is the strongest objection to my plan?" The good models will then push back genuinely. Without that prompt, you tend to get a cheerleader.

It cannot reach into your accounts or files unless you let it

By default, an AI chatbot cannot read your email, check your bank balance, browse your Google Drive, or look at your photos. It is a chat window, not a service that has hooked into the rest of your digital life. This is good news. It means the chatbot cannot accidentally do something destructive on your behalf.

It also means that the moment you start connecting those services (browser plugins, agent tools like Claude Cowork, the new "AI on your phone" features), the constraint disappears. The same model that could not see anything yesterday can suddenly read your email today. Treat new permissions as serious. The Privacy and Security page covers what to think about before granting them.

It does not actually think

This is the philosophical one, and it sounds like it should not matter in everyday use. It matters more than you would expect. The model produces text that reads as if a thinking process produced it. It is not. As the Primer goes through, an LLM generates text by predicting the next word, over and over, based on patterns absorbed from its training data. There is no understanding, no reflection, no mind in the loop. It is statistical pattern matching, scaled up enormously.

Why this matters in practice: the model has no way of knowing it is wrong. It has no doubt. If it has gone off the rails, it will keep going off the rails confidently. The doubting, the checking, the "does this actually make sense?" loop is your job, not the model's.

It does not always give the same answer to the same question

Ask the same question twice and you will often get two different answers. Sometimes very different. This is by design. Models add a pinch of randomness to their output to make replies feel less robotic. The technical name is temperature: higher temperature means more variety, lower means more consistency.

For most tasks (writing, brainstorming, summarising), this is fine. For tasks where the same input must give the same output (calculations, exact data extraction, anything compliance-related), you need to either ask the model to be more deterministic or, more reliably, treat the model as one expert opinion among several rather than as an oracle. Run the same prompt three times. Compare. If the answers diverge, that is a signal.

The habit that matters

If you take only one thing from this page, take this: use AI to draft, use your own brain to verify. The combination is what makes AI genuinely useful. Either alone is much weaker. AI without verification is fast and frequently wrong. Verification without AI is slow and often complete, but exhausting. Together, you go fast and you stay correct.

The next page, How to Talk to AI, covers how to actually do this in practice: how to phrase prompts so the model gives you better drafts, and how to ask it to push back rather than agree.

Once you have the habit of treating AI output as a draft rather than an answer, the question becomes how to actually verify it. The companion page How to Check What AI Tells You walks through the checking recipes by use case: health, money, legal, quotes, letters, forms.

The AI Primer How to Talk to AI