Bypassing Hallucinations in LLMs

My bike broke down. Luckily, I had my camera

Before I get too deep, I just want to get it out of the way: OpenAI’s o3 model is im­pres­sive. With its tool use and web search ca­pa­bil­i­ties, it can do a lot more than most of­fer­ings out there.

That said, al­though I’ve found it to be quite a ca­pa­ble coder, I still don’t trust it with any­thing im­por­tant. Once or twice, I’ve in­structed it to out­line a React com­po­nent, only to rewrite most of it my­self.

I also don’t trust its fac­tual ac­cu­racy at all. Af­ter hal­lu­ci­nat­ing a camp­ground and sev­eral en­tire web APIs, I can’t quite be­lieve any­thing it says. Not that I’ve trusted any model that came be­fore.

That said, there is one thing it is in­cred­i­bly use­ful for: find­ing canon­i­cal doc­u­men­ta­tion for com­plex sub­jects.

I’ve found the great­est suc­cess, per­son­ally and pro­fes­sion­ally, when I am work­ing with the most con­crete and orig­i­nal source of in­for­ma­tion avail­able. When work­ing with the web, the most canon­i­cal source is the W3C spec. When work­ing with com­pil­ers, it’s The Dragon Book. When re­search­ing the ins and outs of GNU/Linux sys­tems, it’s the man page.

This can’t be a novel con­cept. You (the reader) must see things the same way I do. Even LinkedIn seems to agree that base-truth doc­u­men­ta­tion is where we should be get­ting our in­for­ma­tion.

I’ve al­ways won­dered: if the W3C spec is the best place to find in­for­ma­tion about the web, why is­n’t it the first re­sult on Google? Why, af­ter all these years, is W3Schools still the first re­sult 90% of the time?

I use o3 to find canon­i­cal sources of in­for­ma­tion.

I was re­cently look­ing to im­prove the dy­namic range in my cam­era, but do so be­fore my post-pro­cess­ing step. By im­prov­ing dy­namic range in-cam­era, I can avoid the pit­falls of cer­tain kinds of com­pres­sion. I asked o3: find me the canon­i­cal guide for im­prov­ing dy­namic range on my D7100 from the most au­thor­i­ta­tive source.”

I learnt more from the re­sult­ing guide (which was hosted on the Nikon web­site, NOT ChatGPT) than the last three years of shoot­ing, com­bined.