How to use ChatGPT to fact-check itself

How to use masking to double-check

Sep 22, 2023

It is widely known that ChatGPT creates false information, even going so far as to cite fictional sources to backup its claims. However, ChatGPT is only a tool, and as a tool it provides the ability to fact-check itself in most cases.

The Problem

Given the prompt “Tell me about the time when Andrew Johnson was arrested for smuggling drugs into Hong Kong,” ChatGPT will do one of two things: a) refuse to answer the loaded prompt, or b) create a fictitious story that illustrates a plausible sounding scenario where I did something bad.

The Solution

To fact-check the previous prompt it is easy enough to use masking to check whether or not there is any true basis for this story. Masking is a prompt strategy where you obscure a piece of information and ask ChatGPT to fill in the blank. The above prompt could be rewritten “Tell me about the time when Andrew Johnson was arrested for [fill in the blank] in Hong Kong.” For the event to seem factual, the response would need to return a valid answer and correctly match the previous response.

Why this works

ChatGPT fundamentally just predicts what word is most likely to follow the previous words. If it is unsure of what to choose at any point it just randomly chooses one of the options. Masking is an effective way to use ChatGPT to fact-check itself because two random prompts are somewhat unlikely to remain consistent in their answers unless there is some underlying correlation. However, be warned that this technique is still problematic if there are noisy correlations. For example, imagine that there is another person also named Andrew Johnson who was arrested in Hong Kong. That story might overlap and create a false narrative.

Andrew’s Substack

Discussion about this post