OpenAI creates CriticGPT to spot errors in its AI chatbot

28 Jun 2024

Illustration of a robot holding a magnifying glass in front of its face, in a green background.

Image: © Windawake/Stock.adobe.com

OpenAI plans to use CriticGPT to help human trainers spot mistakes and improve ChatGPT, but this new tool has some limitations.

OpenAI has gone full circle by using AI models to fix AI models – the company has launched a tool to spot errors in ChatGPT’s code output.

This new tool is called CriticGPT and is designed to help human trainers to improve OpenAI’s AI models – namely the GPT-4 series. These trainers are part of a process called Reinforcement Learning from Human Feedback (RHLF), which is a fancy way of saying they rate different ChatGPT responses against each other to find the best results.

OpenAI said it will integrate CriticGPT into this process to give these trainers an AI assistant when spotting errors. The company claims this model helps human trainers to write more comprehensive critiques.

“As we make advances in reasoning and model behaviour, ChatGPT becomes more accurate and its mistakes become more subtle,” OpenAI said in a blogpost. “This can make it hard for AI trainers to spot inaccuracies when they do occur, making the comparison task that powers RLHF much harder.”

OpenAI claims its CriticGPT tool will be able to boost the work of human trainers based on early testing. But it’s important to take a company’s claims about its own models with a pinch of salt. Earlier this year, the AI Index claimed that robust evaluations for large language models are “seriously lacking” and there is a lack standardisation in responsible AI reporting.

OpenAI noted some limitations in this CriticGPT tool, with one being that it was trained on ChatGPT answers that are “quite short”. The company also said this generative AI tool can hallucinate and cause human trainers to make mistakes after seeing these hallucinations.

“CriticGPT can only help so much: if a task or response is extremely complex even an expert with model help may not be able to correctly evaluate it,” OpenAI said.

OpenAI recently delayed an ‘advanced voice mode’ upgrade for ChatGPT and says it needs more time to prepare the feature for launch and ensure it can “detect and refuse certain content”.

Find out how emerging tech trends are transforming tomorrow with our new podcast, Future Human: The Series. Listen now on Spotify, on Apple or wherever you get your podcasts.