Leading AI models are lacking in transparency, report claims

19 Oct 2023

A file full of paper documents with a chain and padlock around it. Used as a concept for a lack of data transparency.

Image: © andranik123/Stock.adobe.com

The Foundational Model Transparency Index evaluated companies such as Meta, Google and OpenAI and found low transparency scores across the board.

A new index created by US researchers suggests companies in the foundational AI model space are becoming less transparent about their creations.

A team of researchers from Stanford, MIT and Princeton University developed a scoring system called the Foundational Model Transparency Index, which evaluates 100 different aspects of transparency. These aspects include a company disclosing how it builds its foundation model, how the model works and how it is used downstream.

The index scored 10 major foundation model companies using this index, including Meta, Google and OpenAI. The result was very low scores in transparency across the board.

The top scorer was Meta’s AI model Llama 2, which received a transparency score of 54pc. OpenAI came third with a score of 48pc, while Google received a score of 40pc for its PaLM 2. Amazon scored the lowest out of 10 companies, with a score of only 12pc for its Titan Text model.

A line chart showing the transparency scores of 10 AI models in the Foundation Model Transparency Index.

Overall transparency scores of 10 large language models. Image: Foundational Model Transparency Index 2023

The researchers said none of the 10 companies disclose how many users depend on their models or provide statistics on the geographies or market sectors that use their models.

“This is a pretty clear indication of how these companies compare to their competitors, and we hope will motivate them to improve their transparency,” said PhD candidate Rishi Bommasani, one of the study authors.

Stanford said less transparency makes it harder for businesses to know if they can safely build applications that rely on these foundational models. It can also impact academics who use these models for research, policymakers who need to understand the tech for regulation and consumers to understand the limitations of these models.

“As AI technologies rapidly evolve and are rapidly adopted across industries, it is particularly important for journalists and scientists to understand their designs, and in particular the raw ingredients, or data, that powers them,” said PhD candidate at MIT Shayne Longpre, a co-author of the study.

The researchers hope that their index will be used to inform policymakers in governments by showing them where companies are falling short.

“I think this will give them a lot of clarity about the lay of the land, what is good and bad about the status quo, and what they could potentially change with legislation and regulation,” Bommasani said.

The researchers noted that transparency does not relate to good corporate practices. For example, a company would still score transparency points in the index for disclosing that their AI model uses lots of energy, or that they don’t pay their workers a living wage.

In March, OpenAI faced criticism for keeping various details of GPT-4 private, such as the model’s architecture, hardware and training methods. The company said these details were hidden due to “both the competitive landscape and the safety implications of large-scale models like GPT-4”.

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.