Google's Gemini continues the dangerous obfuscation of AI technology

The company's lack of disclosure, while not surprising, is made more striking by one very large omission: model cards. Here's why.

Until this year, it was possible to learn a lot about artificial intelligence technology simply by reading research documentation published by Google and other AI leaders with each new program they released. Open disclosure was the norm for the AI world.

All that changed in March of this year, when OpenAI elected to announce its latest program, GPT-4, with almost no technical detail. The research paper provided by the company obscured just about every important detail of GPT-4 that would allow researchers to understand its structure and to attempt to replicate its effects.

Last week, Google continued that new approach of obfuscation, announcing the formal release of its newest generative AI program, Gemini, developed in conjunction with its DeepMind unit, which was first unveiled in May. The Google and DeepMind researchers offered a blog post devoid of technical specifications, and an accompanying technical report almost completely devoid of any relevant technical details.

Much of the blog post and the technical report cite a raft of benchmark scores, with Google boasting of beating out OpenAI's GPT-4 on most measures, and beating Google's former top neural network, PaLM.

Neither the blog nor the technical paper include key details that have been customary in years past, such as how many neural net "parameters," or, "weights," the program has, a key aspect of its design and function. Instead, Google refers to three versions of Gemini, with three different sizes, "Ultra," "Pro," and "Nano." The paper does disclose that Nano is trained with two different weight counts, 1.8 billion and 3.25 billion, while failing to disclose the weights of the other two sizes.

Numerous other technical details are absent, just as with the GPT-4 technical paper from OpenAI. In the absence of technical details, online debate has focused on whether the boasting of benchmarks means anything.

OpenAI researcher Rowan Zellers wrote on X (formerly Twitter) that Gemini is "super impressive," and added, "I also don't have a good sense on how much to trust the dozen or so text benchmarks that all the LLM papers report on these days."

😂 in a more serious note though --

the gemini model is super impressive (can't wait to play with the new multimodality aspects!)
I also don't have a good sense on how much to trust the dozen or so text benchmarks that all the LLM papers report on these days 😀

— Rowan Zellers (@rown) December 7, 2023

Tech news site TechCrunch's Kyle Wiggers reports anecdotes of poor performance by Google's Bard search engine, enhanced by Gemini. He cites posts on X by people asking Bard questions such as movie trivia or vocabulary suggestions and reporting the failures.

Mud-slinging is a common phenomenon in the introduction of a new technology or product. In the past, however, technical details allowed outsiders to make a more-informed assessment of capabilities by assessing the technical differences between the latest program and the program's immediate predecessors, such as PaLM.

For lack of such information, assessments are being made in hit-or-miss fashion, by people randomly typing things to Bard.

The sudden swing to secrecy by Google and OpenAI is becoming a major ethical issue for the tech industry because no one knows, outside OpenAI and its partner Microsoft, what is going on in the black box in their computing cloud.

In October, scholars Emanuele La Malfa at the University of Oxford and collaborators at The Alan Turing Institute and the University of Leeds, warned that the obscurity of GPT-4 and other models "causes a significant problem" for AI for society, namely that, "the most potent and risky models are also the most difficult to analyze."

Google's lack of disclosure, while not surprising given its commercial battle with OpenAI, and partner Microsoft, for market share, is made more striking by one very large omission: model cards.

Model cards are a form of standard disclosure used in AI to report on the details of neural networks, including potential harms of the program (hate speech, etc.) While the GPT-4 report from OpenAI omitted most details, it at least made a nod to model cards with a "GPT-4 System Card" section in the paper, which it said was inspired by model cards.

Google doesn't even go that far, omitting anything resembling model cards. The omission is particularly strange given that model cards were invented at Google, by a team that included Margaret Mitchell, formerly co-lead of Ethical AI at Google, and former co-lead Timnit Gebru.

Instead of model cards, the report offers a brief, rather bizarre passage about the deployment of the program with vague language about having model cards at some point:

Following the completion of reviews, model cards ?? [emphasis Google's] for each approved Gemini model are created for structured and consistent internal documentation of critical performance and responsibility metrics as well as to inform appropriate external communication of these metrics over time.

If Google puts question marks next to model cards in its own technical disclosure, one has to wonder what the future of oversight and safety is for neural networks.

Source

Adenman
1

Sign In

Google's Gemini continues the dangerous obfuscation of AI technology

User Feedback

Recommended Comments

Join the conversation

Recently Browsing 0 members

nsane.down

News

Browse

Activity