Weeks after the launch of Gemini 2.5 Pro—Google’s most advanced AI model to date—the company released a technical report on its internal safety evaluations. But according to experts, the report offers limited detail, leaving key questions about potential risks unanswered.

Technical reports are typically seen as vital tools for transparency, offering insights—both positive and critical—about AI model performance and safety. While many in the AI community view these reports as good-faith contributions to independent oversight and research, Google’s latest publication falls short of expectations.

Unlike some of its competitors, Google only releases technical reports once a model has moved beyond the “experimental” phase. Even then, the company does not include the full findings from its evaluations of potentially “dangerous capabilities,” reserving those details for a separate internal audit.

The sparse content of the Gemini 2.5 Pro report has sparked concern among researchers. Critics pointed out the report’s lack of information about Google’s Frontier Safety Framework (FSF)—a set of guidelines introduced last year to identify AI capabilities that could pose serious harm.

“This report is extremely limited, offers minimal substance, and was released weeks after the model went public,” said Peter Wildeford, co-founder of the Institute for AI Policy and Strategy. “There’s no way to verify whether Google is following through on its public safety commitments.”

Thomas Woodside, co-founder of the Secure AI Project, echoed similar concerns. While he acknowledged Google’s decision to publish the Gemini 2.5 Pro report, he questioned the company’s consistency in sharing timely and comprehensive safety updates. He noted that the last time Google released results from dangerous capability testing was in June 2024—for a model launched four months prior.

Adding to the uncertainty, Google has yet to release a report for Gemini 2.5 Flash, a smaller and more efficient model introduced just last week. A company spokesperson told TechCrunch that a report is “coming soon.”

“I hope this signals a shift toward more frequent and transparent reporting,” said Woodside. “Ideally, these updates should also include evaluations of models not yet released to the public, since they may also pose significant risks.”

Although Google was one of the early advocates for standardized AI model documentation, it’s not the only major player drawing criticism for lack of transparency. Meta’s recent safety analysis for its Llama 4 models was also light on detail, and OpenAI has yet to publish any report on its GPT-4.1 series.

This lack of consistency raises questions about industry standards—especially as companies face increased regulatory scrutiny. Two years ago, Google pledged to the U.S. government that it would publish safety evaluations for all “significant” public AI models. Similar commitments were made internationally, promising transparency around AI product development.

Kevin Bankston, senior adviser on AI governance at the Center for Democracy and Technology, described the trend of vague and irregular reporting as a troubling sign.

“With reports that companies like OpenAI are shortening their safety review timelines from months to mere days, Google’s limited documentation for its flagship AI model is part of a broader ‘race to the bottom’ on safety and transparency,” Bankston said.

Google, for its part, maintains that while not fully captured in technical reports, it conducts thorough safety testing and “adversarial red teaming” before releasing new models.

Source