OpenAI announced a significant update to its internal safety policies, hinting that it may reconsider its safeguard requirements if another AI lab launches a “high-risk” artificial intelligence model without comparable safety measures in place. This adjustment, revealed in an update to the company’s Preparedness Framework, underscores the intensifying competitive pressures between leading AI development labs.
The revised guidelines detail how OpenAI evaluates AI models for risks and determines the appropriate level of safeguards necessary during their development and deployment. According to a statement released by the company, OpenAI would not make any policy changes without first thoroughly assessing whether the overall risk landscape had indeed shifted. Additionally, the lab promised it would transparently acknowledge any adjustments and ensure the overall safeguard levels remain appropriately protective.
OpenAI’s latest update highlights an increased dependence on automated evaluations to facilitate quicker product cycles. While the company clarified that it has not completely abandoned human-led safety testing, it is building an expanding array of automated procedures that reportedly allow faster releases of AI models.
However, recent reports suggest that OpenAI has significantly accelerated its release schedule and, according to critics, may be compromising on thorough safety practices. Sources cited by the Financial Times allege that OpenAI’s safety testers recently had less than a week to evaluate safety aspects of a major forthcoming model release—significantly shorter than timelines provided for past models. Moreover, these reports claimed that numerous safety evaluations were performed on early versions of the models rather than on the final versions deployed publicly. OpenAI refuted these claims, denying any compromises in the quality or rigor of its safety procedures.
Other important changes in the Preparedness Framework include how models are categorized based on their risk levels. OpenAI now has two clearly defined thresholds: “high capability,” for models that could amplify current known risks of severe harm, and “critical capability,” for systems that may introduce entirely new threats causing severe harm. The company explained that models reaching either level must meet stringent safeguard requirements before deployment or further development.
OpenAI’s safety framework was last revised in 2023, when it adopted clearer guidelines around catastrophic model-risk identification and management. The current update marks another significant step, explicitly recognizing competitive dynamics as potentially influencing factors in OpenAI’s safeguard policies going forward.