Anthropic Proposes Regulatory Framework for Frontier AI Models

CoinMarketCap reports:

Anthropic has released a policy framework for advanced AI models. The company states that existing transparency rules are struggling to keep pace with the rapid advancement of model capabilities, and governments need clearer authority to intervene before high-risk systems enter the public market.

Regulated entities are classified by hash rate and revenue.

This proposal is divided into two parts: one focusing on the technical and regulatory requirements of the most advanced models, and the other addressing economic distribution issues under automation. Based on the disclosed information, the former is clearly more comprehensive.

Anthropic narrows the regulatory scope to a small number of leading developers rather than covering the entire industry. The company proposes that models requiring more than 10²⁵ floating-point operations for training should be subject to the framework, as well as companies with annual AI-related revenue exceeding $500 million or AI research and development expenditures exceeding $1 billion.

This design aims to focus on the most resource-rich and capable models, avoiding subjecting small and medium-sized developers and research institutions to the same level of regulation.

Advocates that the government can prevent high-risk listings

The company stated that the most critical change is granting the government legal authority to block or curb the deployment of high-risk models. Currently, the United States lacks a comprehensive mechanism to substantially intercept models before they are publicly released.

Specifically, frontier model developers must complete testing before release and publicly disclose a test summary, safety framework, and system card detailing the model’s behavioral performance and risk mitigation approaches. Companies must also submit periodic risk reports disclosing their overall risk posture and progress on safety initiatives.

Anthropic also advocates for the introduction of independent evaluation agencies to review tests conducted by companies themselves and to issue separate model risk assessments. This would ensure that regulators and the public no longer rely solely on companies’ self-reported information.

Penalties and security requirements are being strengthened simultaneously.

In terms of enforcement design, Anthropic recommends linking civil penalties to a company’s global annual revenue rather than imposing fixed fines. The company believes that only this approach will create meaningful deterrence for large AI enterprises. Penalties should be further increased for repeat offenders.

In addition to testing and disclosure, the program requires companies to establish stronger security systems to protect model weights and training systems from external attacks and internal misuse. Companies may publicly outline the general structure of their security plan, with more detailed information provided upon request by government agencies.

Anthropic also proposed that governments and the industry jointly establish standards for independent evaluators and ensure these evaluators receive adequate funding and necessary access. Since frontier models are typically among a company’s most sensitive assets, determining who conducts the evaluations and how access is granted will be one of the key challenges in implementation.

Identify four main types of risk

Anthropic identifies four key risk categories in its document: biological risks, cybersecurity risks, loss-of-control risks, and the risk of AI autonomously accelerating its own development. The company believes these risks are not isolated and may amplify one another.

For example, models capable of large-scale discovery of software vulnerabilities could directly impact critical infrastructure such as hospitals and energy networks; under insufficient constraints, such capabilities may also compound biological risks.

Regarding supporting measures, Anthropic recommends strengthening internet and critical infrastructure protection, promoting the replacement of legacy systems in essential services, and establishing dedicated government functions to continuously monitor changes in the cyber capabilities of advanced AI. Regarding risks of loss of control and automated development, the company acknowledges that related governance tools are still immature and that further improvements are needed in detecting, isolating, and shutting down unsafe systems.

Additional information: Anthropic stated in the document that existing transparency regulations in California, New York, and other regions have some effect, but public disclosure alone is insufficient to address the risks posed by the rapid iteration of advanced models.