AI-generated code may pass syntax checks, testing, and PR reviews, yet still harbor hidden business-semantic risks that bypass all security checks and ultimately cause real financial losses. Peking University’s Narwhal-Lab has open-sourced the Narwhal AI Code Risks project, systematically compiling real-world risk cases involving AI-generated code, including a prominent incident where all 28 checks passed but a pricing semantic error led to a $1.78 million loss. The project categorizes risks into three types—real incidents, early signals, and typical scenarios—and covers seven domains including supply chain, code vulnerabilities, cloud configurations, and Agent risks.Article author and source: AI New Era
By 2026, code is being generated at an increasingly rapid pace, yet deployed with decreasing scrutiny.
More and more often, user requests are entered into the chatbox; the AI reads the context, completes the function, pulls in dependencies, adjusts configurations, and even generates tests along the way.
Before you know it, a piece of code is already sitting in the repository, waiting to be merged.
Users have already developed new habits: they let AI generate the code first and run it, then check where adjustments are needed.
But in the software world, the most dangerous things are often the seemingly ordinary code: syntactically correct, legally interfaced, passing tests, and perfectly commented.
But it may still introduce non-existent package names, grant overly broad permissions, expose databases... or even allow an Agent with direct access to system tools, under prompt injection, to exfiltrate sensitive data from internal systems.
What’s truly dangerous isn’t the red warning light flashing. It’s when all risk indicators show normal.
The risks of AI-generated code were previously scattered everywhere: one case hidden in a security blog, a clue recorded in an issue. When the next team encounters the same problem, they must once again piece together the source of the risk and spend significant time and effort conducting large-scale empirical measurements of the code.
The Narwhal-Lab at Peking University has just open-sourced Narwhal AI Code Risks, which has organized the information fragments into three categories—real-world events, early signals, and typical risk pathways—for researchers to review.

Paper link:
https://github.com/Narwhal-Lab/Narwhal-aicode-risks
When all 28 checks have been completed successfully
The system is still off course.
The first clue was a merged pull request, with the signature line prominently listing Claude Opus 4.6 and Copilot, along with four human developers. All 28 checks passed: no one noticed anything amiss.
Then, the liquidation bot spent several minutes removing collateral worth $1,778,044.83.
In the profile, the price of cbETH was set to the conversion ratio with ETH, approximately $1.12, rather than its actual price of around $2,200.
A price semantics error slipped through the development, review, and merge process, ultimately resulting in real financial losses within the system. This is the most glaring aspect of the Moonwell cbETH oracle configuration incident.
The issue was that the code had no syntax errors, and human developers did not immediately halt the abnormal process. Instead, it appeared complete and smooth—a normal engineering delivery.
But it is precisely this normalcy beneath the surface that makes it a classic example of a security incident.

The risk of AI coding is that it doesn't always manifest as errors.
Often, it slips quietly into the engineering process dressed as the correct solution—code runs, checks pass, PRs get merged—but the business semantics have drifted away from reality.
In low-risk projects, this semantic deviation may only result in one round of rework; however, in sensitive scenarios such as finance or enterprise data systems, it can directly lead to data leaks, exposure of permissions, and asset loss.
When AI is involved in writing code, modifying configurations, conducting reviews, or even co-authoring pull requests, do we have sufficient confidence to understand how every deviation occurs?
Green light signal
Can't reach all corners
Early AI helped with code by offering only local completions. If the syntax was incorrect, the compiler would throw an error, unit tests would fail, and the CI pipeline would reject it.
Today, AI coding has gone further, but regulation has yet to catch up.
It can read files, modify configurations, install dependencies, generate infrastructure scripts, and autonomously plan across multiple tasks via an Agent.
AI is no longer just sitting beside and handing over tools—it’s beginning to enter longer workflows in software engineering.
The originally clear boundaries in software engineering have been reconnected by AI agents into a longer, harder-to-trace path.

Decentralized records
A public voyage log is required.
Security incidents rarely have complete conclusions from the start. Some incidents have sufficient evidence to be included in the catalog as verified cases; others remain at the stage of community screenshots, researcher discussions, or preliminary disclosures and are better suited for continued monitoring;还有一些不绑定单一真实事件,却已经形成清晰模式,适合拿来做提前推演。
Narwhal AI Code Risks divides the materials into three layers: `cases/`, `inferred/`, and `scenarios/`.
cases/documented real-world events supported by publicly available sources and evidence chains;inferred/early signals that are not yet fully confirmed but warrant ongoing monitoring;scenarios/typical scenarios not tied to a single event but with sufficiently clear risk pathways.

Without such a public record, the risks of AI coding can easily become short-term memory on the internet.
Today everyone remembers a certain package name, tomorrow they discuss a certain data breach, and months later they’re swept up by the next wave of tooling trends. When similar issues arise again, the team still stumbles blindly into uncharted risky waters.
Narwhal AI Code Risks does is pin down these scattered risk fragments so that others can find the same page.
Along the seven categories of indices
See where the risk comes from
The problems brought by AI-generated code extend beyond the code itself—they exist in dependencies, permissions, tool calls by agents, and even in how humans trust AI outputs.
Narwhal AI Code Risks currently categorizes risks into seven types: supply chain, code-level vulnerabilities, cloud and infrastructure configuration, Agent risks, vertical domain risks, intellectual property and compliance risks, and human factors.

In supply chain risks, AI may recommend non-existent dependencies. In code-level vulnerabilities, AI may reintroduce path traversal, missing input validation, and authentication issues back into business code. In cloud and infrastructure configurations, AI may grant overly permissive permissions, public storage buckets, or exposed ports just to get the code running. Agent risks are even more complex, as AI goes beyond generating text and begins executing actions. AI-generated outputs are planting hidden dangers in real systems.
The AI engine is igniting
And the logbook of the voyage has just been opened
As AI gradually integrates into the real world, risk prevention and mitigation should not be limited to post-event reviews or scattered discussions.
The crucial part of Narwhal AI Code Risks is turning risk cases into reusable knowledge.
Developers can use it to identify similar issues; security researchers can treat it as a sample library; tool vendors can extract detection rules and evaluation benchmarks from it; and the open-source community can continue to add new cases, new evidence, and new risk types.
The AI engine is roaring, and every deviation should leave a coordinate. Risk never disappears because it’s ignored, but experience can be recorded and shared. What’s truly valuable isn’t discovering a single vulnerability, but ensuring others don’t have to step into the same trap.
Narwhal AI Code Risks is creating an open-source logbook for the software world in the year of AI applications.
