Foreign media commentary suggests that the debate surrounding AI programming agents is shifting from “whether they can improve efficiency” to “whether they will compromise software quality.” Recently, hacker George Hotz, who cracked the original iPhone jailbreak and PlayStation 3, wrote that the widespread adoption of such tools by the software industry could prove to be an extremely costly misstep for the field.
Issued a negative assessment after six months of real-world testing
Hotz stated that he is not criticizing from the sidelines. Over the past six months, he has consistently used AI agents in real projects, including partial development of his open-source deep learning framework, tinygrad, and a complete reverse engineering of a USB-PCIe chip firmware.
His conclusion is that such tools often show rapid progress early on, but become increasingly difficult to finalize as time goes on. Although the model’s outputs appear more polished on the surface, the real issues are harder to detect in a timely manner. According to him, developers still end up frequently manually fixing the results.
The disagreement is not about efficiency, but about who bears the cost.
The article argues that the real risk is not whether a single output is incorrect, but whether organizational quality control may fail. Hotz’s key insight is that more skilled engineers typically can still read generated code, identify vulnerabilities, and decide when to trust the tool; however, less skilled engineers may not possess the same ability to verify it.
If the latter uses proxies to scale output to several times what it was in the past, the team’s apparent efficiency may rise, but average code quality will decline more rapidly—and this decline will be masked by higher submission volumes. Hotz warns that, as a result, the industry may soon be flooded with code that “appears to work but is riddled with issues.”
In stark contrast to Karpathy
Shortly before this article was published, AI researcher Andrej Karpathy joined Anthropic’s pretraining team. The report notes that Karpathy’s attitude toward AI agents has shifted this year, as he now believes that next-generation models have significantly transformed software development.
Anthropic’s CEO Dario Amodei previously stated that some of the company’s engineers have reduced the amount of code they write manually, instead relying on models to generate code that is then reviewed by humans. Hotz, however, had the opposite experience: he tried a similar process but ended up having to fix nearly every output himself.
As "vibe coding" has rapidly gained popularity over the past year, major AI companies have made agent-based programming a key focus. Microsoft has already advanced GitHub Copilot toward a more comprehensive agent-based system, describing this shift as a platform-level transformation.
Hotz argues that the issue isn't whether programmers are worried about being replaced, but whether companies will roll out tools too rapidly under competitive pressure. He specifically notes that if large companies uniformly implement AI coding tools across their entire engineering teams, software quality over the next two years may not improve as a result.
