Search

Saved articles

You have not yet added any article to your bookmarks!

Browse articles
Newsletter image

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

Microsoft study: AI still struggles with debugging code

AI Code Assistants Are Growing — But Still Far from Replacing Developers

Artificial intelligence models are increasingly embedded in software development pipelines, assisting with everything from code generation to documentation. Tech leaders like Google CEO Sundar Pichai and Meta’s Mark Zuckerberg have praised their growing role — Google now attributes 25% of its new code to AI tools.

But a new Microsoft Research study presents a more grounded perspective: today’s top AI models still fall short when it comes to debugging, especially in scenarios that experienced human developers would navigate with ease.

Microsoft Tests AI Agents on SWE-bench Lite

In one of the most comprehensive studies of its kind, researchers at Microsoft’s R&D division evaluated nine AI models, including Anthropic’s Claude 3.7 Sonnet and OpenAI’s o3-mini, using a coding benchmark called SWE-bench Lite — a curated set of 300 software debugging tasks.

Each model was used to power a prompt-based AI agent, which had access to typical debugging tools, including a Python debugger. Despite this, performance lagged far behind expectations.

  • Claude 3.7 Sonnet achieved the highest success rate at 48.4%

  • OpenAI’s o1 model followed with 30.2%

  • OpenAI o3-mini came in at 22.1%

Even the best-performing models failed to fix more than half of the bugs, revealing critical limitations in AI’s current approach to reasoning through code and resolving software issues.

DeBug.png

Why Models Struggle to Debug

The research team identified two core issues limiting model performance:

  1. Tool usage inefficiency: AI agents often failed to apply available debugging tools effectively or understand which tool was most appropriate for a given problem.

  2. Lack of training data: Specifically, the models lacked “sequential decision-making” training examples — data that captures the step-by-step thought process human developers follow when diagnosing and fixing bugs.

“We strongly believe that training or fine-tuning [models] can make them better interactive debuggers,” the study's co-authors noted, calling for specialized datasets that record developer–debugger interactions.

 

Broader Concerns Around AI Coding Tools

The findings reinforce long-standing concerns about AI-generated code quality. Previous research has shown that code written by AI models often contains security flaws, logical errors, or incorrect implementations. A separate study evaluating Devin, a popular AI coding assistant, found it passed only 3 out of 20 programming tests.

Still, despite these limitations, demand for AI-powered coding tools remains high. The market is driven by potential productivity gains and growing investment — even as developers and CTOs must wrestle with the trade-off between speed and reliability.

 

Developers Still Matter — And Always Will, Say Tech Leaders

While AI continues to evolve, industry veterans emphasize that it will augment — not replace — software engineers.

Bill Gates, Microsoft’s co-founder, has publicly stated that programming is here to stay. The sentiment is echoed by Replit CEO Amjad Masad, Okta CEO Todd McKinnon, and IBM CEO Arvind Krishna, who each argue that AI tools will empower developers rather than eliminate their roles.

 

As AI coding tools become more capable, the Microsoft study serves as a timely reminder: real-world software development involves more than generating code snippets. It requires context, judgment, and sequential reasoning — qualities AI has yet to master.

Stay with The Horizons Times for in-depth analysis of AI advancements, developer tools, and the future of software engineering in the age of automation.

Prev Article
Robot made from pig gelatin biodegrades when no longer needed
Next Article
Dexterous and Light Prosthetic Hand Can Tie Knots and Comb Hair

Comments (0)

    Leave a Comment