test12

Analysis of LLMs in Debugging and Software Modification

This website serves as the official presentation of my thesis, which conducts an analysis of the capabilities of Large Language Models (LLMs) in real-world software bugfixing.

Objective

The core question of this project is: How do the bug-fixing capabilities of modern LLMs compare to those of human developers when applied to established, open-source software projects?

To answer this, I developed a semi-automated Python framework that systematically benchmarks LLM performance against human-written solutions. The tool creates a quality corpus of real-world bug fixes, provides the LLM with the same context a human developer would have (the bug report and relevant code), and then uses the project’s own test suite to objectively validate the correctness of the generated fix.

Navigating This Site

This site is organized into two main sections, accessible via the navigation bar at the top:

Home: You are here. This page provides a high-level overview of the project’s goals and methodology.
The Analysis: This is the heart of the project. It is an interactive Jupyter Notebook containing the full, reproducible analysis of the data collected by the tool. It includes all the code, calculations, and visualizations that lead to the final conclusions of this research.

Project Resources

The Live Analysis Notebook: You can explore the full data analysis in the The Analysis section of this site. For an interactive, executable version, please use the “Open in Colab” badge at the top of the analysis page.
The Source Code: The complete source code for the analysis tool itself is hosted in a shared academic repository. You can view it here:
- GitHub Repository for the Tool