Machine learning is a champ at fixing buggy code

mit probablistic patches code

A new automatic bug-repair system fixes 10 times as many errors as its predecessors did, MIT researchers say.

Credit: MIT News

It can fix 10 times as many errors as other systems can, MIT researchers say

Here's yet another new application of machine learning:  MIT has developed a system for fixing errors in bug-riddled code.

The new machine-learning system developed by researchers at MIT can fix roughly 10 times as many errors as its predecessors could, the researchers say. They presented a paper describing the new system, dubbed "Prophet," at the Principles of Programming Languages symposium last month.

Essentially, the system works by studying patches already made to open-source computer programs in order to learn their general properties. Prophet was given 777 errors and fixes in eight common open-source applications stored in the online repository GitHub.

The system then applies that knowledge to produce new repairs for new bugs in a different set of programs.

Fan Long, a graduate student in electrical engineering and computer science who was co-author on the paper, had already developed an algorithm that attempts to repair program bugs by systematically modifying program code. The only problem was it could take a prohibitively long time.

The new machine-learning system works in conjunction with that earlier algorithm but ranks possible patches according to the probability that they are correct before subjecting them to time-consuming tests.

The researchers tested the system on a set of 69 real-world errors that had cropped up in eight popular open-source programs. Where earlier bug-repair systems were able to repair one or two of the bugs, the new system repaired between 15 and 18, depending on whether it settled on the first solution it found or was allowed to run longer.

That's certainly useful, but the implications could be even bigger, according to Martin Rinard, a professor of electrical engineering and computer science who is also a co-author on the paper.

“One of the most intriguing aspects of this research is that we’ve found that there are indeed universal properties of correct code that you can learn from one set of applications and apply to another set of applications,” Rinard explained. “If you can recognize correct code, that has enormous implications across all software engineering. This is just the first application of what we hope will be a brand-new, fabulous technique.”

A look inside the Microsoft Local Administrator Password Solution
View Comments
Join the discussion
Be the first to comment on this article. Our Commenting Policies