Fujitsu Laboratories Introduces AI Based Automatic Patch Generation Technology
Enhances efficiency of business application software development by learning from a corpus of all archived bug reports and bug patches
Fujitsu Laboratories of America Inc.,Fujitsu Laboratories Ltd.
Currently, debugging and patching of software bugs, which is mainly performed manually, consumes a disproportionately large fraction of software development resources. This constitutes a significant drag on the growing influence and penetration of software applications and requires automated solutions. However, conventional automatic patch generation techniques have limited scope since they cannot handle bugs with large number of candidate patches.
The announced technology uses machine learning to accurately rank the space of candidate patches, prioritizing the most relevant candidates, and thereby enabling the generation of the correct patches. Therefore, bugs outside the scope of conventional auto patching techniques can now be patched in minutes, rather than hours or days of manual patching.
FLA and FLL plan to further refine the use-cases and target markets for this technology, including conducting field trials and further improvements to the technology, with an aim to release it as a services-based product in FY2018 from a Fujitsu business unit.
Currently, software maintenance is performed manually and is therefore time-consuming. Research studies have reported that software developers spend as much as 50% of their programming time for identifying and fixing bugs. The Defects4J benchmark dataset is a collection of several bugs from popular, object-oriented, open-source software (OSS) programs, which are typically used in business application development. We investigated 20 method-invocation related bugs (discussed below), with trackable bug repositories, from Defects4J, out of 49 single-fault-location bugs, and found that it took the development teams an average of 17.1 days to fix each bug. Bug fixing is not only time-consuming but ultimately an issue of development cost. For instance, market research studies reported the global cost of fixing software in 2013 as $312 billion.
Therefore, automatic patch generation technologies, which aim to make bug fixing more efficient, are expected to gain wide acceptance. While the ultimate goal is the complete automatic repair of bugs of all complexities, the current focus of the research community is primarily on automatic generation of single-line patches, i.e., automatically patching single-fault-location bugs.
Business applications are typically implemented using object-oriented languages, such as Java. Our investigation on a set of large-scale OSS Java projects, which are often used as benchmarks by the academic community, showed that as many as 30%-40% of the bugs are related to method invocations. In fact, 29 out of the 49 single-fault-location bugs (59.2%) in the aforementioned Defects4J dataset are method-invocation related bugs. Thus, method-invocation related bugs constitute a significant fraction of single-fault-location bugs and urgently merit a solution. However, patches for such bugs typically have a large search space, often with several hundred candidate patches. Effectively searching this candidate space requires new technology for ranking the candidates, prioritizing the ones most likely to be the correct patch.
Conventional techniques, such as the heuristic-search-based automated repair tool ACS, essentially do not fix method-invocation related bugs, and can correctly fix only 6 out of the 29 (20.7%) single-fault-location method-invocation Defects4J bugs, and overall only 14 out of the 49 bugs. By contrast, our technique, presented below, fixed method-invocation bugs and generated 15 correct patches (51.7% and 2.5 times the above-mentioned conventional techniques) out of 29 bugs, and overall correctly fixed 26 out of the 49 bugs.
Our technology targets single-fault-location bugs in object-oriented programs, which are typically used in business application development. We have developed an automated patch generation technology for bug repair that employs machine learning on a corpus of all archived bug reports and bug patches. In particular, the system supports generation of patches involving method invocations. Out of 29 method-invocation related, single-fault-location bugs, in the Defects4J benchmark dataset, our system generated 15 correct patches (51.7%, 2.5X compared to the aforementioned conventional techniques).
The features of the developed technology are as follows:
1. Automated patch generation technology
The automated patch generation AI engine, as shown in Figure 1, consists of a bug localizer, a patch candidate generator, as well as candidate ranking, and test validation. Specifically, given a source code with a bug, first the bug location is diagnosed, and then patch candidates for the bug at the diagnosed location are generated.
Since the number of patch candidates can be large, often in the several hundred, these patch candidates are ranked in decreasing order of their likelihood of being the correct patch. Subsequently, the highest ranked candidates are validated by checking if they pass the test cases. The first candidate that passes all test cases is recommended as a potential patch for the bug.
Figure 1: Automatic Patch Generation AI Engine
2. Ranking technology based on machine learning on corpus of archived bug reports and bug patches
To identify the most promising candidates among a large number of potential patch candidates, we use machine learning on previous bug reports and bug patches.
More specifically, a logistic regression model is created by learning the relationship between existing buggy source code, the bug reports and the final patches for those bugs. Given a target buggy source code, a bug report and the population of patch candidates, a ranked list of patch candidates is generated, based on the model.
By focusing on new aspects such as bug reports and the buggy code surrounding the patch, method-invocation related bugs can be effectively supported and the patch generation precision could be improved significantly compared to the conventional techniques, as mentioned above.
Business applications are typically implemented using object-oriented languages. The announced automated patch generation technology specifically supports automatic patching of method-invocation related bugs, which frequently occur in object-oriented programs. Out of 29 method-invocation-related bugs among 49 single-fault-location bugs in the Defects4J benchmark dataset, our system generates 15 correct patches (51.7%), 2.5 times the 6 correct patches generated by the aforementioned conventional technology. Further, overall our technology correctly patches 26 out of the 49 (i.e., 53.1%) single-fault-location bugs, 1.9 times the 14 correct patches generated by the conventional technology.
Also, for the 20 bugs that have trackable bug repositories, among the 29 method-invocation related bugs, we investigated the bug-fixing time, which is the time from the bug report creation to the bug report closing, and found it to be 17.1 days, on average. The aforementioned conventional technology, which generated 6 correct patches for the 20 bugs, marginally reduced the average bug-fixing time to 17.0 days, while our proposed technology, which generated 11 correct patches for the 20 bugs, reduced the time substantially to 12.1 days. Thus, our proposed technology could reduce the average bug-fixing time by 28.8% compared to manual effort or the conventional technology. The difference between the conventional and the present technologies is because our proposed technology can generate more correct patches than the conventional one can as well as because it can fix time-consuming bugs. Reductions in bug-fixing time directly translate to savings in development costs.
FLA and FLL will discuss this technology with business units, targeting both business applications and OSS, refine and prioritize the targets and use cases, conduct trials and improvements internally, with an aim to release it as a services-based product in FY2018.
Fujitsu is the leading Japanese information and communication technology (ICT) company offering a full range of technology products, solutions and services. Approximately 155,000 Fujitsu people support customers in more than 100 countries. We use our experience and the power of ICT to shape the future of society with our customers. Fujitsu Limited (TSE: 6702) reported consolidated revenues of 4.5 trillion yen (US$40 billion) for the fiscal year ended March 31, 2017. For more information, please see http://www.fujitsu.com.
About Fujitsu Laboratories
Founded in 1968 as a wholly owned subsidiary of Fujitsu Limited, Fujitsu Laboratories Limited is one of the premier research centers in the world. With a global network of laboratories in Japan, China, the United States and Europe, the organization conducts a wide range of basic and applied research in the areas of Next-generation Services, Computer Servers, Networks, Electronic Devices and Advanced Materials. For more information, please see: : Fujitsu Laboratories
About Fujitsu Laboratories of America, Inc.
Fujitsu Laboratories of America, Inc. is a wholly owned subsidiary of Fujitsu Laboratories Ltd. (Japan), focusing on research in novel computing, networking technologies, software development and solutions for several industry verticals. Conducting research in an open environment, it contributes to the global research community and the IT industry. It is headquartered in Sunnyvale, CA.
For more information, please see: www.fla.fujitsu.com
Fujitsu, the Fujitsu logo and “shaping tomorrow with you” are trademarks or registered trademarks of Fujitsu Limited in the United States and other countries. All other company or product names mentioned herein are trademarks or registered trademarks of their respective owners. Information provided in this press release is accurate at time of publication and is subject to change without advance notice.
Date: 11 October, 2017
Fujitsu Laboratories of America, Inc.,
Fujitsu Laboratories, Ltd.