{"title":"Word count as a traditional programming benchmark problem for genetic programming","authors":"Thomas Helmuth, L. Spector","doi":"10.1145/2576768.2598230","DOIUrl":null,"url":null,"abstract":"The Unix utility program wc, which stands for \"word count,\" takes any number of files and prints the number of newlines, words, and characters in each of the files. We show that genetic programming can find programs that replicate the core functionality of the wc utility, and propose this problem as a \"traditional programming\" benchmark for genetic programming systems. This \"wc problem\" features key elements of programming tasks that often confront human programmers, including requirements for multiple data types, a large instruction set, control flow, and multiple outputs. Furthermore, it mimics the behavior of a real-world utility program, showing that genetic programming can automatically synthesize programs with general utility. We suggest statistical procedures that should be used to compare performances of different systems on traditional programming problems such as the wc problem, and present the results of a short experiment using the problem. Finally, we give a short analysis of evolved solution programs, showing how they make use of traditional programming concepts.","PeriodicalId":123241,"journal":{"name":"Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2576768.2598230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
The Unix utility program wc, which stands for "word count," takes any number of files and prints the number of newlines, words, and characters in each of the files. We show that genetic programming can find programs that replicate the core functionality of the wc utility, and propose this problem as a "traditional programming" benchmark for genetic programming systems. This "wc problem" features key elements of programming tasks that often confront human programmers, including requirements for multiple data types, a large instruction set, control flow, and multiple outputs. Furthermore, it mimics the behavior of a real-world utility program, showing that genetic programming can automatically synthesize programs with general utility. We suggest statistical procedures that should be used to compare performances of different systems on traditional programming problems such as the wc problem, and present the results of a short experiment using the problem. Finally, we give a short analysis of evolved solution programs, showing how they make use of traditional programming concepts.