{"title":"Programmed differently? Testing for gender differences in Python programming style and quality on GitHub","authors":"Siân Brooke","doi":"10.1093/jcmc/zmad049","DOIUrl":null,"url":null,"abstract":"The underrepresentation of women in open-source software is frequently attributed to women’s lack of innate aptitude compared to men: natural gender differences in technical ability (Trinkenreich et al., 2021). Approaching code as a form of communication, I conduct a novel empirical study of gender differences in Python programming on GitHub. Based on 1,728 open-source projects, I ask if there is a gender difference in the quality and style of Python code measured in adherence to PEP-8 guidelines. I found significant gender differences in structure and how Python files are organized. While there is gendered variation in programming style, there is no evidence of gender difference in code quality. Using a Random Forest model, I show that the gender of a programmer can be predicted from the style of their Python code. The study concludes that gender differences in Python code are a matter of style, not quality.","PeriodicalId":48319,"journal":{"name":"Journal of Computer-Mediated Communication","volume":null,"pages":null},"PeriodicalIF":5.4000,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Mediated Communication","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1093/jcmc/zmad049","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMMUNICATION","Score":null,"Total":0}
引用次数: 0
Abstract
The underrepresentation of women in open-source software is frequently attributed to women’s lack of innate aptitude compared to men: natural gender differences in technical ability (Trinkenreich et al., 2021). Approaching code as a form of communication, I conduct a novel empirical study of gender differences in Python programming on GitHub. Based on 1,728 open-source projects, I ask if there is a gender difference in the quality and style of Python code measured in adherence to PEP-8 guidelines. I found significant gender differences in structure and how Python files are organized. While there is gendered variation in programming style, there is no evidence of gender difference in code quality. Using a Random Forest model, I show that the gender of a programmer can be predicted from the style of their Python code. The study concludes that gender differences in Python code are a matter of style, not quality.
期刊介绍:
The Journal of Computer-Mediated Communication (JCMC) has been a longstanding contributor to the field of computer-mediated communication research. Since its inception in 1995, it has been a pioneer in web-based, peer-reviewed scholarly publications. JCMC encourages interdisciplinary research, welcoming contributions from various disciplines, such as communication, business, education, political science, sociology, psychology, media studies, and information science. The journal's commitment to open access and high-quality standards has solidified its status as a reputable source for scholars exploring the dynamics of communication in the digital age.