{"title":"Toward Automating Shredding Nonprofit XML Files: The Case of IRS Form 990 Data","authors":"Husam A. Abu Khadra, D. Olsen","doi":"10.2308/isys-2022-031","DOIUrl":null,"url":null,"abstract":"This paper presents and describes data for nonprofit IRS filings in the United States of America. The data contains 831 attributes and 1,102,884 records for the years 2016-2021. Among other items, the data include nonprofits’ comparative financial data, governance disclosures, and hired contractors, as well as management compensation, a detailed statement of revenue, statement of functional expenses, external audit, federal audit election, and reconciliation of net assets. The data is generated using Structured Query Language (SQL) self-developed code to convert the IRS form 990 Extensible Markup Language (XML) tax filing files to a dataset in Excel. This paper is the first to convert these XML files and provide much-needed open access to nonprofit data in a long format that is useful for researchers to conduct cross-sectional analysis. The 2,174 lines of source code that we developed, and a step-by-step guide are included in this paper.","PeriodicalId":50486,"journal":{"name":"European Journal of Information Systems","volume":"41 1","pages":"169-188"},"PeriodicalIF":7.3000,"publicationDate":"2022-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Information Systems","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.2308/isys-2022-031","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents and describes data for nonprofit IRS filings in the United States of America. The data contains 831 attributes and 1,102,884 records for the years 2016-2021. Among other items, the data include nonprofits’ comparative financial data, governance disclosures, and hired contractors, as well as management compensation, a detailed statement of revenue, statement of functional expenses, external audit, federal audit election, and reconciliation of net assets. The data is generated using Structured Query Language (SQL) self-developed code to convert the IRS form 990 Extensible Markup Language (XML) tax filing files to a dataset in Excel. This paper is the first to convert these XML files and provide much-needed open access to nonprofit data in a long format that is useful for researchers to conduct cross-sectional analysis. The 2,174 lines of source code that we developed, and a step-by-step guide are included in this paper.
本文提出并描述了在美国的非营利性国税局备案数据。该数据包含2016-2021年的831个属性和1,102,884条记录。在其他项目中,这些数据包括非营利组织的比较财务数据、治理披露和雇用的承包商,以及管理层薪酬、详细的收入报表、职能支出报表、外部审计、联邦审计选举和净资产对账。该数据使用SQL (Structured Query Language)自行开发的代码生成,将IRS form 990 XML (Extensible Markup Language)税务申报文件转换为Excel中的数据集。本文首次转换了这些XML文件,并提供了对非营利组织数据的长格式开放访问,这对研究人员进行横断面分析很有用。本文中包含了我们开发的2174行源代码和分步指南。
期刊介绍:
The European Journal of Information Systems offers a unique European perspective on the theory and practice of information systems for a global readership. We actively seek first-rate articles that offer a critical examination of information technology, covering its effects, development, implementation, strategy, management, and policy.