{"title":"DATATRAN:数据转换工具","authors":"Ken Jacobs, L. Cooprider, R. F. Teitel","doi":"10.1145/1198277.1198278","DOIUrl":null,"url":null,"abstract":"DATATRAN was designed to meet the need for a general, easy-to-use data transformation facility. This need exists because of the requirements most statistical processors impose as to the nature of input data. Although several of these processors include limited data transformation abilities (such as the BMD package) and almost all permit user supplied FORMAT descriptions of the data, several serious drawbacks can be seen. First, these capabilities, when present , differ widely from program to program. Second, users often find such facilities difficult to use, and even harder to debug. Lastly, no matter how general such a facility is, it is likely that some user will have data in a form which the program is unequipped to handle, or will desire a transformation which the program cannot perform. The intended use of DATATRAN is as a preliminary job step to prepare data for processing by a statistical processor of some sort. However , DATATRAN does include some arithmetic and functional abilities to allow limited data processing. The philosophy of the design of DATATRAN called for maximum transformational capabilities completely under the user's control in an easy-to-read and write syntax. Hopefully, this philosophy has been fulfilled. DATATRAN operates on data groups, or observations. If, for example, a survey produces 2 cards of data per respondent, the transformations the user desires are performed on the data in 2 card blocks. A block of data is read, each of the DATATRAN statements is executed, and the transformed block of data is placed on the output file the user designates. Each record in the block is designated by a letter of the alphabet. Thus a block which contains four records contains records A,B,C, and D. The length and number of the records in both the input and output blocks are specified by the user. Within a record, positions are referred to by column number. Thus, A3 refers to the third column on the first record, while B33-40 refers to columns 33 through 40 of the second record of a block. DATATRAN recognizes three data types: punch, value, and literal. Although it is expected that DATATRAN input will most often be from tape or disk, it is especially easy to picture these data types with respect to a punched card. The following descriptions are applicable to data placed on any type of device, though the references will be to card data. A punch variable consists …","PeriodicalId":129356,"journal":{"name":"ACM Sigsoc Bulletin","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1969-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DATATRAN: a data transformation facility\",\"authors\":\"Ken Jacobs, L. Cooprider, R. F. Teitel\",\"doi\":\"10.1145/1198277.1198278\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"DATATRAN was designed to meet the need for a general, easy-to-use data transformation facility. This need exists because of the requirements most statistical processors impose as to the nature of input data. Although several of these processors include limited data transformation abilities (such as the BMD package) and almost all permit user supplied FORMAT descriptions of the data, several serious drawbacks can be seen. First, these capabilities, when present , differ widely from program to program. Second, users often find such facilities difficult to use, and even harder to debug. Lastly, no matter how general such a facility is, it is likely that some user will have data in a form which the program is unequipped to handle, or will desire a transformation which the program cannot perform. The intended use of DATATRAN is as a preliminary job step to prepare data for processing by a statistical processor of some sort. However , DATATRAN does include some arithmetic and functional abilities to allow limited data processing. The philosophy of the design of DATATRAN called for maximum transformational capabilities completely under the user's control in an easy-to-read and write syntax. Hopefully, this philosophy has been fulfilled. DATATRAN operates on data groups, or observations. If, for example, a survey produces 2 cards of data per respondent, the transformations the user desires are performed on the data in 2 card blocks. A block of data is read, each of the DATATRAN statements is executed, and the transformed block of data is placed on the output file the user designates. Each record in the block is designated by a letter of the alphabet. Thus a block which contains four records contains records A,B,C, and D. The length and number of the records in both the input and output blocks are specified by the user. Within a record, positions are referred to by column number. Thus, A3 refers to the third column on the first record, while B33-40 refers to columns 33 through 40 of the second record of a block. DATATRAN recognizes three data types: punch, value, and literal. Although it is expected that DATATRAN input will most often be from tape or disk, it is especially easy to picture these data types with respect to a punched card. The following descriptions are applicable to data placed on any type of device, though the references will be to card data. A punch variable consists …\",\"PeriodicalId\":129356,\"journal\":{\"name\":\"ACM Sigsoc Bulletin\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1969-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Sigsoc Bulletin\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1198277.1198278\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Sigsoc Bulletin","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1198277.1198278","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DATATRAN was designed to meet the need for a general, easy-to-use data transformation facility. This need exists because of the requirements most statistical processors impose as to the nature of input data. Although several of these processors include limited data transformation abilities (such as the BMD package) and almost all permit user supplied FORMAT descriptions of the data, several serious drawbacks can be seen. First, these capabilities, when present , differ widely from program to program. Second, users often find such facilities difficult to use, and even harder to debug. Lastly, no matter how general such a facility is, it is likely that some user will have data in a form which the program is unequipped to handle, or will desire a transformation which the program cannot perform. The intended use of DATATRAN is as a preliminary job step to prepare data for processing by a statistical processor of some sort. However , DATATRAN does include some arithmetic and functional abilities to allow limited data processing. The philosophy of the design of DATATRAN called for maximum transformational capabilities completely under the user's control in an easy-to-read and write syntax. Hopefully, this philosophy has been fulfilled. DATATRAN operates on data groups, or observations. If, for example, a survey produces 2 cards of data per respondent, the transformations the user desires are performed on the data in 2 card blocks. A block of data is read, each of the DATATRAN statements is executed, and the transformed block of data is placed on the output file the user designates. Each record in the block is designated by a letter of the alphabet. Thus a block which contains four records contains records A,B,C, and D. The length and number of the records in both the input and output blocks are specified by the user. Within a record, positions are referred to by column number. Thus, A3 refers to the third column on the first record, while B33-40 refers to columns 33 through 40 of the second record of a block. DATATRAN recognizes three data types: punch, value, and literal. Although it is expected that DATATRAN input will most often be from tape or disk, it is especially easy to picture these data types with respect to a punched card. The following descriptions are applicable to data placed on any type of device, though the references will be to card data. A punch variable consists …