Editor's notes: FAIR BOT. As metadata is data is metadata is data ...

IASSIST quarterly Pub Date : 2023-03-30 DOI:10.29173/iq1086

K. Rasmussen

{"title":"Editor's notes: FAIR BOT. As metadata is data is metadata is data ...","authors":"K. Rasmussen","doi":"10.29173/iq1086","DOIUrl":null,"url":null,"abstract":"Welcome to the first issue of IASSIST Quarterly for the year 2023 - IQ vol. 47(1). \nThe last article in this issue has in the title the FAIR acronym that stands for Findable, Accessible, Interoperable, and Reusable. These are the concepts most often focused on by our articles in the IQ and FAIR has an extra emphasis in this issue. The first article introduces and demonstrates a shared vocabulary for data points where the need arose after confusions about data and metadata. Basically, I find that the most valuable virtue of well-structured data – I deliberately use a fuzzy term to save you from long excursions here in the editor's notes – is that other well-structured data can benefit from use of the same software. Similarly, well-structured metadata can benefit from the same software. I also see this as the driver for the second article, on time series data and description. Sometimes, the software mentioned is the same software in both instances as metadata is treated as data or vice versa. This allows for new levels of data-driven machine actions. These days universities are busy investigating and discussing the latest chatbots. I find many of the approaches restrictive and prefer to support the inclusive ones. Likewise, I also expect and look forward to bots having great relevance for the future implementation of FAIR principles. \nThe first article is on data and metadata by George Alter, Flavio Rizzolo, and Kathi Schleidt and has the title ‘View points on data points: A shared vocabulary for cross-domain conversations on data and metadata’. The authors have observed that sharing data across scientific domains is often impeded by differences in the language used to describe data and metadata. To avoid confusion, the authors develop a terminology. Part of the confusion concerns disagreement about the boundaries between data and metadata; and that what is metadata in one domain can be data in another. The shift between data and metadata is what they name as ‘semantic transposition’. I find that such shifts are a virtue and a strength and as the authors say, there is no fixed boundary between data and metadata, and both can be acted upon by people and machines. The article draws on and refers to many other standards and developments, most cited are the data model of Observations and Measurements (ISO 19156) and tools of the Data Documentation Initiative’s Cross Domain Integration (DDI-CDI). The article is thorough and explanatory with many examples and diagrams for learning, including examples of transformations between the formats: wide, long, and multidimensional. The long format of entity-attribute-value has the value domain restricted by the attribute, and in examples time and source are added, which demonstrates how further metadata enter the format. When transposing to the wide format, this is a more familiar data matrix where the same value domain applies to the complete column. The multidimensional format with facets is for most readers the familiar aggregations published by statistical agencies. The authors argue that their domain-independent vocabulary enables the cross-domain conversation. George Alter is Research Professor Emeritus in the Institute for Social Research at the University of Michigan, Flavio Rizzolo is Senior Data Science Architect for Statistics Canada. Kathi Schleidt is a data scientist and the founder of DataCove.\nThe format discussion in the first article is also the point of the second paper on ‘Modernizing data management at the US Bureau of Labor Statistics’. The US Bureau of Labor Statistics (BLS) has a focus on time series and Daniel W. Gillman and Clayton Waring (both from the BLS) view time series data as a combination of three components: A measure element; an element for person, places, and things (PPT); and a time element. In the paper Gillman and Waring also describe the conceptual model (UML) and the design and features of the system. First, they go back in history to the 1970s and the Codd relational model and to the standards developed and refined after 2000. You will not be surprised to find here among the references also the Data Documentation Initiative’s Cross Domain Integration (DDI-CDI). The mission is: ‘to find a simple and intuitive way to store and organize statistical data with the goal of making it easy to find and use the data’. A semantic approach is adopted, i.e. the focus is on the meaning of the data based upon the ‘Measures / People-Places-Things / Time’ model. Detailed examples show how PPT are categories of dimensions, for instance ‘nurse’ is in the Standard Occupational Classification and 'hospital' in the North American Industry Classification System. The paper – like the first paper – also refers to multidimensional structures. The modernization described at BLS is expected to be released in early 2023. \nThe third paper is by João Aguiar Castro, Joana Rodrigues, Paula Mena Matos, Célia Sales, and Cristina Ribeiro where all authors are affiliated with the University of Porto. Like the earlier articles this also references the Data Documentation Initiative (DDI) with a focus on the concepts behind the FAIR acronym: Findable, Accessible, Interoperable, and Reusable. The title is: ‘Getting in touch with metadata: a DDI subset for FAIR metadata production in clinical psychology’. Clinical psychology is not an area frequently occurring in IASSIST Quarterly, but it turns out that the project described started with interviews and data description sessions with research groups in the Social Sciences for identifying a manageable DDI subset. The project also draws on other projects such as TAIL, TOGETHER, and Dendro. The TAIL project concerned the integration metadata tools in the research workflow and assessed the requirements of researchers from different domains. TOGETHER was a project in the psycho-oncology domain and family-centered care for hereditary cancer. As most researchers showed to be inexperienced with metadata, they concentrated on a DDI subset that meant that FAIR metadata would be available for deposit. Support for researchers is essential as the they have the domain expertise and can create highly detailed descriptions. On the other hand, data curators can ensure that the metadata follow the rules of FAIR. This was achieved by embedding the Dendro platform in the research workflow, where creation of metadata is performed in an incremental description of the data. The article includes screenshots of the user interface showing the choice of vocabularies. The approach and the adoption of a DDI subset produced more comprehensive metadata than is usually available.\nSubmissions of papers for the IASSIST Quarterly are always very welcome. We welcome input from IASSIST conferences or other conferences and workshops, from local presentations or papers especially written for the IQ. When you are preparing such a presentation, give a thought to turning your one-time presentation into a lasting contribution. Doing that after the event also gives you the opportunity of improving your work after feedback. We encourage you to login or create an author profile at https://www.iassistquarterly.com (our Open Journal System application). We permit authors to have 'deep links' into the IQ as well as deposition of the paper in your local repository. Chairing a conference session or workshop with the purpose of aggregating and integrating papers for a special issue IQ is also much appreciated as the information reaches many more people than the limited number of session participants and will be readily available on the IASSIST Quarterly website at https://www.iassistquarterly.com. Authors are very welcome to take a look at the instructions and layout:\nhttps://www.iassistquarterly.com/index.php/iassist/about/submissions\nAuthors can also contact me directly via e-mail: kbr@sam.sdu.dk. Should you be interested in compiling a special issue for the IQ as guest editor(s) I will also be delighted to hear from you.\nKarsten Boye Rasmussen - March 2023\n ","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IASSIST quarterly","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29173/iq1086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Welcome to the first issue of IASSIST Quarterly for the year 2023 - IQ vol. 47(1). The last article in this issue has in the title the FAIR acronym that stands for Findable, Accessible, Interoperable, and Reusable. These are the concepts most often focused on by our articles in the IQ and FAIR has an extra emphasis in this issue. The first article introduces and demonstrates a shared vocabulary for data points where the need arose after confusions about data and metadata. Basically, I find that the most valuable virtue of well-structured data – I deliberately use a fuzzy term to save you from long excursions here in the editor's notes – is that other well-structured data can benefit from use of the same software. Similarly, well-structured metadata can benefit from the same software. I also see this as the driver for the second article, on time series data and description. Sometimes, the software mentioned is the same software in both instances as metadata is treated as data or vice versa. This allows for new levels of data-driven machine actions. These days universities are busy investigating and discussing the latest chatbots. I find many of the approaches restrictive and prefer to support the inclusive ones. Likewise, I also expect and look forward to bots having great relevance for the future implementation of FAIR principles. The first article is on data and metadata by George Alter, Flavio Rizzolo, and Kathi Schleidt and has the title ‘View points on data points: A shared vocabulary for cross-domain conversations on data and metadata’. The authors have observed that sharing data across scientific domains is often impeded by differences in the language used to describe data and metadata. To avoid confusion, the authors develop a terminology. Part of the confusion concerns disagreement about the boundaries between data and metadata; and that what is metadata in one domain can be data in another. The shift between data and metadata is what they name as ‘semantic transposition’. I find that such shifts are a virtue and a strength and as the authors say, there is no fixed boundary between data and metadata, and both can be acted upon by people and machines. The article draws on and refers to many other standards and developments, most cited are the data model of Observations and Measurements (ISO 19156) and tools of the Data Documentation Initiative’s Cross Domain Integration (DDI-CDI). The article is thorough and explanatory with many examples and diagrams for learning, including examples of transformations between the formats: wide, long, and multidimensional. The long format of entity-attribute-value has the value domain restricted by the attribute, and in examples time and source are added, which demonstrates how further metadata enter the format. When transposing to the wide format, this is a more familiar data matrix where the same value domain applies to the complete column. The multidimensional format with facets is for most readers the familiar aggregations published by statistical agencies. The authors argue that their domain-independent vocabulary enables the cross-domain conversation. George Alter is Research Professor Emeritus in the Institute for Social Research at the University of Michigan, Flavio Rizzolo is Senior Data Science Architect for Statistics Canada. Kathi Schleidt is a data scientist and the founder of DataCove. The format discussion in the first article is also the point of the second paper on ‘Modernizing data management at the US Bureau of Labor Statistics’. The US Bureau of Labor Statistics (BLS) has a focus on time series and Daniel W. Gillman and Clayton Waring (both from the BLS) view time series data as a combination of three components: A measure element; an element for person, places, and things (PPT); and a time element. In the paper Gillman and Waring also describe the conceptual model (UML) and the design and features of the system. First, they go back in history to the 1970s and the Codd relational model and to the standards developed and refined after 2000. You will not be surprised to find here among the references also the Data Documentation Initiative’s Cross Domain Integration (DDI-CDI). The mission is: ‘to find a simple and intuitive way to store and organize statistical data with the goal of making it easy to find and use the data’. A semantic approach is adopted, i.e. the focus is on the meaning of the data based upon the ‘Measures / People-Places-Things / Time’ model. Detailed examples show how PPT are categories of dimensions, for instance ‘nurse’ is in the Standard Occupational Classification and 'hospital' in the North American Industry Classification System. The paper – like the first paper – also refers to multidimensional structures. The modernization described at BLS is expected to be released in early 2023. The third paper is by João Aguiar Castro, Joana Rodrigues, Paula Mena Matos, Célia Sales, and Cristina Ribeiro where all authors are affiliated with the University of Porto. Like the earlier articles this also references the Data Documentation Initiative (DDI) with a focus on the concepts behind the FAIR acronym: Findable, Accessible, Interoperable, and Reusable. The title is: ‘Getting in touch with metadata: a DDI subset for FAIR metadata production in clinical psychology’. Clinical psychology is not an area frequently occurring in IASSIST Quarterly, but it turns out that the project described started with interviews and data description sessions with research groups in the Social Sciences for identifying a manageable DDI subset. The project also draws on other projects such as TAIL, TOGETHER, and Dendro. The TAIL project concerned the integration metadata tools in the research workflow and assessed the requirements of researchers from different domains. TOGETHER was a project in the psycho-oncology domain and family-centered care for hereditary cancer. As most researchers showed to be inexperienced with metadata, they concentrated on a DDI subset that meant that FAIR metadata would be available for deposit. Support for researchers is essential as the they have the domain expertise and can create highly detailed descriptions. On the other hand, data curators can ensure that the metadata follow the rules of FAIR. This was achieved by embedding the Dendro platform in the research workflow, where creation of metadata is performed in an incremental description of the data. The article includes screenshots of the user interface showing the choice of vocabularies. The approach and the adoption of a DDI subset produced more comprehensive metadata than is usually available. Submissions of papers for the IASSIST Quarterly are always very welcome. We welcome input from IASSIST conferences or other conferences and workshops, from local presentations or papers especially written for the IQ. When you are preparing such a presentation, give a thought to turning your one-time presentation into a lasting contribution. Doing that after the event also gives you the opportunity of improving your work after feedback. We encourage you to login or create an author profile at https://www.iassistquarterly.com (our Open Journal System application). We permit authors to have 'deep links' into the IQ as well as deposition of the paper in your local repository. Chairing a conference session or workshop with the purpose of aggregating and integrating papers for a special issue IQ is also much appreciated as the information reaches many more people than the limited number of session participants and will be readily available on the IASSIST Quarterly website at https://www.iassistquarterly.com. Authors are very welcome to take a look at the instructions and layout: https://www.iassistquarterly.com/index.php/iassist/about/submissions Authors can also contact me directly via e-mail: kbr@sam.sdu.dk. Should you be interested in compiling a special issue for the IQ as guest editor(s) I will also be delighted to hear from you. Karsten Boye Rasmussen - March 2023

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

编者按:公平BOT。元数据就是数据，元数据就是数据……

欢迎来到第一期的IASSIST季度为今年2023年-智商卷47(1)。本期最后一篇文章的标题是FAIR，即可查找、可访问、可互操作和可重用。这些是我们在IQ和FAIR上的文章中最常关注的概念，在这个问题上有一个额外的强调。第一篇文章介绍并演示了数据点的共享词汇表，在混淆了数据和元数据之后，需要使用这些数据点。基本上，我发现结构良好的数据最有价值的优点——我故意使用一个模糊的术语，以免您在编辑注释中进行冗长的讨论——是其他结构良好的数据可以从使用相同的软件中受益。同样，结构良好的元数据也可以从相同的软件中受益。我也将此视为第二篇文章(关于时间序列数据和描述)的驱动因素。有时，在两种情况下提到的软件是相同的软件，因为元数据被视为数据，反之亦然。这允许数据驱动的机器操作达到新的水平。最近，大学正忙着研究和讨论最新的聊天机器人。我发现许多方法都是限制性的，我更倾向于支持包容性的方法。同样，我也期望并期待机器人与公平原则的未来实施有很大的相关性。第一篇文章是关于数据和元数据的，作者是George Alter、Flavio Rizzolo和Kathi Schleidt，文章的标题是“数据点的观点:数据和元数据跨域对话的共享词汇”。这组作者观察到，跨科学领域的数据共享常常受到用于描述数据和元数据的语言差异的阻碍。为了避免混淆，作者开发了一个术语。部分混乱涉及数据和元数据之间边界的分歧;一个领域的元数据可以是另一个领域的数据。数据和元数据之间的转换被他们称为“语义转换”。我发现这种转变是一种优点，也是一种优势，正如作者所说，数据和元数据之间没有固定的界限，两者都可以被人和机器所操作。本文借鉴并引用了许多其他标准和发展，其中引用最多的是观察和测量的数据模型(ISO 19156)和数据文档计划的跨域集成(DDI-CDI)工具。这篇文章是全面的和解释性的，有许多用于学习的示例和图表，包括格式之间的转换示例:宽、长和多维。实体-属性-值的长格式具有受属性限制的值域，并且在示例中添加了时间和源，这演示了进一步的元数据如何进入该格式。当转置到宽格式时，这是一个更熟悉的数据矩阵，其中相同的值域应用于整个列。对于大多数读者来说，带有facet的多维格式是统计机构发布的熟悉的聚合。作者认为，他们的领域独立词汇表支持跨领域对话。George Alter是密歇根大学社会研究所名誉研究教授，Flavio Rizzolo是加拿大统计局的高级数据科学架构师。Kathi Schleidt是一位数据科学家，也是DataCove的创始人。第一篇文章中的格式讨论也是第二篇关于“美国劳工统计局数据管理现代化”的论文的重点。美国劳工统计局(BLS)关注时间序列，Daniel W. Gillman和Clayton Waring(都来自BLS)将时间序列数据视为三个组成部分的组合:测量元素;人、地、物元素(PPT);还有一个时间元素。在论文中，Gillman和Waring还描述了概念模型(UML)以及系统的设计和特征。首先，他们回顾了20世纪70年代的历史和Codd关系模型，以及2000年后开发和完善的标准。您不会惊讶地发现，在这些参考文献中还有数据文档计划的跨域集成(DDI-CDI)。其使命是:“找到一种简单直观的方式来存储和组织统计数据，目标是使数据易于查找和使用”。采用语义方法，即关注基于“测量/人-地点-事物/时间”模型的数据的含义。详细的例子说明PPT是如何进行维度分类的，例如“护士”在标准职业分类中，“医院”在北美行业分类系统中。和第一篇论文一样，这篇论文也提到了多维结构。美国劳工统计局描述的现代化预计将于2023年初发布。第三篇论文是由jo<s:1> o Aguiar Castro, Joana Rodrigues, Paula Mena Matos, c<s:1>里亚萨莱斯和克里斯蒂娜里贝罗撰写的，所有作者都隶属于波尔图大学。与前面的文章一样，本文也引用了数据文档计划(DDI)，重点关注FAIR首字母缩略词背后的概念:可查找、可访问、可互操作和可重用。题目是:“接触元数据:临床心理学中FAIR元数据生成的DDI子集”。临床心理学并不是IASSIST季刊中经常出现的一个领域，但事实证明，该项目描述始于与社会科学研究小组的访谈和数据描述会议，以确定可管理的DDI子集。该项目还借鉴了TAIL、TOGETHER和Dendro等其他项目。TAIL项目关注研究工作流程中的集成元数据工具，并评估来自不同领域的研究人员的需求。TOGETHER是一个在心理肿瘤学领域和以家庭为中心的遗传性癌症护理的项目。由于大多数研究人员对元数据缺乏经验，他们将注意力集中在DDI子集上，这意味着FAIR元数据可以用于存储。对研究人员的支持是必不可少的，因为他们有领域的专业知识，可以创建非常详细的描述。另一方面，数据管理员可以确保元数据遵循FAIR规则。这是通过在研究工作流程中嵌入Dendro平台实现的，其中元数据的创建是在数据的增量描述中执行的。本文包括用户界面的屏幕截图，显示词汇表的选择。该方法和DDI子集的采用产生了比通常可用的更全面的元数据。IASSIST季刊非常欢迎提交论文。我们欢迎来自IASSIST会议或其他会议和研讨会的意见，来自当地的演讲或专门为IQ编写的论文。当你准备这样的演讲时，考虑一下把你的一次演讲变成一个持久的贡献。事后做这件事也能让你有机会在得到反馈后改进你的工作。我们鼓励您登录或创建一个作者档案https://www.iassistquarterly.com(我们的开放期刊系统应用程序)。我们允许作者有“深度链接”到智商以及沉积的论文在您的本地存储库。主持一次会议或研讨会，目的是为某一期IQ特刊收集和整合论文，这也是非常值得赞赏的，因为这些信息可以传递给更多的人，而不仅仅是有限的会议参与者，而且可以在IASSIST季刊网站https://www.iassistquarterly.com上随时获得。非常欢迎作者看一下说明和布局:https://www.iassistquarterly.com/index.php/iassist/about/submissionsAuthors也可以直接通过电子邮件与我联系:kbr@sam.sdu.dk。如果您有兴趣作为客座编辑为《IQ》编辑一期特刊，我也将很高兴收到您的来信。卡斯滕·博伊·拉斯穆森——2023年3月

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IASSIST quarterly

自引率

0.00%

发文量

期刊最新文献

Security and preservation of election data in Nigeria in the fourth industrial revolution Knowledge and perception of librarians towards cloud-based technology in academic libraries in southwest Nigeria Much new research, and advances for the IQ Data protection and right to privacy legislation in Kenya Guest editors’ notes