{"title":"FAIR but not Necessarily Open: Sensitive data in the domain of biodiversity","authors":"Patricia Mergen, S. Meeus, F. Leliaert","doi":"10.3897/biss.7.112296","DOIUrl":null,"url":null,"abstract":"In the framework of implementing the European Open Science Cloud (EOSC), there is still confusion between the concept of data FAIRness (Findable, Accessible, Interoperable and Re-usable, Wilkinson et al. 2016) and the idea of open and freely accessible data, which are not necessarily the same. Data can indeed comply with the requirements of FAIRness even if their access is moderated or behind a paywall. Therefore the motto of EOSC is actually “As open as possible, as closed as necessary”. This confusion or misinterpretation of definitions has raised concerns among potential data providers who fear being obligated to make sensitive data openly accessible and freely available, even if there are valid reasons for restrictions, or to forfeit any charges or hamper profit making if the data generate revenue. As a result, there has been some reluctance to fully engage in the activities related to FAIR data and the EOSC.\n When addressing sensitive data, what comes to mind are personal data governed by the General Data Protection Regulation (GDPR), as well as clinical, security, military, or commercially valuable data protected by patents. In the domain of biodiversity or natural history collections, it is often reported that these issues surrounding sensitive data regulations have less impact, especially when contributors are properly cited and embargo periods are respected. However, there are cases in this domain where sensitive data must be considered for legal or ethical purposes. Examples include protected or endangered species, where the exact geographic coordinates might not be shared openly to avoid poaching; cases of Access and Benefit sharing (ABS), depending on the country of origin of the species; the respect of traditional knowledge; and a desire to limit the commercial exploitation of the data. The requirements of the Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization to the Convention on Biological Diversity, as well as the upcoming Digital Sequence Information regulations (DSI), play an important role here. The Digital Services Act (DSA) was recently adopted with the aim of the protection of the digital space against the spread of illegal content, which sets the interoperability requirements for operators of data spaces. This raises questions on the actual definition of data spaces and how they would be affected by this new European legislation but with a worldwide impact on widely used social media and content platforms such as Google or YouTube.\n During the implementation and updating activities in projects and initiatives like Biodiversity Community Integrated Knowledge Library (BiCIKL), it became clear that there is a need to offer a secure data repository and management system that can deal with both open and non-open data in order to effectively include all potential data providers and mobilise their content while adhering to FAIR requirements.\n In this talk, after a general introduction about sensitive data, we will provide several examples in the biodiversity and natural sciences domains on how to deal with sensitive data and their management, such as recommended by GBIF. Last, but not least, we will highlight how important it is to use internationally accepted standards such as those from Biodiversity Information Standards (TDWG) to achieve such developments in the context of the Biodiversity Knowledge Hub (BKH) implemented by BiCIKL. Notably, by providing clear metadata about the terms of use, citation requirements and licensing, actual re-use of the data is made possible both legally and efficiently.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"61 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Information Science and Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/biss.7.112296","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the framework of implementing the European Open Science Cloud (EOSC), there is still confusion between the concept of data FAIRness (Findable, Accessible, Interoperable and Re-usable, Wilkinson et al. 2016) and the idea of open and freely accessible data, which are not necessarily the same. Data can indeed comply with the requirements of FAIRness even if their access is moderated or behind a paywall. Therefore the motto of EOSC is actually “As open as possible, as closed as necessary”. This confusion or misinterpretation of definitions has raised concerns among potential data providers who fear being obligated to make sensitive data openly accessible and freely available, even if there are valid reasons for restrictions, or to forfeit any charges or hamper profit making if the data generate revenue. As a result, there has been some reluctance to fully engage in the activities related to FAIR data and the EOSC.
When addressing sensitive data, what comes to mind are personal data governed by the General Data Protection Regulation (GDPR), as well as clinical, security, military, or commercially valuable data protected by patents. In the domain of biodiversity or natural history collections, it is often reported that these issues surrounding sensitive data regulations have less impact, especially when contributors are properly cited and embargo periods are respected. However, there are cases in this domain where sensitive data must be considered for legal or ethical purposes. Examples include protected or endangered species, where the exact geographic coordinates might not be shared openly to avoid poaching; cases of Access and Benefit sharing (ABS), depending on the country of origin of the species; the respect of traditional knowledge; and a desire to limit the commercial exploitation of the data. The requirements of the Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization to the Convention on Biological Diversity, as well as the upcoming Digital Sequence Information regulations (DSI), play an important role here. The Digital Services Act (DSA) was recently adopted with the aim of the protection of the digital space against the spread of illegal content, which sets the interoperability requirements for operators of data spaces. This raises questions on the actual definition of data spaces and how they would be affected by this new European legislation but with a worldwide impact on widely used social media and content platforms such as Google or YouTube.
During the implementation and updating activities in projects and initiatives like Biodiversity Community Integrated Knowledge Library (BiCIKL), it became clear that there is a need to offer a secure data repository and management system that can deal with both open and non-open data in order to effectively include all potential data providers and mobilise their content while adhering to FAIR requirements.
In this talk, after a general introduction about sensitive data, we will provide several examples in the biodiversity and natural sciences domains on how to deal with sensitive data and their management, such as recommended by GBIF. Last, but not least, we will highlight how important it is to use internationally accepted standards such as those from Biodiversity Information Standards (TDWG) to achieve such developments in the context of the Biodiversity Knowledge Hub (BKH) implemented by BiCIKL. Notably, by providing clear metadata about the terms of use, citation requirements and licensing, actual re-use of the data is made possible both legally and efficiently.
在实施欧洲开放科学云(EOSC)的框架中,数据公平性(可查找、可访问、可互操作和可重用,Wilkinson et al. 2016)的概念与开放和自由访问数据的概念之间仍然存在混淆,这两个概念不一定相同。数据确实可以符合公平的要求,即使它们的访问是经过审核的或在付费墙后面。因此,EOSC的座右铭实际上是“尽可能开放,必要时尽可能封闭”。这种对定义的混淆或误解引起了潜在数据提供者的担忧,他们担心有义务公开访问和免费提供敏感数据,即使有正当理由进行限制,或者在数据产生收入时放弃任何收费或妨碍盈利。因此,有些人不愿意充分参与与FAIR数据和EOSC有关的活动。在处理敏感数据时,首先想到的是受通用数据保护条例(GDPR)管辖的个人数据,以及受专利保护的临床、安全、军事或商业价值数据。在生物多样性或自然历史收藏领域,经常有报道称,围绕敏感数据法规的这些问题影响较小,特别是在适当引用贡献者和尊重禁运期的情况下。但是,在这个领域中,出于法律或道德目的必须考虑敏感数据的情况。例子包括受保护或濒临灭绝的物种,它们的确切地理坐标可能不会公开分享,以避免偷猎;获取和惠益分享(ABS)案例,具体取决于物种的原产国;尊重传统知识;以及限制数据商业利用的愿望。《生物多样性公约关于获取遗传资源和公平公正分享利用遗传资源所产生惠益的名古屋议定书》的要求以及即将出台的《数字序列信息条例》(DSI)在这方面发挥着重要作用。最近通过了《数字服务法案》(DSA),旨在保护数字空间免受非法内容的传播,该法案规定了数据空间运营商的互操作性要求。这就提出了关于数据空间的实际定义的问题,以及它们将如何受到这项新的欧洲立法的影响,但对b谷歌或YouTube等广泛使用的社交媒体和内容平台产生全球影响。在生物多样性社区综合知识图书馆(BiCIKL)等项目和倡议的实施和更新活动中,很明显,需要提供一个安全的数据存储库和管理系统,可以处理开放和非开放数据,以便有效地包括所有潜在的数据提供者,并在遵守FAIR要求的同时调动他们的内容。在本次演讲中,在对敏感数据进行了一般性介绍之后,我们将提供生物多样性和自然科学领域中如何处理敏感数据及其管理的几个例子,例如GBIF推荐的。最后,但并非最不重要的是,我们将强调在BiCIKL实施的生物多样性知识中心(BKH)的背景下,使用生物多样性信息标准(TDWG)等国际公认的标准来实现这些发展的重要性。值得注意的是,通过提供有关使用条款、引用要求和许可的清晰元数据,可以合法有效地对数据进行实际再利用。