首页 > 最新文献

Big Data最新文献

英文 中文
ODQN-Net: Optimized Deep Q Neural Networks for Disease Prediction Through Tongue Image Analysis Using Remora Optimization Algorithm. ODQN-Net:利用 Remora 优化算法通过舌头图像分析进行疾病预测的优化深度 Q 神经网络
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-12-01 Epub Date: 2023-09-13 DOI: 10.1089/big.2023.0014
S V N Sreenivasu, P Santosh Kumar Patra, Vasujadevi Midasala, G S N Murthy, Krishna Chaitanya Janapati, J N V R Swarup Kumar, Pala Mahesh Kumar

Tongue analysis plays the major role in disease type prediction and classification according to Indian ayurvedic medicine. Traditionally, there is a manual inspection of tongue image by the expert ayurvedic doctor to identify or predict the disease. However, this is time-consuming and even imprecise. Due to the advancements in recent machine learning models, several researchers addressed the disease prediction from tongue image analysis. However, they have failed to provide enough accuracy. In addition, multiclass disease classification with enhanced accuracy is still a challenging task. Therefore, this article focuses on the development of optimized deep q-neural network (DQNN) for disease identification and classification from tongue images, hereafter referred as ODQN-Net. Initially, the multiscale retinex approach is introduced for enhancing the quality of tongue images, which also acts as a noise removal technique. In addition, a local ternary pattern is used to extract the disease-specific and disease-dependent features based on color analysis. Then, the best features are extracted from the available features set using the natural inspired Remora optimization algorithm with reduced computational time. Finally, the DQNN model is used to classify the type of diseases from these pretrained features. The obtained simulation performance on tongue imaging data set proved that the proposed ODQN-Net resulted in superior performance compared with state-of-the-art approaches with 99.17% of accuracy and 99.75% and 99.84% of F1-score and Mathew's correlation coefficient, respectively.

根据印度阿育吠陀医学,舌头分析在疾病类型预测和分类方面发挥着重要作用。传统上,阿育吠陀医学专家通过手动检查舌头图像来识别或预测疾病。然而,这不仅耗时,而且不精确。由于近来机器学习模型的进步,一些研究人员开始通过舌头图像分析来预测疾病。然而,这些研究未能提供足够的准确性。此外,提高准确性的多类疾病分类仍是一项具有挑战性的任务。因此,本文重点研究开发优化的深度 q 神经网络(DQNN),用于从舌头图像进行疾病识别和分类,以下简称 ODQN-Net。首先,本文引入了多尺度视网膜方法来提高舌头图像的质量,该方法同时也是一种去噪技术。此外,还使用局部三元模式来提取基于颜色分析的疾病特异性特征和疾病依赖性特征。然后,利用受自然启发的 Remora 优化算法从可用的特征集中提取最佳特征,并缩短计算时间。最后,使用 DQNN 模型根据这些预训练特征对疾病类型进行分类。在舌头成像数据集上获得的模拟性能证明,与最先进的方法相比,所提出的 ODQN-Net 具有更优越的性能,准确率为 99.17%,F1 分数和 Mathew 相关系数分别为 99.75% 和 99.84%。
{"title":"ODQN-Net: Optimized Deep Q Neural Networks for Disease Prediction Through Tongue Image Analysis Using Remora Optimization Algorithm.","authors":"S V N Sreenivasu, P Santosh Kumar Patra, Vasujadevi Midasala, G S N Murthy, Krishna Chaitanya Janapati, J N V R Swarup Kumar, Pala Mahesh Kumar","doi":"10.1089/big.2023.0014","DOIUrl":"10.1089/big.2023.0014","url":null,"abstract":"<p><p>Tongue analysis plays the major role in disease type prediction and classification according to Indian ayurvedic medicine. Traditionally, there is a manual inspection of tongue image by the expert ayurvedic doctor to identify or predict the disease. However, this is time-consuming and even imprecise. Due to the advancements in recent machine learning models, several researchers addressed the disease prediction from tongue image analysis. However, they have failed to provide enough accuracy. In addition, multiclass disease classification with enhanced accuracy is still a challenging task. Therefore, this article focuses on the development of optimized deep q-neural network (DQNN) for disease identification and classification from tongue images, hereafter referred as ODQN-Net. Initially, the multiscale retinex approach is introduced for enhancing the quality of tongue images, which also acts as a noise removal technique. In addition, a local ternary pattern is used to extract the disease-specific and disease-dependent features based on color analysis. Then, the best features are extracted from the available features set using the natural inspired Remora optimization algorithm with reduced computational time. Finally, the DQNN model is used to classify the type of diseases from these pretrained features. The obtained simulation performance on tongue imaging data set proved that the proposed ODQN-Net resulted in superior performance compared with state-of-the-art approaches with 99.17% of accuracy and 99.75% and 99.84% of F1-score and Mathew's correlation coefficient, respectively.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"452-465"},"PeriodicalIF":4.6,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10223867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sharing Medical Big Data While Preserving Patient Confidentiality in Innovative Medicines Initiative: A Summary and Case Report from BigData@Heart. 在创新药物倡议中共享医疗大数据同时保护患者机密:来自BigData@Heart.
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-12-01 Epub Date: 2023-10-27 DOI: 10.1089/big.2022.0178
Megan Schröder, Sam H A Muller, Eleni Vradi, Johanna Mielke, Yvonne M F Lim, Fabrice Couvelard, Menno Mostert, Stefan Koudstaal, Marinus J C Eijkemans, Christoph Gerlinger

Sharing individual patient data (IPD) is a simple concept but complex to achieve due to data privacy and data security concerns, underdeveloped guidelines, and legal barriers. Sharing IPD is additionally difficult in big data-driven collaborations such as Bigdata@Heart in the Innovative Medicines Initiative, due to competing interests between diverse consortium members. One project within BigData@Heart, case study 1, needed to pool data from seven heterogeneous data sets: five randomized controlled trials from three different industry partners, and two disease registries. Sharing IPD was not considered feasible due to legal requirements and the sensitive medical nature of these data. In addition, harmonizing the data sets for a federated data analysis was difficult due to capacity constraints and the heterogeneity of the data sets. An alternative option was to share summary statistics through contingency tables. Here it is demonstrated that this method along with anonymization methods to ensure patient anonymity had minimal loss of information. Although sharing IPD should continue to be encouraged and strived for, our approach achieved a good balance between data transparency while protecting patient privacy. It also allowed a successful collaboration between industry and academia.

共享个人患者数据(IPD)是一个简单的概念,但由于数据隐私和数据安全问题、指导方针不完善以及法律障碍,实现起来很复杂。在诸如Bigdata@Heart在创新药物倡议中,由于不同联盟成员之间的利益竞争。一个项目BigData@Heart,案例研究1,需要汇集来自七个异质数据集的数据:来自三个不同行业合作伙伴的五项随机对照试验,以及两个疾病登记处。由于法律要求和这些数据的敏感医学性质,共享IPD被认为是不可行的。此外,由于容量限制和数据集的异质性,统一联邦数据分析的数据集很困难。另一种选择是通过列联表共享汇总统计数据。这里证明了这种方法以及确保患者匿名性的匿名化方法具有最小的信息损失。尽管应该继续鼓励和努力共享IPD,但我们的方法在数据透明度和保护患者隐私之间取得了良好的平衡。它还促成了工业界和学术界之间的成功合作。
{"title":"Sharing Medical Big Data While Preserving Patient Confidentiality in Innovative Medicines Initiative: A Summary and Case Report from BigData@Heart.","authors":"Megan Schröder, Sam H A Muller, Eleni Vradi, Johanna Mielke, Yvonne M F Lim, Fabrice Couvelard, Menno Mostert, Stefan Koudstaal, Marinus J C Eijkemans, Christoph Gerlinger","doi":"10.1089/big.2022.0178","DOIUrl":"10.1089/big.2022.0178","url":null,"abstract":"<p><p>Sharing individual patient data (IPD) is a simple concept but complex to achieve due to data privacy and data security concerns, underdeveloped guidelines, and legal barriers. Sharing IPD is additionally difficult in big data-driven collaborations such as Bigdata@Heart in the Innovative Medicines Initiative, due to competing interests between diverse consortium members. One project within BigData@Heart, case study 1, needed to pool data from seven heterogeneous data sets: five randomized controlled trials from three different industry partners, and two disease registries. Sharing IPD was not considered feasible due to legal requirements and the sensitive medical nature of these data. In addition, harmonizing the data sets for a federated data analysis was difficult due to capacity constraints and the heterogeneity of the data sets. An alternative option was to share summary statistics through contingency tables. Here it is demonstrated that this method along with anonymization methods to ensure patient anonymity had minimal loss of information. Although sharing IPD should continue to be encouraged and strived for, our approach achieved a good balance between data transparency while protecting patient privacy. It also allowed a successful collaboration between industry and academia.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"399-407"},"PeriodicalIF":4.6,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10733752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"61566098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The incidence and prevalence of coeliac disease in the United Kingdom 英国乳糜泻的发病率和流行率
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-01 DOI: 10.1370/afm.22.s1.5051
Yvonne Nartey, C. Crooks, Joe West, Timothy R. Card, Laila J. Tata
{"title":"The incidence and prevalence of coeliac disease in the United Kingdom","authors":"Yvonne Nartey, C. Crooks, Joe West, Timothy R. Card, Laila J. Tata","doi":"10.1370/afm.22.s1.5051","DOIUrl":"https://doi.org/10.1370/afm.22.s1.5051","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"18 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139303896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Analysis of Serious Illness Conversations Predicts Patient Reports of Feeling Heard & Understood 重症患者对话的机器学习分析可预测患者关于被倾听和理解的报告
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-01 DOI: 10.1370/afm.22.s1.5279
Bob Gramling, Donna Rizzo, Margaret Eppstein, Bradford Demarest
{"title":"Machine Learning Analysis of Serious Illness Conversations Predicts Patient Reports of Feeling Heard & Understood","authors":"Bob Gramling, Donna Rizzo, Margaret Eppstein, Bradford Demarest","doi":"10.1370/afm.22.s1.5279","DOIUrl":"https://doi.org/10.1370/afm.22.s1.5279","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"284 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139291867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Changes in Reasons for Visits to Primary Care as a Result of the COVID-19 Pandemic: by INTRePID COVID-19 大流行导致初级保健就诊原因的变化:按 INTRePID 分类
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-01 DOI: 10.1370/afm.22.s1.5425
Karen Tu, M. Lapadula
{"title":"Changes in Reasons for Visits to Primary Care as a Result of the COVID-19 Pandemic: by INTRePID","authors":"Karen Tu, M. Lapadula","doi":"10.1370/afm.22.s1.5425","DOIUrl":"https://doi.org/10.1370/afm.22.s1.5425","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"11 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139301044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breast cancer screening during the COVID-19 Pandemic in the United States: Results from real-world health records data 美国 COVID-19 大流行期间的乳腺癌筛查:来自真实世界健康记录数据的结果
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-01 DOI: 10.1370/afm.22.s1.4885
William Curry, Wen-Jan Tuan, Qiushi Chen, Andrew Chung
{"title":"Breast cancer screening during the COVID-19 Pandemic in the United States: Results from real-world health records data","authors":"William Curry, Wen-Jan Tuan, Qiushi Chen, Andrew Chung","doi":"10.1370/afm.22.s1.4885","DOIUrl":"https://doi.org/10.1370/afm.22.s1.4885","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"48 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139292120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Method for Utilizing Electronic Health Record Data in Condition-specific Research 在特定病症研究中利用电子健康记录数据的新方法
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-01 DOI: 10.1370/afm.22.s1.4955
Tarin Clay, Melissa Filippi, Elise Robertson, Cory B. Lutgen, Elisabeth F. Callen
{"title":"A Novel Method for Utilizing Electronic Health Record Data in Condition-specific Research","authors":"Tarin Clay, Melissa Filippi, Elise Robertson, Cory B. Lutgen, Elisabeth F. Callen","doi":"10.1370/afm.22.s1.4955","DOIUrl":"https://doi.org/10.1370/afm.22.s1.4955","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"12 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139294842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harmonized Healthcare Database across Family Medicine Institutions 全科医疗机构的统一医疗保健数据库
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-01 DOI: 10.1370/afm.22.s1.5404
Chance R. Strenth, David Schneider, U. Sambamoorthi, Sravan Mattevada, Kimberly Fulda, Bhaskar Thakur, Anna Espinoza
{"title":"Harmonized Healthcare Database across Family Medicine Institutions","authors":"Chance R. Strenth, David Schneider, U. Sambamoorthi, Sravan Mattevada, Kimberly Fulda, Bhaskar Thakur, Anna Espinoza","doi":"10.1370/afm.22.s1.5404","DOIUrl":"https://doi.org/10.1370/afm.22.s1.5404","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"14 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139291188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying the Factors Associated with the Accumulation of Diabetes Complications to Inform a Prediction Tool 确定糖尿病并发症累积的相关因素,为预测工具提供依据
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-01 DOI: 10.1370/afm.22.s1.5071
Winston R. Liaw, Ben King, Omolola E. Adepoju, Jiangtao Luo, Ioannis Kakadiaris, Todd Prewitt, Jessica Dobbins, Pete Womack
{"title":"Identifying the Factors Associated with the Accumulation of Diabetes Complications to Inform a Prediction Tool","authors":"Winston R. Liaw, Ben King, Omolola E. Adepoju, Jiangtao Luo, Ioannis Kakadiaris, Todd Prewitt, Jessica Dobbins, Pete Womack","doi":"10.1370/afm.22.s1.5071","DOIUrl":"https://doi.org/10.1370/afm.22.s1.5071","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"23 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139291940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Big Data Confidentiality: An Approach Toward Corporate Compliance Using a Rule-Based System. 大数据保密:使用基于规则的系统实现企业合规的方法。
IF 4.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-10-31 DOI: 10.1089/big.2022.0201
Georgios Vranopoulos, Nathan Clarke, Shirley Atkinson

Organizations have been investing in analytics relying on internal and external data to gain a competitive advantage. However, the legal and regulatory acts imposed nationally and internationally have become a challenge, especially for highly regulated sectors such as health or finance/banking. Data handlers such as Facebook and Amazon have already sustained considerable fines or are under investigation due to violations of data governance. The era of big data has further intensified the challenges of minimizing the risk of data loss by introducing the dimensions of Volume, Velocity, and Variety into confidentiality. Although Volume and Velocity have been extensively researched, Variety, "the ugly duckling" of big data, is often neglected and difficult to solve, thus increasing the risk of data exposure and data loss. In mitigating the risk of data exposure and data loss in this article, a framework is proposed to utilize algorithmic classification and workflow capabilities to provide a consistent approach toward data evaluations across the organizations. A rule-based system, implementing the corporate data classification policy, will minimize the risk of exposure by facilitating users to identify the approved guidelines and enforce them quickly. The framework includes an exception handling process with appropriate approval for extenuating circumstances. The system was implemented in a proof of concept working prototype to showcase the capabilities and provide a hands-on experience. The information system was evaluated and accredited by a diverse audience of academics and senior business executives in the fields of security and data management. The audience had an average experience of ∼25 years and amasses a total experience of almost three centuries (294 years). The results confirmed that the 3Vs are of concern and that Variety, with a majority of 90% of the commentators, is the most troubling. In addition to that, with an approximate average of 60%, it was confirmed that appropriate policies, procedure, and prerequisites for classification are in place while implementation tools are lagging.

组织一直在投资于依赖内部和外部数据的分析,以获得竞争优势。然而,国家和国际上实施的法律和监管法案已成为一项挑战,尤其是对卫生或金融/银行等高度监管的部门而言。脸书(Facebook)和亚马逊(Amazon)等数据处理公司已经因违反数据治理规定而被处以巨额罚款,或正在接受调查。大数据时代通过将Volume、Velocity和Variety等维度引入保密性,进一步加剧了将数据丢失风险降至最低的挑战。尽管Volume和Velocity已经得到了广泛的研究,但Variety这个大数据的“丑小鸭”却经常被忽视和难以解决,从而增加了数据暴露和数据丢失的风险。在本文中,为了降低数据暴露和数据丢失的风险,提出了一个框架,利用算法分类和工作流功能,为跨组织的数据评估提供一致的方法。一个基于规则的系统,实施公司数据分类政策,将通过方便用户识别批准的指导方针并迅速执行,将暴露风险降至最低。该框架包括一个例外处理程序,对情有可原的情况给予适当批准。该系统是在概念验证工作原型中实现的,以展示其能力并提供动手体验。安全和数据管理领域的学者和高级企业高管对该信息系统进行了评估和认可。观众平均经历了~25年,积累了近三个世纪(294年)的总经历。结果证实,3V令人担忧,而拥有90%评论员的《综艺》是最令人担忧的。除此之外,平均水平约为60%,证实了适当的分类政策、程序和先决条件已经到位,而实施工具却滞后。
{"title":"Big Data Confidentiality: An Approach Toward Corporate Compliance Using a Rule-Based System.","authors":"Georgios Vranopoulos,&nbsp;Nathan Clarke,&nbsp;Shirley Atkinson","doi":"10.1089/big.2022.0201","DOIUrl":"https://doi.org/10.1089/big.2022.0201","url":null,"abstract":"<p><p>Organizations have been investing in analytics relying on internal and external data to gain a competitive advantage. However, the legal and regulatory acts imposed nationally and internationally have become a challenge, especially for highly regulated sectors such as health or finance/banking. Data handlers such as Facebook and Amazon have already sustained considerable fines or are under investigation due to violations of data governance. The era of big data has further intensified the challenges of minimizing the risk of data loss by introducing the dimensions of Volume, Velocity, and Variety into confidentiality. Although Volume and Velocity have been extensively researched, Variety, \"the ugly duckling\" of big data, is often neglected and difficult to solve, thus increasing the risk of data exposure and data loss. In mitigating the risk of data exposure and data loss in this article, a framework is proposed to utilize algorithmic classification and workflow capabilities to provide a consistent approach toward data evaluations across the organizations. A rule-based system, implementing the corporate data classification policy, will minimize the risk of exposure by facilitating users to identify the approved guidelines and enforce them quickly. The framework includes an exception handling process with appropriate approval for extenuating circumstances. The system was implemented in a proof of concept working prototype to showcase the capabilities and provide a hands-on experience. The information system was evaluated and accredited by a diverse audience of academics and senior business executives in the fields of security and data management. The audience had an average experience of ∼25 years and amasses a total experience of almost three centuries (294 years). The results confirmed that the 3Vs are of concern and that Variety, with a majority of 90% of the commentators, is the most troubling. In addition to that, with an approximate average of 60%, it was confirmed that appropriate policies, procedure, and prerequisites for classification are in place while implementation tools are lagging.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71415222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Big Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1