首页 > 最新文献

Discover data最新文献

英文 中文
The measurement errors of google trends data 谷歌趋势数据的测量误差
Pub Date : 2024-06-13 DOI: 10.1007/s44248-024-00013-3
Kerry Liu
{"title":"The measurement errors of google trends data","authors":"Kerry Liu","doi":"10.1007/s44248-024-00013-3","DOIUrl":"https://doi.org/10.1007/s44248-024-00013-3","url":null,"abstract":"","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"40 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141350132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking of Secure Group Communication schemes with focus on IoT 以物联网为重点的安全群组通信方案基准测试
Pub Date : 2024-05-23 DOI: 10.1007/s44248-024-00010-6
Thomas Prantl, André Bauer, Simon Engel, Lukas Horn, Christian Krupitzer, Lukas Iffländer, Samuel Kounev
{"title":"Benchmarking of Secure Group Communication schemes with focus on IoT","authors":"Thomas Prantl, André Bauer, Simon Engel, Lukas Horn, Christian Krupitzer, Lukas Iffländer, Samuel Kounev","doi":"10.1007/s44248-024-00010-6","DOIUrl":"https://doi.org/10.1007/s44248-024-00010-6","url":null,"abstract":"","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"8 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141105153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TFPsocialmedia: a public dataset for studying Turkish foreign policy TFPsocialmedia:研究土耳其外交政策的公共数据集
Pub Date : 2024-04-02 DOI: 10.1007/s44248-024-00009-z
Hakan Mehmetcik, M. Ganiz, Melih Koluk, Galip Yüksel, Muslim Yılmaz, Muhammed Mustafa İnce, Emre Tortumlu
{"title":"TFPsocialmedia: a public dataset for studying Turkish foreign policy","authors":"Hakan Mehmetcik, M. Ganiz, Melih Koluk, Galip Yüksel, Muslim Yılmaz, Muhammed Mustafa İnce, Emre Tortumlu","doi":"10.1007/s44248-024-00009-z","DOIUrl":"https://doi.org/10.1007/s44248-024-00009-z","url":null,"abstract":"","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"27 27","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140753206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data sharing and exchanging with incentive and optimization: a survey 通过激励和优化实现数据共享和交换:一项调查
Pub Date : 2024-03-18 DOI: 10.1007/s44248-024-00006-2
Liyuan Liu, Meng Han
{"title":"Data sharing and exchanging with incentive and optimization: a survey","authors":"Liyuan Liu, Meng Han","doi":"10.1007/s44248-024-00006-2","DOIUrl":"https://doi.org/10.1007/s44248-024-00006-2","url":null,"abstract":"","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"60 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140234195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Canadian agriculture technology adoption 加拿大农业技术采用情况
Pub Date : 2024-03-12 DOI: 10.1007/s44248-024-00008-0
Tahmid Huq Easher, Rickard Enstroem, Terry Griffin, Tomas Nilsson
{"title":"Canadian agriculture technology adoption","authors":"Tahmid Huq Easher, Rickard Enstroem, Terry Griffin, Tomas Nilsson","doi":"10.1007/s44248-024-00008-0","DOIUrl":"https://doi.org/10.1007/s44248-024-00008-0","url":null,"abstract":"","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"29 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140248636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An evaluation of NERC learning-based approaches to discover personal data in Brazilian Portuguese documents 评估基于 NERC 学习的在巴西葡萄牙语文件中发现个人数据的方法
Pub Date : 2023-11-30 DOI: 10.1007/s44248-023-00005-9
Luciano Ignaczak, Márcio Garcia Martins, C. A. da Costa, Bruna Donida, Maria Cristina Peres da Silva
{"title":"An evaluation of NERC learning-based approaches to discover personal data in Brazilian Portuguese documents","authors":"Luciano Ignaczak, Márcio Garcia Martins, C. A. da Costa, Bruna Donida, Maria Cristina Peres da Silva","doi":"10.1007/s44248-023-00005-9","DOIUrl":"https://doi.org/10.1007/s44248-023-00005-9","url":null,"abstract":"","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"93 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139206571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges and approaches when realizing online surface inspection systems with deep learning algorithms 利用深度学习算法实现在线表面检测系统的挑战和方法
Pub Date : 2023-03-30 DOI: 10.1007/s44248-023-00004-w
Henrike Stephani, Thomas Weibel, Ronald Rösch, A. Moghiseh
{"title":"Challenges and approaches when realizing online surface inspection systems with deep learning algorithms","authors":"Henrike Stephani, Thomas Weibel, Ronald Rösch, A. Moghiseh","doi":"10.1007/s44248-023-00004-w","DOIUrl":"https://doi.org/10.1007/s44248-023-00004-w","url":null,"abstract":"","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"123 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83507933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematic literature review of cyber-security data repositories and performance assessment metrics for semi-supervised learning. 对网络安全数据存储库和半监督学习绩效评估指标的系统文献综述。
Pub Date : 2023-01-01 DOI: 10.1007/s44248-023-00003-x
Paul K Mvula, Paula Branco, Guy-Vincent Jourdan, Herna L Viktor

In Machine Learning, the datasets used to build models are one of the main factors limiting what these models can achieve and how good their predictive performance is. Machine Learning applications for cyber-security or computer security are numerous including cyber threat mitigation and security infrastructure enhancement through pattern recognition, real-time attack detection, and in-depth penetration testing. Therefore, for these applications in particular, the datasets used to build the models must be carefully thought to be representative of real-world data. However, because of the scarcity of labelled data and the cost of manually labelling positive examples, there is a growing corpus of literature utilizing Semi-Supervised Learning with cyber-security data repositories. In this work, we provide a comprehensive overview of publicly available data repositories and datasets used for building computer security or cyber-security systems based on Semi-Supervised Learning, where only a few labels are necessary or available for building strong models. We highlight the strengths and limitations of the data repositories and sets and provide an analysis of the performance assessment metrics used to evaluate the built models. Finally, we discuss open challenges and provide future research directions for using cyber-security datasets and evaluating models built upon them.

在机器学习中,用于构建模型的数据集是限制这些模型实现的主要因素之一,以及它们的预测性能有多好。机器学习在网络安全或计算机安全方面的应用有很多,包括通过模式识别、实时攻击检测和深入渗透测试来缓解网络威胁和增强安全基础设施。因此,特别是对于这些应用程序,必须仔细考虑用于构建模型的数据集是否代表真实世界的数据。然而,由于标记数据的稀缺性和手动标记正例的成本,越来越多的文献利用网络安全数据存储库的半监督学习。在这项工作中,我们提供了一个全面的概述,用于构建基于半监督学习的计算机安全或网络安全系统的公开可用的数据存储库和数据集,其中只有少数标签是必要的或可用于构建强模型。我们强调了数据存储库和数据集的优势和局限性,并提供了用于评估构建模型的性能评估指标的分析。最后,我们讨论了开放的挑战,并为使用网络安全数据集和评估基于它们的模型提供了未来的研究方向。
{"title":"A systematic literature review of cyber-security data repositories and performance assessment metrics for semi-supervised learning.","authors":"Paul K Mvula,&nbsp;Paula Branco,&nbsp;Guy-Vincent Jourdan,&nbsp;Herna L Viktor","doi":"10.1007/s44248-023-00003-x","DOIUrl":"https://doi.org/10.1007/s44248-023-00003-x","url":null,"abstract":"<p><p>In Machine Learning, the datasets used to build models are one of the main factors limiting what these models can achieve and how good their predictive performance is. Machine Learning applications for cyber-security or computer security are numerous including cyber threat mitigation and security infrastructure enhancement through pattern recognition, real-time attack detection, and in-depth penetration testing. Therefore, for these applications in particular, the datasets used to build the models must be carefully thought to be representative of real-world data. However, because of the scarcity of labelled data and the cost of manually labelling positive examples, there is a growing corpus of literature utilizing Semi-Supervised Learning with cyber-security data repositories. In this work, we provide a comprehensive overview of publicly available data repositories and datasets used for building computer security or cyber-security systems based on Semi-Supervised Learning, where only a few labels are necessary or available for building strong models. We highlight the strengths and limitations of the data repositories and sets and provide an analysis of the performance assessment metrics used to evaluate the built models. Finally, we discuss open challenges and provide future research directions for using cyber-security datasets and evaluating models built upon them.</p>","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"1 1","pages":"4"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10079755/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9284026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating Word Embedding Feature Extraction Techniques for Host-Based Intrusion Detection Systems. 基于主机的入侵检测系统的词嵌入特征提取技术评价。
Pub Date : 2023-01-01 DOI: 10.1007/s44248-023-00002-y
Paul K Mvula, Paula Branco, Guy-Vincent Jourdan, Herna L Viktor

Research into Intrusion and Anomaly Detectors at the Host level typically pays much attention to extracting attributes from system call traces. These include window-based, Hidden Markov Models, and sequence-model-based attributes. Recently, several works have been focusing on sequence-model-based feature extractors, specifically Word2Vec and GloVe, to extract embeddings from the system call traces due to their ability to capture semantic relationships among system calls. However, due to the nature of the data, these extractors introduce inconsistencies in the extracted features, causing the Machine Learning models built on them to yield inaccurate and potentially misleading results. In this paper, we first highlight the research challenges posed by these extractors. Then, we conduct experiments with new feature sets assessing their suitability to address the detected issues. Our experiments show that Word2Vec is prone to introducing more duplicated samples than GloVe. Regarding the solutions proposed, we found that concatenating the embedding vectors generated by Word2Vec and GloVe yields the overall best balanced accuracy. In addition to resolving the challenge of data leakage, this approach enables an improvement in performance relative to other alternatives.

主机级入侵和异常检测器的研究通常侧重于从系统调用跟踪中提取属性。这些包括基于窗口的、隐马尔可夫模型和基于序列模型的属性。最近,一些工作集中在基于序列模型的特征提取器上,特别是Word2Vec和GloVe,由于它们能够捕获系统调用之间的语义关系,因此可以从系统调用跟踪中提取嵌入。然而,由于数据的性质,这些提取器在提取的特征中引入了不一致性,导致建立在它们之上的机器学习模型产生不准确和潜在的误导性结果。在本文中,我们首先强调了这些提取器带来的研究挑战。然后,我们用新特征集进行实验,评估它们解决检测到的问题的适用性。我们的实验表明,Word2Vec比GloVe更容易引入更多的重复样本。对于所提出的解决方案,我们发现连接由Word2Vec和GloVe生成的嵌入向量可以产生最佳的总体平衡精度。除了解决数据泄漏的问题外,这种方法还可以提高相对于其他替代方案的性能。
{"title":"Evaluating Word Embedding Feature Extraction Techniques for Host-Based Intrusion Detection Systems.","authors":"Paul K Mvula,&nbsp;Paula Branco,&nbsp;Guy-Vincent Jourdan,&nbsp;Herna L Viktor","doi":"10.1007/s44248-023-00002-y","DOIUrl":"https://doi.org/10.1007/s44248-023-00002-y","url":null,"abstract":"<p><p>Research into Intrusion and Anomaly Detectors at the Host level typically pays much attention to extracting attributes from system call traces. These include window-based, Hidden Markov Models, and sequence-model-based attributes. Recently, several works have been focusing on sequence-model-based feature extractors, specifically Word2Vec and GloVe, to extract embeddings from the system call traces due to their ability to capture semantic relationships among system calls. However, due to the nature of the data, these extractors introduce inconsistencies in the extracted features, causing the Machine Learning models built on them to yield inaccurate and potentially misleading results. In this paper, we first highlight the research challenges posed by these extractors. Then, we conduct experiments with new feature sets assessing their suitability to address the detected issues. Our experiments show that Word2Vec is prone to introducing more duplicated samples than GloVe. Regarding the solutions proposed, we found that concatenating the embedding vectors generated by Word2Vec and GloVe yields the overall best balanced accuracy. In addition to resolving the challenge of data leakage, this approach enables an improvement in performance relative to other alternatives.</p>","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"1 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10077957/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9274107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Analysing and visualising bike-sharing demand with outliers 用异常值分析和可视化共享单车需求
Pub Date : 2022-04-12 DOI: 10.1007/s44248-023-00001-z
Nicola Rennie, Catherine Cleophas, A. Sykulski, Florian Dost
{"title":"Analysing and visualising bike-sharing demand with outliers","authors":"Nicola Rennie, Catherine Cleophas, A. Sykulski, Florian Dost","doi":"10.1007/s44248-023-00001-z","DOIUrl":"https://doi.org/10.1007/s44248-023-00001-z","url":null,"abstract":"","PeriodicalId":72824,"journal":{"name":"Discover data","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74597455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Discover data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1