首页 > 最新文献

Anais do III Dataset Showcase Workshop (DSW 2021)最新文献

英文 中文
COVID19.BR: A Dataset of Misinformation about COVID-19 in Brazilian Portuguese WhatsApp Messages COVID19。BR:巴西葡萄牙语WhatsApp消息中关于COVID-19的错误信息数据集
Pub Date : 1900-01-01 DOI: 10.5753/dsw.2021.17422
Antônio Diogo Forte Martins, Lucas Cabral, Pedro Jorge Chaves Mourão, Ivandro Claudino de Sá, José Maria S. Monteiro, Javam C. Machado
Nowadays, our society suffers with a major issue that unfortunately is becoming more and more problematic, once again through social networks, that is the misinformation. The primary source of misinformation in Brazil is the messaging application WhatsApp. However, due to WhatsApp's private messaging nature, there still few misinformation data sets built specifically from this platform. In this context, building a data set of WhatsApp messages about COVID-19 in Brazilian Portuguese and label misinformation messages within it becomes a crucial challenge. In this work, we present the COVID-19.BR, a data set of WhatsApp messages about coronavirus in Brazilian Portuguese, collected from Brazilian public groups and manually labeled.
如今,我们的社会受到一个主要问题的困扰,不幸的是,这个问题又一次通过社交网络变得越来越严重,那就是错误信息。在巴西,错误信息的主要来源是即时通讯应用WhatsApp。然而,由于WhatsApp的私人通讯性质,专门从该平台构建的错误信息数据集仍然很少。在此背景下,建立一个关于COVID-19的WhatsApp巴西葡萄牙语信息数据集,并在其中标记错误信息,成为一项重大挑战。在这项工作中,我们介绍了COVID-19。BR是一组关于巴西葡萄牙语冠状病毒的WhatsApp消息数据集,从巴西公共群组收集并手动标记。
{"title":"COVID19.BR: A Dataset of Misinformation about COVID-19 in Brazilian Portuguese WhatsApp Messages","authors":"Antônio Diogo Forte Martins, Lucas Cabral, Pedro Jorge Chaves Mourão, Ivandro Claudino de Sá, José Maria S. Monteiro, Javam C. Machado","doi":"10.5753/dsw.2021.17422","DOIUrl":"https://doi.org/10.5753/dsw.2021.17422","url":null,"abstract":"Nowadays, our society suffers with a major issue that unfortunately is becoming more and more problematic, once again through social networks, that is the misinformation. The primary source of misinformation in Brazil is the messaging application WhatsApp. However, due to WhatsApp's private messaging nature, there still few misinformation data sets built specifically from this platform. In this context, building a data set of WhatsApp messages about COVID-19 in Brazilian Portuguese and label misinformation messages within it becomes a crucial challenge. In this work, we present the COVID-19.BR, a data set of WhatsApp messages about coronavirus in Brazilian Portuguese, collected from Brazilian public groups and manually labeled.","PeriodicalId":314975,"journal":{"name":"Anais do III Dataset Showcase Workshop (DSW 2021)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116504068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
PPORTAL: Public Domain Portuguese-language Literature Dataset 公共领域葡萄牙语文学数据集
Pub Date : 1900-01-01 DOI: 10.5753/dsw.2021.17416
Mariana O. Silva, Clarisse Scofield, Mirella M. Moro
Combining human expertise with book-consumers data may generate what is needed to sustain constant changes experienced in the book publishing market. Then, building and making available datasets that entirely comprise the essential elements of the book industry ecosystem is essential. However, little has been done in such a context for non-English languages, such as Portuguese. Hence, we introduce PPORTAL, a public domain Portuguese-language literature dataset composed of books-related metadata. After an overview of its building process and content, we discuss a brief exploratory data analysis to summarize its main characteristics. We also highlight potential applications, showing how PPORTAL is useful as a resource on different research domains.
将人类专业知识与图书消费者数据相结合,可能会产生维持图书出版市场不断变化所需要的东西。然后,建立和提供完全包含图书行业生态系统基本要素的可用数据集是必不可少的。然而,对于非英语语言,如葡萄牙语,在这样的背景下做的很少。因此,我们引入了PPORTAL,这是一个由图书相关元数据组成的公共领域葡萄牙语文学数据集。在概述了其建设过程和内容后,对其进行了简要的探索性数据分析,总结了其主要特点。我们还强调了潜在的应用,展示了PPORTAL作为不同研究领域的资源是如何有用的。
{"title":"PPORTAL: Public Domain Portuguese-language Literature Dataset","authors":"Mariana O. Silva, Clarisse Scofield, Mirella M. Moro","doi":"10.5753/dsw.2021.17416","DOIUrl":"https://doi.org/10.5753/dsw.2021.17416","url":null,"abstract":"Combining human expertise with book-consumers data may generate what is needed to sustain constant changes experienced in the book publishing market. Then, building and making available datasets that entirely comprise the essential elements of the book industry ecosystem is essential. However, little has been done in such a context for non-English languages, such as Portuguese. Hence, we introduce PPORTAL, a public domain Portuguese-language literature dataset composed of books-related metadata. After an overview of its building process and content, we discuss a brief exploratory data analysis to summarize its main characteristics. We also highlight potential applications, showing how PPORTAL is useful as a resource on different research domains.","PeriodicalId":314975,"journal":{"name":"Anais do III Dataset Showcase Workshop (DSW 2021)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124612789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
MUHSIC: An Open Dataset with Temporal Musical Success Information MUHSIC:一个具有时间音乐成功信息的开放数据集
Pub Date : 1900-01-01 DOI: 10.5753/dsw.2021.17415
Gabriel P. Oliveira, Gabriel R. G. Barbosa, Bruna C. Melo, Mariana O. Silva, Danilo B. Seufitelli, Mirella M. Moro
Music is an alive industry with an increasing volume of complex data that creates new challenges and opportunities for extracting knowledge, benefiting not only the different music segments but also the Music Information Retrieval (MIR) community. In this paper, we present MUHSIC, a novel dataset with enhanced information on musical success. We focus on artists and genres by combining chart-related data with acoustic metadata to describe the temporal evolution of musical careers. The enriched and curated data allow building success-based time series to investigate high-impact periods (hot streaks) in such careers, transforming complex data into knowledge. Overall, MUHSIC is a relevant tool in music-related tasks due to its easy use and replicability.
音乐是一个充满活力的行业,复杂的数据量不断增加,为提取知识创造了新的挑战和机遇,不仅使不同的音乐领域受益,而且使音乐信息检索(MIR)社区受益。在本文中,我们提出了MUHSIC,这是一个新的数据集,具有增强的音乐成功信息。我们通过将排行榜相关数据与声学元数据相结合来描述音乐职业的时间演变,从而关注艺术家和流派。丰富和整理的数据允许建立基于成功的时间序列,以调查这些职业中的高影响时期(热点),将复杂的数据转化为知识。总体而言,MUHSIC由于其易于使用和可复制性而成为与音乐相关的任务的相关工具。
{"title":"MUHSIC: An Open Dataset with Temporal Musical Success Information","authors":"Gabriel P. Oliveira, Gabriel R. G. Barbosa, Bruna C. Melo, Mariana O. Silva, Danilo B. Seufitelli, Mirella M. Moro","doi":"10.5753/dsw.2021.17415","DOIUrl":"https://doi.org/10.5753/dsw.2021.17415","url":null,"abstract":"Music is an alive industry with an increasing volume of complex data that creates new challenges and opportunities for extracting knowledge, benefiting not only the different music segments but also the Music Information Retrieval (MIR) community. In this paper, we present MUHSIC, a novel dataset with enhanced information on musical success. We focus on artists and genres by combining chart-related data with acoustic metadata to describe the temporal evolution of musical careers. The enriched and curated data allow building success-based time series to investigate high-impact periods (hot streaks) in such careers, transforming complex data into knowledge. Overall, MUHSIC is a relevant tool in music-related tasks due to its easy use and replicability.","PeriodicalId":314975,"journal":{"name":"Anais do III Dataset Showcase Workshop (DSW 2021)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133765719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Extracting and Composing a Dataset of Competitive Counter-Strike Global Offensive Matches 《反恐精英》全球进攻竞赛数据集的提取与构成
Pub Date : 1900-01-01 DOI: 10.5753/dsw.2021.17412
E. Rocha, Henrique Maio, D. Menasché, Claudio Miceli
There is a growing necessity for insightful and meaningful analyticswithin eSports: be it to entertain spectators as they watch their favorite teamscompete, to automatically identify and catch cheaters or even to gain a com-petitive edge over an opponent, there is a plethora of potential applicationsfor analytics within the scene. It follows then, that there is also a necessityfor well structured and organized datasets that enable efficient data explorationand serve as the foundation for the visualization and analytics layers. Becauseof this, the entire process - from data collection at the source to the means ofaccessing the desired information - need to be planned out to address thoseneeds. Our work provides the means by which to construct such a dataset forthe Counter-Strike Global Offensive (CS:GO) game, thus opening up a range ofpossible applications on top of the data
在电子竞技中,越来越需要有洞察力和有意义的分析:无论是在观看自己喜欢的球队比赛时娱乐观众,还是自动识别和抓住作弊者,甚至是在竞争对手面前获得竞争优势,在这个场景中,分析的潜在应用程序数不胜数。因此,良好的结构化和有组织的数据集也很有必要,以实现有效的数据探索,并作为可视化和分析层的基础。正因为如此,整个过程——从源头的数据收集到处理所需信息的手段——都需要规划出来,以满足这些需求。我们的工作提供了为反恐精英全球攻势(CS:GO)游戏构建这样一个数据集的方法,从而在数据的基础上开辟了一系列可能的应用
{"title":"Extracting and Composing a Dataset of Competitive Counter-Strike Global Offensive Matches","authors":"E. Rocha, Henrique Maio, D. Menasché, Claudio Miceli","doi":"10.5753/dsw.2021.17412","DOIUrl":"https://doi.org/10.5753/dsw.2021.17412","url":null,"abstract":"There is a growing necessity for insightful and meaningful analyticswithin eSports: be it to entertain spectators as they watch their favorite teamscompete, to automatically identify and catch cheaters or even to gain a com-petitive edge over an opponent, there is a plethora of potential applicationsfor analytics within the scene. It follows then, that there is also a necessityfor well structured and organized datasets that enable efficient data explorationand serve as the foundation for the visualization and analytics layers. Becauseof this, the entire process - from data collection at the source to the means ofaccessing the desired information - need to be planned out to address thoseneeds. Our work provides the means by which to construct such a dataset forthe Counter-Strike Global Offensive (CS:GO) game, thus opening up a range ofpossible applications on top of the data","PeriodicalId":314975,"journal":{"name":"Anais do III Dataset Showcase Workshop (DSW 2021)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132322931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FeatSet: A Compilation of Visual Features Extracted from Public Image Datasets 从公共图像数据集中提取的视觉特征汇编
Pub Date : 1900-01-01 DOI: 10.5753/dsw.2021.17417
M. Cazzolato, L. C. Scabora, Guilherme F. Zabot, M. A. Gutierrez, Caetano Traina Jr., A. Traina
In this paper, we present FeatSet, a compilation of visual features extracted from open image datasets reported in the literature. FeatSet has a collection of 11 visual features, consisting of color, texture, and shape representations of the images acquired from 13 datasets. We organized the available features in a standard collection, including the available metadata and labels, when available. We also provide a description of the domain of each dataset included in our collection, with visual analysis using Multidimensional Scaling (MDS) and Principal Components Analysis (PCA) methods. FeatSet is recommended for supervised and non-supervised learning, also widely supporting Content-Based Image Retrieval (CBIR) applications and complex data indexing using Metric Access Methods (MAMs).
在本文中,我们介绍了一个从文献中报道的开放图像数据集中提取的视觉特征汇编。FeatSet集合了11个视觉特征,包括从13个数据集获得的图像的颜色、纹理和形状表示。我们将可用的特性组织在一个标准集合中,包括可用的元数据和标签。我们还提供了我们收集的每个数据集的域描述,并使用多维尺度(MDS)和主成分分析(PCA)方法进行可视化分析。推荐用于监督和非监督学习,也广泛支持基于内容的图像检索(CBIR)应用程序和使用度量访问方法(MAMs)的复杂数据索引。
{"title":"FeatSet: A Compilation of Visual Features Extracted from Public Image Datasets","authors":"M. Cazzolato, L. C. Scabora, Guilherme F. Zabot, M. A. Gutierrez, Caetano Traina Jr., A. Traina","doi":"10.5753/dsw.2021.17417","DOIUrl":"https://doi.org/10.5753/dsw.2021.17417","url":null,"abstract":"In this paper, we present FeatSet, a compilation of visual features extracted from open image datasets reported in the literature. FeatSet has a collection of 11 visual features, consisting of color, texture, and shape representations of the images acquired from 13 datasets. We organized the available features in a standard collection, including the available metadata and labels, when available. We also provide a description of the domain of each dataset included in our collection, with visual analysis using Multidimensional Scaling (MDS) and Principal Components Analysis (PCA) methods. FeatSet is recommended for supervised and non-supervised learning, also widely supporting Content-Based Image Retrieval (CBIR) applications and complex data indexing using Metric Access Methods (MAMs).","PeriodicalId":314975,"journal":{"name":"Anais do III Dataset Showcase Workshop (DSW 2021)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130737000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Anais do III Dataset Showcase Workshop (DSW 2021)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1