Pub Date : 2023-07-28DOI: 10.1140/epjds/s13688-023-00405-6
Neeti Pokhriyal, B. Valentino, Soroush Vosoughi
{"title":"Quantifying participation biases on social media","authors":"Neeti Pokhriyal, B. Valentino, Soroush Vosoughi","doi":"10.1140/epjds/s13688-023-00405-6","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00405-6","url":null,"abstract":"","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"1-20"},"PeriodicalIF":3.6,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47482929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-10DOI: 10.1140/epjds/s13688-023-00402-9
Golshid Ranjbaran, Diego Reforgiato Recupero, Gianfranco Lombardo, S. Consoli
{"title":"Leveraging augmentation techniques for tasks with unbalancedness within the financial domain: a two-level ensemble approach","authors":"Golshid Ranjbaran, Diego Reforgiato Recupero, Gianfranco Lombardo, S. Consoli","doi":"10.1140/epjds/s13688-023-00402-9","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00402-9","url":null,"abstract":"","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"1-31"},"PeriodicalIF":3.6,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42786250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-13DOI: 10.1140/epjds/s13688-023-00409-2
Salvatore Citraro, S. Deyne, Massimo Stella, Giulio Rossetti
{"title":"Towards hypergraph cognitive networks as feature-rich models of knowledge","authors":"Salvatore Citraro, S. Deyne, Massimo Stella, Giulio Rossetti","doi":"10.1140/epjds/s13688-023-00409-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00409-2","url":null,"abstract":"","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":" ","pages":"1-22"},"PeriodicalIF":3.6,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47177173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-21DOI: 10.1140/epjds/s13688-023-00388-4
Yu-Ru Lin, Shaomei Wu, Winter A. Mason
{"title":"Mapping language literacy at scale: a case study on Facebook","authors":"Yu-Ru Lin, Shaomei Wu, Winter A. Mason","doi":"10.1140/epjds/s13688-023-00388-4","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00388-4","url":null,"abstract":"","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"1-21"},"PeriodicalIF":3.6,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47224636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-02DOI: 10.1140/epjds/s13688-023-00380-y
Han Zhuang, Tzu-Yang Huang, Daniel Ernesto Acuna
{"title":"A computational analysis of accessibility, readability, and explainability of figures in open access publications","authors":"Han Zhuang, Tzu-Yang Huang, Daniel Ernesto Acuna","doi":"10.1140/epjds/s13688-023-00380-y","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00380-y","url":null,"abstract":"","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"1-16"},"PeriodicalIF":3.6,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41825467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-03DOI: 10.1140/epjds/s13688-022-00376-0
Lluc Font-Pomarol, Angelo Piga, R. M. Garcia-Teruel, Sergio Nasarre-Aznar, M. Sales-Pardo, R. Guimerà
{"title":"Socially disruptive periods and topics from information-theoretical analysis of judicial decisions","authors":"Lluc Font-Pomarol, Angelo Piga, R. M. Garcia-Teruel, Sergio Nasarre-Aznar, M. Sales-Pardo, R. Guimerà","doi":"10.1140/epjds/s13688-022-00376-0","DOIUrl":"https://doi.org/10.1140/epjds/s13688-022-00376-0","url":null,"abstract":"","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"1-15"},"PeriodicalIF":3.6,"publicationDate":"2023-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45542400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-02DOI: 10.1140/epjds/s13688-023-00401-w
Alejandro Vigna-G'omez, Javier Murillo, Manelik Ramirez, A. Borbolla, Ian M'arquez, Prasun K. Ray
{"title":"Design and analysis of tweet-based election models for the 2021 Mexican legislative election","authors":"Alejandro Vigna-G'omez, Javier Murillo, Manelik Ramirez, A. Borbolla, Ian M'arquez, Prasun K. Ray","doi":"10.1140/epjds/s13688-023-00401-w","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00401-w","url":null,"abstract":"","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"1-17"},"PeriodicalIF":3.6,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41502708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01Epub Date: 2023-06-06DOI: 10.1140/epjds/s13688-023-00395-5
Nicolò Gozzi, Niccolò Comini, Nicola Perra
Adherence to the non-pharmaceutical interventions (NPIs) put in place to mitigate the spreading of infectious diseases is a multifaceted problem. Several factors, including socio-demographic and socio-economic attributes, can influence the perceived susceptibility and risk which are known to affect behavior. Furthermore, the adoption of NPIs is dependent upon the barriers, real or perceived, associated with their implementation. Here, we study the determinants of NPIs adherence during the first wave of the COVID-19 Pandemic in Colombia, Ecuador, and El Salvador. Analyses are performed at the level of municipalities and include socio-economic, socio-demographic, and epidemiological indicators. Furthermore, by leveraging a unique dataset comprising tens of millions of internet Speedtest® measurements from Ookla®, we investigate the quality of the digital infrastructure as a possible barrier to adoption. We use mobility changes provided by Meta as a proxy of adherence to NPIs and find a significant correlation between mobility drops and digital infrastructure quality. The relationship remains significant after controlling for several factors. This finding suggests that municipalities with better internet connectivity were able to afford higher mobility reductions. We also find that mobility reductions were more pronounced in larger, denser, and wealthier municipalities.
Supplementary information: The online version contains supplementary material available at 10.1140/epjds/s13688-023-00395-5.
{"title":"The adoption of non-pharmaceutical interventions and the role of digital infrastructure during the COVID-19 pandemic in Colombia, Ecuador, and El Salvador.","authors":"Nicolò Gozzi, Niccolò Comini, Nicola Perra","doi":"10.1140/epjds/s13688-023-00395-5","DOIUrl":"10.1140/epjds/s13688-023-00395-5","url":null,"abstract":"<p><p>Adherence to the non-pharmaceutical interventions (NPIs) put in place to mitigate the spreading of infectious diseases is a multifaceted problem. Several factors, including socio-demographic and socio-economic attributes, can influence the perceived susceptibility and risk which are known to affect behavior. Furthermore, the adoption of NPIs is dependent upon the barriers, real or perceived, associated with their implementation. Here, we study the determinants of NPIs adherence during the first wave of the COVID-19 Pandemic in Colombia, Ecuador, and El Salvador. Analyses are performed at the level of municipalities and include socio-economic, socio-demographic, and epidemiological indicators. Furthermore, by leveraging a unique dataset comprising tens of millions of internet Speedtest® measurements from Ookla®, we investigate the quality of the digital infrastructure as a possible barrier to adoption. We use mobility changes provided by Meta as a proxy of adherence to NPIs and find a significant correlation between mobility drops and digital infrastructure quality. The relationship remains significant after controlling for several factors. This finding suggests that municipalities with better internet connectivity were able to afford higher mobility reductions. We also find that mobility reductions were more pronounced in larger, denser, and wealthier municipalities.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1140/epjds/s13688-023-00395-5.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"18"},"PeriodicalIF":3.6,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10243255/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9612333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01Epub Date: 2023-11-16DOI: 10.1140/epjds/s13688-023-00427-0
Segun Taofeek Aroyehun, Lukas Malik, Hannah Metzler, Nikolas Haimerl, Anna Di Natale, David Garcia
The wealth of text data generated by social media has enabled new kinds of analysis of emotions with language models. These models are often trained on small and costly datasets of text annotations produced by readers who guess the emotions expressed by others in social media posts. This affects the quality of emotion identification methods due to training data size limitations and noise in the production of labels used in model development. We present LEIA, a model for emotion identification in text that has been trained on a dataset of more than 6 million posts with self-annotated emotion labels for happiness, affection, sadness, anger, and fear. LEIA is based on a word masking method that enhances the learning of emotion words during model pre-training. LEIA achieves macro-F1 values of approximately 73 on three in-domain test datasets, outperforming other supervised and unsupervised methods in a strong benchmark that shows that LEIA generalizes across posts, users, and time periods. We further perform an out-of-domain evaluation on five different datasets of social media and other sources, showing LEIA's robust performance across media, data collection methods, and annotation schemes. Our results show that LEIA generalizes its classification of anger, happiness, and sadness beyond the domain it was trained on. LEIA can be applied in future research to provide better identification of emotions in text from the perspective of the writer.
{"title":"LEIA: Linguistic Embeddings for the Identification of Affect.","authors":"Segun Taofeek Aroyehun, Lukas Malik, Hannah Metzler, Nikolas Haimerl, Anna Di Natale, David Garcia","doi":"10.1140/epjds/s13688-023-00427-0","DOIUrl":"10.1140/epjds/s13688-023-00427-0","url":null,"abstract":"<p><p>The wealth of text data generated by social media has enabled new kinds of analysis of emotions with language models. These models are often trained on small and costly datasets of text annotations produced by readers who guess the emotions expressed by others in social media posts. This affects the quality of emotion identification methods due to training data size limitations and noise in the production of labels used in model development. We present LEIA, a model for emotion identification in text that has been trained on a dataset of more than 6 million posts with self-annotated emotion labels for happiness, affection, sadness, anger, and fear. LEIA is based on a word masking method that enhances the learning of emotion words during model pre-training. LEIA achieves macro-F1 values of approximately 73 on three in-domain test datasets, outperforming other supervised and unsupervised methods in a strong benchmark that shows that LEIA generalizes across posts, users, and time periods. We further perform an out-of-domain evaluation on five different datasets of social media and other sources, showing LEIA's robust performance across media, data collection methods, and annotation schemes. Our results show that LEIA generalizes its classification of anger, happiness, and sadness beyond the domain it was trained on. LEIA can be applied in future research to provide better identification of emotions in text from the perspective of the writer.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"52"},"PeriodicalIF":3.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654159/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138458730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1140/epjds/s13688-023-00391-9
Xiao Fan Liu, Zhen-Zhen Wang, Xiao-Ke Xu, Ye Wu, Zhidan Zhao, Huarong Deng, Ping Wang, Naipeng Chao, Yi-Hui C Huang
Human mobility restriction policies have been widely used to contain the coronavirus disease-19 (COVID-19). However, a critical question is how these policies affect individuals' behavioral and psychological well-being during and after confinement periods. Here, we analyze China's five most stringent city-level lockdowns in 2021, treating them as natural experiments that allow for examining behavioral changes in millions of people through smartphone application use. We made three fundamental observations. First, the use of physical and economic activity-related apps experienced a steep decline, yet apps that provide daily necessities maintained normal usage. Second, apps that fulfilled lower-level human needs, such as working, socializing, information seeking, and entertainment, saw an immediate and substantial increase in screen time. Those that satisfied higher-level needs, such as education, only attracted delayed attention. Third, human behaviors demonstrated resilience as most routines resumed after the lockdowns were lifted. Nonetheless, long-term lifestyle changes were observed, as significant numbers of people chose to continue working and learning online, becoming "digital residents." This study also demonstrates the capability of smartphone screen time analytics in the study of human behaviors.
Supplementary information: The online version contains supplementary material available at 10.1140/epjds/s13688-023-00391-9.
{"title":"The shock, the coping, the resilience: smartphone application use reveals Covid-19 lockdown effects on human behaviors.","authors":"Xiao Fan Liu, Zhen-Zhen Wang, Xiao-Ke Xu, Ye Wu, Zhidan Zhao, Huarong Deng, Ping Wang, Naipeng Chao, Yi-Hui C Huang","doi":"10.1140/epjds/s13688-023-00391-9","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00391-9","url":null,"abstract":"<p><p>Human mobility restriction policies have been widely used to contain the coronavirus disease-19 (COVID-19). However, a critical question is how these policies affect individuals' behavioral and psychological well-being during and after confinement periods. Here, we analyze China's five most stringent city-level lockdowns in 2021, treating them as natural experiments that allow for examining behavioral changes in millions of people through smartphone application use. We made three fundamental observations. First, the use of physical and economic activity-related apps experienced a steep decline, yet apps that provide daily necessities maintained normal usage. Second, apps that fulfilled lower-level human needs, such as working, socializing, information seeking, and entertainment, saw an immediate and substantial increase in screen time. Those that satisfied higher-level needs, such as education, only attracted delayed attention. Third, human behaviors demonstrated resilience as most routines resumed after the lockdowns were lifted. Nonetheless, long-term lifestyle changes were observed, as significant numbers of people chose to continue working and learning online, becoming \"digital residents.\" This study also demonstrates the capability of smartphone screen time analytics in the study of human behaviors.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1140/epjds/s13688-023-00391-9.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"12 1","pages":"17"},"PeriodicalIF":3.6,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10240109/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9947205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}