Pub Date : 2023-06-08DOI: 10.1007/s42001-023-00211-0
Christelle Khalaf, G. Michaud, G. J. Jolley
{"title":"Predicting declining and growing occupations using supervised machine learning","authors":"Christelle Khalaf, G. Michaud, G. J. Jolley","doi":"10.1007/s42001-023-00211-0","DOIUrl":"https://doi.org/10.1007/s42001-023-00211-0","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"90 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76679764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-15DOI: 10.1007/s42001-023-00206-x
Louis Magowan
The COVID-19 pandemic meant that, in 2020, students in England were unable to sit their examinations and instead received predicted grades, or "centre assessment grades" (CAGs), from their teachers to allow them to progress. Using the Grading and Admissions Data for England (GRADE) dataset for students from 2018 to 2020, this study treats the use of CAGs as a natural experiment for causally understanding how teacher judgements of academic ability may be biased according to the demographic and socio-economic characteristics of their students. A variety of machine learning models were trained on the 2018-19 data and then used to generate predictions for what the 2020 students were likely to have received had their examinations taken place as usual. The differences between these predictions and the CAGs that students received were calculated and then averaged across students' different characteristics, revealing what the treatment effects of the use of CAGs were likely to have been for different types of students. No evidence of absolute negative bias against students of any demographic or socio-economic characteristic was found, with all groups of students having received higher CAGs than the grades they were likely to have received had they sat their examinations. Some evidence for relative bias was found, with consistent, but insubstantial differences being observed in the treatment effects of certain groups. However, when higher-order interactions of student characteristics were considered, these differences became more substantial. Intersectional perspectives which emphasise interactions and sub-group differences should be used more widely within quantitative educational equalities research.
{"title":"Centre assessment grades in 2020: a natural experiment for investigating bias in teacher judgements.","authors":"Louis Magowan","doi":"10.1007/s42001-023-00206-x","DOIUrl":"10.1007/s42001-023-00206-x","url":null,"abstract":"<p><p>The COVID-19 pandemic meant that, in 2020, students in England were unable to sit their examinations and instead received predicted grades, or \"centre assessment grades\" (CAGs), from their teachers to allow them to progress. Using the Grading and Admissions Data for England (GRADE) dataset for students from 2018 to 2020, this study treats the use of CAGs as a natural experiment for causally understanding how teacher judgements of academic ability may be biased according to the demographic and socio-economic characteristics of their students. A variety of machine learning models were trained on the 2018-19 data and then used to generate predictions for what the 2020 students were likely to have received had their examinations taken place as usual. The differences between these predictions and the CAGs that students received were calculated and then averaged across students' different characteristics, revealing what the treatment effects of the use of CAGs were likely to have been for different types of students. No evidence of absolute negative bias against students of any demographic or socio-economic characteristic was found, with all groups of students having received higher CAGs than the grades they were likely to have received had they sat their examinations. Some evidence for relative bias was found, with consistent, but insubstantial differences being observed in the treatment effects of certain groups. However, when higher-order interactions of student characteristics were considered, these differences became more substantial. Intersectional perspectives which emphasise interactions and sub-group differences should be used more widely within quantitative educational equalities research.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":" ","pages":"1-45"},"PeriodicalIF":3.2,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10184100/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9709243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-05DOI: 10.1007/s42001-023-00199-7
Manfred Stede, Yannic Bracke, Luka Borec, Neele Charlotte Kinkel, Maria Skeppstedt
{"title":"Framing climate change in Nature and Science editorials: applications of supervised and unsupervised text categorization","authors":"Manfred Stede, Yannic Bracke, Luka Borec, Neele Charlotte Kinkel, Maria Skeppstedt","doi":"10.1007/s42001-023-00199-7","DOIUrl":"https://doi.org/10.1007/s42001-023-00199-7","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"12 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79435014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-03DOI: 10.1007/s42001-023-00208-9
Aleksandra Urman, Mykola Makhortykh
In this article, we conduct a comparative analysis of web search behaviors in Switzerland and Germany. For this aim, we rely on a combination of web tracking data and survey data collected over a period of 2 months from users in Germany (n = 558) and Switzerland (n = 563). We find that web search accounts for 13% of all desktop browsing, with the share being higher in Switzerland than in Germany. In over 50% of cases users clicked on the first search result, with over 97% of all clicks being made on the first page of search outputs. Most users rely on Google when conducting searches, with some differences observed in users' preferences for other engines across demographic groups. Further, we observe differences in the temporal patterns of web search use between women and men, marking the necessity of disaggregating data by gender in observational studies regarding online information seeking behaviors. Our findings highlight the contextual differences in web search behavior across countries and demographic groups that should be taken into account when examining search behavior and the potential effects of web search result quality on societies and individuals.
{"title":"You are how (and where) you search? Comparative analysis of web search behavior using web tracking data.","authors":"Aleksandra Urman, Mykola Makhortykh","doi":"10.1007/s42001-023-00208-9","DOIUrl":"10.1007/s42001-023-00208-9","url":null,"abstract":"<p><p>In this article, we conduct a comparative analysis of web search behaviors in Switzerland and Germany. For this aim, we rely on a combination of web tracking data and survey data collected over a period of 2 months from users in Germany (<i>n</i> = 558) and Switzerland (<i>n</i> = 563). We find that web search accounts for 13% of all desktop browsing, with the share being higher in Switzerland than in Germany. In over 50% of cases users clicked on the first search result, with over 97% of all clicks being made on the first page of search outputs. Most users rely on Google when conducting searches, with some differences observed in users' preferences for other engines across demographic groups. Further, we observe differences in the temporal patterns of web search use between women and men, marking the necessity of disaggregating data by gender in observational studies regarding online information seeking behaviors. Our findings highlight the contextual differences in web search behavior across countries and demographic groups that should be taken into account when examining search behavior and the potential effects of web search result quality on societies and individuals.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":" ","pages":"1-16"},"PeriodicalIF":3.2,"publicationDate":"2023-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10155157/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9717505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-11DOI: 10.1007/s42001-023-00204-z
Hyunsun Kim-Hahm
{"title":"Computational approach to studying media coverage of organizations","authors":"Hyunsun Kim-Hahm","doi":"10.1007/s42001-023-00204-z","DOIUrl":"https://doi.org/10.1007/s42001-023-00204-z","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"55 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88396211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-04DOI: 10.1007/s42001-023-00205-y
Alexey Bessudnov, Denis Tarasov, Viacheslav Panasovets, V. Kostenko, I. Smirnov, V. Uspenskiy
{"title":"Predicting perceived ethnicity with data on personal names in Russia","authors":"Alexey Bessudnov, Denis Tarasov, Viacheslav Panasovets, V. Kostenko, I. Smirnov, V. Uspenskiy","doi":"10.1007/s42001-023-00205-y","DOIUrl":"https://doi.org/10.1007/s42001-023-00205-y","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"21 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2023-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88983062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-04DOI: 10.1007/s42001-023-00200-3
Johannes Langguth, Daniel Thilo Schroeder, Petra Filkuková, Stefan Brenner, Jesper Phillips, Konstantin Pogorelov
The COVID-19 pandemic has been accompanied by a surge of misinformation on social media which covered a wide range of different topics and contained many competing narratives, including conspiracy theories. To study such conspiracy theories, we created a dataset of 3495 tweets with manual labeling of the stance of each tweet w.r.t. 12 different conspiracy topics. The dataset thus contains almost 42,000 labels, each of which determined by majority among three expert annotators. The dataset was selected from COVID-19 related Twitter data spanning from January 2020 to June 2021 using a list of 54 keywords. The dataset can be used to train machine learning based classifiers for both stance and topic detection, either individually or simultaneously. BERT was used successfully for the combined task. The dataset can also be used to further study the prevalence of different conspiracy narratives. To this end we qualitatively analyze the tweets, discussing the structure of conspiracy narratives that are frequently found in the dataset. Furthermore, we illustrate the interconnection between the conspiracy categories as well as the keywords.
{"title":"COCO: an annotated Twitter dataset of COVID-19 conspiracy theories.","authors":"Johannes Langguth, Daniel Thilo Schroeder, Petra Filkuková, Stefan Brenner, Jesper Phillips, Konstantin Pogorelov","doi":"10.1007/s42001-023-00200-3","DOIUrl":"10.1007/s42001-023-00200-3","url":null,"abstract":"<p><p>The COVID-19 pandemic has been accompanied by a surge of misinformation on social media which covered a wide range of different topics and contained many competing narratives, including conspiracy theories. To study such conspiracy theories, we created a dataset of 3495 tweets with manual labeling of the stance of each tweet w.r.t. 12 different conspiracy topics. The dataset thus contains almost 42,000 labels, each of which determined by majority among three expert annotators. The dataset was selected from COVID-19 related Twitter data spanning from January 2020 to June 2021 using a list of 54 keywords. The dataset can be used to train machine learning based classifiers for both stance and topic detection, either individually or simultaneously. BERT was used successfully for the combined task. The dataset can also be used to further study the prevalence of different conspiracy narratives. To this end we qualitatively analyze the tweets, discussing the structure of conspiracy narratives that are frequently found in the dataset. Furthermore, we illustrate the interconnection between the conspiracy categories as well as the keywords.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":" ","pages":"1-42"},"PeriodicalIF":3.2,"publicationDate":"2023-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10071453/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9717507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-28DOI: 10.1007/s42001-023-00203-0
Prateeksha Dawn Davidson, Thanujah Muniandy, Dhivya Karmegam
Vaccination has been a hot topic in the present COVID-19 context. The government, public health stakeholders and media are all concerned about how to get the people vaccinated. The study was intended to explore the perception and emotions of the Indians citizens toward COVID-19 vaccine from Twitter messages. The tweets were collected for the period of 6 months, from mid-January to June, 2021 using hash-tags and keywords specific to India. Topics and emotions from the tweets were extracted using Latent Dirichlet Allocation (LDA) method and National Research Council (NRC) Lexicon, respectively. Theme, sentiment and emotion wise engagement and reachability metrics were assessed. Hash-tag frequency of COVID-19 vaccine brands were also identified and evaluated. Information regarding 'Co-WIN app and availability of vaccine' was widely discussed and also received highest engagement and reachability among Twitter users. Among the various emotions, trust was expressed the most, which highlights the acceptance of vaccines among the Indian citizens. The hash-tags frequency of vaccine brands shows that Covishield was popular in the month of March 2021, and Covaxin in April 2021. The results from the study will help stakeholders to efficiently use social media to disseminate COVID-19 vaccine information on popular concerns. This in turn will encourage citizens to be vaccinated and achieve herd immunity. Similar methodology can be adopted in future to understand the perceptions and concerns of people in emergency situations.
Supplementary information: The online version contains supplementary material available at 10.1007/s42001-023-00203-0.
{"title":"Perception of COVID-19 vaccination among Indian Twitter users: computational approach.","authors":"Prateeksha Dawn Davidson, Thanujah Muniandy, Dhivya Karmegam","doi":"10.1007/s42001-023-00203-0","DOIUrl":"10.1007/s42001-023-00203-0","url":null,"abstract":"<p><p>Vaccination has been a hot topic in the present COVID-19 context. The government, public health stakeholders and media are all concerned about how to get the people vaccinated. The study was intended to explore the perception and emotions of the Indians citizens toward COVID-19 vaccine from Twitter messages. The tweets were collected for the period of 6 months, from mid-January to June, 2021 using hash-tags and keywords specific to India. Topics and emotions from the tweets were extracted using Latent Dirichlet Allocation (LDA) method and National Research Council (NRC) Lexicon, respectively. Theme, sentiment and emotion wise engagement and reachability metrics were assessed. Hash-tag frequency of COVID-19 vaccine brands were also identified and evaluated. Information regarding '<i>Co-WIN</i> app and availability of vaccine' was widely discussed and also received highest engagement and reachability among Twitter users. Among the various emotions, trust was expressed the most, which highlights the acceptance of vaccines among the Indian citizens. The hash-tags frequency of vaccine brands shows that Covishield was popular in the month of March 2021, and Covaxin in April 2021. The results from the study will help stakeholders to efficiently use social media to disseminate COVID-19 vaccine information on popular concerns. This in turn will encourage citizens to be vaccinated and achieve herd immunity. Similar methodology can be adopted in future to understand the perceptions and concerns of people in emergency situations.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s42001-023-00203-0.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":" ","pages":"1-20"},"PeriodicalIF":3.2,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047476/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9709245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To effectively design policies and implement measures for addressing problems faced by people during these difficult times of pandemic, it is critical to have a clear vision of the problems people are freely talking about. One of the ways is to analyze social media feeds e.g., tweets, which has become one of the primary ways people express their views on various socioeconomic issues and on-ground effectiveness of measures adopted to address these issues. In this work, we attempt to uncover various socioeconomic issues, which are giving rise to negative and positive sentiments and their trends across geographies over a course of one year of the pandemic. We also try identifying similarities and differences in opinions as they vary across gender as the time passes through the crisis. Many previous works have analyzed sentiments in context of vaccines, fatalities, and lockdowns; however, socioeconomic issues did not receive full attention. We found that sentiments of people with respect to economy are negative across geographies during starting of pandemic. Thereafter, gradually sentiments lift towards positive direction reflecting a sense of improvement in situation. Females appeared to have slightly different concerns and hopes in comparison to males and especially across globe people expressed positive sentiments during new year time. Finally, this work, together with many other similar works on social media analysis gives ground for wide scale adoption of geo-temporal sentiments trend analysis of social media as a tool for uncovering key concerns and effectiveness of measures.
{"title":"Geo-sentiment trends analysis of tweets in context of economy and employment during COVID-19.","authors":"Narendranath Sukhavasi, Janardan Misra, Vikrant Kaulgud, Sanjay Podder","doi":"10.1007/s42001-023-00201-2","DOIUrl":"https://doi.org/10.1007/s42001-023-00201-2","url":null,"abstract":"<p><p>To effectively design policies and implement measures for addressing problems faced by people during these difficult times of pandemic, it is critical to have a clear vision of the problems people are freely talking about. One of the ways is to analyze social media feeds e.g., tweets, which has become one of the primary ways people express their views on various socioeconomic issues and on-ground effectiveness of measures adopted to address these issues. In this work, we attempt to uncover various socioeconomic issues, which are giving rise to negative and positive sentiments and their trends across geographies over a course of one year of the pandemic. We also try identifying similarities and differences in opinions as they vary across gender as the time passes through the crisis. Many previous works have analyzed sentiments in context of vaccines, fatalities, and lockdowns; however, socioeconomic issues did not receive full attention. We found that sentiments of people with respect to economy are negative across geographies during starting of pandemic. Thereafter, gradually sentiments lift towards positive direction reflecting a sense of improvement in situation. Females appeared to have slightly different concerns and hopes in comparison to males and especially across globe people expressed positive sentiments during new year time. Finally, this work, together with many other similar works on social media analysis gives ground for wide scale adoption of geo-temporal sentiments trend analysis of social media as a tool for uncovering key concerns and effectiveness of measures.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":" ","pages":"1-31"},"PeriodicalIF":3.2,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10035975/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9717509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-22DOI: 10.1007/s42001-023-00202-1
M. Lokanan
{"title":"Incorporating machine learning in dispute resolution and settlement process for financial fraud","authors":"M. Lokanan","doi":"10.1007/s42001-023-00202-1","DOIUrl":"https://doi.org/10.1007/s42001-023-00202-1","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"44 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80047222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}