Pub Date : 2023-10-17DOI: 10.1007/s42001-023-00224-9
Marian-Andrei Rizoiu, Tianyu Wang, Gabriela Ferraro, Hanna Suominen
Abstract Today, the internet is an integral part of our daily lives, enabling people to be more connected than ever before. However, this greater connectivity and access to information increase exposure to harmful content, such as cyber-bullying and cyber-hatred. Models based on machine learning and natural language offer a way to make online platforms safer by identifying hate speech in web text autonomously. However, the main difficulty is annotating a sufficiently large number of examples to train these models. This paper uses a transfer learning technique to leverage two independent datasets jointly and builds a single representation of hate speech. We build an interpretable two-dimensional visualization tool of the constructed hate speech representation—dubbed the Map of Hate—in which multiple datasets can be projected and comparatively analyzed. The hateful content is annotated differently across the two datasets (racist and sexist in one dataset, hateful and offensive in another). However, the common representation successfully projects the harmless class of both datasets into the same space and can be used to uncover labeling errors (false positives). We also show that the joint representation boosts prediction performances when only a limited amount of supervision is available. These methods and insights hold the potential for safer social media and reduce the need to expose human moderators and annotators to distressing online messaging.
{"title":"Transfer learning for hate speech detection in social media","authors":"Marian-Andrei Rizoiu, Tianyu Wang, Gabriela Ferraro, Hanna Suominen","doi":"10.1007/s42001-023-00224-9","DOIUrl":"https://doi.org/10.1007/s42001-023-00224-9","url":null,"abstract":"Abstract Today, the internet is an integral part of our daily lives, enabling people to be more connected than ever before. However, this greater connectivity and access to information increase exposure to harmful content, such as cyber-bullying and cyber-hatred. Models based on machine learning and natural language offer a way to make online platforms safer by identifying hate speech in web text autonomously. However, the main difficulty is annotating a sufficiently large number of examples to train these models. This paper uses a transfer learning technique to leverage two independent datasets jointly and builds a single representation of hate speech. We build an interpretable two-dimensional visualization tool of the constructed hate speech representation—dubbed the Map of Hate—in which multiple datasets can be projected and comparatively analyzed. The hateful content is annotated differently across the two datasets (racist and sexist in one dataset, hateful and offensive in another). However, the common representation successfully projects the harmless class of both datasets into the same space and can be used to uncover labeling errors (false positives). We also show that the joint representation boosts prediction performances when only a limited amount of supervision is available. These methods and insights hold the potential for safer social media and reduce the need to expose human moderators and annotators to distressing online messaging.","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135944271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-09DOI: 10.1007/s42001-023-00226-7
Eva Dziadula, John O’Hare, Carl Colglazier, Marie C. Clay, Paul Brenner
{"title":"Modeling economic migration on a global scale","authors":"Eva Dziadula, John O’Hare, Carl Colglazier, Marie C. Clay, Paul Brenner","doi":"10.1007/s42001-023-00226-7","DOIUrl":"https://doi.org/10.1007/s42001-023-00226-7","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-26DOI: 10.1007/s42001-023-00225-8
Nicole Schwitter
Abstract Wikipedia is one of the most visited websites worldwide. Thousands of volunteers are contributing to it daily, making it an example of how productive non-market collaboration on a very wide scale is not only viable but also sustainable. Wikipedia’s freely available data on the online actions conducted make it a popular source of data, particularly for computer scientists and computational social scientists. This data brief will present the dewiki meetup dataset which covers the offline component of the German-language version of the online encyclopaedia Wikipedia: informal offline gatherings between Wikipedia contributors. These gatherings are organised online and information about who is attending them, where they take place and what has happened at these meetings is shared publicly. The dewiki meetup dataset covers almost 20 years of offline activity of the German-language Wikipedia, containing 4418 meetups that have been organised with information on attendees, apologies, date and place of meeting, and minutes recorded. It is a valuable source of data for social science research: it captures the development of the offline network over time of one of the largest and most sustainable online public goods and communities. The data can easily be merged with online activity data on Wikipedia which allows us to bridge the gap between offline and online behaviour.
{"title":"Bridging the offline and online: 20 years of offline meeting data of the German-language Wikipedia","authors":"Nicole Schwitter","doi":"10.1007/s42001-023-00225-8","DOIUrl":"https://doi.org/10.1007/s42001-023-00225-8","url":null,"abstract":"Abstract Wikipedia is one of the most visited websites worldwide. Thousands of volunteers are contributing to it daily, making it an example of how productive non-market collaboration on a very wide scale is not only viable but also sustainable. Wikipedia’s freely available data on the online actions conducted make it a popular source of data, particularly for computer scientists and computational social scientists. This data brief will present the dewiki meetup dataset which covers the offline component of the German-language version of the online encyclopaedia Wikipedia: informal offline gatherings between Wikipedia contributors. These gatherings are organised online and information about who is attending them, where they take place and what has happened at these meetings is shared publicly. The dewiki meetup dataset covers almost 20 years of offline activity of the German-language Wikipedia, containing 4418 meetups that have been organised with information on attendees, apologies, date and place of meeting, and minutes recorded. It is a valuable source of data for social science research: it captures the development of the offline network over time of one of the largest and most sustainable online public goods and communities. The data can easily be merged with online activity data on Wikipedia which allows us to bridge the gap between offline and online behaviour.","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134961127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-07DOI: 10.1007/s42001-023-00221-y
Shinya Obayashi, Misato Inaba, Tetsushi Ohdaira, T. Kiyonari
{"title":"It’s my turn: empirical evidence of upstream indirect reciprocity in society through a quasi-experimental approach","authors":"Shinya Obayashi, Misato Inaba, Tetsushi Ohdaira, T. Kiyonari","doi":"10.1007/s42001-023-00221-y","DOIUrl":"https://doi.org/10.1007/s42001-023-00221-y","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80787177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tweet topics on cancer among Indian Twitter users—computational approach using latent Dirichlet allocation topic modelling","authors":"Thilagavathi Ramamoorthy, Bagavandas Mappillairaju","doi":"10.1007/s42001-023-00222-x","DOIUrl":"https://doi.org/10.1007/s42001-023-00222-x","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86174380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-18DOI: 10.1007/s42001-023-00220-z
J. Valinejad, Zhen Guo, Jin-Hee Cho, I. Chen
{"title":"Social media-based social–psychological community resilience analysis of five countries on COVID-19","authors":"J. Valinejad, Zhen Guo, Jin-Hee Cho, I. Chen","doi":"10.1007/s42001-023-00220-z","DOIUrl":"https://doi.org/10.1007/s42001-023-00220-z","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86289892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-24DOI: 10.1007/s42001-023-00218-7
Pablo M. Flores, Martin Hilbert
{"title":"Temporal communication dynamics in the aftermath of large-scale upheavals: do digital footprints reveal a stage model?","authors":"Pablo M. Flores, Martin Hilbert","doi":"10.1007/s42001-023-00218-7","DOIUrl":"https://doi.org/10.1007/s42001-023-00218-7","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86810910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-13DOI: 10.1007/s42001-023-00219-6
Günther Jikeli, Katharina Soemer
{"title":"The value of manual annotation in assessing trends of hate speech on social media: was antisemitism on the rise during the tumultuous weeks of Elon Musk’s Twitter takeover?","authors":"Günther Jikeli, Katharina Soemer","doi":"10.1007/s42001-023-00219-6","DOIUrl":"https://doi.org/10.1007/s42001-023-00219-6","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80741493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-07DOI: 10.1007/s42001-023-00217-8
Daryl R. DeFord, Elliot Kimsey, R. Zerr
{"title":"Multi-balanced redistricting","authors":"Daryl R. DeFord, Elliot Kimsey, R. Zerr","doi":"10.1007/s42001-023-00217-8","DOIUrl":"https://doi.org/10.1007/s42001-023-00217-8","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76805653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-30DOI: 10.1007/s42001-023-00216-9
Thomas Feliciani, J. Tolsma, A. Flache
{"title":"Ethnic segregation and spatial patterns of attitudes: studying the link using register data and social simulation","authors":"Thomas Feliciani, J. Tolsma, A. Flache","doi":"10.1007/s42001-023-00216-9","DOIUrl":"https://doi.org/10.1007/s42001-023-00216-9","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79762786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}