首页 > 最新文献

Journal of Computational Social Science最新文献

英文 中文
Estimating time-series changes in social sentiment @Twitter in U.S. metropolises during the COVID-19 pandemic. 估算新冠疫情期间美国大城市社会情绪@Twitter的时间序列变化。
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 DOI: 10.1007/s42001-022-00186-4
Ryuichi Saito, Shinichiro Haruyama

Since early 2020, the global coronavirus pandemic has strained economic activities and traditional lifestyles. For such emergencies, our paper proposes a social sentiment estimation model that changes in response to infection conditions and state government orders. By designing mediation keywords that do not directly evoke coronavirus, it is possible to observe sentiment waveforms that vary as confirmed cases increase or decrease and as behavioral restrictions are ordered or lifted over a long period. The model demonstrates guaranteed performance with transformer-based neural network models and has been validated in New York City, Los Angeles, and Chicago, given that coronavirus infections explode in overcrowded cities. The time-series of the extracted social sentiment reflected the infection conditions of each city during the 2-year period from pre-pandemic to the new normal and shows a concurrency of waveforms common to the three cities. The methods of this paper could be applied not only to analysis of the COVID-19 pandemic but also to analyses of a wide range of emergencies and they could be a policy support tool that complements traditional surveys in the future.

自2020年初以来,全球冠状病毒大流行给经济活动和传统生活方式带来了压力。针对此类突发事件,本文提出了一种随感染情况和国家政府命令变化的社会情绪估计模型。通过设计不直接引起冠状病毒的中介关键词,可以观察到随着确诊病例的增加或减少、行为限制的下达或解除而长期变化的情绪波形。该模型利用基于变压器的神经网络模型证明了有保证的性能,并在纽约、洛杉矶和芝加哥得到了验证,因为冠状病毒感染在拥挤的城市中激增。提取的社会情绪时间序列反映了各城市从疫情前到新常态2年期间的感染情况,呈现出3个城市共有的波形并发性。本文的方法不仅可以应用于COVID-19大流行的分析,还可以应用于各种突发事件的分析,它们可以成为未来传统调查的补充政策支持工具。
{"title":"Estimating time-series changes in social sentiment @Twitter in U.S. metropolises during the COVID-19 pandemic.","authors":"Ryuichi Saito,&nbsp;Shinichiro Haruyama","doi":"10.1007/s42001-022-00186-4","DOIUrl":"https://doi.org/10.1007/s42001-022-00186-4","url":null,"abstract":"<p><p>Since early 2020, the global coronavirus pandemic has strained economic activities and traditional lifestyles. For such emergencies, our paper proposes a social sentiment estimation model that changes in response to infection conditions and state government orders. By designing mediation keywords that do not directly evoke coronavirus, it is possible to observe sentiment waveforms that vary as confirmed cases increase or decrease and as behavioral restrictions are ordered or lifted over a long period. The model demonstrates guaranteed performance with transformer-based neural network models and has been validated in New York City, Los Angeles, and Chicago, given that coronavirus infections explode in overcrowded cities. The time-series of the extracted social sentiment reflected the infection conditions of each city during the 2-year period from pre-pandemic to the new normal and shows a concurrency of waveforms common to the three cities. The methods of this paper could be applied not only to analysis of the COVID-19 pandemic but also to analyses of a wide range of emergencies and they could be a policy support tool that complements traditional surveys in the future.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"359-388"},"PeriodicalIF":3.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9660099/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9469439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A comparison of approaches for imbalanced classification problems in the context of retrieving relevant documents for an analysis. 在检索相关文档进行分析的背景下,不平衡分类问题的方法比较。
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 DOI: 10.1007/s42001-022-00191-7
Sandra Wankmüller

One of the first steps in many text-based social science studies is to retrieve documents that are relevant for an analysis from large corpora of otherwise irrelevant documents. The conventional approach in social science to address this retrieval task is to apply a set of keywords and to consider those documents to be relevant that contain at least one of the keywords. But the application of incomplete keyword lists has a high risk of drawing biased inferences. More complex and costly methods such as query expansion techniques, topic model-based classification rules, and active as well as passive supervised learning could have the potential to more accurately separate relevant from irrelevant documents and thereby reduce the potential size of bias. Yet, whether applying these more expensive approaches increases retrieval performance compared to keyword lists at all, and if so, by how much, is unclear as a comparison of these approaches is lacking. This study closes this gap by comparing these methods across three retrieval tasks associated with a data set of German tweets (Linder in SSRN, 2017. 10.2139/ssrn.3026393), the Social Bias Inference Corpus (SBIC) (Sap et al. in Social bias frames: reasoning about social and power implications of language. In: Jurafsky et al. (eds) Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, p 5477-5490, 2020. 10.18653/v1/2020.aclmain.486), and the Reuters-21578 corpus (Lewis in Reuters-21578 (Distribution 1.0). [Data set], 1997. http://www.daviddlewis.com/resources/testcollections/reuters21578/). Results show that query expansion techniques and topic model-based classification rules in most studied settings tend to decrease rather than increase retrieval performance. Active supervised learning, however, if applied on a not too small set of labeled training instances (e.g. 1000 documents), reaches a substantially higher retrieval performance than keyword lists.

许多基于文本的社会科学研究的第一步是从大量无关文档的语料库中检索与分析相关的文档。在社会科学中,解决这一检索任务的传统方法是应用一组关键字,并认为那些包含至少一个关键字的文档是相关的。但应用不完整的关键字列表有很高的风险得出有偏见的推论。更复杂和昂贵的方法,如查询扩展技术、基于主题模型的分类规则、主动和被动监督学习,都有可能更准确地将相关文档与不相关文档分开,从而减少潜在的偏差大小。然而,与关键字列表相比,应用这些更昂贵的方法是否提高了检索性能,如果有的话,提高了多少,由于缺乏对这些方法的比较,目前还不清楚。本研究通过将这些方法与一组德语推文数据集相关的三个检索任务进行比较,缩小了这一差距(Linder in SSRN, 2017)。10.2139/ssrn.3026393),社会偏见推理语料库(SBIC) (Sap et al. Social Bias frames: reasoning about Social and power implications of language)。见:Jurafsky et al.(编)计算语言学协会第58届年会论文集。计算语言学,p 5477-5490, 2020。10.18653/v1/2020.aclmain.486)和Reuters-21578语料库(Lewis in Reuters-21578 (Distribution 1.0))。[数据集],1997。http://www.daviddlewis.com/resources/testcollections/reuters21578/)。结果表明,在大多数研究环境下,查询扩展技术和基于主题模型的分类规则倾向于降低而不是提高检索性能。然而,如果将主动监督学习应用于不太小的标记训练实例集(例如1000个文档),则可以达到比关键字列表高得多的检索性能。
{"title":"A comparison of approaches for imbalanced classification problems in the context of retrieving relevant documents for an analysis.","authors":"Sandra Wankmüller","doi":"10.1007/s42001-022-00191-7","DOIUrl":"https://doi.org/10.1007/s42001-022-00191-7","url":null,"abstract":"<p><p>One of the first steps in many text-based social science studies is to retrieve documents that are relevant for an analysis from large corpora of otherwise irrelevant documents. The conventional approach in social science to address this retrieval task is to apply a set of keywords and to consider those documents to be relevant that contain at least one of the keywords. But the application of incomplete keyword lists has a high risk of drawing biased inferences. More complex and costly methods such as query expansion techniques, topic model-based classification rules, and active as well as passive supervised learning could have the potential to more accurately separate relevant from irrelevant documents and thereby reduce the potential size of bias. Yet, whether applying these more expensive approaches increases retrieval performance compared to keyword lists at all, and if so, by how much, is unclear as a comparison of these approaches is lacking. This study closes this gap by comparing these methods across three retrieval tasks associated with a data set of German tweets (Linder in SSRN, 2017. 10.2139/ssrn.3026393), the Social Bias Inference Corpus (SBIC) (Sap et al. in Social bias frames: reasoning about social and power implications of language. In: Jurafsky et al. (eds) Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, p 5477-5490, 2020. 10.18653/v1/2020.aclmain.486), and the Reuters-21578 corpus (Lewis in Reuters-21578 (Distribution 1.0). [Data set], 1997. http://www.daviddlewis.com/resources/testcollections/reuters21578/). Results show that query expansion techniques and topic model-based classification rules in most studied settings tend to decrease rather than increase retrieval performance. Active supervised learning, however, if applied on a not too small set of labeled training instances (e.g. 1000 documents), reaches a substantially higher retrieval performance than keyword lists.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"91-163"},"PeriodicalIF":3.2,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762672/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9469919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A scoping review on the use of natural language processing in research on political polarization: trends and research prospects. 自然语言处理在政治极化研究中的应用综述:趋势与研究展望。
IF 2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2023-01-01 Epub Date: 2022-12-19 DOI: 10.1007/s42001-022-00196-2
Renáta Németh

As part of the "text-as-data" movement, Natural Language Processing (NLP) provides a computational way to examine political polarization. We conducted a methodological scoping review of studies published since 2010 (n = 154) to clarify how NLP research has conceptualized and measured political polarization, and to characterize the degree of integration of the two different research paradigms that meet in this research area. We identified biases toward US context (59%), Twitter data (43%) and machine learning approach (33%). Research covers different layers of the political public sphere (politicians, experts, media, or the lay public), however, very few studies involved more than one layer. Results indicate that only a few studies made use of domain knowledge and a high proportion of the studies were not interdisciplinary. Those studies that made efforts to interpret the results demonstrated that the characteristics of political texts depend not only on the political position of their authors, but also on other often-overlooked factors. Ignoring these factors may lead to overly optimistic performance measures. Also, spurious results may be obtained when causal relations are inferred from textual data. Our paper provides arguments for the integration of explanatory and predictive modeling paradigms, and for a more interdisciplinary approach to polarization research.

Supplementary information: The online version contains supplementary material available at 10.1007/s42001-022-00196-2.

作为“文本即数据”运动的一部分,自然语言处理(NLP)提供了一种计算方法来检查政治两极分化。我们对2010年以来发表的研究(n = 154)进行了方法学范围审查,以阐明NLP研究如何概念化和测量政治两极分化,并表征在该研究领域遇到的两种不同研究范式的整合程度。我们发现了对美国背景(59%)、Twitter数据(43%)和机器学习方法(33%)的偏见。研究涵盖了政治公共领域的不同层面(政治家、专家、媒体或非专业公众),然而,很少有研究涉及超过一个层面。结果表明,利用领域知识的研究较少,非跨学科研究占很大比例。那些努力解释结果的研究表明,政治文本的特征不仅取决于其作者的政治立场,还取决于其他经常被忽视的因素。忽略这些因素可能会导致过于乐观的绩效指标。此外,当从文本数据推断因果关系时,可能会得到虚假的结果。我们的论文为解释和预测模型范式的整合提供了论据,并为极化研究提供了更跨学科的方法。补充信息:在线版本提供的补充资料为10.1007/s42001-022-00196-2。
{"title":"A scoping review on the use of natural language processing in research on political polarization: trends and research prospects.","authors":"Renáta Németh","doi":"10.1007/s42001-022-00196-2","DOIUrl":"10.1007/s42001-022-00196-2","url":null,"abstract":"<p><p>As part of the \"text-as-data\" movement, Natural Language Processing (NLP) provides a computational way to examine political polarization. We conducted a methodological scoping review of studies published since 2010 (<i>n</i> = 154) to clarify how NLP research has conceptualized and measured political polarization, and to characterize the degree of integration of the two different research paradigms that meet in this research area. We identified biases toward US context (59%), Twitter data (43%) and machine learning approach (33%). Research covers different layers of the political public sphere (politicians, experts, media, or the lay public), however, very few studies involved more than one layer. Results indicate that only a few studies made use of domain knowledge and a high proportion of the studies were not interdisciplinary. Those studies that made efforts to interpret the results demonstrated that the characteristics of political texts depend not only on the political position of their authors, but also on other often-overlooked factors. Ignoring these factors may lead to overly optimistic performance measures. Also, spurious results may be obtained when causal relations are inferred from textual data. Our paper provides arguments for the integration of explanatory and predictive modeling paradigms, and for a more interdisciplinary approach to polarization research.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s42001-022-00196-2.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"6 1","pages":"289-313"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9762668/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9469920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
School dropout prediction and feature importance exploration in Malawi using household panel data: machine learning approach 使用家庭面板数据的马拉维辍学预测和特征重要性探索:机器学习方法
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-12-13 DOI: 10.1007/s42001-022-00195-3
Hazal Colak Oz, Çiçek Güven, Gonzalo Nápoles
{"title":"School dropout prediction and feature importance exploration in Malawi using household panel data: machine learning approach","authors":"Hazal Colak Oz, Çiçek Güven, Gonzalo Nápoles","doi":"10.1007/s42001-022-00195-3","DOIUrl":"https://doi.org/10.1007/s42001-022-00195-3","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"22 1","pages":"245 - 287"},"PeriodicalIF":3.2,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74285190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evaluating algorithmic homeless service allocation 评估无家可归者服务分配算法
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-12-10 DOI: 10.1007/s42001-022-00190-8
Wenting Qi, C. Chelmis
{"title":"Evaluating algorithmic homeless service allocation","authors":"Wenting Qi, C. Chelmis","doi":"10.1007/s42001-022-00190-8","DOIUrl":"https://doi.org/10.1007/s42001-022-00190-8","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"24 1","pages":"59 - 89"},"PeriodicalIF":3.2,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85090562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
User behaviors in consumer-generated media under monetary reward schemes 货币奖励机制下消费者生成媒体中的用户行为
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-11-12 DOI: 10.1007/s42001-022-00187-3
Yutaro Usui, F. Toriumi, T. Sugawara
{"title":"User behaviors in consumer-generated media under monetary reward schemes","authors":"Yutaro Usui, F. Toriumi, T. Sugawara","doi":"10.1007/s42001-022-00187-3","DOIUrl":"https://doi.org/10.1007/s42001-022-00187-3","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"22 1","pages":"389 - 409"},"PeriodicalIF":3.2,"publicationDate":"2022-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90155833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Group-specific behavior change following terror attacks 恐怖袭击后群体特定行为的改变
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-11-12 DOI: 10.1007/s42001-022-00188-2
Jonas L. Juul, Laura Alessandretti, J. Dammeyer, Ingo Zettler, Sune Lehmann, J. Mathiesen
{"title":"Group-specific behavior change following terror attacks","authors":"Jonas L. Juul, Laura Alessandretti, J. Dammeyer, Ingo Zettler, Sune Lehmann, J. Mathiesen","doi":"10.1007/s42001-022-00188-2","DOIUrl":"https://doi.org/10.1007/s42001-022-00188-2","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"11 1","pages":"1 - 18"},"PeriodicalIF":3.2,"publicationDate":"2022-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84716922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification of intimate partner violence from free text descriptions in social media. 从社交媒体上的自由文本描述识别亲密伴侣暴力。
IF 2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-11-01 Epub Date: 2022-05-07 DOI: 10.1007/s42001-022-00166-8
Phan Trinh Ha, Rhea D'Silva, Ethan Chen, Mehmet Koyutürk, Günnur Karakurt

Intimate partner violence (IPV) is a significant public health problem that adversely affects the well-being of victims. IPV is often under-reported and non-physical forms of violence may not be recognized as IPV, even by victims. With the increasing popularity of social media and due to the anonymity provided by some of these platforms, people feel comfortable sharing descriptions of their relationship problems in social media. The content generated in these platforms can be useful in identifying IPV and characterizing the prevalence, causes, consequences, and correlates of IPV in broad populations. However, these descriptions are in the form of free text and no corpus of labeled data is available to perform large-scale computational and statistical analyses. Here, we use data from established questionnaires that are used to collect self-report data on IPV to train machine learning models to predict IPV from free text. Using Universal Sentence Encoder (USE) along with multiple machine learning algorithms (random forest, SVM, logistic regression, Naïve Bayes), we develop DetectIPV, a tool for detecting IPV in free text. Using DetectIPV, we comprehensively characterize the predictability of different types of violence (physical abuse, emotional abuse, sexual abuse) from free text. Our results show that a general model that is trained using examples of all violence types can identify IPV from free text with area under the ROC curve (AUROC) 89%. We also train type-specific models and observe that physical abuse can be identified with greatest accuracy (AUROC 98%), while sexual abuse can be identified with high precision but relatively low recall. While our results indicate that the prediction of emotional abuse is the most challenging, DetectIPV can identify emotional abuse with AUROC above 80%. These results establish DetectIPV as a tool that can be used to reliably detect IPV in the context of various applications, ranging from flagging social media posts to detecting IPV in large text corpuses for research purposes. DetectIPV is available as a web service at https://www.ipvlab.case.edu/ipvdetect/.

亲密伴侣暴力是一个严重的公共卫生问题,对受害者的福祉产生不利影响。IPV的报告往往不足,非身体形式的暴力可能不被认为是IPV,甚至受害者也不承认。随着社交媒体的日益普及,以及一些平台提供的匿名性,人们在社交媒体上分享自己的感情问题感到很舒服。这些平台产生的内容可用于识别IPV,并描述广泛人群中IPV的流行程度、原因、后果和相关关系。然而,这些描述是自由文本的形式,没有标记数据的语料库可用于执行大规模的计算和统计分析。在这里,我们使用来自既定问卷的数据,这些问卷用于收集关于IPV的自我报告数据,以训练机器学习模型来预测来自自由文本的IPV。使用通用句子编码器(USE)以及多种机器学习算法(随机森林,支持向量机,逻辑回归,Naïve贝叶斯),我们开发了DetectIPV,一个检测自由文本中IPV的工具。使用DetectIPV,我们从自由文本中全面描述了不同类型的暴力(身体虐待、情感虐待、性虐待)的可预测性。我们的研究结果表明,使用所有暴力类型的示例训练的一般模型可以从自由文本中识别出ROC曲线下面积(AUROC)为89%的IPV。我们还训练了特定类型的模型,并观察到身体虐待的识别准确率最高(AUROC为98%),而性虐待的识别准确率很高,但召回率相对较低。虽然我们的研究结果表明,情绪虐待的预测是最具挑战性的,但DetectIPV可以识别AUROC超过80%的情绪虐待。这些结果表明,DetectIPV可以作为一种工具,在各种应用环境中可靠地检测IPV,从标记社交媒体帖子到检测用于研究目的的大型文本语料库中的IPV。DetectIPV是一个网络服务,网址是https://www.ipvlab.case.edu/ipvdetect/。
{"title":"Identification of intimate partner violence from free text descriptions in social media.","authors":"Phan Trinh Ha, Rhea D'Silva, Ethan Chen, Mehmet Koyutürk, Günnur Karakurt","doi":"10.1007/s42001-022-00166-8","DOIUrl":"10.1007/s42001-022-00166-8","url":null,"abstract":"<p><p>Intimate partner violence (IPV) is a significant public health problem that adversely affects the well-being of victims. IPV is often under-reported and non-physical forms of violence may not be recognized as IPV, even by victims. With the increasing popularity of social media and due to the anonymity provided by some of these platforms, people feel comfortable sharing descriptions of their relationship problems in social media. The content generated in these platforms can be useful in identifying IPV and characterizing the prevalence, causes, consequences, and correlates of IPV in broad populations. However, these descriptions are in the form of free text and no corpus of labeled data is available to perform large-scale computational and statistical analyses. Here, we use data from established questionnaires that are used to collect self-report data on IPV to train machine learning models to predict IPV from free text. Using Universal Sentence Encoder (USE) along with multiple machine learning algorithms (random forest, SVM, logistic regression, Naïve Bayes), we develop DetectIPV, a tool for detecting IPV in free text. Using DetectIPV, we comprehensively characterize the predictability of different types of violence (physical abuse, emotional abuse, sexual abuse) from free text. Our results show that a general model that is trained using examples of all violence types can identify IPV from free text with area under the ROC curve (AUROC) 89%. We also train type-specific models and observe that physical abuse can be identified with greatest accuracy (AUROC 98%), while sexual abuse can be identified with high precision but relatively low recall. While our results indicate that the prediction of emotional abuse is the most challenging, DetectIPV can identify emotional abuse with AUROC above 80%. These results establish DetectIPV as a tool that can be used to reliably detect IPV in the context of various applications, ranging from flagging social media posts to detecting IPV in large text corpuses for research purposes. DetectIPV is available as a web service at https://www.ipvlab.case.edu/ipvdetect/.</p>","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"32 1","pages":"1207-1233"},"PeriodicalIF":2.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12040337/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88815530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The gameability of redistricting criteria 重划选区标准的可游戏性
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-10-26 DOI: 10.1007/s42001-022-00180-w
Amariah Becker, Dara Gold
{"title":"The gameability of redistricting criteria","authors":"Amariah Becker, Dara Gold","doi":"10.1007/s42001-022-00180-w","DOIUrl":"https://doi.org/10.1007/s42001-022-00180-w","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"40 1","pages":"1735 - 1777"},"PeriodicalIF":3.2,"publicationDate":"2022-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91307073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Lingual markers for automating personality profiling: background and road ahead 自动化人格分析的语言标记:背景和未来的道路
IF 3.2 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2022-09-22 DOI: 10.1007/s42001-022-00184-6
Mohmad Azhar Teli, M. Chachoo
{"title":"Lingual markers for automating personality profiling: background and road ahead","authors":"Mohmad Azhar Teli, M. Chachoo","doi":"10.1007/s42001-022-00184-6","DOIUrl":"https://doi.org/10.1007/s42001-022-00184-6","url":null,"abstract":"","PeriodicalId":29946,"journal":{"name":"Journal of Computational Social Science","volume":"3 1","pages":"1663 - 1707"},"PeriodicalIF":3.2,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88543797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Journal of Computational Social Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1