首页 > 最新文献

2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)最新文献

英文 中文
CAB-NC: The Correspondence Analysis Based Network Clustering Method CAB-NC:基于对应分析的网络聚类方法
M. Kimura
Finding clusters in a network has been practically important in many applications and was studied by many researchers. Most commonly used methods are spectral clustering and Newman's modularity maximization. However, there has been no unified view of them. In this study, we introduced a new guiding principle based on correspondence analysis to obtain nodes' coordinates and discussed its equivalence to spectral clustering and its relationship to Newman's modularity.
在网络中寻找聚类在许多应用中具有重要的实际意义,并被许多研究者所研究。最常用的方法是谱聚类和纽曼模块化最大化。然而,对它们并没有统一的看法。本文提出了一种新的基于对应分析的节点坐标获取指导原则,并讨论了其与谱聚类的等价性及其与纽曼模块化的关系。
{"title":"CAB-NC: The Correspondence Analysis Based Network Clustering Method","authors":"M. Kimura","doi":"10.1145/3341161.3342944","DOIUrl":"https://doi.org/10.1145/3341161.3342944","url":null,"abstract":"Finding clusters in a network has been practically important in many applications and was studied by many researchers. Most commonly used methods are spectral clustering and Newman's modularity maximization. However, there has been no unified view of them. In this study, we introduced a new guiding principle based on correspondence analysis to obtain nodes' coordinates and discussed its equivalence to spectral clustering and its relationship to Newman's modularity.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121166020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Artificial Intelligence for ETF Market Prediction and Portfolio Optimization 人工智能在ETF市场预测和投资组合优化中的应用
Min-Yuh Day, Jian-Ting Lin
In asset allocation and time-series forecasting studies, few have shed light on using the different machine learning and deep learning models to verify the difference in the result of investment returns and optimal asset allocation. To fill this research gap, we develop a robo-advisor with different machine learning and deep learning forecasting methodologies and utilize the forecasting result of the portfolio optimization model to support our investors in making decisions. This research integrated several dimensions of technologies, which contain machine learning, data analytics, and portfolio optimization. We focused on developing robo-advisor framework and utilized algorithms by integrating machine learning and deep learning approaches with the portfolio optimization algorithm by using our predicted trends and results to replace the historical data and investor views. We eliminate the extreme fluctuation to maintain our trading within the acceptable risk coefficient. Accordingly, we can minimize the investment risk and reach a relatively stable return. We compared different algorithms and found that the F1 score of the model prediction significantly affects the result of the optimized portfolio. We used our deep learning model with the highest winning rate and leveraged the prediction result with the portfolio optimization algorithm to reach 12% of annual return, which outperform our benchmark index 0050. TW and the optimized portfolio with the integration of historical data.
在资产配置和时间序列预测研究中,很少有人阐明使用不同的机器学习和深度学习模型来验证投资回报和最优资产配置结果的差异。为了填补这一研究空白,我们开发了一个具有不同机器学习和深度学习预测方法的机器人顾问,并利用投资组合优化模型的预测结果来支持我们的投资者做出决策。这项研究整合了几个维度的技术,包括机器学习、数据分析和投资组合优化。我们专注于开发机器人顾问框架,并利用算法将机器学习和深度学习方法与投资组合优化算法相结合,使用我们预测的趋势和结果来取代历史数据和投资者的观点。我们消除极端波动,以保持我们的交易在可接受的风险系数。因此,我们可以将投资风险降到最低,并获得相对稳定的回报。我们比较了不同的算法,发现模型预测的F1分数显著影响优化投资组合的结果。我们使用了胜率最高的深度学习模型,并将预测结果与投资组合优化算法相结合,达到了12%的年回报率,超过了基准指数0050。TW和整合历史数据的优化投资组合。
{"title":"Artificial Intelligence for ETF Market Prediction and Portfolio Optimization","authors":"Min-Yuh Day, Jian-Ting Lin","doi":"10.1145/3341161.3344822","DOIUrl":"https://doi.org/10.1145/3341161.3344822","url":null,"abstract":"In asset allocation and time-series forecasting studies, few have shed light on using the different machine learning and deep learning models to verify the difference in the result of investment returns and optimal asset allocation. To fill this research gap, we develop a robo-advisor with different machine learning and deep learning forecasting methodologies and utilize the forecasting result of the portfolio optimization model to support our investors in making decisions. This research integrated several dimensions of technologies, which contain machine learning, data analytics, and portfolio optimization. We focused on developing robo-advisor framework and utilized algorithms by integrating machine learning and deep learning approaches with the portfolio optimization algorithm by using our predicted trends and results to replace the historical data and investor views. We eliminate the extreme fluctuation to maintain our trading within the acceptable risk coefficient. Accordingly, we can minimize the investment risk and reach a relatively stable return. We compared different algorithms and found that the F1 score of the model prediction significantly affects the result of the optimized portfolio. We used our deep learning model with the highest winning rate and leveraged the prediction result with the portfolio optimization algorithm to reach 12% of annual return, which outperform our benchmark index 0050. TW and the optimized portfolio with the integration of historical data.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130034595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Surveying public opinion using label prediction on social media data 利用社交媒体数据的标签预测来调查民意
Marija Stanojevic, Jumanah Alshehri, Z. Obradovic
In this study, a procedure is proposed for surveying public opinion from big social media domain-specific textual data to minimize the difficulties associated with modeling public behavior. Strategies for labeling posts relevant to a topic are discussed. A two-part framework is proposed in which semiautomatic labeling is applied to a small subset of posts, referred to as the “seed” in further text. This seed is used as bases for semi-supervised labeling of the rest of the data. The hypothesis is that the proposed method will achieve better labeling performance than existing classification models when applied to small amounts of labeled data. The seed is labeled using posts of users with a known and consistent view on the topic. A semi-supervised multi-class prediction model labels the remaining data iteratively. In each iteration, it adds context-label pairs to the training set if softmax-based label probabilities are above the threshold. The proposed method is characterized on four datasets by comparison to the three popular text modeling algorithms (n-grams + tfidf, fastText, VDCNN) for different sizes of labeled seeds (5,000 and 50,000 posts) and for several label-prediction significance thresholds. Our proposed semi-supervised method outperformed alternative algorithms by capturing additional contexts from the unlabeled data. The accuracy of the algorithm was increasing by (3-10%) when using a larger fraction of data as the seed. For the smaller seed, lower label probability threshold was clearly a better choice, while for larger seeds no predominant threshold was observed. The proposed framework, using fastText library for efficient text classification and representation learning, achieved the best results for a smaller seed, while VDCNN wrapped in the proposed framework achieved the best results for the bigger seed. The performance was negatively influenced by the number of classes. Finally, the model was applied to characterize a biased dataset of opinions related to gun control/rights advocacy. The proposed semi-automatic seed labeling is used to label 8,448 twitter posts of 171 advocates for guns control/rights. On this application, our approach performed better than existing models and it achieves 96.5% accuracy and 0.68 F1 score.
在本研究中,提出了一种从大型社交媒体领域特定文本数据中调查民意的程序,以最大限度地减少与公众行为建模相关的困难。讨论了贴标签与主题相关的帖子的策略。提出了一个由两部分组成的框架,其中半自动标记应用于一小部分帖子,在进一步的文本中称为“种子”。该种子用作对其余数据进行半监督标记的基础。假设当应用于少量标记数据时,所提出的方法将比现有的分类模型获得更好的标记性能。种子使用对该主题具有已知和一致观点的用户的帖子进行标记。半监督多类预测模型对剩余数据进行迭代标记。在每次迭代中,如果基于softmax的标签概率高于阈值,则向训练集中添加上下文标签对。通过对比三种流行的文本建模算法(n-grams + tfidf, fastText, VDCNN)在四个数据集上对不同大小的标记种子(5,000和50,000篇文章)和几个标签预测显著性阈值进行了表征。我们提出的半监督方法通过从未标记的数据中捕获额外的上下文而优于其他算法。当使用较大比例的数据作为种子时,算法的准确率提高了(3-10%)。对于较小的种子,较低的标签概率阈值显然是较好的选择,而对于较大的种子,没有观察到优势阈值。该框架使用fastText库进行高效的文本分类和表示学习,对于较小的种子取得了最好的结果,而封装在该框架中的VDCNN对于较大的种子取得了最好的结果。性能受到班级数量的负面影响。最后,该模型被用于描述与枪支管制/权利倡导相关的有偏见的意见数据集。拟议的半自动种子标签用于标记171名枪支管制/权利倡导者的8,448条推特帖子。在此应用中,我们的方法表现优于现有模型,准确率达到96.5%,F1得分为0.68。
{"title":"Surveying public opinion using label prediction on social media data","authors":"Marija Stanojevic, Jumanah Alshehri, Z. Obradovic","doi":"10.1145/3341161.3342861","DOIUrl":"https://doi.org/10.1145/3341161.3342861","url":null,"abstract":"In this study, a procedure is proposed for surveying public opinion from big social media domain-specific textual data to minimize the difficulties associated with modeling public behavior. Strategies for labeling posts relevant to a topic are discussed. A two-part framework is proposed in which semiautomatic labeling is applied to a small subset of posts, referred to as the “seed” in further text. This seed is used as bases for semi-supervised labeling of the rest of the data. The hypothesis is that the proposed method will achieve better labeling performance than existing classification models when applied to small amounts of labeled data. The seed is labeled using posts of users with a known and consistent view on the topic. A semi-supervised multi-class prediction model labels the remaining data iteratively. In each iteration, it adds context-label pairs to the training set if softmax-based label probabilities are above the threshold. The proposed method is characterized on four datasets by comparison to the three popular text modeling algorithms (n-grams + tfidf, fastText, VDCNN) for different sizes of labeled seeds (5,000 and 50,000 posts) and for several label-prediction significance thresholds. Our proposed semi-supervised method outperformed alternative algorithms by capturing additional contexts from the unlabeled data. The accuracy of the algorithm was increasing by (3-10%) when using a larger fraction of data as the seed. For the smaller seed, lower label probability threshold was clearly a better choice, while for larger seeds no predominant threshold was observed. The proposed framework, using fastText library for efficient text classification and representation learning, achieved the best results for a smaller seed, while VDCNN wrapped in the proposed framework achieved the best results for the bigger seed. The performance was negatively influenced by the number of classes. Finally, the model was applied to characterize a biased dataset of opinions related to gun control/rights advocacy. The proposed semi-automatic seed labeling is used to label 8,448 twitter posts of 171 advocates for guns control/rights. On this application, our approach performed better than existing models and it achieves 96.5% accuracy and 0.68 F1 score.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117315266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Computing Node Clustering Coefficients Securely 安全计算节点聚类系数
K. Areekijseree, Y. Tang, S. Soundarajan
When performing any analysis task, some information may be leaked or scattered among individuals who may not willing to share their information (e.g., number of individual's friends and who they are). Secure multi-party computation (MPC) allows individuals to jointly perform any computation without revealing each individual's input. Here, we present two novel secure frameworks which allow node to securely compute its clustering coefficient, which we evaluate the trade off between efficiency and security of several proposed instantiations. Our results show that the cost for secure computing highly depends on network structure.
在执行任何分析任务时,一些信息可能会泄露或分散在不愿意分享其信息的个人之间(例如,个人的朋友数量和他们是谁)。安全多方计算(MPC)允许个人联合执行任何计算,而不泄露每个人的输入。在这里,我们提出了两个新的安全框架,允许节点安全地计算其聚类系数,我们评估了几种建议实例的效率和安全性之间的权衡。我们的研究结果表明,安全计算的成本高度依赖于网络结构。
{"title":"Computing Node Clustering Coefficients Securely","authors":"K. Areekijseree, Y. Tang, S. Soundarajan","doi":"10.1145/3341161.3342946","DOIUrl":"https://doi.org/10.1145/3341161.3342946","url":null,"abstract":"When performing any analysis task, some information may be leaked or scattered among individuals who may not willing to share their information (e.g., number of individual's friends and who they are). Secure multi-party computation (MPC) allows individuals to jointly perform any computation without revealing each individual's input. Here, we present two novel secure frameworks which allow node to securely compute its clustering coefficient, which we evaluate the trade off between efficiency and security of several proposed instantiations. Our results show that the cost for secure computing highly depends on network structure.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114824883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Show me your friends, and I will tell you whom you vote for: Predicting voting behavior in social networks 给我看看你的朋友,我就会告诉你你投给谁:预测社交网络中的投票行为
Lihi Idan, J. Feigenbaum
Increasing use of social media in campaigns raises the question of whether one can predict the voting behavior of social-network users who do not disclose their political preferences in their online profiles. Prior work on this task only considered users who generate politically oriented content or voluntarily disclose their political preferences online. We avoid this bias by using a novel Bayesian-network model that combines demographic, behavioral, and social features; we apply this novel approach to the 2016 U.S. Presidential election. Our model is highly extensible and facilitates the use of incomplete datasets. Furthermore, our work is the first to apply a semi-supervised approach for this task: Using the EM algorithm, we combine labeled survey data with unlabeled Facebook data, thus obtaining larger datasets as well as addressing self-selection bias.
社交媒体在竞选活动中的使用越来越多,这引发了一个问题:人们能否预测那些没有在网上个人资料中披露政治偏好的社交网络用户的投票行为?在此之前的工作只考虑那些生成政治导向内容或自愿在网上披露其政治偏好的用户。我们通过使用一种新颖的贝叶斯网络模型来避免这种偏差,该模型结合了人口统计、行为和社会特征;我们将这种新方法应用于2016年美国总统大选。我们的模型具有高度的可扩展性,并且便于使用不完整的数据集。此外,我们的工作是第一个将半监督方法应用于该任务的:使用EM算法,我们将标记的调查数据与未标记的Facebook数据结合起来,从而获得更大的数据集,并解决自我选择偏差。
{"title":"Show me your friends, and I will tell you whom you vote for: Predicting voting behavior in social networks","authors":"Lihi Idan, J. Feigenbaum","doi":"10.1145/3341161.3343676","DOIUrl":"https://doi.org/10.1145/3341161.3343676","url":null,"abstract":"Increasing use of social media in campaigns raises the question of whether one can predict the voting behavior of social-network users who do not disclose their political preferences in their online profiles. Prior work on this task only considered users who generate politically oriented content or voluntarily disclose their political preferences online. We avoid this bias by using a novel Bayesian-network model that combines demographic, behavioral, and social features; we apply this novel approach to the 2016 U.S. Presidential election. Our model is highly extensible and facilitates the use of incomplete datasets. Furthermore, our work is the first to apply a semi-supervised approach for this task: Using the EM algorithm, we combine labeled survey data with unlabeled Facebook data, thus obtaining larger datasets as well as addressing self-selection bias.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114647813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Predicting Public Opinion on Drug Legalization: Social Media Analysis and Consumption Trends 预测公众对毒品合法化的看法:社会媒体分析和消费趋势
F. Motlagh, Saeedeh Shekarpour, A. Sheth, K. Thirunarayan, M. Raymer
In this paper, we focus on the collection and analysis of relevant Twitter data on a state-by-state basis for (i) measuring public opinion on marijuana legalization by mining sentiment in Twitter data and (ii) determining the usage trends for six distinct types of marijuana. We overcome the challenges posed by the informal and ungrammatical nature of tweets to analyze a corpus of 306,835 relevant tweets collected over the four-month period, preceding the November 2015 Ohio Marijuana Legalization ballot and the four months after the election for all states in the US. Our analysis revealed two key insights: (i) the people in states that have legalized recreational marijuana express greater positive sentiments about marijuana than the people in states that have either legalized medicinal marijuana or have not legalized marijuana at all; (ii) the states that have a high percentage of positive sentiment about marijuana is more inclined to authorize (e.g., by allowing medical marijuana) or broaden its legal usage (e.g., by allowing recreational marijuana in addition to medical marijuana). Our analysis shows that social media can provide reliable information and can serve as an alternative to traditional polling of public opinion on drug use and epidemiology research.
在本文中,我们专注于收集和分析各州的相关Twitter数据,以(i)通过挖掘Twitter数据中的情绪来衡量公众对大麻合法化的看法,以及(ii)确定六种不同类型大麻的使用趋势。我们克服了推文的非正式和不符合语法的性质所带来的挑战,分析了在2015年11月俄亥俄州大麻合法化投票之前和美国所有州选举后四个月内收集的306,835条相关推文的语料库。我们的分析揭示了两个关键的见解:(i)与药用大麻合法化或根本没有大麻合法化的州相比,娱乐性大麻合法化州的人们对大麻表达了更多的积极情绪;(二)对大麻持积极态度比例较高的州更倾向于授权(例如,允许医用大麻)或扩大其合法使用(例如,允许除医用大麻外的娱乐性大麻)。我们的分析表明,社交媒体可以提供可靠的信息,可以作为传统的毒品使用民意调查和流行病学研究的替代方案。
{"title":"Predicting Public Opinion on Drug Legalization: Social Media Analysis and Consumption Trends","authors":"F. Motlagh, Saeedeh Shekarpour, A. Sheth, K. Thirunarayan, M. Raymer","doi":"10.1145/3341161.3344380","DOIUrl":"https://doi.org/10.1145/3341161.3344380","url":null,"abstract":"In this paper, we focus on the collection and analysis of relevant Twitter data on a state-by-state basis for (i) measuring public opinion on marijuana legalization by mining sentiment in Twitter data and (ii) determining the usage trends for six distinct types of marijuana. We overcome the challenges posed by the informal and ungrammatical nature of tweets to analyze a corpus of 306,835 relevant tweets collected over the four-month period, preceding the November 2015 Ohio Marijuana Legalization ballot and the four months after the election for all states in the US. Our analysis revealed two key insights: (i) the people in states that have legalized recreational marijuana express greater positive sentiments about marijuana than the people in states that have either legalized medicinal marijuana or have not legalized marijuana at all; (ii) the states that have a high percentage of positive sentiment about marijuana is more inclined to authorize (e.g., by allowing medical marijuana) or broaden its legal usage (e.g., by allowing recreational marijuana in addition to medical marijuana). Our analysis shows that social media can provide reliable information and can serve as an alternative to traditional polling of public opinion on drug use and epidemiology research.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125985363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detecting Depressed Users in Online Forums 检测在线论坛中的抑郁用户
Anu Shrestha, Francesca Spezzano
Depression is the most common mental illness in the U.S., with 6.7% of all adults who have experienced a major depressive episode. Unfortunately, depression extends to teens and young users as well, and researchers observed an increasing rate in the recent years (from 8.7% in 2005 to 11.3% in 2014 in adolescents and from 8.8% to 9.6% in young adults), especially among girls and women. People themselves are a barrier to fight this disease as they tend to hide their symptoms and do not receive treatments. However, protected by anonymity, they share their sentiments on the Web, looking for help. In this paper, we address the problem of detecting depressed users in online forums. We analyze user behavior in the Rea-chOut.com online forum, a platform providing a supportive environment for young people to discuss their everyday issues, including depression. We examine the linguistic style of user posts in combination with network-based features modeling how users connect in the forum. Our results show that network features are strong predictors of depressed users and, by combining them with user post linguistic features, we can achieve an average precision of 0.78 (vs. 0.47 of a random classifier and 0.71 of linguistic features only) and perform better than related work (F1-measure of 0.63 vs. 0.50).
抑郁症是美国最常见的精神疾病,6.7%的成年人经历过严重的抑郁症发作。不幸的是,青少年和年轻用户也会患上抑郁症,研究人员观察到近年来抑郁症的发病率不断上升(青少年从2005年的8.7%上升到2014年的11.3%,年轻人从8.8%上升到9.6%),尤其是在女孩和女性中。人们本身是对抗这种疾病的障碍,因为他们往往隐藏自己的症状,不接受治疗。然而,在匿名的保护下,他们在网上分享自己的情绪,寻求帮助。在本文中,我们解决了在在线论坛中检测抑郁用户的问题。我们分析了Rea-chOut.com在线论坛的用户行为,这是一个为年轻人提供支持性环境的平台,可以讨论他们的日常问题,包括抑郁症。我们将用户帖子的语言风格与基于网络的特征相结合,对用户在论坛中的连接方式进行建模。我们的研究结果表明,网络特征是抑郁用户的强大预测因子,通过将它们与用户帖子语言特征相结合,我们可以实现0.78的平均精度(相对于随机分类器的0.47和语言特征的0.71),并且表现优于相关工作(f1测量值为0.63 vs 0.50)。
{"title":"Detecting Depressed Users in Online Forums","authors":"Anu Shrestha, Francesca Spezzano","doi":"10.1145/3341161.3343511","DOIUrl":"https://doi.org/10.1145/3341161.3343511","url":null,"abstract":"Depression is the most common mental illness in the U.S., with 6.7% of all adults who have experienced a major depressive episode. Unfortunately, depression extends to teens and young users as well, and researchers observed an increasing rate in the recent years (from 8.7% in 2005 to 11.3% in 2014 in adolescents and from 8.8% to 9.6% in young adults), especially among girls and women. People themselves are a barrier to fight this disease as they tend to hide their symptoms and do not receive treatments. However, protected by anonymity, they share their sentiments on the Web, looking for help. In this paper, we address the problem of detecting depressed users in online forums. We analyze user behavior in the Rea-chOut.com online forum, a platform providing a supportive environment for young people to discuss their everyday issues, including depression. We examine the linguistic style of user posts in combination with network-based features modeling how users connect in the forum. Our results show that network features are strong predictors of depressed users and, by combining them with user post linguistic features, we can achieve an average precision of 0.78 (vs. 0.47 of a random classifier and 0.71 of linguistic features only) and perform better than related work (F1-measure of 0.63 vs. 0.50).","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127493836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Integrating Neural and Syntactic Features on the Helpfulness Analysis of the Online Customer Reviews 结合神经学和句法特征的在线顾客评论有用性分析
Shih-Hung Wu, Jun-Wei Wang
Before purchasing a product online, customers often read the reviews posted by people who also brought the product. Customer reviews provide opinions and relevant information such as comparisons among similar products or usage experiences about the product. Previous studies addressed on the prediction of the helpfulness of customer reviews to predict the helpfulness voting results. However, the voting result of an online review is not a constant over time; predicting the voting result based on the analysis of text is not practical. Therefore, we collect the voting results of the same online customer review over time, and observe whether the number of votes will increase or not. We construct a dataset with 10,195 online reviews in six different product categories (Computer Hardware, Drink, Makeup, Pen, Shoes, and Toys) from Amazon.cn with the voting result on the helpfulness of the reviews, and monitor the helpfulness voting in six weeks. Experiments are conducted on the dataset to predict whether the helpfulness voting result of each review will increase or not. We propose a classification system that can classify the online reviews into more helpful ones, based on a set of syntactic features and neural features trained via CNN. The results show that integrating the syntactic features with the neural features can get better result.
在网上购买产品之前,顾客通常会阅读同样购买该产品的人发布的评论。客户评论提供意见和相关信息,例如类似产品之间的比较或对产品的使用体验。以往的研究都是通过预测顾客评论的有用性来预测有用性投票结果。然而,在线评论的投票结果并不是随着时间的推移而不变的;基于文本分析来预测投票结果是不现实的。因此,我们收集同一在线客户评论在一段时间内的投票结果,观察投票数是否会增加。我们构建了一个数据集,包含来自Amazon.cn的六个不同产品类别(计算机硬件,饮料,化妆品,钢笔,鞋子和玩具)的10,195个在线评论,并对评论的有用性进行投票,并在六周内监控有用性投票。在数据集上进行实验,预测每条评论的有益投票结果是否会增加。我们提出了一个分类系统,基于一组句法特征和通过CNN训练的神经特征,可以将在线评论分类为更有用的评论。结果表明,将句法特征与神经特征相结合可以获得更好的结果。
{"title":"Integrating Neural and Syntactic Features on the Helpfulness Analysis of the Online Customer Reviews","authors":"Shih-Hung Wu, Jun-Wei Wang","doi":"10.1145/3341161.3344825","DOIUrl":"https://doi.org/10.1145/3341161.3344825","url":null,"abstract":"Before purchasing a product online, customers often read the reviews posted by people who also brought the product. Customer reviews provide opinions and relevant information such as comparisons among similar products or usage experiences about the product. Previous studies addressed on the prediction of the helpfulness of customer reviews to predict the helpfulness voting results. However, the voting result of an online review is not a constant over time; predicting the voting result based on the analysis of text is not practical. Therefore, we collect the voting results of the same online customer review over time, and observe whether the number of votes will increase or not. We construct a dataset with 10,195 online reviews in six different product categories (Computer Hardware, Drink, Makeup, Pen, Shoes, and Toys) from Amazon.cn with the voting result on the helpfulness of the reviews, and monitor the helpfulness voting in six weeks. Experiments are conducted on the dataset to predict whether the helpfulness voting result of each review will increase or not. We propose a classification system that can classify the online reviews into more helpful ones, based on a set of syntactic features and neural features trained via CNN. The results show that integrating the syntactic features with the neural features can get better result.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121745509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Opioid Relapse Prediction with GAN GAN预测阿片类药物复发
Zhou Yang, L. Nguyen, Fang Jin
Opioid addiction is a severe public health threat in the U.S, causing massive deaths and many social problems. Accurate relapse prediction is of practical importance for recovering patients since relapse prediction promotes timely relapse preventions that help patients stay clean. In this paper, we introduce a Generative Adversarial Networks (GAN) model to predict the addiction relapses based on sentiment images and social influences. Experimental results on real social media data from Reddit.com demonstrate that the GAN model delivers a better performance than comparable alternative techniques. The sentiment images generated by the model show that relapse is closely connected with two emotions ‘joy’ and ‘negative’. This work is one of the first attempts to predict relapses using massive social media (Reddit.com) data and generative adversarial nets. The proposed method, combined with knowledge of social media mining, has the potential to revolutionize the practice of opioid addiction prevention and treatment.
阿片类药物成瘾是美国严重的公共卫生威胁,造成大量死亡和许多社会问题。准确的复发预测对于康复患者具有重要的现实意义,因为复发预测有助于及时预防复发,帮助患者保持清洁。在本文中,我们引入了一种基于情感图像和社会影响的生成对抗网络(GAN)模型来预测成瘾复发。在Reddit.com的真实社交媒体数据上的实验结果表明,GAN模型比可比的替代技术提供了更好的性能。该模型生成的情绪图像显示,复发与“快乐”和“消极”两种情绪密切相关。这项工作是使用大量社交媒体(Reddit.com)数据和生成对抗网络来预测复发的第一次尝试之一。拟议的方法与社交媒体挖掘的知识相结合,有可能彻底改变阿片类药物成瘾预防和治疗的实践。
{"title":"Opioid Relapse Prediction with GAN","authors":"Zhou Yang, L. Nguyen, Fang Jin","doi":"10.1145/3341161.3342951","DOIUrl":"https://doi.org/10.1145/3341161.3342951","url":null,"abstract":"Opioid addiction is a severe public health threat in the U.S, causing massive deaths and many social problems. Accurate relapse prediction is of practical importance for recovering patients since relapse prediction promotes timely relapse preventions that help patients stay clean. In this paper, we introduce a Generative Adversarial Networks (GAN) model to predict the addiction relapses based on sentiment images and social influences. Experimental results on real social media data from Reddit.com demonstrate that the GAN model delivers a better performance than comparable alternative techniques. The sentiment images generated by the model show that relapse is closely connected with two emotions ‘joy’ and ‘negative’. This work is one of the first attempts to predict relapses using massive social media (Reddit.com) data and generative adversarial nets. The proposed method, combined with knowledge of social media mining, has the potential to revolutionize the practice of opioid addiction prevention and treatment.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124983962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Fraudulent User Detection on Rating Networks Based on Expanded Balance Theory and GCNs 基于扩展平衡理论和GCNs的评级网络欺诈用户检测
Wataru Kudo, Mao Nishiguchi, F. Toriumi
Rating platforms provide users with useful information on products or other users. However, fake ratings are sometimes generated by fraudulent users. In this paper, we tackle the task of fraudulent user detection on rating platforms. We propose an end-to-end framework based on Graph Convolutional Networks (GCNs) and expanded balance theory, which properly incorporates both the signs and directions of edges. Experimental results on four real-world datasets show that the proposed framework performs better, or even best, in most settings. In particular, this framework shows remarkable stability in inductive settings, which is associated with the detection of new fraudulent users on rating platforms. Furthermore, using expanded balance theory, we provide new insight into the behavior of users in rating networks, that fraudulent users form a faction to deal with the negative ratings from other users. The owner of a rating platform can detect fraudulent users earlier and constantly provide users with more credible information by using the proposed framework.
评级平台为用户提供关于产品或其他用户的有用信息。然而,虚假评级有时是由欺诈用户产生的。在本文中,我们解决了评级平台上的欺诈用户检测任务。我们提出了一个基于图卷积网络(GCNs)和扩展平衡理论的端到端框架,该框架适当地融合了边的符号和方向。在四个真实数据集上的实验结果表明,所提出的框架在大多数情况下表现更好,甚至最好。特别是,该框架在归纳设置中显示出显着的稳定性,这与在评级平台上检测新的欺诈用户有关。此外,利用扩展平衡理论,我们对评分网络中的用户行为提供了新的见解,即欺诈性用户形成一个派别来处理来自其他用户的负面评分。通过使用该框架,评级平台所有者可以更早地发现欺诈用户,并不断为用户提供更可信的信息。
{"title":"Fraudulent User Detection on Rating Networks Based on Expanded Balance Theory and GCNs","authors":"Wataru Kudo, Mao Nishiguchi, F. Toriumi","doi":"10.1145/3341161.3342929","DOIUrl":"https://doi.org/10.1145/3341161.3342929","url":null,"abstract":"Rating platforms provide users with useful information on products or other users. However, fake ratings are sometimes generated by fraudulent users. In this paper, we tackle the task of fraudulent user detection on rating platforms. We propose an end-to-end framework based on Graph Convolutional Networks (GCNs) and expanded balance theory, which properly incorporates both the signs and directions of edges. Experimental results on four real-world datasets show that the proposed framework performs better, or even best, in most settings. In particular, this framework shows remarkable stability in inductive settings, which is associated with the detection of new fraudulent users on rating platforms. Furthermore, using expanded balance theory, we provide new insight into the behavior of users in rating networks, that fraudulent users form a faction to deal with the negative ratings from other users. The owner of a rating platform can detect fraudulent users earlier and constantly provide users with more credible information by using the proposed framework.","PeriodicalId":403360,"journal":{"name":"2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125110211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1