Using Explainable AI to Understand Team Formation and Team Impact

Huimin Xu, Maytal Saar‐Tsechansky, Min Song, Ying Ding
{"title":"Using Explainable <scp>AI</scp> to Understand Team Formation and Team Impact","authors":"Huimin Xu, Maytal Saar‐Tsechansky, Min Song, Ying Ding","doi":"10.1002/pra2.804","DOIUrl":null,"url":null,"abstract":"ABSTRACT The citation of scientific papers is considered a simple and direct indicator of papers' impact. This paper predicts papers' citations through team‐related variables, team composition, and team structure. Team composition includes team size, male/female dominance, academia/industry collaboration, unique race number, and unique country number. Team structures are made up of team power level and team power hierarchy. Team members' previous citation number, H‐index, previous collaborators, career age, and previous paper numbers are a proxy of team power. We calculated the mean value and Gini coefficient to represent team power level (the collective team capability) and team power hierarchy (the vertical difference of power distribution within a team). Taking 1,675,035 CS teams in the DBLP dataset, we trained the XGBoost model to predict high/low citation. Our model has reached 0.71 in AUC and 70.45% in accuracy rate. Utilizing Explainable AI method SHAP to evaluate features' relative importance in predicting team citation categories, we found that team structure plays a more critical role than team composition in predicting team citation. High team power level, flat team power structure, diverse race background, large team, collaboration with industry, and male‐dominated teams can bring higher team citations. Our project can provide insights into how to form the best scientific teams and maximize team impact from team composition and team structure.","PeriodicalId":37833,"journal":{"name":"Proceedings of the Association for Information Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Association for Information Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/pra2.804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

ABSTRACT The citation of scientific papers is considered a simple and direct indicator of papers' impact. This paper predicts papers' citations through team‐related variables, team composition, and team structure. Team composition includes team size, male/female dominance, academia/industry collaboration, unique race number, and unique country number. Team structures are made up of team power level and team power hierarchy. Team members' previous citation number, H‐index, previous collaborators, career age, and previous paper numbers are a proxy of team power. We calculated the mean value and Gini coefficient to represent team power level (the collective team capability) and team power hierarchy (the vertical difference of power distribution within a team). Taking 1,675,035 CS teams in the DBLP dataset, we trained the XGBoost model to predict high/low citation. Our model has reached 0.71 in AUC and 70.45% in accuracy rate. Utilizing Explainable AI method SHAP to evaluate features' relative importance in predicting team citation categories, we found that team structure plays a more critical role than team composition in predicting team citation. High team power level, flat team power structure, diverse race background, large team, collaboration with industry, and male‐dominated teams can bring higher team citations. Our project can provide insights into how to form the best scientific teams and maximize team impact from team composition and team structure.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用可解释的AI来理解团队形成和团队影响
科学论文的引用被认为是论文影响力的一个简单而直接的指标。本文通过团队相关变量、团队组成和团队结构来预测论文的引用。团队组成包括团队规模、男性/女性优势、学术界/行业合作、独特的种族编号和独特的国家编号。团队结构由团队权力层次和团队权力层次构成。团队成员以前的引用次数、H指数、以前的合作者、职业年龄和以前的论文数量是团队力量的代表。我们计算了均值和基尼系数来表示团队权力水平(团队集体能力)和团队权力层次(团队内部权力分布的垂直差异)。以DBLP数据集中的1,675,035个CS团队为例,我们训练了XGBoost模型来预测高/低引用。模型的AUC达到0.71,准确率达到70.45%。利用可解释人工智能方法SHAP评估特征在预测团队引用类别中的相对重要性,我们发现团队结构在预测团队引用方面比团队组成发挥更关键的作用。团队权力水平高、团队权力结构扁平化、种族背景多元化、团队规模大、与行业合作、男性主导的团队可以带来更高的团队引用率。我们的项目可以从团队组成和团队结构中提供如何组建最好的科学团队和最大化团队影响的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Proceedings of the Association for Information Science and Technology
Proceedings of the Association for Information Science and Technology Social Sciences-Library and Information Sciences
CiteScore
1.30
自引率
0.00%
发文量
164
期刊介绍: Information not localized
期刊最新文献
Considering the Role of Information and Context in Promoting Health-Related Behavioral Change. Transforming Indigenous Knowledges Stewardship Praxis through an Ethics of Care “I Am in a Privileged Situation”: Examining the Factors Promoting Inequity in Open Access Publishing Shifting Roles of Citizen Scientists Accelerates High‐Quality Data Collection for Climate Change Research Investigating the Intersections of Ethics and Artificial Intelligence in the Collections as Data Position Papers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1