Influence of User Profile Attributes on e-Cigarette-Related Searches on YouTube: Machine Learning Clustering and Classification.

IF 3.5 Q1 HEALTH CARE SCIENCES & SERVICES JMIR infodemiology Pub Date : 2023-04-12 eCollection Date: 2023-01-01 DOI:10.2196/42218
Dhiraj Murthy, Juhan Lee, Hassan Dashtian, Grace Kong
{"title":"Influence of User Profile Attributes on e-Cigarette-Related Searches on YouTube: Machine Learning Clustering and Classification.","authors":"Dhiraj Murthy, Juhan Lee, Hassan Dashtian, Grace Kong","doi":"10.2196/42218","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The proliferation of e-cigarette content on YouTube is concerning because of its possible effect on youth use behaviors. YouTube has a personalized search and recommendation algorithm that derives attributes from a user's profile, such as age and sex. However, little is known about whether e-cigarette content is shown differently based on user characteristics.</p><p><strong>Objective: </strong>The aim of this study was to understand the influence of age and sex attributes of user profiles on e-cigarette-related YouTube search results.</p><p><strong>Methods: </strong>We created 16 fictitious YouTube profiles with ages of 16 and 24 years, sex (female and male), and ethnicity/race to search for 18 e-cigarette-related search terms. We used unsupervised (k-means clustering and classification) and supervised (graph convolutional network) machine learning and network analysis to characterize the variation in the search results of each profile. We further examined whether user attributes may play a role in e-cigarette-related content exposure by using networks and degree centrality.</p><p><strong>Results: </strong>We analyzed 4201 nonduplicate videos. Our k-means clustering suggested that the videos could be clustered into 3 categories. The graph convolutional network achieved high accuracy (0.72). Videos were classified based on content into 4 categories: product review (49.3%), health information (15.1%), instruction (26.9%), and other (8.5%). Underage users were exposed mostly to instructional videos (37.5%), with some indication that more female 16-year-old profiles were exposed to this content, while young adult age groups (24 years) were exposed mostly to product review videos (39.2%).</p><p><strong>Conclusions: </strong>Our results indicate that demographic attributes factor into YouTube's algorithmic systems in the context of e-cigarette-related queries on YouTube. Specifically, differences in the age and sex attributes of user profiles do result in variance in both the videos presented in YouTube search results as well as in the types of these videos. We find that underage profiles were exposed to e-cigarette content despite YouTube's age-restriction policy that ostensibly prohibits certain e-cigarette content. Greater enforcement of policies to restrict youth access to e-cigarette content is needed.</p>","PeriodicalId":73554,"journal":{"name":"JMIR infodemiology","volume":"3 ","pages":"e42218"},"PeriodicalIF":3.5000,"publicationDate":"2023-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10139687/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR infodemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/42218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The proliferation of e-cigarette content on YouTube is concerning because of its possible effect on youth use behaviors. YouTube has a personalized search and recommendation algorithm that derives attributes from a user's profile, such as age and sex. However, little is known about whether e-cigarette content is shown differently based on user characteristics.

Objective: The aim of this study was to understand the influence of age and sex attributes of user profiles on e-cigarette-related YouTube search results.

Methods: We created 16 fictitious YouTube profiles with ages of 16 and 24 years, sex (female and male), and ethnicity/race to search for 18 e-cigarette-related search terms. We used unsupervised (k-means clustering and classification) and supervised (graph convolutional network) machine learning and network analysis to characterize the variation in the search results of each profile. We further examined whether user attributes may play a role in e-cigarette-related content exposure by using networks and degree centrality.

Results: We analyzed 4201 nonduplicate videos. Our k-means clustering suggested that the videos could be clustered into 3 categories. The graph convolutional network achieved high accuracy (0.72). Videos were classified based on content into 4 categories: product review (49.3%), health information (15.1%), instruction (26.9%), and other (8.5%). Underage users were exposed mostly to instructional videos (37.5%), with some indication that more female 16-year-old profiles were exposed to this content, while young adult age groups (24 years) were exposed mostly to product review videos (39.2%).

Conclusions: Our results indicate that demographic attributes factor into YouTube's algorithmic systems in the context of e-cigarette-related queries on YouTube. Specifically, differences in the age and sex attributes of user profiles do result in variance in both the videos presented in YouTube search results as well as in the types of these videos. We find that underage profiles were exposed to e-cigarette content despite YouTube's age-restriction policy that ostensibly prohibits certain e-cigarette content. Greater enforcement of policies to restrict youth access to e-cigarette content is needed.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用户资料属性对 YouTube 上电子烟相关搜索的影响:机器学习聚类和分类。
背景:YouTube 上电子烟内容的激增令人担忧,因为这可能会影响青少年的使用行为。YouTube 有一种个性化搜索和推荐算法,该算法从用户的个人资料(如年龄和性别)中提取属性。然而,人们对电子烟内容是否会根据用户特征以不同方式显示知之甚少:本研究旨在了解用户资料中的年龄和性别属性对电子烟相关 YouTube 搜索结果的影响:我们创建了 16 个年龄为 16 岁和 24 岁、性别(女性和男性)和民族/种族的虚构 YouTube 资料,用于搜索 18 个与电子烟相关的搜索词。我们使用无监督(k-均值聚类和分类)和有监督(图卷积网络)机器学习和网络分析来描述每个档案搜索结果的差异。通过使用网络和度中心性,我们进一步研究了用户属性是否会在电子烟相关内容曝光中发挥作用:我们分析了 4201 个不重复的视频。我们的 k-means 聚类表明,这些视频可分为 3 类。图卷积网络达到了很高的准确率(0.72)。视频根据内容分为 4 类:产品评论(49.3%)、健康信息(15.1%)、教学(26.9%)和其他(8.5%)。未成年用户主要接触的是教学视频(37.5%),有迹象表明更多的 16 岁女性用户接触了这一内容,而青壮年年龄组(24 岁)主要接触的是产品评论视频(39.2%):我们的研究结果表明,在 YouTube 上与电子烟相关的查询中,人口统计学属性是 YouTube 算法系统的一个因素。具体来说,用户年龄和性别属性的差异确实导致了 YouTube 搜索结果中呈现的视频以及这些视频类型的差异。我们发现,尽管 YouTube 的年龄限制政策表面上禁止某些电子烟内容,但未成年用户还是接触到了电子烟内容。有必要加大政策执行力度,限制青少年接触电子烟内容。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.80
自引率
0.00%
发文量
0
期刊最新文献
Association Between X/Twitter and Prescribing Behavior During the COVID-19 Pandemic: Retrospective Ecological Study. Correction: Exploring the Impact of the COVID-19 Pandemic on Twitter in Japan: Qualitative Analysis of Disrupted Plans and Consequences. The Complex Interaction Between Sleep-Related Information, Misinformation, and Sleep Health: A Call for Comprehensive Research on Sleep Infodemiology and Infoveillance. Understanding and Combating Misinformation: An Evolutionary Perspective. Detection and Characterization of Online Substance Use Discussions Among Gamers: Qualitative Retrospective Analysis of Reddit r/StopGaming Data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1