Reading is a vital and complex cognitive activity during users’ information-seeking process. Several studies have focused on understanding users’ reading behavior in desktop search. Their findings greatly contribute to the design of information retrieval models. However, little is known about how users read a result in mobile search, although search currently happens more frequently in mobile scenarios. In this paper, we conduct a lab-based user study to investigate users’ fine-grained reading behavior patterns in mobile search. We find that users’ reading attention allocation is strongly affected by several behavior biases, such as position and selection biases. Inspired by these findings, we propose a probabilistic generative model, the Passage-level Reading behavior Model (PRM), to model users’ reading behavior in mobile search. The PRM utilizes observable passage-level exposure and viewport duration events to infer users’ unobserved skimming event, reading event, and satisfaction perception during the reading process. Besides fitting the passage-level reading behavior, we utilize the fitted parameters of PRM to estimate the passage-level and document-level relevance. Experimental results show that PRM outperforms existing unsupervised relevance estimation models. PRM has strong interpretability and provides valuable insights into the understanding of how users seek and perceive useful information in mobile search.
{"title":"A Passage-Level Reading Behavior Model for Mobile Search","authors":"Zhijing Wu, Jiaxin Mao, Kedi Xu, Dandan Song, Heyan Huang","doi":"10.1145/3543507.3583343","DOIUrl":"https://doi.org/10.1145/3543507.3583343","url":null,"abstract":"Reading is a vital and complex cognitive activity during users’ information-seeking process. Several studies have focused on understanding users’ reading behavior in desktop search. Their findings greatly contribute to the design of information retrieval models. However, little is known about how users read a result in mobile search, although search currently happens more frequently in mobile scenarios. In this paper, we conduct a lab-based user study to investigate users’ fine-grained reading behavior patterns in mobile search. We find that users’ reading attention allocation is strongly affected by several behavior biases, such as position and selection biases. Inspired by these findings, we propose a probabilistic generative model, the Passage-level Reading behavior Model (PRM), to model users’ reading behavior in mobile search. The PRM utilizes observable passage-level exposure and viewport duration events to infer users’ unobserved skimming event, reading event, and satisfaction perception during the reading process. Besides fitting the passage-level reading behavior, we utilize the fitted parameters of PRM to estimate the passage-level and document-level relevance. Experimental results show that PRM outperforms existing unsupervised relevance estimation models. PRM has strong interpretability and provides valuable insights into the understanding of how users seek and perceive useful information in mobile search.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129052708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haixin Wang, Jinan Sun, Xiang Wei, Shikun Zhang, C. Chen, Xiansheng Hua, Xiao Luo
This paper studies unsupervised domain adaptive hashing, which aims to transfer a hashing model from a label-rich source domain to a label-scarce target domain. Current state-of-the-art approaches generally resolve the problem by integrating pseudo-labeling and domain adaptation techniques into deep hashing paradigms. Nevertheless, they usually suffer from serious class imbalance in pseudo-labels and suboptimal domain alignment caused by the neglection of the intrinsic structures of two domains. To address this issue, we propose a novel method named unbiaseD duAl hashiNg Contrastive lEarning (DANCE) for domain adaptive image retrieval. The core of our DANCE is to perform contrastive learning on hash codes from both instance level and prototype level. To begin, DANCE utilizes label information to guide instance-level hashing contrastive learning in the source domain. To generate unbiased and reliable pseudo-labels for semantic learning in the target domain, we uniformly select samples around each label embedding in the Hamming space. A momentum-update scheme is also utilized to smooth the optimization process. Additionally, we measure the semantic prototype representations in both source and target domains and incorporate them into a domain-aware prototype-level contrastive learning paradigm, which enhances domain alignment in the Hamming space while maximizing the model capacity. Experimental results on a number of well-known domain adaptive retrieval benchmarks validate the effectiveness of our proposed DANCE compared to a variety of competing baselines in different settings.
{"title":"DANCE: Learning A Domain Adaptive Framework for Deep Hashing","authors":"Haixin Wang, Jinan Sun, Xiang Wei, Shikun Zhang, C. Chen, Xiansheng Hua, Xiao Luo","doi":"10.1145/3543507.3583445","DOIUrl":"https://doi.org/10.1145/3543507.3583445","url":null,"abstract":"This paper studies unsupervised domain adaptive hashing, which aims to transfer a hashing model from a label-rich source domain to a label-scarce target domain. Current state-of-the-art approaches generally resolve the problem by integrating pseudo-labeling and domain adaptation techniques into deep hashing paradigms. Nevertheless, they usually suffer from serious class imbalance in pseudo-labels and suboptimal domain alignment caused by the neglection of the intrinsic structures of two domains. To address this issue, we propose a novel method named unbiaseD duAl hashiNg Contrastive lEarning (DANCE) for domain adaptive image retrieval. The core of our DANCE is to perform contrastive learning on hash codes from both instance level and prototype level. To begin, DANCE utilizes label information to guide instance-level hashing contrastive learning in the source domain. To generate unbiased and reliable pseudo-labels for semantic learning in the target domain, we uniformly select samples around each label embedding in the Hamming space. A momentum-update scheme is also utilized to smooth the optimization process. Additionally, we measure the semantic prototype representations in both source and target domains and incorporate them into a domain-aware prototype-level contrastive learning paradigm, which enhances domain alignment in the Hamming space while maximizing the model capacity. Experimental results on a number of well-known domain adaptive retrieval benchmarks validate the effectiveness of our proposed DANCE compared to a variety of competing baselines in different settings.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125672448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyunsik Yoo, Yeon-Chang Lee, Kijung Shin, Sang-Wook Kim
The goal of directed network embedding is to represent the nodes in a given directed network as embeddings that preserve the asymmetric relationships between nodes. While a number of directed network embedding methods have been proposed, we empirically show that the existing methods lack out-of-distribution generalization abilities against degree-related distributional shifts. To mitigate this problem, we propose ODIN (Out-of-Distribution Generalized Directed Network Embedding), a new directed NE method where we model multiple factors in the formation of directed edges. Then, for each node, ODIN learns multiple embeddings, each of which preserves its corresponding factor, by disentangling interest factors and biases related to in- and out-degrees of nodes. Our experiments on four real-world directed networks demonstrate that disentangling multiple factors enables ODIN to yield out-of-distribution generalized embeddings that are consistently effective under various degrees of shifts in degree distributions. Specifically, ODIN universally outperforms 9 state-of-the-art competitors in 2 LP tasks on 4 real-world datasets under both identical distribution (ID) and non-ID settings. The code is available at https://github.com/hsyoo32/odin.
{"title":"Disentangling Degree-related Biases and Interest for Out-of-Distribution Generalized Directed Network Embedding","authors":"Hyunsik Yoo, Yeon-Chang Lee, Kijung Shin, Sang-Wook Kim","doi":"10.1145/3543507.3583271","DOIUrl":"https://doi.org/10.1145/3543507.3583271","url":null,"abstract":"The goal of directed network embedding is to represent the nodes in a given directed network as embeddings that preserve the asymmetric relationships between nodes. While a number of directed network embedding methods have been proposed, we empirically show that the existing methods lack out-of-distribution generalization abilities against degree-related distributional shifts. To mitigate this problem, we propose ODIN (Out-of-Distribution Generalized Directed Network Embedding), a new directed NE method where we model multiple factors in the formation of directed edges. Then, for each node, ODIN learns multiple embeddings, each of which preserves its corresponding factor, by disentangling interest factors and biases related to in- and out-degrees of nodes. Our experiments on four real-world directed networks demonstrate that disentangling multiple factors enables ODIN to yield out-of-distribution generalized embeddings that are consistently effective under various degrees of shifts in degree distributions. Specifically, ODIN universally outperforms 9 state-of-the-art competitors in 2 LP tasks on 4 real-world datasets under both identical distribution (ID) and non-ID settings. The code is available at https://github.com/hsyoo32/odin.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127631195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Abuhassan, Tarique Anwar, Chengfei Liu, H. Jarman, M. Fuller‐Tyszkiewicz
Social media platforms provide rich data sources in several domains. In mental health, individuals experiencing an Eating Disorder (ED) are often hesitant to seek help through conventional healthcare services. However, many people seek help with diet and body image issues on social media. To better distinguish at-risk users who may need help for an ED from those who are simply commenting on ED in social environments, highly sophisticated approaches are required. Assessment of ED risks in such a situation can be done in various ways, and each has its own strengths and weaknesses. Hence, there is a need for and potential benefit of a more complex multimodal approach. To this end, we collect historical tweets, user biographies, and online behaviours of relevant users from Twitter, and generate a reasonably large labelled benchmark dataset. Thereafter, we develop an advanced multimodal deep learning model called EDNet using these data to identify the different types of users with ED engagement (e.g., potential ED sufferers, healthcare professionals, or communicators) and distinguish them from those not experiencing EDs on Twitter. EDNet consists of five deep neural network layers. With the help of its embedding, representation and behaviour modeling layers, it effectively learns the multimodalities of social media. In our experiments, EDNet consistently outperforms all the baseline techniques by significant margins. It achieves an accuracy of up to 94.32% and F1 score of up to 93.91% F1 score. To the best of our knowledge, this is the first such study to propose a multimodal approach for user-level classification according to their engagement with ED content on social media.
{"title":"EDNet: Attention-Based Multimodal Representation for Classification of Twitter Users Related to Eating Disorders","authors":"Mohammad Abuhassan, Tarique Anwar, Chengfei Liu, H. Jarman, M. Fuller‐Tyszkiewicz","doi":"10.1145/3543507.3583863","DOIUrl":"https://doi.org/10.1145/3543507.3583863","url":null,"abstract":"Social media platforms provide rich data sources in several domains. In mental health, individuals experiencing an Eating Disorder (ED) are often hesitant to seek help through conventional healthcare services. However, many people seek help with diet and body image issues on social media. To better distinguish at-risk users who may need help for an ED from those who are simply commenting on ED in social environments, highly sophisticated approaches are required. Assessment of ED risks in such a situation can be done in various ways, and each has its own strengths and weaknesses. Hence, there is a need for and potential benefit of a more complex multimodal approach. To this end, we collect historical tweets, user biographies, and online behaviours of relevant users from Twitter, and generate a reasonably large labelled benchmark dataset. Thereafter, we develop an advanced multimodal deep learning model called EDNet using these data to identify the different types of users with ED engagement (e.g., potential ED sufferers, healthcare professionals, or communicators) and distinguish them from those not experiencing EDs on Twitter. EDNet consists of five deep neural network layers. With the help of its embedding, representation and behaviour modeling layers, it effectively learns the multimodalities of social media. In our experiments, EDNet consistently outperforms all the baseline techniques by significant margins. It achieves an accuracy of up to 94.32% and F1 score of up to 93.91% F1 score. To the best of our knowledge, this is the first such study to propose a multimodal approach for user-level classification according to their engagement with ED content on social media.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131723261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Previous researches demonstrate that users’ actions in search interaction are associated with relative gains and losses to reference points, known as the reference dependence effect. However, this widely confirmed effect is not represented in most user models underpinning existing search evaluation metrics. In this study, we propose a new evaluation metric framework, namely Reference Dependent Metric (ReDeM), for assessing query-level search by incorporating the effect of reference dependence into the modelling of user search behavior. To test the overall effectiveness of the proposed framework, (1) we evaluate the performance, in terms of correlation with user satisfaction, of ReDeMs built upon different reference points against that of the widely-used metrics on three search datasets; (2) we examine the performance of ReDeMs under different task states, like task difficulty and task urgency; and (3) we analyze the statistical reliability of ReDeMs in terms of discriminative power. Experimental results indicate that: (1) ReDeMs integrated with a proper reference point achieve better correlations with user satisfaction than most of the existing metrics, like Discounted Cumulative Gain (DCG) and Rank-Biased Precision (RBP), even though their parameters have already been well-tuned; (2) ReDeMs reach relatively better performance compared to existing metrics when the task triggers a high-level cognitive load; (3) the discriminative power of ReDeMs is far stronger than Expected Reciprocal Rank (ERR), slightly stronger than Precision and similar to DCG, RBP and INST. To our knowledge, this study is the first to explicitly incorporate the reference dependence effect into the user browsing model and offline evaluation metrics. Our work illustrates a promising approach to leveraging the insights about user biases from cognitive psychology in better evaluating user search experience and enhancing user models.
{"title":"A Reference-Dependent Model for Web Search Evaluation: Understanding and Measuring the Experience of Boundedly Rational Users","authors":"Nuo Chen, Jiqun Liu, Tetsuya Sakai","doi":"10.1145/3543507.3583551","DOIUrl":"https://doi.org/10.1145/3543507.3583551","url":null,"abstract":"Previous researches demonstrate that users’ actions in search interaction are associated with relative gains and losses to reference points, known as the reference dependence effect. However, this widely confirmed effect is not represented in most user models underpinning existing search evaluation metrics. In this study, we propose a new evaluation metric framework, namely Reference Dependent Metric (ReDeM), for assessing query-level search by incorporating the effect of reference dependence into the modelling of user search behavior. To test the overall effectiveness of the proposed framework, (1) we evaluate the performance, in terms of correlation with user satisfaction, of ReDeMs built upon different reference points against that of the widely-used metrics on three search datasets; (2) we examine the performance of ReDeMs under different task states, like task difficulty and task urgency; and (3) we analyze the statistical reliability of ReDeMs in terms of discriminative power. Experimental results indicate that: (1) ReDeMs integrated with a proper reference point achieve better correlations with user satisfaction than most of the existing metrics, like Discounted Cumulative Gain (DCG) and Rank-Biased Precision (RBP), even though their parameters have already been well-tuned; (2) ReDeMs reach relatively better performance compared to existing metrics when the task triggers a high-level cognitive load; (3) the discriminative power of ReDeMs is far stronger than Expected Reciprocal Rank (ERR), slightly stronger than Precision and similar to DCG, RBP and INST. To our knowledge, this study is the first to explicitly incorporate the reference dependence effect into the user browsing model and offline evaluation metrics. Our work illustrates a promising approach to leveraging the insights about user biases from cognitive psychology in better evaluating user search experience and enhancing user models.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126472645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
WebAssembly is a recent web standard built for better performance in web applications. The standard defines a binary code format to use as a compilation target for a variety of languages, such as C, C++, and Rust. The standard also defines a text representation for readability, although, WebAssembly modules are difficult to interpret by human readers, regardless of their experience level. This makes it difficult to understand and maintain any existing WebAssembly code. As a result, third-party WebAssembly modules need to be implicitly trusted by developers as verifying the functionality themselves may not be feasible. To this end, we construct WASPur, a tool to automatically identify the purposes of WebAssembly functions. To build this tool, we first construct an extensive collection of WebAssembly samples that represent the state of WebAssembly. Second, we analyze the dataset and identify the diverse use cases of the collected WebAssembly modules. We leverage the dataset of WebAssembly modules to construct semantics-aware intermediate representations (IR) of the functions in the modules. We encode the function IR for use in a machine learning classifier, and we find that this classifier can predict the similarity of a given function against known named functions with an accuracy rate of 88.07%. We hope our tool will enable inspection of optimized and minified WebAssembly modules that remove function names and most other semantic identifiers.
{"title":"Automated WebAssembly Function Purpose Identification With Semantics-Aware Analysis","authors":"Alan Romano, Weihang Wang","doi":"10.1145/3543507.3583235","DOIUrl":"https://doi.org/10.1145/3543507.3583235","url":null,"abstract":"WebAssembly is a recent web standard built for better performance in web applications. The standard defines a binary code format to use as a compilation target for a variety of languages, such as C, C++, and Rust. The standard also defines a text representation for readability, although, WebAssembly modules are difficult to interpret by human readers, regardless of their experience level. This makes it difficult to understand and maintain any existing WebAssembly code. As a result, third-party WebAssembly modules need to be implicitly trusted by developers as verifying the functionality themselves may not be feasible. To this end, we construct WASPur, a tool to automatically identify the purposes of WebAssembly functions. To build this tool, we first construct an extensive collection of WebAssembly samples that represent the state of WebAssembly. Second, we analyze the dataset and identify the diverse use cases of the collected WebAssembly modules. We leverage the dataset of WebAssembly modules to construct semantics-aware intermediate representations (IR) of the functions in the modules. We encode the function IR for use in a machine learning classifier, and we find that this classifier can predict the similarity of a given function against known named functions with an accuracy rate of 88.07%. We hope our tool will enable inspection of optimized and minified WebAssembly modules that remove function names and most other semantic identifiers.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123833292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Crowd stampede disasters often occur, such as recent ones in Indonesia and South Korea, and crowd simulation is particularly important to prevent and avoid such disasters. Most traditional models for crowd simulation, such as the social force model, are hand-designed formulas, which use Newtonian forces to model the interactions between pedestrians. However, such formula-based methods may not be flexible enough to capture the complex interaction patterns in diverse crowd scenarios. Recently, due to the development of the Internet, a large amount of pedestrian movement data has been collected, allowing us to study crowd simulation in a data-driven way. Inspired by the recent success of graph network-based simulation (GNS), we propose a novel method under the framework of GNS, which simulates the crowd in a data-driven way. Specifically, we propose to model the interactions among people and the environment using a heterogeneous graph. Then, we design a heterogeneous gated message-passing network to learn the interaction pattern that depends on the visual field. Finally, the randomness is introduced by modeling the context’s different influences on pedestrians with a probabilistic emission function. Extensive experiments on synthetic data, controlled-environment data and real-world data are performed. Extensive results show that our model can generally capture the three main factors which contribute to crowd trajectories while adapting to the data characteristics beyond the strong assumption of formulas-based methods. As a result, the proposed method outperforms existing methods by a large margin.
{"title":"Learning to Simulate Crowd Trajectories with Graph Networks","authors":"Hongzhi Shi, Quanming Yao, Yong Li","doi":"10.1145/3543507.3583858","DOIUrl":"https://doi.org/10.1145/3543507.3583858","url":null,"abstract":"Crowd stampede disasters often occur, such as recent ones in Indonesia and South Korea, and crowd simulation is particularly important to prevent and avoid such disasters. Most traditional models for crowd simulation, such as the social force model, are hand-designed formulas, which use Newtonian forces to model the interactions between pedestrians. However, such formula-based methods may not be flexible enough to capture the complex interaction patterns in diverse crowd scenarios. Recently, due to the development of the Internet, a large amount of pedestrian movement data has been collected, allowing us to study crowd simulation in a data-driven way. Inspired by the recent success of graph network-based simulation (GNS), we propose a novel method under the framework of GNS, which simulates the crowd in a data-driven way. Specifically, we propose to model the interactions among people and the environment using a heterogeneous graph. Then, we design a heterogeneous gated message-passing network to learn the interaction pattern that depends on the visual field. Finally, the randomness is introduced by modeling the context’s different influences on pedestrians with a probabilistic emission function. Extensive experiments on synthetic data, controlled-environment data and real-world data are performed. Extensive results show that our model can generally capture the three main factors which contribute to crowd trajectories while adapting to the data characteristics beyond the strong assumption of formulas-based methods. As a result, the proposed method outperforms existing methods by a large margin.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114296643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The modern web is a collection of remote resources that are identified by their location and composed of interleaving networks of trust. Supply chain attacks compromise the users of a target domain by leveraging its often large set of trusted third parties who provide resources such as JavaScript. The ubiquity of JavaScript, paired with its ability to execute arbitrary code on client machines, makes this particular web resource an ideal vector for supply chain attacks. Currently, there exists no robust method for users browsing the web to verify that the script content they receive from a third party is the expected content. In this paper, we present key insights to inform the design of robust integrity mechanisms, derived from our large-scale analyses of the 6M scripts we collected while crawling 44K domains every day for 77 days. We find that scripts that frequently change should be considered first-class citizens in the modern web ecosystem, and that the ways in which scripts change remain constant over time. Furthermore, we present analyses on the use of strict integrity verification (e.g., Subresource Integrity) at the granularity of the script providers themselves, offering a more complete perspective and demonstrating that the use of strict integrity alone cannot provide satisfactory security guarantees. We conclude that it is infeasible for a client to distinguish benign changes from malicious ones without additional, external knowledge, motivating the need for a new protocol to provide clients the necessary context to assess the potential ramifications of script changes.
{"title":"The More Things Change, the More They Stay the Same: Integrity of Modern JavaScript","authors":"J. So, M. Ferdman, Nick Nikiforakis","doi":"10.1145/3543507.3583395","DOIUrl":"https://doi.org/10.1145/3543507.3583395","url":null,"abstract":"The modern web is a collection of remote resources that are identified by their location and composed of interleaving networks of trust. Supply chain attacks compromise the users of a target domain by leveraging its often large set of trusted third parties who provide resources such as JavaScript. The ubiquity of JavaScript, paired with its ability to execute arbitrary code on client machines, makes this particular web resource an ideal vector for supply chain attacks. Currently, there exists no robust method for users browsing the web to verify that the script content they receive from a third party is the expected content. In this paper, we present key insights to inform the design of robust integrity mechanisms, derived from our large-scale analyses of the 6M scripts we collected while crawling 44K domains every day for 77 days. We find that scripts that frequently change should be considered first-class citizens in the modern web ecosystem, and that the ways in which scripts change remain constant over time. Furthermore, we present analyses on the use of strict integrity verification (e.g., Subresource Integrity) at the granularity of the script providers themselves, offering a more complete perspective and demonstrating that the use of strict integrity alone cannot provide satisfactory security guarantees. We conclude that it is infeasible for a client to distinguish benign changes from malicious ones without additional, external knowledge, motivating the need for a new protocol to provide clients the necessary context to assess the potential ramifications of script changes.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"359 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122757614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The creative web is all about combining different types of media to create a unique and engaging online experience. Multimodal data, such as text and images, is a key component in the creative web. Social media posts that incorporate both text descriptions and images offer a wealth of information and context. Text in social media posts typically relates to one topic, while images often convey information about multiple topics due to the richness of visual content. Despite this potential, many existing multimodal topic models do not take these criteria into account, resulting in poor quality topics being generated. Therefore, we proposed a Coherent Topic modeling for Multimodal Data (CTM-MM), which takes into account that text in social media posts typically relates to one topic, while images can contain information about multiple topics. Our experimental results show that CTM-MM outperforms traditional multimodal topic models in terms of classification and topic coherence.
{"title":"Coherent Topic Modeling for Creative Multimodal Data on Social Media","authors":"Junaid Rashid, Jungeun Kim, Usman Naseem","doi":"10.1145/3543507.3587433","DOIUrl":"https://doi.org/10.1145/3543507.3587433","url":null,"abstract":"The creative web is all about combining different types of media to create a unique and engaging online experience. Multimodal data, such as text and images, is a key component in the creative web. Social media posts that incorporate both text descriptions and images offer a wealth of information and context. Text in social media posts typically relates to one topic, while images often convey information about multiple topics due to the richness of visual content. Despite this potential, many existing multimodal topic models do not take these criteria into account, resulting in poor quality topics being generated. Therefore, we proposed a Coherent Topic modeling for Multimodal Data (CTM-MM), which takes into account that text in social media posts typically relates to one topic, while images can contain information about multiple topics. Our experimental results show that CTM-MM outperforms traditional multimodal topic models in terms of classification and topic coherence.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127731375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bharathan Balaji, Venkata Sai Gargeya Vunnava, G. Guest, Jared Kramer
Products contribute to carbon emissions in each phase of their life cycle, from manufacturing to disposal. Estimating the embodied carbon in products is a key step towards understanding their impact, and undertaking mitigation actions. Precise carbon attribution is challenging at scale, requiring both domain expertise and granular supply chain data. As a first-order approximation, standard reports use Economic Input-Output based Life Cycle Assessment (EIO-LCA) which estimates carbon emissions per dollar at an industry sector level using transactions between different parts of the economy. EIO-LCA models map products to an industry sector, and uses the corresponding carbon per dollar estimates to calculate the embodied carbon footprint of a product. An LCA expert needs to map each product to one of upwards of 1000 potential industry sectors. To reduce the annotation burden, the standard practice is to group products by categories, and map categories to their corresponding industry sector. We present CaML, an algorithm to automate EIO-LCA using semantic text similarity matching by leveraging the text descriptions of the product and the industry sector. CaML uses a pre-trained sentence transformer model to rank the top-5 matches, and asks a human to check if any of them are a good match. We annotated 40K products with non-experts. Our results reveal that pre-defined product categories are heterogeneous with respect to EIO-LCA industry sectors, and lead to a large mean absolute percentage error (MAPE) of 51% in kgCO2e/$. CaML outperforms the previous manually intensive method, yielding a MAPE of 22% with no domain labels (zero-shot). We compared annotations of a small sample of 210 products with LCA experts, and find that CaML accuracy is comparable to that of annotations by non-experts.
{"title":"CaML: Carbon Footprinting of Household Products with Zero-Shot Semantic Text Similarity","authors":"Bharathan Balaji, Venkata Sai Gargeya Vunnava, G. Guest, Jared Kramer","doi":"10.1145/3543507.3583882","DOIUrl":"https://doi.org/10.1145/3543507.3583882","url":null,"abstract":"Products contribute to carbon emissions in each phase of their life cycle, from manufacturing to disposal. Estimating the embodied carbon in products is a key step towards understanding their impact, and undertaking mitigation actions. Precise carbon attribution is challenging at scale, requiring both domain expertise and granular supply chain data. As a first-order approximation, standard reports use Economic Input-Output based Life Cycle Assessment (EIO-LCA) which estimates carbon emissions per dollar at an industry sector level using transactions between different parts of the economy. EIO-LCA models map products to an industry sector, and uses the corresponding carbon per dollar estimates to calculate the embodied carbon footprint of a product. An LCA expert needs to map each product to one of upwards of 1000 potential industry sectors. To reduce the annotation burden, the standard practice is to group products by categories, and map categories to their corresponding industry sector. We present CaML, an algorithm to automate EIO-LCA using semantic text similarity matching by leveraging the text descriptions of the product and the industry sector. CaML uses a pre-trained sentence transformer model to rank the top-5 matches, and asks a human to check if any of them are a good match. We annotated 40K products with non-experts. Our results reveal that pre-defined product categories are heterogeneous with respect to EIO-LCA industry sectors, and lead to a large mean absolute percentage error (MAPE) of 51% in kgCO2e/$. CaML outperforms the previous manually intensive method, yielding a MAPE of 22% with no domain labels (zero-shot). We compared annotations of a small sample of 210 products with LCA experts, and find that CaML accuracy is comparable to that of annotations by non-experts.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133692676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}