Peer-to-peer (P2P) lending platforms offer Internet users the possibility to borrow money from peers without the intervention of traditional financial institutions. Due to the anonymity on such social lending platforms, determining the creditworthiness of borrowers is of high importance. Beyond the disclosure of traditional financial variables that enable risk assessment, peer-to-peer lending platforms offer the opportunity to reveal additional information on the loan purpose. We investigate whether this self-disclosed information is used to show reliability and to outline creditworthiness of platform participants. We analyze more than 70,000 loans funded at a leading social lending platform. We show that linguistic and content-based factors help to explain a loan's probability of default and that content-based factors are more important than linguistic variables. Surprisingly, not every information provided by borrowers underlines creditworthiness. Instead, certain aspects rather indicate a higher probability of default. Our study provides important insights on information disclosure in the context of peer-to-peer lending, shows how to increase performance in credit scoring and is highly relevant for the stakeholders on social lending platforms.
{"title":"Peer-to-Peer (P2P) Lending Risk Management: Assessing Credit Risk on Social Lending Platforms Using Textual Factors","authors":"Michael Siering","doi":"10.1145/3589003","DOIUrl":"https://doi.org/10.1145/3589003","url":null,"abstract":"Peer-to-peer (P2P) lending platforms offer Internet users the possibility to borrow money from peers without the intervention of traditional financial institutions. Due to the anonymity on such social lending platforms, determining the creditworthiness of borrowers is of high importance. Beyond the disclosure of traditional financial variables that enable risk assessment, peer-to-peer lending platforms offer the opportunity to reveal additional information on the loan purpose. We investigate whether this self-disclosed information is used to show reliability and to outline creditworthiness of platform participants. We analyze more than 70,000 loans funded at a leading social lending platform. We show that linguistic and content-based factors help to explain a loan's probability of default and that content-based factors are more important than linguistic variables. Surprisingly, not every information provided by borrowers underlines creditworthiness. Instead, certain aspects rather indicate a higher probability of default. Our study provides important insights on information disclosure in the context of peer-to-peer lending, shows how to increase performance in credit scoring and is highly relevant for the stakeholders on social lending platforms.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47469514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
For many decades, ‘design science’ was used to depict the process around the systematic formation of artifacts. In information systems, the term is used more broadly to describe systematic approaches to creating an expansive set of diverse artifacts, ranging from knowledge frameworks to full-fledged information systems. Design science in information systems denotes research that focuses on the creation of new technology, knowledge about technology, and the process of creation. ‘Data science’ refers to an interdisciplinary field that focuses on data and its collection, preparation, and integration. Although different from ‘design science,’ ‘data science’ also has seen increasing use in the information systems (IS) literature. The growing availability of high-quality software libraries and technology to reuse existing code has most likely contributed to this increase. Regardless, data science research plays an essential role in the increase in design science research. Hevner et al. [2004] portray design science in a framework comprised of the environment, information systems research, and an application domain. They suggest that design science research addresses important unsolved problems in unique or innovative ways or that it solves problems in more effective or efficient ways. Similarly, the Design Science Research knowledge contribution framework later developed by Gregor and Hevner [2013] proposes three types of research contributions: developing new solutions for known problems, extending known solutions to new problems, and inventing new solutions for new problems. In contrast to other computing fields, the IS field has historically emphasized using kernel theories to invent, adjust, and improve artifacts. However, notable contributions can also be made without reliance on kernel theories in the intersection of data science and design science. For example, no comprehensive theories explain why artificial neural networks (ANNs) work as well as they do. And yet, ANNs serve as a cornerstone technology in most classification projects ranging from tumor identification in medicine to recognizing handwritten checks or recommendations in e-commerce. Even when theories exist, they may be irrelevant to the artifact design. For example,
{"title":"Introduction to the Special Issue on Design and Data Science Research in Healthcare","authors":"G. Leroy, B. Tulu, Xiao Liu","doi":"10.1145/3579646","DOIUrl":"https://doi.org/10.1145/3579646","url":null,"abstract":"For many decades, ‘design science’ was used to depict the process around the systematic formation of artifacts. In information systems, the term is used more broadly to describe systematic approaches to creating an expansive set of diverse artifacts, ranging from knowledge frameworks to full-fledged information systems. Design science in information systems denotes research that focuses on the creation of new technology, knowledge about technology, and the process of creation. ‘Data science’ refers to an interdisciplinary field that focuses on data and its collection, preparation, and integration. Although different from ‘design science,’ ‘data science’ also has seen increasing use in the information systems (IS) literature. The growing availability of high-quality software libraries and technology to reuse existing code has most likely contributed to this increase. Regardless, data science research plays an essential role in the increase in design science research. Hevner et al. [2004] portray design science in a framework comprised of the environment, information systems research, and an application domain. They suggest that design science research addresses important unsolved problems in unique or innovative ways or that it solves problems in more effective or efficient ways. Similarly, the Design Science Research knowledge contribution framework later developed by Gregor and Hevner [2013] proposes three types of research contributions: developing new solutions for known problems, extending known solutions to new problems, and inventing new solutions for new problems. In contrast to other computing fields, the IS field has historically emphasized using kernel theories to invent, adjust, and improve artifacts. However, notable contributions can also be made without reliance on kernel theories in the intersection of data science and design science. For example, no comprehensive theories explain why artificial neural networks (ANNs) work as well as they do. And yet, ANNs serve as a cornerstone technology in most classification projects ranging from tumor identification in medicine to recognizing handwritten checks or recommendations in e-commerce. Even when theories exist, they may be irrelevant to the artifact design. For example,","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43220133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Internet of Things (IoT) designers frequently must determine whether action-oriented decisions should be made by edge computers or whether they should be made only by central servers combining input from all edge computers. An important example of this design problem occurs in fire protection IoT, where individual edge computers attached to sensors might be empowered to make decisions (have decision rights) about how to manage the fire. Alternatively, decision rights could be held exclusively by a central server isolated from the fire, because the designer is concerned damage to edge computers could cause them to act unreliably. This research models this allocation of decision rights to identify the relative influence of various decision factors. We first model the allocation of decision rights under the following assumptions: (1) The central server cannot make an error the edge computer cannot make; (2) the central server cannot update the edge computer with its information in a timely manner; and (3) the central server cannot reverse an action initiated by the edge computer to explore the factors impacting decision rights conferral. We then relax each of these three assumptions. We show how relaxing each assumption radically changes the factors impacting decision rights conferral. We also show that allowing the central server to update information on the edge computer or reverse the edge computer's decision making can result in overall lower system performance. We then perform a series of numerical experiments to understand how changing various parameters affect the problem. We show for the general real-world scenario, the key factor influencing the decision is the ability of the edge computer to detect false alarms. We also show magnitude of loss and ratio of real to false incidents have a linear and logarithmic relationship to the reliability of the edge computer.
{"title":"Situational Factor Determinants of the Allocation of Decision Rights to Edge Computers","authors":"C. Chua, F. Niederman","doi":"10.1145/3582081","DOIUrl":"https://doi.org/10.1145/3582081","url":null,"abstract":"Internet of Things (IoT) designers frequently must determine whether action-oriented decisions should be made by edge computers or whether they should be made only by central servers combining input from all edge computers. An important example of this design problem occurs in fire protection IoT, where individual edge computers attached to sensors might be empowered to make decisions (have decision rights) about how to manage the fire. Alternatively, decision rights could be held exclusively by a central server isolated from the fire, because the designer is concerned damage to edge computers could cause them to act unreliably. This research models this allocation of decision rights to identify the relative influence of various decision factors. We first model the allocation of decision rights under the following assumptions: (1) The central server cannot make an error the edge computer cannot make; (2) the central server cannot update the edge computer with its information in a timely manner; and (3) the central server cannot reverse an action initiated by the edge computer to explore the factors impacting decision rights conferral. We then relax each of these three assumptions. We show how relaxing each assumption radically changes the factors impacting decision rights conferral. We also show that allowing the central server to update information on the edge computer or reverse the edge computer's decision making can result in overall lower system performance. We then perform a series of numerical experiments to understand how changing various parameters affect the problem. We show for the general real-world scenario, the key factor influencing the decision is the ability of the edge computer to detect false alarms. We also show magnitude of loss and ratio of real to false incidents have a linear and logarithmic relationship to the reliability of the edge computer.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44090475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Pal, Rohan Xavier Sequeira, Y. Zhu, Angelica Marotta, Michael Siegel, Edward Y. Hua
The COVID-19 pandemic (e.g., especially the first and second COVID waves) had forced firms (organizations) to radically shift a considerable (if not all) proportion of their employees to serve in a work-from-home (WFH) mode. Industry statistics showcase that despite ushering in significant work-flexibility (and other) benefits, the WFH mode has also expanded an organization’s cyber-vulnerability space, and increased the number of cyber-breaches in IT and IT-OT systems (e.g., ICSs). This leads us to an important fundamental question: is the WFH paradigm detrimental to IT and IoT-driven ICS security in general? While vulnerability reasoning and empirical statistics might qualitatively support an affirmative answer to this question, a rigorous, practically motivated, and strategic cost-benefit analysis is yet to be conducted to establish in principle whether and to what degree WFH-induced cyber-security in an IT/ICS system is sub-optimal when compared to that in the non-WFH work mode. We propose a novel and rigorous strategic method to dynamically quantify the degree of sub-optimal cyber-security in an IT/ICS organization of employees, all of whom work in heterogeneous WFH “siloes”. We first derive as benchmark for a WFH setting - the centrally-planned socially optimal aggregate employee effort in cyber-security best practices at any given time instant. We then derive and compute (using Breton’s Nash equilibrium computation algorithm for stochastic dynamic games) for for the same setting - the distributed time-varying strategic Nash equilibrium amount of aggregate employee effort in cyber-security. The time-varying ratios of these centralized and distributed estimates quantify the free riding dynamics, i.e., a proxy concept for security sub-optimality, within an IT/ICS organization for the WFH setting. We finally compare the free-riding ratio between WFH and non-WFH work modes to gauge the (possible) extent of the increase (lower bound) in security sub-optimality when the organization operates in a WFH mode. We counter-intuitively observe through extensive real-world-trace-driven Monte Carlo simulations that the maximum of the time-dependent median increase in the related security sub-optimality ranges around 25% but decreases fast with time to near 0% (implying security sub-optimality in the WFH mode equals that in the non-WFH mode) if the impact of employee security effort is time-accumulative (sustainable) even for short time intervals.
{"title":"How Suboptimal is Work-From-Home Security in IT/ICS Enterprises? A Strategic Organizational Theory for Managers","authors":"R. Pal, Rohan Xavier Sequeira, Y. Zhu, Angelica Marotta, Michael Siegel, Edward Y. Hua","doi":"10.1145/3579645","DOIUrl":"https://doi.org/10.1145/3579645","url":null,"abstract":"The COVID-19 pandemic (e.g., especially the first and second COVID waves) had forced firms (organizations) to radically shift a considerable (if not all) proportion of their employees to serve in a work-from-home (WFH) mode. Industry statistics showcase that despite ushering in significant work-flexibility (and other) benefits, the WFH mode has also expanded an organization’s cyber-vulnerability space, and increased the number of cyber-breaches in IT and IT-OT systems (e.g., ICSs). This leads us to an important fundamental question: is the WFH paradigm detrimental to IT and IoT-driven ICS security in general? While vulnerability reasoning and empirical statistics might qualitatively support an affirmative answer to this question, a rigorous, practically motivated, and strategic cost-benefit analysis is yet to be conducted to establish in principle whether and to what degree WFH-induced cyber-security in an IT/ICS system is sub-optimal when compared to that in the non-WFH work mode. We propose a novel and rigorous strategic method to dynamically quantify the degree of sub-optimal cyber-security in an IT/ICS organization of employees, all of whom work in heterogeneous WFH “siloes”. We first derive as benchmark for a WFH setting - the centrally-planned socially optimal aggregate employee effort in cyber-security best practices at any given time instant. We then derive and compute (using Breton’s Nash equilibrium computation algorithm for stochastic dynamic games) for for the same setting - the distributed time-varying strategic Nash equilibrium amount of aggregate employee effort in cyber-security. The time-varying ratios of these centralized and distributed estimates quantify the free riding dynamics, i.e., a proxy concept for security sub-optimality, within an IT/ICS organization for the WFH setting. We finally compare the free-riding ratio between WFH and non-WFH work modes to gauge the (possible) extent of the increase (lower bound) in security sub-optimality when the organization operates in a WFH mode. We counter-intuitively observe through extensive real-world-trace-driven Monte Carlo simulations that the maximum of the time-dependent median increase in the related security sub-optimality ranges around 25% but decreases fast with time to near 0% (implying security sub-optimality in the WFH mode equals that in the non-WFH mode) if the impact of employee security effort is time-accumulative (sustainable) even for short time intervals.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42863814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emergency room (ER) admissions are the front door for the utilization of a community's health resources and serve as a valuable proxy for a community health system's capacity. While recent research suggests that social determinants of health (SDOH) are important predictors of patient health outcomes, their impact on ER utilization during the COVID-19 pandemic is not well understood. Further, the role of hospital information integration in moderating the impact of SDOH on ER utilization has not received adequate attention. Utilizing longitudinal claims data from a regional health information exchange spanning six years including the COVID-19 period, we study how SDOH affects ER utilization and whether effective integration of patient health information across hospitals can moderate its impact. Our results suggest that a patient's economic well-being significantly reduces future ER utilization. The magnitude of this relationship is significant when patients are treated at hospitals with high information integration but is weaker when patients receive care at hospitals with lower levels of information integration. Instead, patients' family and social support can reduce ER utilization when they are treated at hospitals with low information integration. In other words, different dimensions of SDOH are important in low versus high information integration conditions. Furthermore, predictive modeling shows that patient visit type and prior visit history can significantly improve the predictive accuracy of ER utilization. Our research implications support efforts to develop national standards for the collection and sharing of SDOH data, and their use and interpretation for clinical decision making by healthcare providers and policy makers.
{"title":"Social Determinants of Health and ER Utilization: Role of Information Integration during COVID-19","authors":"Tian-Ze Guo, I. Bardhan, Anjum Khurshid","doi":"10.1145/3583077","DOIUrl":"https://doi.org/10.1145/3583077","url":null,"abstract":"Emergency room (ER) admissions are the front door for the utilization of a community's health resources and serve as a valuable proxy for a community health system's capacity. While recent research suggests that social determinants of health (SDOH) are important predictors of patient health outcomes, their impact on ER utilization during the COVID-19 pandemic is not well understood. Further, the role of hospital information integration in moderating the impact of SDOH on ER utilization has not received adequate attention. Utilizing longitudinal claims data from a regional health information exchange spanning six years including the COVID-19 period, we study how SDOH affects ER utilization and whether effective integration of patient health information across hospitals can moderate its impact. Our results suggest that a patient's economic well-being significantly reduces future ER utilization. The magnitude of this relationship is significant when patients are treated at hospitals with high information integration but is weaker when patients receive care at hospitals with lower levels of information integration. Instead, patients' family and social support can reduce ER utilization when they are treated at hospitals with low information integration. In other words, different dimensions of SDOH are important in low versus high information integration conditions. Furthermore, predictive modeling shows that patient visit type and prior visit history can significantly improve the predictive accuracy of ER utilization. Our research implications support efforts to develop national standards for the collection and sharing of SDOH data, and their use and interpretation for clinical decision making by healthcare providers and policy makers.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49599485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Public auditing and data deduplication are integral considerations in providing efficient and secure cloud storage services. Nevertheless, the traditional data deduplication models that support public auditing can endure the enormous waste of storage and computation resources induced through data redundancy and repeated audit work by multiple tenants on trusted third-party auditor (TPA). In this work, we introduce blockchain-based secure decentralized public auditing in a decentralized cloud storage with an efficient deduplication model. We employ blockchain to take on the task of centralized TPA, which also mitigates the implications of malicious blockchain miners by using the concept of a decentralized autonomous organization (DAO). Specifically, we employ the idea of redactability for blockchain to handle often neglected security issues that would adversely affect the integrity of stored auditing records on blockchain in decentralized auditing models. However, the proposed model also employs an efficient deduplication scheme to attain adequate storage savings while preserving the users from data loss due to duplicate faking attacks. Moreover, the detailed concrete security analysis demonstrates the computational infeasibility of the proposed model against proof-of-ownership, duplicate faking attack (DFA), collusion attack, storage free-riding attack, data privacy, and forgery attack with high efficiency. Finally, the comprehensive performance analysis shows the scalability and feasibility of the proposed model.
{"title":"Enabling Efficient Deduplication and Secure Decentralized Public Auditing for Cloud Storage: A Redactable Blockchain Approach","authors":"Rahul Mishra, D. Ramesh, S. Kanhere, D. Edla","doi":"10.1145/3578555","DOIUrl":"https://doi.org/10.1145/3578555","url":null,"abstract":"Public auditing and data deduplication are integral considerations in providing efficient and secure cloud storage services. Nevertheless, the traditional data deduplication models that support public auditing can endure the enormous waste of storage and computation resources induced through data redundancy and repeated audit work by multiple tenants on trusted third-party auditor (TPA). In this work, we introduce blockchain-based secure decentralized public auditing in a decentralized cloud storage with an efficient deduplication model. We employ blockchain to take on the task of centralized TPA, which also mitigates the implications of malicious blockchain miners by using the concept of a decentralized autonomous organization (DAO). Specifically, we employ the idea of redactability for blockchain to handle often neglected security issues that would adversely affect the integrity of stored auditing records on blockchain in decentralized auditing models. However, the proposed model also employs an efficient deduplication scheme to attain adequate storage savings while preserving the users from data loss due to duplicate faking attacks. Moreover, the detailed concrete security analysis demonstrates the computational infeasibility of the proposed model against proof-of-ownership, duplicate faking attack (DFA), collusion attack, storage free-riding attack, data privacy, and forgery attack with high efficiency. Finally, the comprehensive performance analysis shows the scalability and feasibility of the proposed model.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43562966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Combining symbolic and subsymbolic methods has become a promising strategy as research tasks in AI grow increasingly complicated and require higher levels of understanding. Targeted Aspect-based Financial Sentiment Analysis (TABFSA) is an example of such complicated tasks, as it involves processes like information extraction, information specification, and domain adaptation. However, little is known about the design principles of such hybrid models leveraging external lexical knowledge. To fill this gap, we define anterior, parallel, and posterior knowledge integration and propose incorporating multiple lexical knowledge sources strategically into the fine-tuning process of pre-trained transformer models for TABFSA. Experiments on the Financial Opinion mining and Question Answering challenge (FiQA) Task 1 and SemEval 2017 Task 5 datasets show that the knowledge-enabled models systematically improve upon their plain deep learning counterparts, and some outperform state-of-the-art results reported in terms of aspect sentiment analysis error. We discover that parallel knowledge integration is the most effective and domain-specific lexical knowledge is more important according to our ablation analysis.
{"title":"Incorporating Multiple Knowledge Sources for Targeted Aspect-based Financial Sentiment Analysis","authors":"Kelvin Du, Frank Xing, E. Cambria","doi":"10.1145/3580480","DOIUrl":"https://doi.org/10.1145/3580480","url":null,"abstract":"Combining symbolic and subsymbolic methods has become a promising strategy as research tasks in AI grow increasingly complicated and require higher levels of understanding. Targeted Aspect-based Financial Sentiment Analysis (TABFSA) is an example of such complicated tasks, as it involves processes like information extraction, information specification, and domain adaptation. However, little is known about the design principles of such hybrid models leveraging external lexical knowledge. To fill this gap, we define anterior, parallel, and posterior knowledge integration and propose incorporating multiple lexical knowledge sources strategically into the fine-tuning process of pre-trained transformer models for TABFSA. Experiments on the Financial Opinion mining and Question Answering challenge (FiQA) Task 1 and SemEval 2017 Task 5 datasets show that the knowledge-enabled models systematically improve upon their plain deep learning counterparts, and some outperform state-of-the-art results reported in terms of aspect sentiment analysis error. We discover that parallel knowledge integration is the most effective and domain-specific lexical knowledge is more important according to our ablation analysis.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49278695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Moshe Unger, Pan Li, Sahana (Shahana) Sen, A. Tuzhilin
Although building a 360-degree comprehensive view of a customer has been a long-standing goal in marketing, this challenge has not been successfully addressed in many marketing applications because fractured customer data stored across different “silos” are hard to integrate under “one roof” for several reasons. Instead of integrating customer data, in this article we propose to integrate several domain-specific partial customer views into one consolidated or composite customer profile using a Deep Learning-based method that is theoretically grounded in Kolmogorov’s Mapping Neural Network Existence Theorem. Furthermore, our method needs to securely access domain-specific or siloed customer data only once for building the initial customer embeddings. We conduct extensive studies on two industrial applications to demonstrate that our method effectively reconstructs stable composite customer embeddings that constitute strong approximations of the ground-truth composite embeddings obtained from integrating the siloed raw customer data. Moreover, we show that these data-security preserving reconstructed composite embeddings not only perform as well as the original ground-truth embeddings but significantly outperform partial embeddings and state-of-the-art baselines in recommendation and consumer preference prediction tasks.
{"title":"Don’t Need All Eggs in One Basket: Reconstructing Composite Embeddings of Customers from Individual-Domain Embeddings","authors":"Moshe Unger, Pan Li, Sahana (Shahana) Sen, A. Tuzhilin","doi":"10.1145/3578710","DOIUrl":"https://doi.org/10.1145/3578710","url":null,"abstract":"Although building a 360-degree comprehensive view of a customer has been a long-standing goal in marketing, this challenge has not been successfully addressed in many marketing applications because fractured customer data stored across different “silos” are hard to integrate under “one roof” for several reasons. Instead of integrating customer data, in this article we propose to integrate several domain-specific partial customer views into one consolidated or composite customer profile using a Deep Learning-based method that is theoretically grounded in Kolmogorov’s Mapping Neural Network Existence Theorem. Furthermore, our method needs to securely access domain-specific or siloed customer data only once for building the initial customer embeddings. We conduct extensive studies on two industrial applications to demonstrate that our method effectively reconstructs stable composite customer embeddings that constitute strong approximations of the ground-truth composite embeddings obtained from integrating the siloed raw customer data. Moreover, we show that these data-security preserving reconstructed composite embeddings not only perform as well as the original ground-truth embeddings but significantly outperform partial embeddings and state-of-the-art baselines in recommendation and consumer preference prediction tasks.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43435730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ecommerce websites increasingly provide predictive analytics-based advice (PAA), such as advice about future potential price reductions. Establishing consumer-trust in these advice-giving systems imposes unique and novel challenges. First, PAA about future alternatives that can benefit the consumer appears to inherently contradict the business goal of selling a product quickly and at high profit margins. Second, PAA is based on mathematical models that are non-transparent to the user. Third, PAA advice is inherently uncertain, and can be perceived as subjectively imposed in algorithms. Utilizing Toulmin's argumentation-model, we investigate the influence of advice-justification statements in overcoming these difficulties. Based on three experimental studies, in which respondents are provided with the advice of PAA systems, we show evidence for the different roles Toulmin's statement-types play in enhancing various trusting-beliefs in PAA systems. Provision of warrants is mostly associated with enhanced competence beliefs; rebuttals with integrity beliefs; backings both competence and benevolence; and data statements enhance competence, integrity, and benevolence beliefs. Implications of the findings for research and practice are provided.
{"title":"Using Toulmin's Argumentation Model to Enhance Trust in Analytics-Based Advice Giving Systems","authors":"E. Rubin, I. Benbasat","doi":"10.1145/3580479","DOIUrl":"https://doi.org/10.1145/3580479","url":null,"abstract":"Ecommerce websites increasingly provide predictive analytics-based advice (PAA), such as advice about future potential price reductions. Establishing consumer-trust in these advice-giving systems imposes unique and novel challenges. First, PAA about future alternatives that can benefit the consumer appears to inherently contradict the business goal of selling a product quickly and at high profit margins. Second, PAA is based on mathematical models that are non-transparent to the user. Third, PAA advice is inherently uncertain, and can be perceived as subjectively imposed in algorithms. Utilizing Toulmin's argumentation-model, we investigate the influence of advice-justification statements in overcoming these difficulties. Based on three experimental studies, in which respondents are provided with the advice of PAA systems, we show evidence for the different roles Toulmin's statement-types play in enhancing various trusting-beliefs in PAA systems. Provision of warrants is mostly associated with enhanced competence beliefs; rebuttals with integrity beliefs; backings both competence and benevolence; and data statements enhance competence, integrity, and benevolence beliefs. Implications of the findings for research and practice are provided.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45657855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mu-Yen Chen, B. Thuraisingham, E. Eğrioğlu, J. J. Rubio
The development of big data applications is driving the dramatic growth of hybrid data, often in the form of complex sets of cross-media content including text, images, videos, audios, and time series. Tremendous volumes of these heterogeneous data are derived from multiple IoT sources and present new challenges for the design, development, and implementation of effective information systems and decision support frameworks tomeet heterogeneous computing requirements. Emerging technologies allow for the near real-time extraction and analysis of heterogeneous data to find meaningful information. Machine-learning algorithms allow computers to learn automatically, analyzing existing data to establish rules to predict outcomes of unknown data. However, traditional machine learning approaches do not meet the needs for Internet of Things (IoT) applications, calling for new technologies. Deep learning is a good example of emerging technologies that tackle the limitations of traditional machine learning through feature engineering, providing superior performance in highly complex applications. However, these technologies also raise new security and privacy concerns. Technology adoption and trust issues are of timely importance as well. Industrial operations are in themidst of rapid transformations, sometimes referred to as Industry 4.0, Industrial Internet of Things (IIoT), or smart manufacturing. These transformations are bringing fundamental changes to factories and workplaces, making them safer and more efficient, flexible, and environmentally friendly. Machines are evolving to have increased autonomy, and new human-machine interfaces such as smart tools, augmented reality, and touchless interfaces are making interaction more natural. Machines are also becoming increasingly interconnected within individual factories as well as to the outside world through cloud computing, enabling many opportunities for operational efficiency and flexibility in manufacturing and maintenance. An increasing number of countries have put forth national advanced manufacturing development strategies, such as Germany’s Industry 4.0, the United States’ Industrial Internet and manufacturing system based on CPS (Cyber-Physical Systems), and China’s Internet Plus Manufacturing and Made in China 2025 initiatives. Smart Manufacturing aims to maximize transparency and access of all manufacturing process information across entire manufacturing supply chains and product lifecycles, with the Internet of Things (IoT) as a centerpiece to increase productivity and output value. This manufacturing revolution depends on technology connectivity and the contextualization of data, thus putting intelligent systems support and data science at the center of these developments.
{"title":"Introduction to the Special Issue on Smart Systems for Industry 4.0 and IoT","authors":"Mu-Yen Chen, B. Thuraisingham, E. Eğrioğlu, J. J. Rubio","doi":"10.1145/3583985","DOIUrl":"https://doi.org/10.1145/3583985","url":null,"abstract":"The development of big data applications is driving the dramatic growth of hybrid data, often in the form of complex sets of cross-media content including text, images, videos, audios, and time series. Tremendous volumes of these heterogeneous data are derived from multiple IoT sources and present new challenges for the design, development, and implementation of effective information systems and decision support frameworks tomeet heterogeneous computing requirements. Emerging technologies allow for the near real-time extraction and analysis of heterogeneous data to find meaningful information. Machine-learning algorithms allow computers to learn automatically, analyzing existing data to establish rules to predict outcomes of unknown data. However, traditional machine learning approaches do not meet the needs for Internet of Things (IoT) applications, calling for new technologies. Deep learning is a good example of emerging technologies that tackle the limitations of traditional machine learning through feature engineering, providing superior performance in highly complex applications. However, these technologies also raise new security and privacy concerns. Technology adoption and trust issues are of timely importance as well. Industrial operations are in themidst of rapid transformations, sometimes referred to as Industry 4.0, Industrial Internet of Things (IIoT), or smart manufacturing. These transformations are bringing fundamental changes to factories and workplaces, making them safer and more efficient, flexible, and environmentally friendly. Machines are evolving to have increased autonomy, and new human-machine interfaces such as smart tools, augmented reality, and touchless interfaces are making interaction more natural. Machines are also becoming increasingly interconnected within individual factories as well as to the outside world through cloud computing, enabling many opportunities for operational efficiency and flexibility in manufacturing and maintenance. An increasing number of countries have put forth national advanced manufacturing development strategies, such as Germany’s Industry 4.0, the United States’ Industrial Internet and manufacturing system based on CPS (Cyber-Physical Systems), and China’s Internet Plus Manufacturing and Made in China 2025 initiatives. Smart Manufacturing aims to maximize transparency and access of all manufacturing process information across entire manufacturing supply chains and product lifecycles, with the Internet of Things (IoT) as a centerpiece to increase productivity and output value. This manufacturing revolution depends on technology connectivity and the contextualization of data, thus putting intelligent systems support and data science at the center of these developments.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45122895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}