Consumer segmentation is an electronic marketing practice that involves dividing consumers into groups with similar features to discover their preferences. In the business-to-customer (B2C) retailing industry, marketers explore big data to segment consumers based on various dimensions. However, among these dimensions, the motives of location and time of shopping have received relatively less attention. In this study, we use the recency, frequency, monetary, and tenure (RFMT) method to segment consumers into 10 groups based on their time and geographical features. To explore location, we investigate market distribution, revenue distribution, and consumer distribution. Geographical coordinates and peculiarities are estimated based on consumer density. Regarding time exploration, we evaluate the accuracy of product delivery and the timing of promotions. To pinpoint the target consumers, we display the main hotspots on the distribution heatmap. Furthermore, we identify the optimal time for purchase and the most densely populated locations of beneficial consumers. In addition, we evaluate product distribution to determine the most popular product categories. Based on the RFMT segmentation and product popularity, we have developed a product recommender system to assist marketers in attracting and engaging potential consumers. Through a case study using data from massive B2C retailing, we conclude that the proposed segmentation provides superior insights into consumer behavior and improves product recommendation performance.
{"title":"Consumer Segmentation Based on Location and Timing Dimensions Using Big Data from Business-to-Customer Retailing Marketplaces.","authors":"Fatemeh Ehsani, Monireh Hosseini","doi":"10.1089/big.2022.0307","DOIUrl":"10.1089/big.2022.0307","url":null,"abstract":"<p><p>Consumer segmentation is an electronic marketing practice that involves dividing consumers into groups with similar features to discover their preferences. In the business-to-customer (B2C) retailing industry, marketers explore big data to segment consumers based on various dimensions. However, among these dimensions, the motives of location and time of shopping have received relatively less attention. In this study, we use the recency, frequency, monetary, and tenure (RFMT) method to segment consumers into 10 groups based on their time and geographical features. To explore location, we investigate market distribution, revenue distribution, and consumer distribution. Geographical coordinates and peculiarities are estimated based on consumer density. Regarding time exploration, we evaluate the accuracy of product delivery and the timing of promotions. To pinpoint the target consumers, we display the main hotspots on the distribution heatmap. Furthermore, we identify the optimal time for purchase and the most densely populated locations of beneficial consumers. In addition, we evaluate product distribution to determine the most popular product categories. Based on the RFMT segmentation and product popularity, we have developed a product recommender system to assist marketers in attracting and engaging potential consumers. Through a case study using data from massive B2C retailing, we conclude that the proposed segmentation provides superior insights into consumer behavior and improves product recommendation performance.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71415223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01Epub Date: 2022-01-24DOI: 10.1089/big.2021.0274
Afzal Badshah, Ateeqa Jalal, Umar Farooq, Ghani-Ur Rehman, Shahab S Band, Celestine Iwendi
The cloud network is rapidly growing due to a massive increase in interconnected devices and the emergence of different technologies such as the Internet of things, fog computing, and artificial intelligence. In response, cloud computing needs reliable dealings among the service providers, brokers, and consumers. The existing cloud monitoring frameworks such as Amazon Cloud Watch, Paraleap Azure Watch, and Rack Space Cloud Kick work under the control of service providers. They work fine; however, this may create dissatisfaction among customers over Service Level Agreement (SLA) violations. Customers' dissatisfaction may drastically reduce the businesses of service providers. To cope with the earlier mentioned issue and get in line with cloud philosophy, Monitoring as a Service (MaaS), completely independent in nature, is needed for observing and regulating the cloud businesses. However, the existing MaaS frameworks do not address the comprehensive SLA for customer satisfaction and penalties management. This article proposes a reliable framework for monitoring the provider's services by adopting third-party monitoring services with clearcut SLA and penalties management. Since this framework monitors SLA as a cloud monitoring service, it is named as SLA-MaaS. On violations, it penalizes those who are found in breach of terms and condition enlisted in SLA. Simulation results confirmed that the proposed framework adequately satisfies the customers (as well as service providers). This helps in developing a trustworthy relationship among cloud partners and increases customer attention and retention.
由于互联设备的大量增加以及物联网、雾计算和人工智能等不同技术的出现,云网络正在迅速发展。作为回应,云计算需要服务提供商、经纪人和消费者之间的可靠交易。现有的云监控框架,如Amazon cloud Watch、Paraleap Azure Watch和Rack Space cloud Kick,在服务提供商的控制下工作。它们工作良好;然而,这可能会引起客户对违反服务水平协议(SLA)的不满。客户的不满可能会大大减少服务提供商的业务。为了解决前面提到的问题并符合云的理念,监控即服务(MaaS)在本质上是完全独立的,需要用于观察和监管云业务。然而,现有的MaaS框架没有解决客户满意度和处罚管理的全面SLA问题。本文提出了一个可靠的框架,通过采用具有clearcut SLA和惩罚管理的第三方监控服务来监控提供商的服务。由于该框架将SLA作为云监控服务进行监控,因此将其命名为SLA-MaaS。关于违规行为,它惩罚那些被发现违反苏丹解放军招募的条款和条件的人。仿真结果证实,所提出的框架充分满足了客户(以及服务提供商)的要求。这有助于在云合作伙伴之间建立值得信赖的关系,并提高客户的关注度和忠诚度。
{"title":"Service Level Agreement Monitoring as a Service: An Independent Monitoring Service for Service Level Agreements in Clouds.","authors":"Afzal Badshah, Ateeqa Jalal, Umar Farooq, Ghani-Ur Rehman, Shahab S Band, Celestine Iwendi","doi":"10.1089/big.2021.0274","DOIUrl":"10.1089/big.2021.0274","url":null,"abstract":"<p><p>The cloud network is rapidly growing due to a massive increase in interconnected devices and the emergence of different technologies such as the Internet of things, fog computing, and artificial intelligence. In response, cloud computing needs reliable dealings among the service providers, brokers, and consumers. The existing cloud monitoring frameworks such as Amazon Cloud Watch, Paraleap Azure Watch, and Rack Space Cloud Kick work under the control of service providers. They work fine; however, this may create dissatisfaction among customers over Service Level Agreement (SLA) violations. Customers' dissatisfaction may drastically reduce the businesses of service providers. To cope with the earlier mentioned issue and get in line with cloud philosophy, Monitoring as a Service (MaaS), completely independent in nature, is needed for observing and regulating the cloud businesses. However, the existing MaaS frameworks do not address the comprehensive SLA for customer satisfaction and penalties management. This article proposes a reliable framework for monitoring the provider's services by adopting third-party monitoring services with clearcut SLA and penalties management. Since this framework monitors SLA as a cloud monitoring service, it is named as SLA-MaaS. On violations, it penalizes those who are found in breach of terms and condition enlisted in SLA. Simulation results confirmed that the proposed framework adequately satisfies the customers (as well as service providers). This helps in developing a trustworthy relationship among cloud partners and increases customer attention and retention.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"339-354"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39857084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01Epub Date: 2023-01-19DOI: 10.1089/big.2021.0333
Guowei Zhang, Weilan Wang, Ce Zhang, Penghai Zhao, Mingkai Zhang
Recognition of handwritten Uchen Tibetan characters input has been considered an efficient way of acquiring mass data in the digital era. However, it still faces considerable challenges due to seriously touching letters and various morphological features of identical characters. Thus, deeper neural networks are required to achieve decent recognition accuracy, making an efficient, lightweight model design important to balance the inevitable trade-off between accuracy and latency. To reduce the learnable parameters of the network as much as possible and maintain acceptable accuracy, we introduce an efficient model named HUTNet based on the internal relationship between floating-point operations per second (FLOPs) and Memory Access Cost. The proposed network achieves a ResNet-18-level accuracy of 96.86%, with only a tenth of the parameters. The subsequent pruning and knowledge distillation strategies were applied to further reduce the inference latency of the model. Experiments on the test set (Handwritten Uchen Tibetan Data set by Wang [HUTDW]) containing 562 classes of 42,068 samples show that the compressed model achieves a 96.83% accuracy while maintaining lower FLOPs and fewer parameters. To verify the effectiveness of HUTNet, we tested it on the Chinese Handwriting Data sets Handwriting Database 1.1 (HWDB1.1), in which HUTNet achieved an accuracy of 97.24%, higher than that of ResNet-18 and ResNet-34. In general, we conduct extensive experiments on resource and accuracy trade-offs and show a stronger performance compared with other famous models on HUTDW and HWDB1.1. It also unlocks the critical bottleneck for handwritten Uchen Tibetan recognition on low-power computing devices.
{"title":"HUTNet: An Efficient Convolutional Neural Network for Handwritten Uchen Tibetan Character Recognition.","authors":"Guowei Zhang, Weilan Wang, Ce Zhang, Penghai Zhao, Mingkai Zhang","doi":"10.1089/big.2021.0333","DOIUrl":"10.1089/big.2021.0333","url":null,"abstract":"<p><p>Recognition of handwritten Uchen Tibetan characters input has been considered an efficient way of acquiring mass data in the digital era. However, it still faces considerable challenges due to seriously touching letters and various morphological features of identical characters. Thus, deeper neural networks are required to achieve decent recognition accuracy, making an efficient, lightweight model design important to balance the inevitable trade-off between accuracy and latency. To reduce the learnable parameters of the network as much as possible and maintain acceptable accuracy, we introduce an efficient model named HUTNet based on the internal relationship between floating-point operations per second (FLOPs) and Memory Access Cost. The proposed network achieves a ResNet-18-level accuracy of 96.86%, with only a tenth of the parameters. The subsequent pruning and knowledge distillation strategies were applied to further reduce the inference latency of the model. Experiments on the test set (Handwritten Uchen Tibetan Data set by Wang [HUTDW]) containing 562 classes of 42,068 samples show that the compressed model achieves a 96.83% accuracy while maintaining lower FLOPs and fewer parameters. To verify the effectiveness of HUTNet, we tested it on the Chinese Handwriting Data sets Handwriting Database 1.1 (HWDB1.1), in which HUTNet achieved an accuracy of 97.24%, higher than that of ResNet-18 and ResNet-34. In general, we conduct extensive experiments on resource and accuracy trade-offs and show a stronger performance compared with other famous models on HUTDW and HWDB1.1. It also unlocks the critical bottleneck for handwritten Uchen Tibetan recognition on low-power computing devices.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"387-398"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10543391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01Epub Date: 2023-01-19DOI: 10.1089/big.2021.0365
Jiabing Xu, Jiarui Liu, Tianen Yao, Yang Li
This study aims to transform the existing telecom operators from traditional Internet operators to digital-driven services, and improve the overall competitiveness of telecom enterprises. Data mining is applied to telecom user classification to process the existing telecom user data through data integration, cleaning, standardization, and transformation. Although the existing algorithms ensure the accuracy of the algorithm on the telecom user analysis platform under big data, they do not solve the limitations of single machine computing and cannot effectively improve the training efficiency of the model. To solve this problem, this article establishes a telecom customer churn prediction model with the help of backpropagation neural network (BPNN) algorithm, and deploys the MapReduce programming framework on Hadoop platform. Using the data of a telecom company, this article analyzes the loss of telecom customers in the big data environment. The research shows that the accuracy of telecom customer churn prediction model in BPNN is 82.12%. After deploying large data sets, the learning and training time of the model is greatly shortened. When the number of nodes is 8, the acceleration ratio of the model remains at 60 seconds. Under big data, the telecom user analysis platform not only ensures the accuracy of the algorithm, but also solves the limitations of single machine computing and effectively improves the training efficiency of the model. Compared with that of the existing research, the accuracy of the model is improved by 25.36%, and the running time is shortened by about twice. This business model based on BPNN algorithm has obvious advantages in processing more data sets, and has great reference value for the digital-driven business model transformation of the telecommunications industry.
{"title":"Prediction and Big Data Impact Analysis of Telecom Churn by Backpropagation Neural Network Algorithm from the Perspective of Business Model.","authors":"Jiabing Xu, Jiarui Liu, Tianen Yao, Yang Li","doi":"10.1089/big.2021.0365","DOIUrl":"10.1089/big.2021.0365","url":null,"abstract":"<p><p>This study aims to transform the existing telecom operators from traditional Internet operators to digital-driven services, and improve the overall competitiveness of telecom enterprises. Data mining is applied to telecom user classification to process the existing telecom user data through data integration, cleaning, standardization, and transformation. Although the existing algorithms ensure the accuracy of the algorithm on the telecom user analysis platform under big data, they do not solve the limitations of single machine computing and cannot effectively improve the training efficiency of the model. To solve this problem, this article establishes a telecom customer churn prediction model with the help of backpropagation neural network (BPNN) algorithm, and deploys the MapReduce programming framework on Hadoop platform. Using the data of a telecom company, this article analyzes the loss of telecom customers in the big data environment. The research shows that the accuracy of telecom customer churn prediction model in BPNN is 82.12%. After deploying large data sets, the learning and training time of the model is greatly shortened. When the number of nodes is 8, the acceleration ratio of the model remains at 60 seconds. Under big data, the telecom user analysis platform not only ensures the accuracy of the algorithm, but also solves the limitations of single machine computing and effectively improves the training efficiency of the model. Compared with that of the existing research, the accuracy of the model is improved by 25.36%, and the running time is shortened by about twice. This business model based on BPNN algorithm has obvious advantages in processing more data sets, and has great reference value for the digital-driven business model transformation of the telecommunications industry.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"355-368"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10549823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01Epub Date: 2022-01-06DOI: 10.1089/big.2021.0279
Aisha Batool, Muhammad Wasif Nisar, Jamal Hussain Shah, Muhammad Attique Khan, Ahmed A Abd El-Latif
Traffic sign detection (TSD) in real-time environment holds great importance for applications such as automated-driven vehicles. Large variety of traffic signs, different appearances, and spatial representations causes a huge intraclass variation. In this article, an extreme learning machine (ELM), convolutional neural network (CNN), and scale transformation (ST)-based model, called improved extreme learning machine network, are proposed to detect traffic signs in real-time environment. The proposed model has a custom DenseNet-based novel CNN architecture, improved version of region proposal networks called accurate anchor prediction model (A2PM), ST, and ELM module. CNN architecture makes use of handcrafted features such as scale-invariant feature transform and Gabor to improvise the edges of traffic signs. The A2PM minimizes the redundancy among extracted features to make the model efficient and ST enables the model to detect traffic signs of different sizes. ELM module enhances the efficiency by reshaping the features. The proposed model is tested on three publicly available data sets, challenging unreal and real environments for traffic sign recognition, Tsinghua-Tencent 100K, and German traffic sign detection benchmark and achieves average precisions of 93.31%, 95.22%, and 99.45%, respectively. These results prove that the proposed model is more efficient than state-of-the-art sign detection techniques.
{"title":"iELMNet: Integrating Novel Improved Extreme Learning Machine and Convolutional Neural Network Model for Traffic Sign Detection.","authors":"Aisha Batool, Muhammad Wasif Nisar, Jamal Hussain Shah, Muhammad Attique Khan, Ahmed A Abd El-Latif","doi":"10.1089/big.2021.0279","DOIUrl":"10.1089/big.2021.0279","url":null,"abstract":"<p><p>Traffic sign detection (TSD) in real-time environment holds great importance for applications such as automated-driven vehicles. Large variety of traffic signs, different appearances, and spatial representations causes a huge intraclass variation. In this article, an extreme learning machine (ELM), convolutional neural network (CNN), and scale transformation (ST)-based model, called improved extreme learning machine network, are proposed to detect traffic signs in real-time environment. The proposed model has a custom DenseNet-based novel CNN architecture, improved version of region proposal networks called accurate anchor prediction model (A2PM), ST, and ELM module. CNN architecture makes use of handcrafted features such as scale-invariant feature transform and Gabor to improvise the edges of traffic signs. The A2PM minimizes the redundancy among extracted features to make the model efficient and ST enables the model to detect traffic signs of different sizes. ELM module enhances the efficiency by reshaping the features. The proposed model is tested on three publicly available data sets, challenging unreal and real environments for traffic sign recognition, Tsinghua-Tencent 100K, and German traffic sign detection benchmark and achieves average precisions of 93.31%, 95.22%, and 99.45%, respectively. These results prove that the proposed model is more efficient than state-of-the-art sign detection techniques.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"323-338"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39655008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01Epub Date: 2023-01-27DOI: 10.1089/big.2021.0343
Chen Tao
Anomaly detection is crucial in a variety of domains, such as fraud detection, disease diagnosis, and equipment defect detection. With the development of deep learning, anomaly detection with Bayesian neural networks (BNNs) becomes a novel research topic in recent years. This article aims to propose a widely applicable method of outlier detection (a category of anomaly detection) using BNNs based on uncertainty measurement. There are three kinds of uncertainties generated in the prediction of BNNs: epistemic uncertainty, aleatoric uncertainty, and (model) misspecification uncertainty. Although the approaches in previous studies are adopted to measure epistemic and aleatoric uncertainty, a new method of utilizing loss functions to quantify misspecification uncertainty is proposed in this article. Then, these three uncertainty sources are merged together by specific combination models to construct total prediction uncertainty. In this study, the key idea is that the observations with high total prediction uncertainty should correspond to outliers in the data. The method of this research is applied to the experiments on Modified National Institute of Standards and Technology (MNIST) dataset and Taxi dataset, respectively. From the results, if the network is appropriately constructed and well-trained and model parameters are carefully tuned, most anomalous images in MNIST dataset and all the abnormal traffic periods in Taxi dataset can be nicely detected. In addition, the performance of this method is compared with the BNN anomaly detection methods proposed before and the classical Local Outlier Factor and Density-Based Spatial Clustering of Applications with Noise methods. This study links the classification of uncertainties in essence with anomaly detection and takes the lead to consider combining different uncertainty sources to reform detection outcomes instead of using only single uncertainty each time.
{"title":"Applications of Bayesian Neural Networks in Outlier Detection.","authors":"Chen Tao","doi":"10.1089/big.2021.0343","DOIUrl":"10.1089/big.2021.0343","url":null,"abstract":"<p><p>Anomaly detection is crucial in a variety of domains, such as fraud detection, disease diagnosis, and equipment defect detection. With the development of deep learning, anomaly detection with Bayesian neural networks (BNNs) becomes a novel research topic in recent years. This article aims to propose a widely applicable method of outlier detection (a category of anomaly detection) using BNNs based on uncertainty measurement. There are three kinds of uncertainties generated in the prediction of BNNs: epistemic uncertainty, aleatoric uncertainty, and (model) misspecification uncertainty. Although the approaches in previous studies are adopted to measure epistemic and aleatoric uncertainty, a new method of utilizing loss functions to quantify misspecification uncertainty is proposed in this article. Then, these three uncertainty sources are merged together by specific combination models to construct total prediction uncertainty. In this study, the key idea is that the observations with high total prediction uncertainty should correspond to outliers in the data. The method of this research is applied to the experiments on Modified National Institute of Standards and Technology (MNIST) dataset and Taxi dataset, respectively. From the results, if the network is appropriately constructed and well-trained and model parameters are carefully tuned, most anomalous images in MNIST dataset and all the abnormal traffic periods in Taxi dataset can be nicely detected. In addition, the performance of this method is compared with the BNN anomaly detection methods proposed before and the classical Local Outlier Factor and Density-Based Spatial Clustering of Applications with Noise methods. This study links the classification of uncertainties in essence with anomaly detection and takes the lead to consider combining different uncertainty sources to reform detection outcomes instead of using only single uncertainty each time.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"369-386"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10681813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1089/big.2023.29062.editorial
Chinmay Chakraborty, Muhammad Khurram Khan
{"title":"Big Data-Driven Futuristic Fabric System in Societal Digital Transformation.","authors":"Chinmay Chakraborty, Muhammad Khurram Khan","doi":"10.1089/big.2023.29062.editorial","DOIUrl":"10.1089/big.2023.29062.editorial","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"11 5","pages":"321-322"},"PeriodicalIF":4.6,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41219740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinjun Lai, Guitao Huang, Ziyue Zhao, Shenhe Lin, Sheng Zhang, Huiyu Zhang, Qingxin Chen, Ning Mao
This study investigates customers' product design requirements through online comments from social media, and quickly translates these needs into product design specifications. First, the exponential discriminative snowball sampling method was proposed to generate a product-related subnetwork. Second, natural language processing (NLP) was utilized to mine user-generated comments, and a Graph SAmple and aggreGatE method was employed to embed the user's node neighborhood information in the network to jointly define a user's persona. Clustering was used for market and product model segmentation. Finally, a deep learning bidirectional long short-term memory with conditional random fields framework was introduced for opinion mining. A comment frequency-invert group frequency indicator was proposed to quantify all user groups' positive and negative opinions for various specifications of different product functions. A case study of smartphone design analysis is presented with data from a large Chinese online community called Baidu Tieba. Eleven layers of social relationships were snowball sampled, with 14,018 users and 30,803 comments. The proposed method produced a more reasonable user group clustering result than the conventional method. With our approach, user groups' dominating likes and dislikes for specifications could be immediately identified, and the similar and different preferences of product features by different user groups were instantly revealed. Managerial and engineering insights were also discussed.
{"title":"Social Listening for Product Design Requirement Analysis and Segmentation: A Graph Analysis Approach with User Comments Mining.","authors":"Xinjun Lai, Guitao Huang, Ziyue Zhao, Shenhe Lin, Sheng Zhang, Huiyu Zhang, Qingxin Chen, Ning Mao","doi":"10.1089/big.2022.0021","DOIUrl":"https://doi.org/10.1089/big.2022.0021","url":null,"abstract":"<p><p>This study investigates customers' product design requirements through online comments from social media, and quickly translates these needs into product design specifications. First, the exponential discriminative snowball sampling method was proposed to generate a product-related subnetwork. Second, natural language processing (NLP) was utilized to mine user-generated comments, and a Graph SAmple and aggreGatE method was employed to embed the user's node neighborhood information in the network to jointly define a user's persona. Clustering was used for market and product model segmentation. Finally, a deep learning bidirectional long short-term memory with conditional random fields framework was introduced for opinion mining. A comment frequency-invert group frequency indicator was proposed to quantify all user groups' positive and negative opinions for various specifications of different product functions. A case study of smartphone design analysis is presented with data from a large Chinese online community called Baidu Tieba. Eleven layers of social relationships were snowball sampled, with 14,018 users and 30,803 comments. The proposed method produced a more reasonable user group clustering result than the conventional method. With our approach, user groups' dominating likes and dislikes for specifications could be immediately identified, and the similar and different preferences of product features by different user groups were instantly revealed. Managerial and engineering insights were also discussed.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10508327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-21DOI: 10.3390/engproc2023038091
Manying Shi, Fang Luo, Hanping Ke, Shiliang Zhang
{"title":"Design and Analysis of Education Personalized Recommendation System under Vision of System Science Communication","authors":"Manying Shi, Fang Luo, Hanping Ke, Shiliang Zhang","doi":"10.3390/engproc2023038091","DOIUrl":"https://doi.org/10.3390/engproc2023038091","url":null,"abstract":"","PeriodicalId":51314,"journal":{"name":"Big Data","volume":"1 1","pages":""},"PeriodicalIF":4.6,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90898197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}