Pub Date : 2023-01-24DOI: 10.1108/dta-09-2021-0236
Hui Xu, Junjie Zhang, Hui Sun, Miao Qi, Jun Kong
PurposeAttention is one of the most important factors to affect the academic performance of students. Effectively analyzing students' attention in class can promote teachers' precise teaching and students' personalized learning. To intelligently analyze the students' attention in classroom from the first-person perspective, this paper proposes a fusion model based on gaze tracking and object detection. In particular, the proposed attention analysis model does not depend on any smart equipment.Design/methodology/approachGiven a first-person view video of students' learning, the authors first estimate the gazing point by using the deep space–time neural network. Second, single shot multi-box detector and fast segmentation convolutional neural network are comparatively adopted to accurately detect the objects in the video. Third, they predict the gazing objects by combining the results of gazing point estimation and object detection. Finally, the personalized attention of students is analyzed based on the predicted gazing objects and the measurable eye movement criteria.FindingsA large number of experiments are carried out on a public database and a new dataset that is built in a real classroom. The experimental results show that the proposed model not only can accurately track the students' gazing trajectory and effectively analyze the fluctuation of attention of the individual student and all students but also provide a valuable reference to evaluate the process of learning of students.Originality/valueThe contributions of this paper can be summarized as follows. The analysis of students' attention plays an important role in improving teaching quality and student achievement. However, there is little research on how to automatically and intelligently analyze students' attention. To alleviate this problem, this paper focuses on analyzing students' attention by gaze tracking and object detection in classroom teaching, which is significant for practical application in the field of education. The authors proposed an effectively intelligent fusion model based on the deep neural network, which mainly includes the gazing point module and the object detection module, to analyze students' attention in classroom teaching instead of relying on any smart wearable device. They introduce the attention mechanism into the gazing point module to improve the performance of gazing point detection and perform some comparison experiments on the public dataset to prove that the gazing point module can achieve better performance. They associate the eye movement criteria with visual gaze to get quantifiable objective data for students' attention analysis, which can provide a valuable basis to evaluate the learning process of students, provide useful learning information of students for both parents and teachers and support the development of individualized teaching. They built a new database that contains the first-person view videos of 11 subjects in a real classroom an
{"title":"Analyzing students' attention by gaze tracking and object detection in classroom teaching","authors":"Hui Xu, Junjie Zhang, Hui Sun, Miao Qi, Jun Kong","doi":"10.1108/dta-09-2021-0236","DOIUrl":"https://doi.org/10.1108/dta-09-2021-0236","url":null,"abstract":"PurposeAttention is one of the most important factors to affect the academic performance of students. Effectively analyzing students' attention in class can promote teachers' precise teaching and students' personalized learning. To intelligently analyze the students' attention in classroom from the first-person perspective, this paper proposes a fusion model based on gaze tracking and object detection. In particular, the proposed attention analysis model does not depend on any smart equipment.Design/methodology/approachGiven a first-person view video of students' learning, the authors first estimate the gazing point by using the deep space–time neural network. Second, single shot multi-box detector and fast segmentation convolutional neural network are comparatively adopted to accurately detect the objects in the video. Third, they predict the gazing objects by combining the results of gazing point estimation and object detection. Finally, the personalized attention of students is analyzed based on the predicted gazing objects and the measurable eye movement criteria.FindingsA large number of experiments are carried out on a public database and a new dataset that is built in a real classroom. The experimental results show that the proposed model not only can accurately track the students' gazing trajectory and effectively analyze the fluctuation of attention of the individual student and all students but also provide a valuable reference to evaluate the process of learning of students.Originality/valueThe contributions of this paper can be summarized as follows. The analysis of students' attention plays an important role in improving teaching quality and student achievement. However, there is little research on how to automatically and intelligently analyze students' attention. To alleviate this problem, this paper focuses on analyzing students' attention by gaze tracking and object detection in classroom teaching, which is significant for practical application in the field of education. The authors proposed an effectively intelligent fusion model based on the deep neural network, which mainly includes the gazing point module and the object detection module, to analyze students' attention in classroom teaching instead of relying on any smart wearable device. They introduce the attention mechanism into the gazing point module to improve the performance of gazing point detection and perform some comparison experiments on the public dataset to prove that the gazing point module can achieve better performance. They associate the eye movement criteria with visual gaze to get quantifiable objective data for students' attention analysis, which can provide a valuable basis to evaluate the learning process of students, provide useful learning information of students for both parents and teachers and support the development of individualized teaching. They built a new database that contains the first-person view videos of 11 subjects in a real classroom an","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48903830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-23DOI: 10.1108/dta-10-2022-0385
Zhongbao Liu, Wen-juan Zhao
PurposeIn recent years, Chinese sentiment analysis has made great progress, but the characteristics of the language itself and downstream task requirements were not explored thoroughly. It is not practical to directly migrate achievements obtained in English sentiment analysis to the analysis of Chinese because of the huge difference between the two languages.Design/methodology/approachIn view of the particularity of Chinese text and the requirement of sentiment analysis, a Chinese sentiment analysis model integrating multi-granularity semantic features is proposed in this paper. This model introduces the radical and part-of-speech features based on the character and word features, with the application of bidirectional long short-term memory, attention mechanism and recurrent convolutional neural network.FindingsThe comparative experiments showed that the F1 values of this model reaches 88.28 and 84.80 per cent on the man-made dataset and the NLPECC dataset, respectively. Meanwhile, an ablation experiment was conducted to verify the effectiveness of attention mechanism, part of speech, radical, character and word factors in Chinese sentiment analysis. The performance of the proposed model exceeds that of existing models to some extent.Originality/valueThe academic contribution of this paper is as follows: first, in view of the particularity of Chinese texts and the requirement of sentiment analysis, this paper focuses on solving the deficiency problem of Chinese sentiment analysis under the big data context. Second, this paper borrows ideas from multiple interdisciplinary frontier theories and methods, such as information science, linguistics and artificial intelligence, which makes it innovative and comprehensive. Finally, this paper deeply integrates multi-granularity semantic features such as character, word, radical and part of speech, which further complements the theoretical framework and method system of Chinese sentiment analysis.
{"title":"Chinese sentiment analysis model by integrating multi-granularity semantic features","authors":"Zhongbao Liu, Wen-juan Zhao","doi":"10.1108/dta-10-2022-0385","DOIUrl":"https://doi.org/10.1108/dta-10-2022-0385","url":null,"abstract":"PurposeIn recent years, Chinese sentiment analysis has made great progress, but the characteristics of the language itself and downstream task requirements were not explored thoroughly. It is not practical to directly migrate achievements obtained in English sentiment analysis to the analysis of Chinese because of the huge difference between the two languages.Design/methodology/approachIn view of the particularity of Chinese text and the requirement of sentiment analysis, a Chinese sentiment analysis model integrating multi-granularity semantic features is proposed in this paper. This model introduces the radical and part-of-speech features based on the character and word features, with the application of bidirectional long short-term memory, attention mechanism and recurrent convolutional neural network.FindingsThe comparative experiments showed that the F1 values of this model reaches 88.28 and 84.80 per cent on the man-made dataset and the NLPECC dataset, respectively. Meanwhile, an ablation experiment was conducted to verify the effectiveness of attention mechanism, part of speech, radical, character and word factors in Chinese sentiment analysis. The performance of the proposed model exceeds that of existing models to some extent.Originality/valueThe academic contribution of this paper is as follows: first, in view of the particularity of Chinese texts and the requirement of sentiment analysis, this paper focuses on solving the deficiency problem of Chinese sentiment analysis under the big data context. Second, this paper borrows ideas from multiple interdisciplinary frontier theories and methods, such as information science, linguistics and artificial intelligence, which makes it innovative and comprehensive. Finally, this paper deeply integrates multi-granularity semantic features such as character, word, radical and part of speech, which further complements the theoretical framework and method system of Chinese sentiment analysis.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44748796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-10DOI: 10.1108/dta-08-2021-0229
Yue Ming
PurposeSocial media platforms such as Reddit can be used as a place for people with shared health problems to share knowledge and support. Previous studies have focused on the overall picture of how much social support people who live with HIV/AIDS (PLWHA) receive from online interactions. Yet, only few studies have examined the impact of social support from social media platforms on antiretroviral therapy (ART), which is a necessary lifelong therapy for PLWHA. This study used social support theory to examine related Reddit posts.Design/methodology/approachThis study used content analysis to analyze ART-related Reddit posts. Each Reddit post was manually coded by two coders for social support type. A computational text analysis tool, Linguistic Inquiry and Word Count, was used to generate linguistic features. ANOVA analyses were conducted to compare differences in user engagement and well-being across the types of social support.FindingsResults suggest that most of the posts were informational support posts, followed by emotional support posts and instrumental support posts. Results indicate that there are no significant differences within user engagement variables, but there are significant differences within several well-being variables including analytic score, clout score, health words usage and negative emotional words usage among social support types.Originality/valueThis study contributes to further understanding of social support theory in an online context used predominantly by a younger generation. Practical advice for public health researchers and practitioners is discussed.
{"title":"Social support on Reddit for antiretroviral therapy","authors":"Yue Ming","doi":"10.1108/dta-08-2021-0229","DOIUrl":"https://doi.org/10.1108/dta-08-2021-0229","url":null,"abstract":"PurposeSocial media platforms such as Reddit can be used as a place for people with shared health problems to share knowledge and support. Previous studies have focused on the overall picture of how much social support people who live with HIV/AIDS (PLWHA) receive from online interactions. Yet, only few studies have examined the impact of social support from social media platforms on antiretroviral therapy (ART), which is a necessary lifelong therapy for PLWHA. This study used social support theory to examine related Reddit posts.Design/methodology/approachThis study used content analysis to analyze ART-related Reddit posts. Each Reddit post was manually coded by two coders for social support type. A computational text analysis tool, Linguistic Inquiry and Word Count, was used to generate linguistic features. ANOVA analyses were conducted to compare differences in user engagement and well-being across the types of social support.FindingsResults suggest that most of the posts were informational support posts, followed by emotional support posts and instrumental support posts. Results indicate that there are no significant differences within user engagement variables, but there are significant differences within several well-being variables including analytic score, clout score, health words usage and negative emotional words usage among social support types.Originality/valueThis study contributes to further understanding of social support theory in an online context used predominantly by a younger generation. Practical advice for public health researchers and practitioners is discussed.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"4 1","pages":"279-292"},"PeriodicalIF":1.6,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82853473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-25DOI: 10.1108/dta-04-2022-0151
Ashutosh Kumar, Aakanksha Sharaff
PurposeThe purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.Design/methodology/approachIn the proposed automated bio entity extraction (ABEE) model, a multitask learning model has been introduced with the combination of single-task learning models. Our model used Bidirectional Encoder Representations from Transformers to train the single-task learning model. Then combined model's outputs so that we can find the verity of entities from biomedical text.FindingsThe proposed ABEE model targeted unique gene/protein, chemical and disease entities from the biomedical text. The finding is more important in terms of biomedical research like drug finding and clinical trials. This research aids not only to reduce the effort of the researcher but also to reduce the cost of new drug discoveries and new treatments.Research limitations/implicationsAs such, there are no limitations with the model, but the research team plans to test the model with gigabyte of data and establish a knowledge graph so that researchers can easily estimate the entities of similar groups.Practical implicationsAs far as the practical implication concerned, the ABEE model will be helpful in various natural language processing task as in information extraction (IE), it plays an important role in the biomedical named entity recognition and biomedical relation extraction and also in the information retrieval task like literature-based knowledge discovery.Social implicationsDuring the COVID-19 pandemic, the demands for this type of our work increased because of the increase in the clinical trials at that time. If this type of research has been introduced previously, then it would have reduced the time and effort for new drug discoveries in this area.Originality/valueIn this work we proposed a novel multitask learning model that is capable to extract biomedical entities from the biomedical text without any ambiguity. The proposed model achieved state-of-the-art performance in terms of precision, recall and F1 score.
{"title":"ABEE: automated bio entity extraction from biomedical text documents","authors":"Ashutosh Kumar, Aakanksha Sharaff","doi":"10.1108/dta-04-2022-0151","DOIUrl":"https://doi.org/10.1108/dta-04-2022-0151","url":null,"abstract":"PurposeThe purpose of this study was to design a multitask learning model so that biomedical entities can be extracted without having any ambiguity from biomedical texts.Design/methodology/approachIn the proposed automated bio entity extraction (ABEE) model, a multitask learning model has been introduced with the combination of single-task learning models. Our model used Bidirectional Encoder Representations from Transformers to train the single-task learning model. Then combined model's outputs so that we can find the verity of entities from biomedical text.FindingsThe proposed ABEE model targeted unique gene/protein, chemical and disease entities from the biomedical text. The finding is more important in terms of biomedical research like drug finding and clinical trials. This research aids not only to reduce the effort of the researcher but also to reduce the cost of new drug discoveries and new treatments.Research limitations/implicationsAs such, there are no limitations with the model, but the research team plans to test the model with gigabyte of data and establish a knowledge graph so that researchers can easily estimate the entities of similar groups.Practical implicationsAs far as the practical implication concerned, the ABEE model will be helpful in various natural language processing task as in information extraction (IE), it plays an important role in the biomedical named entity recognition and biomedical relation extraction and also in the information retrieval task like literature-based knowledge discovery.Social implicationsDuring the COVID-19 pandemic, the demands for this type of our work increased because of the increase in the clinical trials at that time. If this type of research has been introduced previously, then it would have reduced the time and effort for new drug discoveries in this area.Originality/valueIn this work we proposed a novel multitask learning model that is capable to extract biomedical entities from the biomedical text without any ambiguity. The proposed model achieved state-of-the-art performance in terms of precision, recall and F1 score.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"30 1","pages":"222-244"},"PeriodicalIF":1.6,"publicationDate":"2022-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83892638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-23DOI: 10.1108/dta-06-2022-0252
Konstantinos Chytas, A. Tsolakidis, Evangelia Triperina, C. Skourlas
PurposeThe purpose of this paper is to introduce an interactive system that relies on the educational data generated from the online Universities services to assess, correct and ameliorate the learning process for both students and faculty.Design/methodology/approachIn the presented research, data from the online services, provided by a Greek University, prior, during and after the COVID-19 outbreak, are analyzed and utilized in order to ameliorate the offered learning process and provide better quality services to the students. Moreover, according to the learning paths, their presence online and their participation in the services of the University, insights can be derived for their performance, so as to better support and assist them.FindingsThe system can deduce the future learning progression of each student, according to the past and the current performance. As a direct consequence, the exploitation of the data can provide a road map for the strategic planning of universities, can indicate how the learning process can be updated and amended, both online and in person, as well as make the learning experience more essential, effective and efficient for the students and aiding the professors to provide a more meaningful and to-the-point learning experience.Originality/valueNowadays, educational activities in academia are strongly supported by online services, information systems and online educational materials. The learning design in the academic setting is primarily facilitated in the University premises. However, the exploitation of the contemporary technologies and supporting materials that are available online can enrich and transform the educational process and its results.
{"title":"Educational data mining in the academic setting: employing the data produced by blended learning to ameliorate the learning process","authors":"Konstantinos Chytas, A. Tsolakidis, Evangelia Triperina, C. Skourlas","doi":"10.1108/dta-06-2022-0252","DOIUrl":"https://doi.org/10.1108/dta-06-2022-0252","url":null,"abstract":"PurposeThe purpose of this paper is to introduce an interactive system that relies on the educational data generated from the online Universities services to assess, correct and ameliorate the learning process for both students and faculty.Design/methodology/approachIn the presented research, data from the online services, provided by a Greek University, prior, during and after the COVID-19 outbreak, are analyzed and utilized in order to ameliorate the offered learning process and provide better quality services to the students. Moreover, according to the learning paths, their presence online and their participation in the services of the University, insights can be derived for their performance, so as to better support and assist them.FindingsThe system can deduce the future learning progression of each student, according to the past and the current performance. As a direct consequence, the exploitation of the data can provide a road map for the strategic planning of universities, can indicate how the learning process can be updated and amended, both online and in person, as well as make the learning experience more essential, effective and efficient for the students and aiding the professors to provide a more meaningful and to-the-point learning experience.Originality/valueNowadays, educational activities in academia are strongly supported by online services, information systems and online educational materials. The learning design in the academic setting is primarily facilitated in the University premises. However, the exploitation of the contemporary technologies and supporting materials that are available online can enrich and transform the educational process and its results.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"140 1","pages":"366-384"},"PeriodicalIF":1.6,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75478641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1108/dta-11-2021-0332
Xuwei Pan, Xuemei Zeng, Ling Ding
PurposeWith the continuous increase of users, resources and tags, social tagging systems gradually present the characteristics of “big data” such as large number, fast growth, complexity and unreliable quality, which greatly increases the complexity of recommendation. The contradiction between the efficiency and effectiveness of recommendation service in social tagging is increasingly becoming prominent. The purpose of this study is to incorporate topic optimization into collaborative filtering to enhance both the effectiveness and the efficiency of personalized recommendations for social tagging.Design/methodology/approachCombining the idea of optimization before service, this paper presents an approach that incorporates topic optimization into collaborative recommendations for social tagging. In the proposed approach, the recommendation process is divided into two phases of offline topic optimization and online recommendation service to achieve high-quality and efficient personalized recommendation services. In the offline phase, the tags' topic model is constructed and then used to optimize the latent preference of users and the latent affiliation of resources on topics.FindingsExperimental evaluation shows that the proposed approach improves both precision and recall of recommendations, as well as enhances the efficiency of online recommendations compared with the three baseline approaches. The proposed topic optimization–incorporated collaborative recommendation approach can achieve the improvement of both effectiveness and efficiency for the recommendation in social tagging.Originality/valueWith the support of the proposed approach, personalized recommendation in social tagging with high quality and efficiency can be achieved.
{"title":"Topic optimization–incorporated collaborative recommendation for social tagging","authors":"Xuwei Pan, Xuemei Zeng, Ling Ding","doi":"10.1108/dta-11-2021-0332","DOIUrl":"https://doi.org/10.1108/dta-11-2021-0332","url":null,"abstract":"PurposeWith the continuous increase of users, resources and tags, social tagging systems gradually present the characteristics of “big data” such as large number, fast growth, complexity and unreliable quality, which greatly increases the complexity of recommendation. The contradiction between the efficiency and effectiveness of recommendation service in social tagging is increasingly becoming prominent. The purpose of this study is to incorporate topic optimization into collaborative filtering to enhance both the effectiveness and the efficiency of personalized recommendations for social tagging.Design/methodology/approachCombining the idea of optimization before service, this paper presents an approach that incorporates topic optimization into collaborative recommendations for social tagging. In the proposed approach, the recommendation process is divided into two phases of offline topic optimization and online recommendation service to achieve high-quality and efficient personalized recommendation services. In the offline phase, the tags' topic model is constructed and then used to optimize the latent preference of users and the latent affiliation of resources on topics.FindingsExperimental evaluation shows that the proposed approach improves both precision and recall of recommendations, as well as enhances the efficiency of online recommendations compared with the three baseline approaches. The proposed topic optimization–incorporated collaborative recommendation approach can achieve the improvement of both effectiveness and efficiency for the recommendation in social tagging.Originality/valueWith the support of the proposed approach, personalized recommendation in social tagging with high quality and efficiency can be achieved.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"1 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42465660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-29DOI: 10.1108/dta-06-2022-0243
Vahide Bulut
PurposeSurface curvature is needed to analyze the range data of real objects and is widely applied in object recognition and segmentation, robotics, and computer vision. Therefore, it is not easy to estimate the curvature of the scanned data. In recent years, machine learning classification methods have gained importance in various fields such as finance, health, engineering, etc. The purpose of this study is to classify surface points based on principal curvatures to find the best method for determining surface point types.Design/methodology/approachA feature selection method is presented to find the best feature vector that achieves the highest accuracy. For this reason, ten different feature selections are used and six sample datasets of different sizes are classified using these feature vectors.FindingsThe author examined the surface examples based on the feature vector using the machine learning classification methods. Also, the author compared the results for each experiment.Originality/valueTo the best of the author's knowledge, this is the first study to examine surface points according to principal curvatures using machine learning classification methods.
{"title":"Identifying surface points based on machine learning algorithms: a comprehensive analysis","authors":"Vahide Bulut","doi":"10.1108/dta-06-2022-0243","DOIUrl":"https://doi.org/10.1108/dta-06-2022-0243","url":null,"abstract":"PurposeSurface curvature is needed to analyze the range data of real objects and is widely applied in object recognition and segmentation, robotics, and computer vision. Therefore, it is not easy to estimate the curvature of the scanned data. In recent years, machine learning classification methods have gained importance in various fields such as finance, health, engineering, etc. The purpose of this study is to classify surface points based on principal curvatures to find the best method for determining surface point types.Design/methodology/approachA feature selection method is presented to find the best feature vector that achieves the highest accuracy. For this reason, ten different feature selections are used and six sample datasets of different sizes are classified using these feature vectors.FindingsThe author examined the surface examples based on the feature vector using the machine learning classification methods. Also, the author compared the results for each experiment.Originality/valueTo the best of the author's knowledge, this is the first study to examine surface points according to principal curvatures using machine learning classification methods.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44846803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-09DOI: 10.1108/dta-06-2021-0167
Anuraj Mohan, Karthika P.V., P. Sankar, Maya Manohar K., Amala Peter
PurposeMoney laundering is the process of concealing unlawfully obtained funds by presenting them as coming from a legitimate source. Criminals use crypto money laundering to hide the illicit origin of funds using a variety of methods. The most simplified form of bitcoin money laundering leans hard on the fact that transactions made in cryptocurrencies are pseudonymous, but open data gives more power to investigators and enables the crowdsourcing of forensic analysis. With the motive to curb these illegal activities, there exist various rules, policies and technologies collectively known as anti-money laundering (AML) tools. When properly implemented, AML restrictions reduce the negative effects of illegal economic activity while also promoting financial market integrity and stability, but these bear high costs for institutions. The purpose of this work is to motivate the opportunity to reconcile the cause of safety with that of financial inclusion, bearing in mind the limitations of the available data. The authors use the Elliptic dataset; to the best of the authors' knowledge, this is the largest labelled transaction dataset publicly available in any cryptocurrency.Design/methodology/approachAML in bitcoin can be modelled as a node classification task in dynamic networks. In this work, graph convolutional decision forest will be introduced, which combines the potentialities of evolving graph convolutional network and deep neural decision forest (DNDF). This model will be used to classify the unknown transactions in the Elliptic dataset. Additionally, the application of knowledge distillation (KD) over the proposed approach gives finest results compared to all the other experimented techniques.FindingsThe importance of utilising a concatenation between dynamic graph learning and ensemble feature learning is demonstrated in this work. The results show the superiority of the proposed model to classify the illicit transactions in the Elliptic dataset. Experiments also show that the results can be further improved when the system is fine-tuned using a KD framework.Originality/valueExisting works used either ensemble learning or dynamic graph learning to tackle the problem of AML in bitcoin. The proposed model provides a novel view to combine the power of random forest with dynamic graph learning methods. Furthermore, the work also demonstrates the advantage of KD in improving the performance of the whole system.
{"title":"Improving anti-money laundering in bitcoin using evolving graph convolutions and deep neural decision forest","authors":"Anuraj Mohan, Karthika P.V., P. Sankar, Maya Manohar K., Amala Peter","doi":"10.1108/dta-06-2021-0167","DOIUrl":"https://doi.org/10.1108/dta-06-2021-0167","url":null,"abstract":"PurposeMoney laundering is the process of concealing unlawfully obtained funds by presenting them as coming from a legitimate source. Criminals use crypto money laundering to hide the illicit origin of funds using a variety of methods. The most simplified form of bitcoin money laundering leans hard on the fact that transactions made in cryptocurrencies are pseudonymous, but open data gives more power to investigators and enables the crowdsourcing of forensic analysis. With the motive to curb these illegal activities, there exist various rules, policies and technologies collectively known as anti-money laundering (AML) tools. When properly implemented, AML restrictions reduce the negative effects of illegal economic activity while also promoting financial market integrity and stability, but these bear high costs for institutions. The purpose of this work is to motivate the opportunity to reconcile the cause of safety with that of financial inclusion, bearing in mind the limitations of the available data. The authors use the Elliptic dataset; to the best of the authors' knowledge, this is the largest labelled transaction dataset publicly available in any cryptocurrency.Design/methodology/approachAML in bitcoin can be modelled as a node classification task in dynamic networks. In this work, graph convolutional decision forest will be introduced, which combines the potentialities of evolving graph convolutional network and deep neural decision forest (DNDF). This model will be used to classify the unknown transactions in the Elliptic dataset. Additionally, the application of knowledge distillation (KD) over the proposed approach gives finest results compared to all the other experimented techniques.FindingsThe importance of utilising a concatenation between dynamic graph learning and ensemble feature learning is demonstrated in this work. The results show the superiority of the proposed model to classify the illicit transactions in the Elliptic dataset. Experiments also show that the results can be further improved when the system is fine-tuned using a KD framework.Originality/valueExisting works used either ensemble learning or dynamic graph learning to tackle the problem of AML in bitcoin. The proposed model provides a novel view to combine the power of random forest with dynamic graph learning methods. Furthermore, the work also demonstrates the advantage of KD in improving the performance of the whole system.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"33 1","pages":"313-329"},"PeriodicalIF":1.6,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85727429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-17DOI: 10.1108/dta-05-2022-0210
Hasnae Zerouaoui, A. Idri, Omar El Alaoui
PurposeHundreds of thousands of deaths each year in the world are caused by breast cancer (BC). An early-stage diagnosis of this disease can positively reduce the morbidity and mortality rate by helping to select the most appropriate treatment options, especially by using histological BC images for the diagnosis.Design/methodology/approachThe present study proposes and evaluates a novel approach which consists of 24 deep hybrid heterogenous ensembles that combine the strength of seven deep learning techniques (DenseNet 201, Inception V3, VGG16, VGG19, Inception-ResNet-V3, MobileNet V2 and ResNet 50) for feature extraction and four well-known classifiers (multi-layer perceptron, support vector machines, K-nearest neighbors and decision tree) by means of hard and weighted voting combination methods for histological classification of BC medical image. Furthermore, the best deep hybrid heterogenous ensembles were compared to the deep stacked ensembles to determine the best strategy to design the deep ensemble methods. The empirical evaluations used four classification performance criteria (accuracy, sensitivity, precision and F1-score), fivefold cross-validation, Scott–Knott (SK) statistical test and Borda count voting method. All empirical evaluations were assessed using four performance measures, including accuracy, precision, recall and F1-score, and were over the histological BreakHis public dataset with four magnification factors (40×, 100×, 200× and 400×). SK statistical test and Borda count were also used to cluster the designed techniques and rank the techniques belonging to the best SK cluster, respectively.FindingsResults showed that the deep hybrid heterogenous ensembles outperformed both their singles and the deep stacked ensembles and reached the accuracy values of 96.3, 95.6, 96.3 and 94 per cent across the four magnification factors 40×, 100×, 200× and 400×, respectively.Originality/valueThe proposed deep hybrid heterogenous ensembles can be applied for the BC diagnosis to assist pathologists in reducing the missed diagnoses and proposing adequate treatments for the patients.
{"title":"A new approach for histological classification of breast cancer using deep hybrid heterogenous ensemble","authors":"Hasnae Zerouaoui, A. Idri, Omar El Alaoui","doi":"10.1108/dta-05-2022-0210","DOIUrl":"https://doi.org/10.1108/dta-05-2022-0210","url":null,"abstract":"PurposeHundreds of thousands of deaths each year in the world are caused by breast cancer (BC). An early-stage diagnosis of this disease can positively reduce the morbidity and mortality rate by helping to select the most appropriate treatment options, especially by using histological BC images for the diagnosis.Design/methodology/approachThe present study proposes and evaluates a novel approach which consists of 24 deep hybrid heterogenous ensembles that combine the strength of seven deep learning techniques (DenseNet 201, Inception V3, VGG16, VGG19, Inception-ResNet-V3, MobileNet V2 and ResNet 50) for feature extraction and four well-known classifiers (multi-layer perceptron, support vector machines, K-nearest neighbors and decision tree) by means of hard and weighted voting combination methods for histological classification of BC medical image. Furthermore, the best deep hybrid heterogenous ensembles were compared to the deep stacked ensembles to determine the best strategy to design the deep ensemble methods. The empirical evaluations used four classification performance criteria (accuracy, sensitivity, precision and F1-score), fivefold cross-validation, Scott–Knott (SK) statistical test and Borda count voting method. All empirical evaluations were assessed using four performance measures, including accuracy, precision, recall and F1-score, and were over the histological BreakHis public dataset with four magnification factors (40×, 100×, 200× and 400×). SK statistical test and Borda count were also used to cluster the designed techniques and rank the techniques belonging to the best SK cluster, respectively.FindingsResults showed that the deep hybrid heterogenous ensembles outperformed both their singles and the deep stacked ensembles and reached the accuracy values of 96.3, 95.6, 96.3 and 94 per cent across the four magnification factors 40×, 100×, 200× and 400×, respectively.Originality/valueThe proposed deep hybrid heterogenous ensembles can be applied for the BC diagnosis to assist pathologists in reducing the missed diagnoses and proposing adequate treatments for the patients.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":"38 1","pages":"245-278"},"PeriodicalIF":1.6,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90791288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-29DOI: 10.1108/dta-03-2022-0124
Zheng Wang, Ying Ji, Tao Zhang, Yuanming Li, Lun Wang, Shaojian Qu
PurposeWith the continuous development of online shopping, analyzing the competitiveness of products in the fierce market competition is becoming increasingly crucial to position their own product development. However, the information overload brought by the network development makes it getting difficult to obtain the accurate competitiveness information. Therefore, competitiveness analysis research to combine with the perceived helpfulness study needs urgent solution. Furthermore, deviations exist in the three common methods of perceived helpfulness research. Finally, the traditional information fusion analysis only analyzes the advantages and disadvantages of products in competitiveness analysis without taking account of the competitive environment.Design/methodology/approachThis study puts forward a novel prediction model of perceived helpfulness in conjunction of unsupervised learning and sentiment analysis techniques, to conduct the comparison with pros and cons of congeneric products.FindingsThis paper adopts Wilcoxon test to demonstrate the significant rectification of our competitiveness analysis to the traditional methods. It is noted that the positive reviews of the products in this study impact more on product word of mouth and competitiveness than negative ones.Originality/valueTo sum up, the results of this study benefit businesses in locating their dynamic market position with competitors in practice and exploring new method for long-term development strategic planning.
{"title":"Product competitiveness analysis from the perspective of customer perceived helpfulness: a novel method of information fusion research","authors":"Zheng Wang, Ying Ji, Tao Zhang, Yuanming Li, Lun Wang, Shaojian Qu","doi":"10.1108/dta-03-2022-0124","DOIUrl":"https://doi.org/10.1108/dta-03-2022-0124","url":null,"abstract":"PurposeWith the continuous development of online shopping, analyzing the competitiveness of products in the fierce market competition is becoming increasingly crucial to position their own product development. However, the information overload brought by the network development makes it getting difficult to obtain the accurate competitiveness information. Therefore, competitiveness analysis research to combine with the perceived helpfulness study needs urgent solution. Furthermore, deviations exist in the three common methods of perceived helpfulness research. Finally, the traditional information fusion analysis only analyzes the advantages and disadvantages of products in competitiveness analysis without taking account of the competitive environment.Design/methodology/approachThis study puts forward a novel prediction model of perceived helpfulness in conjunction of unsupervised learning and sentiment analysis techniques, to conduct the comparison with pros and cons of congeneric products.FindingsThis paper adopts Wilcoxon test to demonstrate the significant rectification of our competitiveness analysis to the traditional methods. It is noted that the positive reviews of the products in this study impact more on product word of mouth and competitiveness than negative ones.Originality/valueTo sum up, the results of this study benefit businesses in locating their dynamic market position with competitors in practice and exploring new method for long-term development strategic planning.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47617254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}