Pub Date : 2021-11-10DOI: 10.46610/jodmm.2021.v06i03.003
N. Ranjan, R. Prasad, D. Mane
About 80% organizational data are present in the unstructured (Text) format. E-mails, Social media, notes, and wide variety of different types of documents in text formats are present, but all these data are not got importance and analyzed in meaningful ways. It has been observed that information workers spend their significant time (up to one third) to locating this information and trying to make sense of it. Text analytics (TA) is the process which analyzed all these available unstructured text information and converts it into useful information which helps the organization significantly in their business processes. In this paper we have discussed the business values, methods of text analytics, and business application of text analytics.
{"title":"A Brief Survey on Text Analytics Methods and Applications","authors":"N. Ranjan, R. Prasad, D. Mane","doi":"10.46610/jodmm.2021.v06i03.003","DOIUrl":"https://doi.org/10.46610/jodmm.2021.v06i03.003","url":null,"abstract":"About 80% organizational data are present in the unstructured (Text) format. E-mails, Social media, notes, and wide variety of different types of documents in text formats are present, but all these data are not got importance and analyzed in meaningful ways. It has been observed that information workers spend their significant time (up to one third) to locating this information and trying to make sense of it. Text analytics (TA) is the process which analyzed all these available unstructured text information and converts it into useful information which helps the organization significantly in their business processes. In this paper we have discussed the business values, methods of text analytics, and business application of text analytics.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"47 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78856402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-11-07DOI: 10.46610/jodmm.2021.v06i03.001
N. Ranjan, R. Prasad
About 80% organizational data are present in the unstructured (Text) format. E-mails, social media, notes, and wide variety of different types of documents in text formats are present, but all these data are not get importance and analyzed in meaningful ways. It has been observed that information workers spend their significant time (up to one third) to locating this information and trying to make sense of it. Text analytics is the process which analyzed all these available unstructured text information and converts it into useful information which helps the organization significantly in their business processes. In this paper, we have highlighted the business values, some of the methods, and business application of text analytics.
{"title":"Text Analytics: An Application of Text Mining","authors":"N. Ranjan, R. Prasad","doi":"10.46610/jodmm.2021.v06i03.001","DOIUrl":"https://doi.org/10.46610/jodmm.2021.v06i03.001","url":null,"abstract":"About 80% organizational data are present in the unstructured (Text) format. E-mails, social media, notes, and wide variety of different types of documents in text formats are present, but all these data are not get importance and analyzed in meaningful ways. It has been observed that information workers spend their significant time (up to one third) to locating this information and trying to make sense of it. Text analytics is the process which analyzed all these available unstructured text information and converts it into useful information which helps the organization significantly in their business processes. In this paper, we have highlighted the business values, some of the methods, and business application of text analytics.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"33 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86405617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-16DOI: 10.46610/jodmm.2020.v05i03.005
S. AgilanK, S. BaraneetharanP, N. Surya, M. Sujithra, P. Velvadivu
Cricket is one of the most loved and favourite sport entertainments specially in India. The Indian Premier League (IPL) is a T20 cricket league tournament held in India every year where top players from all over the world take part. It is the most celebrated and attended cricket league in the world and ranks sixth among all sports leagues. Hadoop is an open source framework which is used to store and process bigdata. Here, we have analysed and interpreted various insights from the dataset of Indian Premier League using Pig and HIVE.
{"title":"IPL Data Analysis using Hadoop – Pig and Hive","authors":"S. AgilanK, S. BaraneetharanP, N. Surya, M. Sujithra, P. Velvadivu","doi":"10.46610/jodmm.2020.v05i03.005","DOIUrl":"https://doi.org/10.46610/jodmm.2020.v05i03.005","url":null,"abstract":"Cricket is one of the most loved and favourite sport entertainments specially in India. The Indian Premier League (IPL) is a T20 cricket league tournament held in India every year where top players from all over the world take part. It is the most celebrated and attended cricket league in the world and ranks sixth among all sports leagues. Hadoop is an open source framework which is used to store and process bigdata. Here, we have analysed and interpreted various insights from the dataset of Indian Premier League using Pig and HIVE.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"12 2","pages":""},"PeriodicalIF":0.5,"publicationDate":"2020-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72375870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1504/IJDMMM.2019.10016838
H. Sabahno, S. Mousavi, A. Amiri
It is proved that adaptive control charts have better performance than classical control charts due to adaptability of some or all of their parameters to the previous process information. Fuzzy classical control charts have been occasionally considered by many researchers in the last two decades; however, fuzzy adaptive control charts have not been investigated. In this paper, we introduce a new adaptive X − R fuzzy control chart that allows all of the charts' parameters to adapt based on the process state in the previous sample. Also, the warning limits are redefined in the fuzzy environments. We utilise fuzzy mode defuzzification technique to design the decision procedure in the proposed fuzzy adaptive control chart. Finally, an illustrative example is used to present the application of the proposed control chart.
{"title":"A new development of an adaptive X − R control chart under a fuzzy environment","authors":"H. Sabahno, S. Mousavi, A. Amiri","doi":"10.1504/IJDMMM.2019.10016838","DOIUrl":"https://doi.org/10.1504/IJDMMM.2019.10016838","url":null,"abstract":"It is proved that adaptive control charts have better performance than classical control charts due to adaptability of some or all of their parameters to the previous process information. Fuzzy classical control charts have been occasionally considered by many researchers in the last two decades; however, fuzzy adaptive control charts have not been investigated. In this paper, we introduce a new adaptive X − R fuzzy control chart that allows all of the charts' parameters to adapt based on the process state in the previous sample. Also, the warning limits are redefined in the fuzzy environments. We utilise fuzzy mode defuzzification technique to design the decision procedure in the proposed fuzzy adaptive control chart. Finally, an illustrative example is used to present the application of the proposed control chart.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"77 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88220689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01DOI: 10.1504/ijdmmm.2019.10015450
Chitrakala S, S. T
{"title":"Human Activity Recognition based on Interaction Modelling","authors":"Chitrakala S, S. T","doi":"10.1504/ijdmmm.2019.10015450","DOIUrl":"https://doi.org/10.1504/ijdmmm.2019.10015450","url":null,"abstract":"","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"8 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87220752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-13DOI: 10.1504/IJDMMM.2017.086566
Supaporn Tantanasiriwong, S. Guha, P. Janecek, C. Haruechaiyasak, L. Azzopardi
Cross-domain recommendations are of growing importance in the research community. An application of particular interest is to recommend a set of relevant research papers as citations for a given patent. This paper proposes an approach for cross-domain citation recommendation based on the hybrid topic model and co-citation selection. Using the topic model, relevant terms from documents could be clustered into the same topics. In addition, the co-citation selection technique will help select citations based on a set of highly similar patents. To evaluate the performance, we compared our proposed approach with the traditional baseline approaches using a corpus of patents collected for different technological fields of biotechnology, environmental technology, medical technology and nanotechnology. Experimental results show our cross domain citation recommendation yields a higher performance in predicting relevant publication citations than all baseline approaches.
{"title":"Cross-domain citation recommendation based on hybrid topic model and co-citation selection citation selection","authors":"Supaporn Tantanasiriwong, S. Guha, P. Janecek, C. Haruechaiyasak, L. Azzopardi","doi":"10.1504/IJDMMM.2017.086566","DOIUrl":"https://doi.org/10.1504/IJDMMM.2017.086566","url":null,"abstract":"Cross-domain recommendations are of growing importance in the research community. An application of particular interest is to recommend a set of relevant research papers as citations for a given patent. This paper proposes an approach for cross-domain citation recommendation based on the hybrid topic model and co-citation selection. Using the topic model, relevant terms from documents could be clustered into the same topics. In addition, the co-citation selection technique will help select citations based on a set of highly similar patents. To evaluate the performance, we compared our proposed approach with the traditional baseline approaches using a corpus of patents collected for different technological fields of biotechnology, environmental technology, medical technology and nanotechnology. Experimental results show our cross domain citation recommendation yields a higher performance in predicting relevant publication citations than all baseline approaches.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"37 1","pages":"220-236"},"PeriodicalIF":0.5,"publicationDate":"2017-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83430796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-12DOI: 10.1504/IJDMMM.2015.069247
Erik Å trumbelj, Marko Robnik-Å ikonja
We analyse data from 5,000 competitors who participated in an online soccer managerial game which revolved around the English Premier League (EPL). We show that competitors incorporate into their decisions relevant information about the outcome of a soccer match. Furthermore, forecasts based on managerial game data are significantly better than random forecasts, forecasts based on relative frequency, and forecasts based on teams' attendance, but worse than bookmaker odds. Our work provides an evidence that crowds poses significant amount of information for the match outcome prediction.
{"title":"Predictive power of fantasy sports data for soccer forecasting","authors":"Erik Å trumbelj, Marko Robnik-Å ikonja","doi":"10.1504/IJDMMM.2015.069247","DOIUrl":"https://doi.org/10.1504/IJDMMM.2015.069247","url":null,"abstract":"We analyse data from 5,000 competitors who participated in an online soccer managerial game which revolved around the English Premier League (EPL). We show that competitors incorporate into their decisions relevant information about the outcome of a soccer match. Furthermore, forecasts based on managerial game data are significantly better than random forecasts, forecasts based on relative frequency, and forecasts based on teams' attendance, but worse than bookmaker odds. Our work provides an evidence that crowds poses significant amount of information for the match outcome prediction.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"204 1","pages":"154-163"},"PeriodicalIF":0.5,"publicationDate":"2015-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73216602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-14DOI: 10.1504/IJDMMM.2013.055861
Maciej Piasecki, Michal Marcinczuk, Radoslaw Ramocki, M. Maziarz
The paper presents WordNetLoom – an application for WordNet development used in the construction of a Polish WordNet called plWordNet. WordNetLoom provides two means of interaction: a form-based, implemented initially, and a graph-based introduced recently. The graphical, active presentation of WordNet structure enables direct work on the structure of synsets and lexico-semantic relations. In the paper, the both means of interaction are compared and the results of the usability evaluation performed on a group of experienced WordNetLoom users are presented. Directions of the application development were identified. A new version of WordNetWeaver – a tool supporting semi-automated WordNet expansion – is also presented. The new version is based on the user interface similar to WordNetLoom, utilises all types of WordNet relations and is embedded in WordNetLoom. The paper discusses also the role of the application in WordNet development and the extent to which the application can be used for other WordNets. A set of WWW-based tools supporting team work coordination and verification is presented, too.
{"title":"WordnetLoom: a Wordnet Development System Integrating Form-based and Graph-based Perspectives","authors":"Maciej Piasecki, Michal Marcinczuk, Radoslaw Ramocki, M. Maziarz","doi":"10.1504/IJDMMM.2013.055861","DOIUrl":"https://doi.org/10.1504/IJDMMM.2013.055861","url":null,"abstract":"The paper presents WordNetLoom – an application for WordNet development used in the construction of a Polish WordNet called plWordNet. WordNetLoom provides two means of interaction: a form-based, implemented initially, and a graph-based introduced recently. The graphical, active presentation of WordNet structure enables direct work on the structure of synsets and lexico-semantic relations. In the paper, the both means of interaction are compared and the results of the usability evaluation performed on a group of experienced WordNetLoom users are presented. Directions of the application development were identified. A new version of WordNetWeaver – a tool supporting semi-automated WordNet expansion – is also presented. The new version is based on the user interface similar to WordNetLoom, utilises all types of WordNet relations and is embedded in WordNetLoom. The paper discusses also the role of the application in WordNet development and the extent to which the application can be used for other WordNets. A set of WWW-based tools supporting team work coordination and verification is presented, too.","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"14 1","pages":"210-232"},"PeriodicalIF":0.5,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84168088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1504/ijdmmm.2024.10058088
Marina Bagi Babac, Ivona Lipovac
{"title":"Developing a Data Pipeline Solution for Big Data Processing","authors":"Marina Bagi Babac, Ivona Lipovac","doi":"10.1504/ijdmmm.2024.10058088","DOIUrl":"https://doi.org/10.1504/ijdmmm.2024.10058088","url":null,"abstract":"","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"117 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72522803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1504/ijdmmm.2024.10055782
Dipti P Rana, Mathe John Kenny Kumar
{"title":"HARUIM: High Average Recent Utility Itemset Mining","authors":"Dipti P Rana, Mathe John Kenny Kumar","doi":"10.1504/ijdmmm.2024.10055782","DOIUrl":"https://doi.org/10.1504/ijdmmm.2024.10055782","url":null,"abstract":"","PeriodicalId":43061,"journal":{"name":"International Journal of Data Mining Modelling and Management","volume":"29 1","pages":""},"PeriodicalIF":0.5,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77852293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}