Data continuously gathered monitoring the spreading of the COVID-19 pandemic form an unbounded flow of data. Accurately forecasting if the infections will increase or decrease has a high impact, but it is challenging because the pandemic spreads and contracts periodically. Technically, the flow of data is said to be imbalanced and subject to concept drifts because signs of decrements are the minority class during the spreading periods, while they become the majority class in the contraction periods and the other way round. In this paper, we propose a case study applying the Continuous Synthetic Minority Oversampling Technique (C-SMOTE), a novel meta-strategy to pipeline with Streaming Machine Learning (SML) classification algorithms, to forecast the COVID-19 pandemic trend. Benchmarking SML pipelinesthat use C-SMOTE against state-of-the-art methods on a COVID-19 dataset, we bring statistical evidence that models learned using C-SMOTE are better.
{"title":"Predict COVID-19 Spreading With C-SMOTE","authors":"Alessio Bernardo, Emanuele Della Valle","doi":"10.52825/bis.v1i.45","DOIUrl":"https://doi.org/10.52825/bis.v1i.45","url":null,"abstract":"Data continuously gathered monitoring the spreading of the COVID-19 pandemic form an unbounded flow of data. Accurately forecasting if the infections will increase or decrease has a high impact, but it is challenging because the pandemic spreads and contracts periodically. Technically, the flow of data is said to be imbalanced and subject to concept drifts because signs of decrements are the minority class during the spreading periods, while they become the majority class in the contraction periods and the other way round. In this paper, we propose a case study applying the Continuous Synthetic Minority Oversampling Technique (C-SMOTE), a novel meta-strategy to pipeline with Streaming Machine Learning (SML) classification algorithms, to forecast the COVID-19 pandemic trend. Benchmarking SML pipelinesthat use C-SMOTE against state-of-the-art methods on a COVID-19 dataset, we bring statistical evidence that models learned using C-SMOTE are better.","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"12 1","pages":"27-38"},"PeriodicalIF":7.9,"publicationDate":"2021-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82762125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Witte, Johann Gerberding, Christian Melching, J. Gómez
In this paper, the deep learning instance segmentation architectures DetectoRS, SOLOv2, DETR and Mask R-CNN were applied to data from the field of Pig Precision Livestock Farming to investigate whether these models can address the specific challenges of this domain. For this purpose, we created a custom dataset consisting of 731 images with high heterogeneity and high-quality segmentation masks. For evaluation, the standard metric for benchmarking instance segmentation models in computer vision, the mean average precision, was used. The results show that all tested models can be applied to the considered domain in terms of prediction accuracy. With a mAP of 0.848, DetectoRS achieves the best results on the test set, but is also the largest model with the greatest hardware requirements. It turns out that increasing model complexity and size does not have a large impact on prediction accuracy for instance segmentation of pigs. DETR, SOLOv2, and Mask R-CNN achieve similar results to DetectoRS with a parameter count almost three times smaller. Visual evaluation of predictions shows quality differences in terms of accuracy of segmentation masks. DetectoRS generates the best masks overall, while DETR has advantages in correctly segmenting the tail region. However, it can be observed that each of the tested models has problems in assigning segmentation masks correctly once a pig is overlapped. The results demonstrate the potential of deep learning instance segmentation models in Pig Precision Livestock Farming and lay the foundation for future research in this area.
{"title":"Evaluation of Deep Learning Instance Segmentation Models for Pig Precision Livestock Farming","authors":"J. Witte, Johann Gerberding, Christian Melching, J. Gómez","doi":"10.52825/bis.v1i.59","DOIUrl":"https://doi.org/10.52825/bis.v1i.59","url":null,"abstract":"In this paper, the deep learning instance segmentation architectures DetectoRS, SOLOv2, DETR and Mask R-CNN were applied to data from the field of Pig Precision Livestock Farming to investigate whether these models can address the specific challenges of this domain. For this purpose, we created a custom dataset consisting of 731 images with high heterogeneity and high-quality segmentation masks. For evaluation, the standard metric for benchmarking instance segmentation models in computer vision, the mean average precision, was used. The results show that all tested models can be applied to the considered domain in terms of prediction accuracy. With a mAP of 0.848, DetectoRS achieves the best results on the test set, but is also the largest model with the greatest hardware requirements. It turns out that increasing model complexity and size does not have a large impact on prediction accuracy for instance segmentation of pigs. DETR, SOLOv2, and Mask R-CNN achieve similar results to DetectoRS with a parameter count almost three times smaller. Visual evaluation of predictions shows quality differences in terms of accuracy of segmentation masks. DetectoRS generates the best masks overall, while DETR has advantages in correctly segmenting the tail region. However, it can be observed that each of the tested models has problems in assigning segmentation masks correctly once a pig is overlapped. The results demonstrate the potential of deep learning instance segmentation models in Pig Precision Livestock Farming and lay the foundation for future research in this area.","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"17 1","pages":"209-220"},"PeriodicalIF":7.9,"publicationDate":"2021-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75204751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The management of business information processes needs effective decision-making models. That means to involve different methods, techniques, and principles to improve competitiveness and to achieve the planned business results. In this context, the article deals with the problem of group decision-making under uncertain conditions. To cope with such problems some well-known optimization strategies of Wald, Laplace, Hurwitz, and Savage are modified to take into account the experts’ opinions with different importance when forming the final group decision. Numerical testing is based on a case study for CRM software selection. The results are discussed based on the proposed models under two different cases derived from the case study. The conducted numerical testing of the proposed models demonstrates their applicability to cope simultaneously with multiple experts’ evaluations and uncertainty conditions.
{"title":"An Integrated Group Decision-Making Approach Considering Uncertainty Conditions","authors":"D. Borissova, Z. Dimitrova","doi":"10.52825/bis.v1i.52","DOIUrl":"https://doi.org/10.52825/bis.v1i.52","url":null,"abstract":"The management of business information processes needs effective decision-making models. That means to involve different methods, techniques, and principles to improve competitiveness and to achieve the planned business results. In this context, the article deals with the problem of group decision-making under uncertain conditions. To cope with such problems some well-known optimization strategies of Wald, Laplace, Hurwitz, and Savage are modified to take into account the experts’ opinions with different importance when forming the final group decision. Numerical testing is based on a case study for CRM software selection. The results are discussed based on the proposed models under two different cases derived from the case study. The conducted numerical testing of the proposed models demonstrates their applicability to cope simultaneously with multiple experts’ evaluations and uncertainty conditions.","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"15 1","pages":"307-316"},"PeriodicalIF":7.9,"publicationDate":"2021-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81307127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge graphs are used as a source of prior knowledge in numerous computer vision tasks. However, such an approach requires to have a mapping between ground truth data labels and the target knowledge graph. We linked the ILSVRC 2012 dataset (often simply referred to as ImageNet) labels to Wikidata entities. This enables using rich knowledge graph structure and contextual information for several computer vision tasks, traditionally benchmarked with ImageNet and its variations. For instance, in few-shot learning classification scenarios with neural networks, this mapping can be leveraged for weight initialisation, which can improve the final performance metrics value. We mapped all 1000 ImageNet labels – 461 were already directly linked with the exact match property (P2888), 467 have exact match candidates, and 72 cannot be matched directly. For these 72 labels, we discuss different problem categories stemming from the inability of finding an exact match. Semantically close non-exact match candidates are presented as well. The mapping is publicly available athttps://github.com/DominikFilipiak/imagenet-to-wikidata-mapping.
{"title":"Mapping of ImageNet and Wikidata for Knowledge Graphs Enabled Computer Vision","authors":"D. Filipiak, A. Fensel, A. Filipowska","doi":"10.52825/bis.v1i.65","DOIUrl":"https://doi.org/10.52825/bis.v1i.65","url":null,"abstract":"Knowledge graphs are used as a source of prior knowledge in numerous computer vision tasks. However, such an approach requires to have a mapping between ground truth data labels and the target knowledge graph. We linked the ILSVRC 2012 dataset (often simply referred to as ImageNet) labels to Wikidata entities. This enables using rich knowledge graph structure and contextual information for several computer vision tasks, traditionally benchmarked with ImageNet and its variations. For instance, in few-shot learning classification scenarios with neural networks, this mapping can be leveraged for weight initialisation, which can improve the final performance metrics value. We mapped all 1000 ImageNet labels – 461 were already directly linked with the exact match property (P2888), 467 have exact match candidates, and 72 cannot be matched directly. For these 72 labels, we discuss different problem categories stemming from the inability of finding an exact match. Semantically close non-exact match candidates are presented as well. The mapping is publicly available athttps://github.com/DominikFilipiak/imagenet-to-wikidata-mapping.","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"28 1","pages":"151-161"},"PeriodicalIF":7.9,"publicationDate":"2021-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81874003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We develop the Google matrix analysis of the multiproduct world trade network obtained from the UN COMTRADE database in recent years. The comparison is done between this new approach and the usual Import-Export description of this world trade network. The Google matrix analysis takes into account the multiplicity of trade transactions thus highlighting in a better way the world influence of specific countries and products. It shows that after Brexit, the European Union of 27 countries has the leading position in the world trade network ranking, being ahead of USA and China. Our approach determines also a sensitivity of trade country balance to specific products showing the dominant role of machinery and mineral fuels in multiproduct exchanges. It also underlines the growing influence of Asian countries.
{"title":"Post-Brexit power of European Union from the world trade network analysis","authors":"Justin Loye, K. Jaffrès-Runser, D. Shepelyansky","doi":"10.52825/bis.v1i.48","DOIUrl":"https://doi.org/10.52825/bis.v1i.48","url":null,"abstract":"We develop the Google matrix analysis of the multiproduct world trade network obtained from the UN COMTRADE database in recent years. The comparison is done between this new approach and the usual Import-Export description of this world trade network. The Google matrix analysis takes into account the multiplicity of trade transactions thus highlighting in a better way the world influence of specific countries and products. It shows that after Brexit, the European Union of 27 countries has the leading position in the world trade network ranking, being ahead of USA and China. Our approach determines also a sensitivity of trade country balance to specific products showing the dominant role of machinery and mineral fuels in multiproduct exchanges. It also underlines the growing influence of Asian countries.","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"22 1","pages":"39-47"},"PeriodicalIF":7.9,"publicationDate":"2021-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82270963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dorina Bano, Tom Lichtenstein, Finn Klessascheck, M. Weske
Process mining is widely adopted in organizations to gain deep insights about running business processes. This can be achieved by applying different process mining techniques like discovery, conformance checking, and performance analysis. These techniques are applied on event logs, which need to be extracted from the organization’s databases beforehand. This not only implies access to databases, but also detailed knowledge about the database schema, which is often not available. In many real-world scenarios, however, process execution data is available as redo logs. Such logs are used to bring a database into a consistent state in case of a system failure. This paper proposes a semi-automatic approach to extract an event log from redo logs alone. It does not require access to the database or knowledge of the databaseschema. The feasibility of the proposed approach is evaluated on two synthetic redo logs.
{"title":"Database-Less Extraction of Event Logs from Redo Logs","authors":"Dorina Bano, Tom Lichtenstein, Finn Klessascheck, M. Weske","doi":"10.52825/bis.v1i.66","DOIUrl":"https://doi.org/10.52825/bis.v1i.66","url":null,"abstract":"Process mining is widely adopted in organizations to gain deep insights about running business processes. This can be achieved by applying different process mining techniques like discovery, conformance checking, and performance analysis. These techniques are applied on event logs, which need to be extracted from the organization’s databases beforehand. This not only implies access to databases, but also detailed knowledge about the database schema, which is often not available. In many real-world scenarios, however, process execution data is available as redo logs. Such logs are used to bring a database into a consistent state in case of a system failure. This paper proposes a semi-automatic approach to extract an event log from redo logs alone. It does not require access to the database or knowledge of the databaseschema. The feasibility of the proposed approach is evaluated on two synthetic redo logs.","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"163 1","pages":"73-82"},"PeriodicalIF":7.9,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80312091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01DOI: 10.1007/978-3-031-04216-4_21
Hui Na Chua, Alvin Wei Qiang Liao, Y. Low, A. Lee, M. Ismail
{"title":"Challenges of Mining Twitter Data for Analyzing Service Performance: A Case Study of Transportation Service in Malaysia","authors":"Hui Na Chua, Alvin Wei Qiang Liao, Y. Low, A. Lee, M. Ismail","doi":"10.1007/978-3-031-04216-4_21","DOIUrl":"https://doi.org/10.1007/978-3-031-04216-4_21","url":null,"abstract":"","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"284 1","pages":"227-239"},"PeriodicalIF":7.9,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76843317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Finn Klessascheck, Tom Lichtenstein, Martin Meier, Simon Remy, Jan-Philipp Sachs, Luise Pufahl, Riccardo Miotto, E. Böttinger, M. Weske
Process mining aims at deriving process knowledge from event logs, which contain data recorded during process executions. Typically, event logs need to be generated from process execution data, stored in different kinds of information systems. In complex domains like healthcare, data is available only at different levels of granularity. Event abstraction techniques allow the transformation of events to a common level of granularity, which enables effective process mining. Existing event abstraction techniques do not sufficiently take into account domain knowledge and, as a result, fail to deliver suitable event logs in complex application domains.This paper presents an event abstraction method based on domain ontologies. We show that the method introduced generates semantically meaningful high-level events, suitable for process mining; it is evaluated on real-world patient treatment data of a large U.S. health system.
{"title":"Domain-Specific Event Abstraction","authors":"Finn Klessascheck, Tom Lichtenstein, Martin Meier, Simon Remy, Jan-Philipp Sachs, Luise Pufahl, Riccardo Miotto, E. Böttinger, M. Weske","doi":"10.52825/bis.v1i.39","DOIUrl":"https://doi.org/10.52825/bis.v1i.39","url":null,"abstract":"Process mining aims at deriving process knowledge from event logs, which contain data recorded during process executions. Typically, event logs need to be generated from process execution data, stored in different kinds of information systems. In complex domains like healthcare, data is available only at different levels of granularity. Event abstraction techniques allow the transformation of events to a common level of granularity, which enables effective process mining. Existing event abstraction techniques do not sufficiently take into account domain knowledge and, as a result, fail to deliver suitable event logs in complex application domains.This paper presents an event abstraction method based on domain ontologies. We show that the method introduced generates semantically meaningful high-level events, suitable for process mining; it is evaluated on real-world patient treatment data of a large U.S. health system.","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"22 1","pages":"117-126"},"PeriodicalIF":7.9,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87117922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01DOI: 10.1007/978-3-031-04216-4_2
D. Kriksciuniene, V. Sakalauskas, Ivana Ognjanovic, R. Šendelj
{"title":"Time-to-Event Modelling for Survival and Hazard Analysis of Stroke Clinical Case","authors":"D. Kriksciuniene, V. Sakalauskas, Ivana Ognjanovic, R. Šendelj","doi":"10.1007/978-3-031-04216-4_2","DOIUrl":"https://doi.org/10.1007/978-3-031-04216-4_2","url":null,"abstract":"","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"4 1","pages":"14-26"},"PeriodicalIF":7.9,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81164102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maciej Pondel, Maciej Wuczynski, W. Gryncewicz, Lukasz Lysik, Marcin Hernes, Artur Rot, Agata Kozina
Churn prediction is a Big Data domain, one of the most demanding use cases of recent time. It is also one of the most critical indicators of a healthy and growing business, irrespective of the size or channel of sales. This paper aims to develop a deep learning model for customers’ churn prediction in e-commerce, which is the main contribution of the article. The experiment was performed over real e-commerce data where 75% of buyers are one-off customers. The prediction based on this business specificity (many one-off customers and very few regular ones) is extremely challenging and, in a natural way, must be inaccurate to a certain ex-tent. Looking from another perspective, correct prediction and subsequent actions resulting in a higher customer retention are very attractive for overall business performance. In such a case, predictions with 74% accuracy, 78% precision, and 68% recall are very promising. Also, the paper fills a research gap and contrib-utes to the existing literature in the area of developing a customer churn prediction method for the retail sector by using deep learning tools based on customer churn and the full history of each customer’s transactions.
{"title":"Deep Learning for Customer Churn Prediction in E-Commerce Decision Support","authors":"Maciej Pondel, Maciej Wuczynski, W. Gryncewicz, Lukasz Lysik, Marcin Hernes, Artur Rot, Agata Kozina","doi":"10.52825/bis.v1i.42","DOIUrl":"https://doi.org/10.52825/bis.v1i.42","url":null,"abstract":"Churn prediction is a Big Data domain, one of the most demanding use cases of recent time. It is also one of the most critical indicators of a healthy and growing business, irrespective of the size or channel of sales. This paper aims to develop a deep learning model for customers’ churn prediction in e-commerce, which is the main contribution of the article. The experiment was performed over real e-commerce data where 75% of buyers are one-off customers. The prediction based on this business specificity (many one-off customers and very few regular ones) is extremely challenging and, in a natural way, must be inaccurate to a certain ex-tent. Looking from another perspective, correct prediction and subsequent actions resulting in a higher customer retention are very attractive for overall business performance. In such a case, predictions with 74% accuracy, 78% precision, and 68% recall are very promising. Also, the paper fills a research gap and contrib-utes to the existing literature in the area of developing a customer churn prediction method for the retail sector by using deep learning tools based on customer churn and the full history of each customer’s transactions.","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"10 1","pages":"3-12"},"PeriodicalIF":7.9,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84919126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}