Programming and Computer Software最新文献_第7页

Mining User-Object Interaction Data for Student Modeling in Intelligent Learning Environments 挖掘用户与物体交互数据，为智能学习环境中的学生建模

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2024-01-24 DOI: 10.1134/s036176882308008x

J. G. Hernández-Calderón, E. Benítez-Guerrero, J. R. Rojano-Cáceres, Carmen Mezura-Godoy

Abstract

This work seeks to contribute to the development of intelligent environments by presenting an approach oriented to the identification of On-Task and Off-Task behaviors in educational settings. This is accomplished by monitoring and analyzing the user-object interactions that users manifest while performing academic activities with a tangible-intangible hybrid system in a university intelligent environment configuration. With the proposal of a framework and the Orange Data Mining tool and the Neural Network, Random Forest, Naive Bayes, and Tree classification models, training and testing was carried out with the user-object interaction records of the 13 students (11 for training and two for testing) to identify representative sequences of behavior from user-object interaction records. The two models that had the best results, despite the small number of data, were the Neural Network and Naive Bayes. Although a more significant amount of data is necessary to perform a classification adequately, the process allowed exemplifying this process so that it can later be fully incorporated into an intelligent educational system.

摘要这项工作旨在通过提出一种在教育环境中识别 "任务中 "和 "任务外 "行为的方法，为智能环境的开发做出贡献。该方法通过监测和分析用户在大学智能环境配置中使用有形-无形混合系统进行学术活动时所表现出的用户-对象互动来实现。通过提出一个框架和 Orange 数据挖掘工具，以及神经网络、随机森林、奈夫贝叶斯和树分类模型，对 13 名学生（11 人用于训练，2 人用于测试）的用户-对象交互记录进行了训练和测试，以便从用户-对象交互记录中找出有代表性的行为序列。尽管数据量较小，但效果最好的两个模型是神经网络和 Naive Bayes。虽然需要更多的数据量才能充分进行分类，但这一过程可以对这一过程进行示范，以便日后将其完全纳入智能教育系统。

{"title":"Mining User-Object Interaction Data for Student Modeling in Intelligent Learning Environments","authors":"J. G. Hernández-Calderón, E. Benítez-Guerrero, J. R. Rojano-Cáceres, Carmen Mezura-Godoy","doi":"10.1134/s036176882308008x","DOIUrl":"https://doi.org/10.1134/s036176882308008x","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>This work seeks to contribute to the development of intelligent environments by presenting an approach oriented to the identification of On-Task and Off-Task behaviors in educational settings. This is accomplished by monitoring and analyzing the user-object interactions that users manifest while performing academic activities with a tangible-intangible hybrid system in a university intelligent environment configuration. With the proposal of a framework and the Orange Data Mining tool and the Neural Network, Random Forest, Naive Bayes, and Tree classification models, training and testing was carried out with the user-object interaction records of the 13 students (11 for training and two for testing) to identify representative sequences of behavior from user-object interaction records. The two models that had the best results, despite the small number of data, were the Neural Network and Naive Bayes. Although a more significant amount of data is necessary to perform a classification adequately, the process allowed exemplifying this process so that it can later be fully incorporated into an intelligent educational system.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"121 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139559548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Determination of Relevant Risk Factors for Breast Cancer Using Feature Selection 利用特征选择确定乳腺癌的相关风险因素

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2024-01-24 DOI: 10.1134/s0361768823080091

Zazil Ibarra-Cuevas, Jose Nunez-Varela, Alberto Nunez-Varela, Francisco E. Martinez-Perez, Sandra E. Nava-Muñoz, Cesar A. Ramirez-Gamez, Hector G. Perez-Gonzalez

Abstract

Breast cancer is a serious threat to women’s health worldwide. Although the exact causes of this disease are still unknown, it is known that the incidence of breast cancer is associated with risk factors. Risk factors in cancer are any genetic, reproductive, hormonal, physical, biological, or lifestyle-related conditions that increase the likelihood of developing breast cancer. This research aims to identify the most relevant risk factors in patients with breast cancer in a dataset by following the Knowledge Discovery in Databases process. To determine the relevance of risk factors, this research implements two feature selection methods: the Chi-Squared test and Mutual Information; and seven classifiers are used to validate the results obtained. Our results show that the risk factors identified as the most relevant are related to the age of the patient, her menopausal status, whether she had undergone hormonal therapy, and her type of menopause.

摘要乳腺癌严重威胁着全世界妇女的健康。虽然这种疾病的确切病因尚不清楚，但已知乳腺癌的发病率与危险因素有关。癌症的危险因素是指任何会增加罹患乳腺癌可能性的遗传、生殖、荷尔蒙、生理、生物或生活方式相关条件。本研究旨在通过数据库中的知识发现过程，从数据集中找出与乳腺癌患者最相关的风险因素。为了确定风险因素的相关性，本研究采用了两种特征选择方法：Chi-Squared 检验和互信息；并使用了七个分类器来验证所获得的结果。结果表明，被确定为最相关的风险因素与患者的年龄、绝经状态、是否接受过激素治疗以及绝经类型有关。

{"title":"Determination of Relevant Risk Factors for Breast Cancer Using Feature Selection","authors":"Zazil Ibarra-Cuevas, Jose Nunez-Varela, Alberto Nunez-Varela, Francisco E. Martinez-Perez, Sandra E. Nava-Muñoz, Cesar A. Ramirez-Gamez, Hector G. Perez-Gonzalez","doi":"10.1134/s0361768823080091","DOIUrl":"https://doi.org/10.1134/s0361768823080091","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Breast cancer is a serious threat to women’s health worldwide. Although the exact causes of this disease are still unknown, it is known that the incidence of breast cancer is associated with risk factors. Risk factors in cancer are any genetic, reproductive, hormonal, physical, biological, or lifestyle-related conditions that increase the likelihood of developing breast cancer. This research aims to identify the most relevant risk factors in patients with breast cancer in a dataset by following the <i>Knowledge Discovery in Databases</i> process. To determine the relevance of risk factors, this research implements two feature selection methods: the <i>Chi-Squared test</i> and <i>Mutual Information</i>; and seven classifiers are used to validate the results obtained. Our results show that the risk factors identified as the most relevant are related to the age of the patient, her menopausal status, whether she had undergone hormonal therapy, and her type of menopause.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"18 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139559570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Building a Scale for Internet Fraud Detection Using Machine Learning 利用机器学习构建互联网欺诈检测规模

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2024-01-24 DOI: 10.1134/s0361768823080261

L. V. Zhukova, I. M. Kovalchuk, A. A. Kochnev, V. R. Chugunov

Abstract

The widespread digitalization of the modern society and the development of information technology have increased the number of ways in which financial institutions and potential consumers of financial services can interact. At the same time, the advent of new financial products inevitably leads to a rise in threats, and the use of information technology facilitates the constant “improvement” of fraud schemes and unfair business practices, which negatively affect both the financial market as a whole and its individual participants such as financial institutions and their clients. With the development of the modern society, most financial transactions, including the fraudulent ones, have moved to the Internet. When services are provided remotely, it is more difficult to trace and prosecute the beneficiary. However, there are still ways to stop fraudulent activity, but they are associated with high costs of monitoring and analysis of huge amounts of unstructured information (BigData) available on the Internet, which takes a great amount of time and effort. A solution to illegal activity detection in financial markets is based on open data intelligence, machine learning, and systems analysis. This paper examines certain types of financial services provided on the Internet among which fraudulent activities are most common. In order to identify illegal financial services, some criteria are developed and grouped based on their contribution to the decision-making process. The main result of this study is the construction of a scale for a complex indicator, which is used to build a mathematical model based on the developed criteria and machine learning methods for determining the degree of illegality of online financial services.

摘要现代社会的广泛数字化和信息技术的发展增加了金融机构和金融服务潜在消费者的互动方式。与此同时，新金融产品的出现也不可避免地导致了威胁的增加，信息技术的使用促进了欺诈阴谋和不公平商业行为的不断 "改进"，这对整个金融市场及其个体参与者（如金融机构及其客户）都产生了负面影响。随着现代社会的发展，大多数金融交易，包括欺诈性交易，都转移到了互联网上。在远程提供服务的情况下，追查和起诉受益人就更加困难。不过，阻止欺诈活动的方法还是有的，但与之相关的是监控和分析互联网上大量非结构化信息（BigData）的高昂成本，这需要花费大量的时间和精力。金融市场非法活动检测的解决方案基于开放数据智能、机器学习和系统分析。本文研究了互联网上提供的某些类型的金融服务，其中欺诈活动最为常见。为了识别非法金融服务，本文制定了一些标准，并根据这些标准对决策过程的贡献进行了分组。本研究的主要成果是为一个复杂指标构建了一个量表，并根据所制定的标准和机器学习方法建立了一个数学模型，用于确定在线金融服务的非法程度。

{"title":"Building a Scale for Internet Fraud Detection Using Machine Learning","authors":"L. V. Zhukova, I. M. Kovalchuk, A. A. Kochnev, V. R. Chugunov","doi":"10.1134/s0361768823080261","DOIUrl":"https://doi.org/10.1134/s0361768823080261","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The widespread digitalization of the modern society and the development of information technology have increased the number of ways in which financial institutions and potential consumers of financial services can interact. At the same time, the advent of new financial products inevitably leads to a rise in threats, and the use of information technology facilitates the constant “improvement” of fraud schemes and unfair business practices, which negatively affect both the financial market as a whole and its individual participants such as financial institutions and their clients. With the development of the modern society, most financial transactions, including the fraudulent ones, have moved to the Internet. When services are provided remotely, it is more difficult to trace and prosecute the beneficiary. However, there are still ways to stop fraudulent activity, but they are associated with high costs of monitoring and analysis of huge amounts of unstructured information (BigData) available on the Internet, which takes a great amount of time and effort. A solution to illegal activity detection in financial markets is based on open data intelligence, machine learning, and systems analysis. This paper examines certain types of financial services provided on the Internet among which fraudulent activities are most common. In order to identify illegal financial services, some criteria are developed and grouped based on their contribution to the decision-making process. The main result of this study is the construction of a scale for a complex indicator, which is used to build a mathematical model based on the developed criteria and machine learning methods for determining the degree of illegality of online financial services.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"61 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140881595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Human Event Recognition in Smart Classrooms Using Computer Vision: A Systematic Literature Review 利用计算机视觉识别智能教室中的人类事件：系统性文献综述

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2024-01-24 DOI: 10.1134/s0361768823080066

M. L. Córdoba-Tlaxcalteco, E. Benítez-Guerrero

Abstract

The field of human event recognition using visual data in smart environments has emerged as a fruitful and successful area of study, with extensive research and development efforts driving significant advancements. These advancements have not only provided valuable insights, but also led to practical applications in various domains. In this context, human actions, activities, interactions, and behaviors can all be considered as events of interest in smart environments. However, when it comes to smart classrooms, there is a lack of unified consensus on the definition of the term “human event.” This lack of agreement presents a significant challenge for educators, researchers, and developers, as it hampers their ability to precisely identify and classify the specific situations that are relevant within the educational context. The aim of this paper is to address this challenge by conducting a systematic literature review of relevant events in smart classrooms, with a focus on their applications in assistive technology. The review encompasses a comprehensive analysis of 227 published documents spanning from 2012 to 2022. It delves into key algorithms, methodologies, and applications of vision-based event recognition in smart environments. As the primary outcome, the review identifies the most significant events, classifying them according to single person behavior, or multiple-person interactions, or object-person interactions. It also examines their practical applications within the educational context. The paper concludes with a discussion on the relevance and practicality of vision-based human event recognition in smart classrooms, particularly in the post-COVID era.

摘要利用智能环境中的视觉数据进行人类事件识别领域已成为一个富有成果的成功研究领域，广泛的研究和开发工作推动了这一领域的重大进展。这些进展不仅提供了有价值的见解，还促成了在各个领域的实际应用。在这种情况下，人类的行动、活动、互动和行为都可以被视为智能环境中的相关事件。然而，在智能教室方面，人们对 "人类事件 "的定义缺乏统一的共识。这种缺乏共识的情况给教育工作者、研究人员和开发人员带来了巨大的挑战，因为这阻碍了他们精确识别和分类教育环境中相关特定情况的能力。本文旨在通过对智能教室中的相关事件进行系统的文献综述来应对这一挑战，重点关注其在辅助技术中的应用。该综述全面分析了从 2012 年到 2022 年发表的 227 篇文献。它深入探讨了智能环境中基于视觉的事件识别的关键算法、方法和应用。作为主要成果，综述确定了最重要的事件，并根据单人行为、多人交互或物人交互对其进行了分类。论文还探讨了这些事件在教育领域的实际应用。论文最后讨论了智能教室中基于视觉的人类事件识别的相关性和实用性，尤其是在后 COVID 时代。

{"title":"Human Event Recognition in Smart Classrooms Using Computer Vision: A Systematic Literature Review","authors":"M. L. Córdoba-Tlaxcalteco, E. Benítez-Guerrero","doi":"10.1134/s0361768823080066","DOIUrl":"https://doi.org/10.1134/s0361768823080066","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>The field of human event recognition using visual data in smart environments has emerged as a fruitful and successful area of study, with extensive research and development efforts driving significant advancements. These advancements have not only provided valuable insights, but also led to practical applications in various domains. In this context, human actions, activities, interactions, and behaviors can all be considered as events of interest in smart environments. However, when it comes to smart classrooms, there is a lack of unified consensus on the definition of the term “human event.” This lack of agreement presents a significant challenge for educators, researchers, and developers, as it hampers their ability to precisely identify and classify the specific situations that are relevant within the educational context. The aim of this paper is to address this challenge by conducting a systematic literature review of relevant events in smart classrooms, with a focus on their applications in assistive technology. The review encompasses a comprehensive analysis of 227 published documents spanning from 2012 to 2022. It delves into key algorithms, methodologies, and applications of vision-based event recognition in smart environments. As the primary outcome, the review identifies the most significant events, classifying them according to single person behavior, or multiple-person interactions, or object-person interactions. It also examines their practical applications within the educational context. The paper concludes with a discussion on the relevance and practicality of vision-based human event recognition in smart classrooms, particularly in the post-COVID era.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"7 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139559608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Taxonomic View of the Fundamental Concepts of Quantum Computing–A Software Engineering Perspective 量子计算基本概念的分类学视角--软件工程的视角

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2024-01-24 DOI: 10.1134/s0361768823080108

R. Juárez-Ramírez, C. X. Navarro, Samantha Jiménez, Alan Ramírez, Verónica Tapia-Ibarra, César Guerra-García, Hector G. Perez-Gonzalez, Carlos Fernández-y-Fernández

Abstract

Quantum computing is based on the principles of quantum mechanics, such as superposition, entanglement, measurement, and decoherence. The basic units of computation are qubits, which are abstract objects with a mathematical expression to implement the quantum mechanics principles. Alongside quantum hardware, software is a principal element for conducting quantum computing. The software consists of logic gates and quantum circuits that implement algorithms for the execution of quantum programs. Due to those characteristics, quantum computing is a paradigm that non-physics experts cannot understand. Under this new scheme for developing software, it is important to integrate a conceptual framework of the fundamentals on which quantum computing is based. In this paper, we present a kind of taxonomical view of the fundamental concepts of quantum computing and the derived concepts that integrate the emerging discipline of quantum software engineering. We performed a quasi-systematic mapping for conducting the systematic review because the objective of the review only intends to detect the fundamental concepts of quantum computing and quantum software. The results can help computer science students and professors as a starting point to address the study of this discipline.

摘要量子计算基于量子力学原理，如叠加、纠缠、测量和退相干。计算的基本单位是量子比特，它是用数学表达式来实现量子力学原理的抽象对象。除了量子硬件，软件也是进行量子计算的主要元素。软件由逻辑门和量子电路组成，可实现执行量子程序的算法。由于这些特点，量子计算是非物理专家无法理解的范式。在这种新的软件开发方案下，必须整合量子计算所基于的基础概念框架。在本文中，我们对量子计算的基本概念和量子软件工程这一新兴学科的衍生概念进行了分类。由于综述的目的只是为了发现量子计算和量子软件的基本概念，因此我们在进行系统综述时进行了准系统映射。综述结果可以帮助计算机科学专业的学生和教授们以此为起点来研究这门学科。

{"title":"A Taxonomic View of the Fundamental Concepts of Quantum Computing–A Software Engineering Perspective","authors":"R. Juárez-Ramírez, C. X. Navarro, Samantha Jiménez, Alan Ramírez, Verónica Tapia-Ibarra, César Guerra-García, Hector G. Perez-Gonzalez, Carlos Fernández-y-Fernández","doi":"10.1134/s0361768823080108","DOIUrl":"https://doi.org/10.1134/s0361768823080108","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Quantum computing is based on the principles of quantum mechanics, such as superposition, entanglement, measurement, and decoherence. The basic units of computation are qubits, which are abstract objects with a mathematical expression to implement the quantum mechanics principles. Alongside quantum hardware, software is a principal element for conducting quantum computing. The software consists of logic gates and quantum circuits that implement algorithms for the execution of quantum programs. Due to those characteristics, quantum computing is a paradigm that non-physics experts cannot understand. Under this new scheme for developing software, it is important to integrate a conceptual framework of the fundamentals on which quantum computing is based. In this paper, we present a kind of taxonomical view of the fundamental concepts of quantum computing and the derived concepts that integrate the emerging discipline of quantum software engineering. We performed a quasi-systematic mapping for conducting the systematic review because the objective of the review only intends to detect the fundamental concepts of quantum computing and quantum software. The results can help computer science students and professors as a starting point to address the study of this discipline.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"10 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140881598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving a Model for NFR Estimation Using Band Classification and Selection with KNN 利用 KNN 对波段进行分类和选择，改进 NFR 估算模型

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2024-01-24 DOI: 10.1134/s0361768823080236

F. Valdés-Souto, J. Valeriano-Assem, D. Torres-Robledo

Abstract

Any software development project needs to estimate non-functional requirements (NFR). Typically, software managers are forced to use expert judgment to estimate the NFR. Today, NFRs cannot be measured, as there is no standardized unit of measurement for them. Consequently, most estimation models focus on the functional user requirements (FUR) and do not consider the NFR in the estimation process because these terms are often subjective. The objective of this paper was to show how an NFR estimation model was created using fuzzy logic, and K-Nearest Neighbors classifier algorithm, aiming to consider the subjectivity embedded in NFR terms to solve a specific problem in a Mexican company. The proposed model was developed using a database with real projects from a Mexican company in the private sector. The results were beneficial and better than the initial model considering quality criteria like mean magnitude of relative error (MMRE), standard deviation of magnitude of relative error (SDMRE) and prediction level (Pred 25%). Additionally, the proposed approach allows the managers to identify quantitative elements related to NFR that could be used to interpret the data and build additional models.

摘要任何软件开发项目都需要估算非功能性需求（NFR）。通常情况下，软件管理者不得不使用专家判断来估算 NFR。如今，NFR 无法测量，因为没有标准化的测量单位。因此，大多数估算模型都只关注功能用户需求（FUR），而不考虑估算过程中的 NFR，因为这些术语通常都是主观的。本文旨在展示如何利用模糊逻辑和 K-Nearest Neighbors 分类器算法创建 NFR 估算模型，旨在考虑 NFR 术语的主观性，以解决墨西哥一家公司的具体问题。所提议的模型是利用墨西哥一家私营公司的真实项目数据库开发的。考虑到平均相对误差幅度（MMRE）、相对误差幅度标准偏差（SDMRE）和预测水平（Pred 25%）等质量标准，结果比初始模型更有益、更好。此外，建议的方法还能让管理人员识别与 NFR 相关的定量要素，这些要素可用于解释数据和建立其他模型。

{"title":"Improving a Model for NFR Estimation Using Band Classification and Selection with KNN","authors":"F. Valdés-Souto, J. Valeriano-Assem, D. Torres-Robledo","doi":"10.1134/s0361768823080236","DOIUrl":"https://doi.org/10.1134/s0361768823080236","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Any software development project needs to estimate non-functional requirements (NFR). Typically, software managers are forced to use expert judgment to estimate the NFR. Today, NFRs cannot be measured, as there is no standardized unit of measurement for them. Consequently, most estimation models focus on the functional user requirements (FUR) and do not consider the NFR in the estimation process because these terms are often subjective. The objective of this paper was to show how an NFR estimation model was created using fuzzy logic, and K-Nearest Neighbors classifier algorithm, aiming to consider the subjectivity embedded in NFR terms to solve a specific problem in a Mexican company. The proposed model was developed using a database with real projects from a Mexican company in the private sector. The results were beneficial and better than the initial model considering quality criteria like mean magnitude of relative error (MMRE), standard deviation of magnitude of relative error (SDMRE) and prediction level (Pred 25%). Additionally, the proposed approach allows the managers to identify quantitative elements related to NFR that could be used to interpret the data and build additional models.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"10 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140881694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Numerical Simulation of Particulate Matter Transport in the Atmospheric Urban Boundary Layer Using the Lagrangian Approach: Physical Problems and Parallel Implementation 使用拉格朗日方法对大气城市边界层中的颗粒物质迁移进行数值模拟：物理问题与并行实施

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2024-01-24 DOI: 10.1134/s0361768823080248

A. I. Varentsov, O. A. Imeev, A. V. Glazunov, E. V. Mortikov, V. M. Stepanenko

Abstract

This paper presents results of development of a numerical model of Lagrangian particle transport, as well as results of application of parallel computation methods to improve the efficiency of the software implementation of this model. The model is a software package that allows the transport and deposition of aerosol particles to be calculated taking into account properties of particles and the input data that describe atmospheric conditions and underlying surface geometry. The dynamic core, physical parameterizations, numerical implementation, and algorithm of the model are described. Results of successful verification of the model on analytical solutions are presented. Initially, the model was used for less computationally intensive problems. In this paper, given the need to use the model in more computationally intensive problems, we optimize the sequential software implementation of the model, as well as develop its software implementations that use parallel computing technologies (OpenMP, MPI, and CUDA). The results of testing different implementations of the model show that the optimization of the most computationally complex blocks in its sequential version can reduce the execution time by 27%. At the same time, the use of parallel computing technologies allows us to achieve acceleration by several orders of magnitude. The use of OpenMP in the dynamic block of the model provides almost 4-fold acceleration of this block; the use of MPI, almost 8-fold acceleration; and the use of CUDA, almost 16-fold acceleration (all other conditions being equal). We also give some recommendations on the choice of a parallel computing technology depending on the properties of a computing system.

摘要本文介绍了拉格朗日粒子输运数值模型的开发成果，以及应用并行计算方法提高该模型软件实施效率的成果。该模型是一个软件包，可以计算气溶胶粒子的传输和沉积，同时考虑到粒子的特性以及描述大气条件和底层表面几何形状的输入数据。本文介绍了该模型的动态核心、物理参数化、数值实现和算法。还介绍了该模型在分析解上的成功验证结果。最初，该模型用于计算量较小的问题。在本文中，考虑到需要将模型用于计算密集度较高的问题，我们优化了模型的顺序软件实现，并开发了使用并行计算技术（OpenMP、MPI 和 CUDA）的软件实现。对模型不同实施方案的测试结果表明，对其顺序版本中计算最复杂的区块进行优化，可将执行时间缩短 27%。同时，并行计算技术的使用使我们能够实现几个数量级的加速。在模型的动态区块中使用 OpenMP，可将该区块的速度提高近 4 倍；使用 MPI，可将速度提高近 8 倍；使用 CUDA，可将速度提高近 16 倍（在其他条件相同的情况下）。我们还根据计算系统的特性，就并行计算技术的选择提出了一些建议。

{"title":"Numerical Simulation of Particulate Matter Transport in the Atmospheric Urban Boundary Layer Using the Lagrangian Approach: Physical Problems and Parallel Implementation","authors":"A. I. Varentsov, O. A. Imeev, A. V. Glazunov, E. V. Mortikov, V. M. Stepanenko","doi":"10.1134/s0361768823080248","DOIUrl":"https://doi.org/10.1134/s0361768823080248","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>This paper presents results of development of a numerical model of Lagrangian particle transport, as well as results of application of parallel computation methods to improve the efficiency of the software implementation of this model. The model is a software package that allows the transport and deposition of aerosol particles to be calculated taking into account properties of particles and the input data that describe atmospheric conditions and underlying surface geometry. The dynamic core, physical parameterizations, numerical implementation, and algorithm of the model are described. Results of successful verification of the model on analytical solutions are presented. Initially, the model was used for less computationally intensive problems. In this paper, given the need to use the model in more computationally intensive problems, we optimize the sequential software implementation of the model, as well as develop its software implementations that use parallel computing technologies (OpenMP, MPI, and CUDA). The results of testing different implementations of the model show that the optimization of the most computationally complex blocks in its sequential version can reduce the execution time by 27%. At the same time, the use of parallel computing technologies allows us to achieve acceleration by several orders of magnitude. The use of OpenMP in the dynamic block of the model provides almost 4-fold acceleration of this block; the use of MPI, almost 8-fold acceleration; and the use of CUDA, almost 16-fold acceleration (all other conditions being equal). We also give some recommendations on the choice of a parallel computing technology depending on the properties of a computing system.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"18 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140881697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Active Learning and Transfer Learning for Document Segmentation 文档分割的主动学习和迁移学习

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2023-12-07 DOI: 10.1134/s0361768823070046

D. M. Kiranov, M. A. Ryndin, I. S. Kozlov

Abstract

In this paper, we investigate the effectiveness of classical approaches to active learning in the problem of document segmentation with the aim of reducing the size of the training sample. A modified approach to selection of document images for labeling and subsequent model training is presented. The results of active learning are compared to those of transfer learning on fully labeled data. The paper also investigates how the problem domain of a training set, on which a model is initialized for transfer learning, affects the subsequent uptraining of the model.

摘要在本文中，我们研究了主动学习的经典方法在文档分割问题中的有效性，目的是减少训练样本的大小。本文介绍了一种改进的方法，用于选择文档图像进行标记和随后的模型训练。将主动学习的结果与在完全标记数据上进行迁移学习的结果进行了比较。论文还研究了用于迁移学习的模型初始化的训练集的问题域如何影响模型的后续向上训练。

引用次数: 1

Kotlin from the Point of View of Static Analysis Developer 从静态分析开发人员的角度看 Kotlin

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2023-12-07 DOI: 10.1134/s0361768823070022

V. O. Afanasyev, S. A. Polyakov, A. E. Borodin, A. A. Belevantsev

Abstract

This paper describes a static analysis tool for finding defects, analyzing metrics and relations for programs written in the Kotlin language. The approach is implemented in the Svace static analyzer developed at the Ivannikov Institute for System Programming of the Russian Academy of Sciences. The paper focuses on the problems we faced during the implementation, the approaches we used to solve them, and the experimental results for the tool we built. The tool not only supports Kotlin but is also capable of analyzing mixed projects that use both Java and Kotlin. We hope that this paper will be useful to static analysis developers and language designers.

摘要本文介绍了一种静态分析工具，用于查找用 Kotlin 语言编写的程序的缺陷、分析指标和关系。该方法是在俄罗斯科学院伊万尼科夫系统编程研究所开发的 Svace 静态分析器中实现的。本文重点介绍了我们在实施过程中遇到的问题、解决这些问题的方法以及我们所构建工具的实验结果。该工具不仅支持 Kotlin，还能分析同时使用 Java 和 Kotlin 的混合项目。我们希望本文能对静态分析开发人员和语言设计人员有所帮助。

引用次数: 0

Cross-Lingual Transfer Learning in Drug-Related Information Extraction from User-Generated Texts 从用户生成的文本中提取药物相关信息的跨语言迁移学习

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Programming and Computer Software

Pub Date : 2023-12-07 DOI: 10.1134/s036176882307006x

A. S. Sakhovskiy, E. V. Tutubalina

Abstract

Aggregating knowledge about drug, disease, and drug reaction entities across a broader range of domains and languages is critical for information extraction applications. In this work, we present a fine-grained evaluation intended to understand the efficiency of multilingual BERT-based models for biomedical named entity recognition (NER) and multi-label sentence classification. We investigate the role of transfer learning strategies between two English corpora and a novel annotated corpus of Russian reviews about drug therapy. In these corpora, labels for sentences indicate health-related issues or their absence. Sentences that belong to a certain class are additionally labeled at the entity level to identify fine-grained subtypes such as drug names, drug indications, and drug reactions. The evaluation results demonstrate that the BERT training on Russian and English raw reviews (5M in total) provides the best transfer capabilities for adverse drug reactions detection task on the Russian data. The macro F1 score of 74.85% in the NER task was achieved by our RuDR-BERT model. For the classification task, our EnRuDR-BERT model achieved the macro F1 score of 70%, gaining 8.64% over the score of a general-domain BERT model.

摘要在更广泛的领域和语言中聚合有关药物、疾病和药物反应实体的知识对于信息提取应用至关重要。在这项工作中，我们提出了一项细粒度评估，旨在了解基于多语言 BERT 模型的生物医学命名实体识别（NER）和多标签句子分类的效率。我们研究了迁移学习策略在两个英语语料库和一个新的俄语药物治疗评论注释语料库之间的作用。在这些语料库中，句子的标签表示与健康相关的问题或不存在这些问题。属于某个类别的句子在实体层面上被额外标注，以识别细粒度的子类型，如药物名称、药物适应症和药物反应。评估结果表明，在俄语和英语原始评论（共 500 万条）上进行的 BERT 训练为俄语数据上的药物不良反应检测任务提供了最佳的转移能力。我们的 RuDR-BERT 模型在 NER 任务中取得了 74.85% 的宏观 F1 分数。在分类任务中，我们的 EnRuDR-BERT 模型取得了 70% 的宏观 F1 分数，比一般领域 BERT 模型的分数高出 8.64%。

{"title":"Cross-Lingual Transfer Learning in Drug-Related Information Extraction from User-Generated Texts","authors":"A. S. Sakhovskiy, E. V. Tutubalina","doi":"10.1134/s036176882307006x","DOIUrl":"https://doi.org/10.1134/s036176882307006x","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Aggregating knowledge about drug, disease, and drug reaction entities across a broader range of domains and languages is critical for information extraction applications. In this work, we present a fine-grained evaluation intended to understand the efficiency of multilingual BERT-based models for biomedical named entity recognition (NER) and multi-label sentence classification. We investigate the role of transfer learning strategies between two English corpora and a novel annotated corpus of Russian reviews about drug therapy. In these corpora, labels for sentences indicate health-related issues or their absence. Sentences that belong to a certain class are additionally labeled at the entity level to identify fine-grained subtypes such as drug names, drug indications, and drug reactions. The evaluation results demonstrate that the BERT training on Russian and English raw reviews (5M in total) provides the best transfer capabilities for adverse drug reactions detection task on the Russian data. The macro F1 score of 74.85% in the NER task was achieved by our RuDR-BERT model. For the classification task, our EnRuDR-BERT model achieved the macro F1 score of 70%, gaining 8.64% over the score of a general-domain BERT model.</p>","PeriodicalId":54555,"journal":{"name":"Programming and Computer Software","volume":"13 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138553053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1