Rezaul Karim, Mahmudul Hasan, Amit Kumar Kundu, Ali Ahmed Ave
While the core quality of SVM comes from its ability to get the global optima, its classification performance also depends on computing kernels. However, while this kernel-complexity generates the power of machine, it is also responsible for the computational load to execute this kernel. Moreover, insisting on a similarity function to be a positive definite kernel demands some properties to be satisfied that seem unproductive sometimes raising a question about which similarity measures to be used for classifier. We model Vapnik’s LPSVM proposing a new similarity function replacing kernel function. Following the strategy of ”Accuracy first, speed second”, we have modelled a similarity function that is mathematically well-defined depending on analysis as well as geometry and complex enough to train the machine for generating solid generalization ability. Being consistent with the theory of learning by Balcan and Blum [1], our similarity function does not need to be a valid kernel function and demands less computational cost for executing compared to its counterpart like RBF or other kernels while provides sufficient power to the classifier using its optimal complexity. Benchmarking shows that our similarity function based LPSVM poses test error 0.86 times of the most powerful RBF based QP SVM but demands only 0.40 times of its computational cost.
{"title":"LP SVM with A Novel Similarity function Outperforms Powerful LP-QP-Kernel-SVM Considering Efficient Classification","authors":"Rezaul Karim, Mahmudul Hasan, Amit Kumar Kundu, Ali Ahmed Ave","doi":"10.31449/inf.v47i8.4767","DOIUrl":"https://doi.org/10.31449/inf.v47i8.4767","url":null,"abstract":"While the core quality of SVM comes from its ability to get the global optima, its classification performance also depends on computing kernels. However, while this kernel-complexity generates the power of machine, it is also responsible for the computational load to execute this kernel. Moreover, insisting on a similarity function to be a positive definite kernel demands some properties to be satisfied that seem unproductive sometimes raising a question about which similarity measures to be used for classifier. We model Vapnik’s LPSVM proposing a new similarity function replacing kernel function. Following the strategy of ”Accuracy first, speed second”, we have modelled a similarity function that is mathematically well-defined depending on analysis as well as geometry and complex enough to train the machine for generating solid generalization ability. Being consistent with the theory of learning by Balcan and Blum [1], our similarity function does not need to be a valid kernel function and demands less computational cost for executing compared to its counterpart like RBF or other kernels while provides sufficient power to the classifier using its optimal complexity. Benchmarking shows that our similarity function based LPSVM poses test error 0.86 times of the most powerful RBF based QP SVM but demands only 0.40 times of its computational cost.","PeriodicalId":56292,"journal":{"name":"Informatica","volume":"47 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69808614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of neural models has greatly improved the performance of machine translation, but these methods require large-scale parallel data, which can be difficult to obtain for low-resource language pairs. To address this issue, this research employs a pre-trained multilingual model and fine-tunes it by using a small bilingual dataset. Additionally, two data-augmentation strategies are proposed to generate new training data: (i) back-translation with the dataset from the source language; (ii) data augmentation via the English pivot language. The proposed approach is applied to the Khmer-Vietnamese machine translation. Experimental results show that our proposed approach outperforms the Google Translator model by 5.3% in terms of BLEU score on a test set of 2,000 Khmer-Vietnamese sentence pairs.
{"title":"Low-Resource Neural Machine Translation Improvement Using Data Augmentation Strategies","authors":"Thai Nguyen Quoc, Huong Le Thanh, Hanh Pham Van","doi":"10.31449/inf.v47i3.4761","DOIUrl":"https://doi.org/10.31449/inf.v47i3.4761","url":null,"abstract":"The development of neural models has greatly improved the performance of machine translation, but these methods require large-scale parallel data, which can be difficult to obtain for low-resource language pairs. To address this issue, this research employs a pre-trained multilingual model and fine-tunes it by using a small bilingual dataset. Additionally, two data-augmentation strategies are proposed to generate new training data: (i) back-translation with the dataset from the source language; (ii) data augmentation via the English pivot language. The proposed approach is applied to the Khmer-Vietnamese machine translation. Experimental results show that our proposed approach outperforms the Google Translator model by 5.3% in terms of BLEU score on a test set of 2,000 Khmer-Vietnamese sentence pairs.","PeriodicalId":56292,"journal":{"name":"Informatica","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136349756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tri Le, Nham Huynh-Duc, Chung Thai Nguyen, Minh-Triet Tran
The demand for human activity recognition (HAR) from videos has witnessed a significant surge in various real-life applications, including video surveillance, healthcare, elderly care, among others. The explotion of short-form videos on social media platforms has further intensified the interest in this domain. This research endeavors to focus on the problem of HAR in general short videos. In contrast to still images, video clips offer both spatial and temporal information, rendering it challenging to extract complementary information on appearance from still frames and motion between frames. This research makes a two-fold contribution. Firstly, we investigate the use of motion-embedded images in a variant of two-stream Convolutional Neural Network architecture, in which one stream captures motion using combined batches of frames, while another stream employs a normal image classification ConvNet to classify static appearance. Secondly, we create a novel dataset of Southeast Asian Sports short videos that encompasses both videos with and without effects, which is a modern factor that is lacking in all currently available datasets used for benchmarking models. The proposed model is trained and evaluated on two benchmarks: UCF-101 and SEAGS-V1. The results reveal that the proposed model yields competitive performance compared to prior attempts to address the same problem.
{"title":"Motion Embedded Images: An Approach to Capture Spatial and Temporal Features for Action Recognition","authors":"Tri Le, Nham Huynh-Duc, Chung Thai Nguyen, Minh-Triet Tran","doi":"10.31449/inf.v47i3.4755","DOIUrl":"https://doi.org/10.31449/inf.v47i3.4755","url":null,"abstract":"The demand for human activity recognition (HAR) from videos has witnessed a significant surge in various real-life applications, including video surveillance, healthcare, elderly care, among others. The explotion of short-form videos on social media platforms has further intensified the interest in this domain. This research endeavors to focus on the problem of HAR in general short videos. In contrast to still images, video clips offer both spatial and temporal information, rendering it challenging to extract complementary information on appearance from still frames and motion between frames. This research makes a two-fold contribution. Firstly, we investigate the use of motion-embedded images in a variant of two-stream Convolutional Neural Network architecture, in which one stream captures motion using combined batches of frames, while another stream employs a normal image classification ConvNet to classify static appearance. Secondly, we create a novel dataset of Southeast Asian Sports short videos that encompasses both videos with and without effects, which is a modern factor that is lacking in all currently available datasets used for benchmarking models. The proposed model is trained and evaluated on two benchmarks: UCF-101 and SEAGS-V1. The results reveal that the proposed model yields competitive performance compared to prior attempts to address the same problem.","PeriodicalId":56292,"journal":{"name":"Informatica","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136349752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In order to access the rich cultural heritage conveyed in Vietnamese steles, automatic reading of stone engravings would be a great support for historians, who are analyzing tens of thousands of stele images. Approaching the challenging problem with deep learning alone is difficult because the data-driven models require large representative datasets with expert human annotations, which are not available for the steles and costly to obtain. In this article, we present a hybrid approach to spot keywords in stele images that combines data-driven deep learning with knowledge-based structural modeling and matching of Chu Nom characters. The main advantage of the proposed method is that it is annotation-free, i.e. no human data annotation is required. In an experimental evaluation, we demonstrate that keywords can be successfully spotted with a mean average precision of more than 70% when a single engraving style is considered.
{"title":"A Hybrid Deep Learning Approach to Keyword Spotting in Vietnamese Stele Images","authors":"Anna Scius-Bertrand, Marc Bui, Andreas Fischer","doi":"10.31449/inf.v47i3.4785","DOIUrl":"https://doi.org/10.31449/inf.v47i3.4785","url":null,"abstract":"In order to access the rich cultural heritage conveyed in Vietnamese steles, automatic reading of stone engravings would be a great support for historians, who are analyzing tens of thousands of stele images. Approaching the challenging problem with deep learning alone is difficult because the data-driven models require large representative datasets with expert human annotations, which are not available for the steles and costly to obtain. In this article, we present a hybrid approach to spot keywords in stele images that combines data-driven deep learning with knowledge-based structural modeling and matching of Chu Nom characters. The main advantage of the proposed method is that it is annotation-free, i.e. no human data annotation is required. In an experimental evaluation, we demonstrate that keywords can be successfully spotted with a mean average precision of more than 70% when a single engraving style is considered.","PeriodicalId":56292,"journal":{"name":"Informatica","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136349755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An Vo, Tan Ngoc Pham, Van Bich Nguyen, Ngoc Hoang Luong
Neural architecture search (NAS) with naive problem formulations and applications of conventional search algorithms often incur prohibitive search costs due to the evaluations of many candidate architectures. For each architecture, its accuracy performance can be properly evaluated after hundreds (or thousands) of computationally expensive training epochs are performed to achieve proper network weights. A so-called zero-cost metric, Synaptic Flow, computed based on random network weight values at initialization, is found to exhibit certain correlations with the neural network test accuracy and can thus be used as an efficient proxy performance metric during the search. Besides, NAS in practice often involves not only optimizing for network accuracy performance but also optimizing for network complexity, such as model size, number of floating point operations, or latency, as well. In this article, we study various NAS problem formulations in which multiple aspects of deep neural networks are treated as multiple optimization objectives. We employ a widely-used multi-objective evolutionary algorithm, i.e., the non-dominated sorting genetic algorithm II (NSGA-II), to approximate the optimal Pareto-optimal fronts for these NAS problem formulations. Experimental results on the NAS benchmark NATS-Bench show the advantages and disadvantages of each formulation.
{"title":"Lightweight Multi-Objective and Many-Objective Problem Formulations for Evolutionary Neural Architecture Search with the Training-Free Performance Metric Synaptic Flow","authors":"An Vo, Tan Ngoc Pham, Van Bich Nguyen, Ngoc Hoang Luong","doi":"10.31449/inf.v47i3.4736","DOIUrl":"https://doi.org/10.31449/inf.v47i3.4736","url":null,"abstract":"Neural architecture search (NAS) with naive problem formulations and applications of conventional search algorithms often incur prohibitive search costs due to the evaluations of many candidate architectures. For each architecture, its accuracy performance can be properly evaluated after hundreds (or thousands) of computationally expensive training epochs are performed to achieve proper network weights. A so-called zero-cost metric, Synaptic Flow, computed based on random network weight values at initialization, is found to exhibit certain correlations with the neural network test accuracy and can thus be used as an efficient proxy performance metric during the search. Besides, NAS in practice often involves not only optimizing for network accuracy performance but also optimizing for network complexity, such as model size, number of floating point operations, or latency, as well. In this article, we study various NAS problem formulations in which multiple aspects of deep neural networks are treated as multiple optimization objectives. We employ a widely-used multi-objective evolutionary algorithm, i.e., the non-dominated sorting genetic algorithm II (NSGA-II), to approximate the optimal Pareto-optimal fronts for these NAS problem formulations. Experimental results on the NAS benchmark NATS-Bench show the advantages and disadvantages of each formulation.","PeriodicalId":56292,"journal":{"name":"Informatica","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136349976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A complaint is uttered when reality fails to meet one's expectations. Research on complaints, which contributes to our understanding of basic human behavior, has been conducted in the fields of psychology, linguistics, and marketing. Although several approaches have been implemented to the study of complaints, studies have yet focused on a target scope of complaints. Examination of a target scope of complaints is crusial because the functions of complaints, such as evocation of emotion, use of grammar, and intention, are different depending on the target scope. We first tackle the construction and release of a complaint dataset of 6,418 tweets by annotating Japanese texts collected from Twitter with labels of the target scope. Our dataset is available at url{https://github.com/sociocom/JaGUCHI}. We then benchmark the annotated dataset with several machine learning baselines and obtain the best performance of 90.4 F1-score in detecting whether a text was a complaint or not, and a micro-F1 score of 72.2 in identifying the target scope label. Finally, we conducted case studies using our model to demonstrate that identifying a target scope of complaints is useful for sociological analysis.
{"title":"Complaints with Target Scope Identification on Social Media","authors":"Kazuhiro Ito, Taichi Murayama, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki","doi":"10.31449/inf.v47i3.4758","DOIUrl":"https://doi.org/10.31449/inf.v47i3.4758","url":null,"abstract":"A complaint is uttered when reality fails to meet one's expectations. Research on complaints, which contributes to our understanding of basic human behavior, has been conducted in the fields of psychology, linguistics, and marketing. Although several approaches have been implemented to the study of complaints, studies have yet focused on a target scope of complaints. Examination of a target scope of complaints is crusial because the functions of complaints, such as evocation of emotion, use of grammar, and intention, are different depending on the target scope. We first tackle the construction and release of a complaint dataset of 6,418 tweets by annotating Japanese texts collected from Twitter with labels of the target scope. Our dataset is available at url{https://github.com/sociocom/JaGUCHI}. We then benchmark the annotated dataset with several machine learning baselines and obtain the best performance of 90.4 F1-score in detecting whether a text was a complaint or not, and a micro-F1 score of 72.2 in identifying the target scope label. Finally, we conducted case studies using our model to demonstrate that identifying a target scope of complaints is useful for sociological analysis.","PeriodicalId":56292,"journal":{"name":"Informatica","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136349758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Text classification methods using deep learning, which is trained with a tremendous amount of text, have achieved superior performance than traditional methods. In addition to its success, multi-task learning (MTL for short) has become a promising approach for text classification; for instance, a multi-task learning approach employs the named entity recognition as an auxiliary task for text classification, and it showcases that the auxiliary task helps make the text classification model higher classification performance. The existing MTL-based text classification methods depend on auxiliary tasks using supervised labels. Obtaining such supervision signals requires additional human and financial costs in addition to those for the main text classification task. To reduce these costs, this paper proposes a multi-task learning-based text classification framework reducing the additional costs on supervised label creation by automatically labeling phrases in texts for the auxiliary recognition task. A basic idea to realize the proposed framework is to utilize phrasal expressions consisting of subwords (called subword-phrase) and to deal with the recent situation in which the pre-trained neural language models such as BERT are designed upon subword-based tokenization to avoid out-of-vocabulary words being missed. To the best of our knowledge, there has been no text classification approach on top of subword-phrases, because subwords only sometimes express a coherent set of meanings. The proposed framework is novel in adding subword-phrase recognition as an auxiliary task and utilizing subword-phrases for text classification. It extracts subword-phrases in an unsupervised manner, particularly the statistics approach. In order to construct labels for effective subword-phrase recognition tasks, extracted subword-phrases are classified for document classes so that subword-phrases dedicated to some classes can be distinguishable. The experimental evaluation of the five popular datasets for text classification showcases the effectiveness of the involvement of the subword-phrase recognition as an auxiliary task. It also shows comparative results with the state-of-the-art method, and the comparison of various labeling schemes indicates insights for labeling common subword-phrases among several document classes.
{"title":"An Automatic Labeling Method for Subword-Phrase Recognition in Effective Text Classification","authors":"Yusuke Kimura, Takahiro Komamizu, Kenji Hatano","doi":"10.31449/inf.v47i3.4742","DOIUrl":"https://doi.org/10.31449/inf.v47i3.4742","url":null,"abstract":"Text classification methods using deep learning, which is trained with a tremendous amount of text, have achieved superior performance than traditional methods. In addition to its success, multi-task learning (MTL for short) has become a promising approach for text classification; for instance, a multi-task learning approach employs the named entity recognition as an auxiliary task for text classification, and it showcases that the auxiliary task helps make the text classification model higher classification performance. The existing MTL-based text classification methods depend on auxiliary tasks using supervised labels. Obtaining such supervision signals requires additional human and financial costs in addition to those for the main text classification task. To reduce these costs, this paper proposes a multi-task learning-based text classification framework reducing the additional costs on supervised label creation by automatically labeling phrases in texts for the auxiliary recognition task. A basic idea to realize the proposed framework is to utilize phrasal expressions consisting of subwords (called subword-phrase) and to deal with the recent situation in which the pre-trained neural language models such as BERT are designed upon subword-based tokenization to avoid out-of-vocabulary words being missed. To the best of our knowledge, there has been no text classification approach on top of subword-phrases, because subwords only sometimes express a coherent set of meanings. The proposed framework is novel in adding subword-phrase recognition as an auxiliary task and utilizing subword-phrases for text classification. It extracts subword-phrases in an unsupervised manner, particularly the statistics approach. In order to construct labels for effective subword-phrase recognition tasks, extracted subword-phrases are classified for document classes so that subword-phrases dedicated to some classes can be distinguishable. The experimental evaluation of the five popular datasets for text classification showcases the effectiveness of the involvement of the subword-phrase recognition as an auxiliary task. It also shows comparative results with the state-of-the-art method, and the comparison of various labeling schemes indicates insights for labeling common subword-phrases among several document classes.","PeriodicalId":56292,"journal":{"name":"Informatica","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136349753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Multimedia Web-Data Mining Approach based on Equivalence Class Evaluation Pipelined to Feature Maps onto Planar Projection","authors":"Ravindar Mogili, M. Naidu, G. Narsimha","doi":"10.31449/inf.v47i7.4583","DOIUrl":"https://doi.org/10.31449/inf.v47i7.4583","url":null,"abstract":"","PeriodicalId":56292,"journal":{"name":"Informatica","volume":" ","pages":""},"PeriodicalIF":2.9,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48577968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}