Pub Date : 2023-09-28DOI: 10.5121/ijaia.2023.14502
Xiaohan Feng, Makoto Murakami
The Aim of this paper is to explore different ways of using AI to subvert stereotypes more efficiently and effectively. It will also enumerate the advantages and disadvantages of each approach, helping creators select the most appropriate method for their specific situations. AI opens up new possibilities, enabling anyone to effortlessly generate visually stunning images without the need for artistic skills. However, it also leads to the creation of more stereotypes when using large amounts of data. Consequently, stereotypes are becoming more prevalent and serious than ever before. Our belief is that we can use this situation in reverse, aiming to summarize stereotypes with AI and then subvert them through elemental exchange. In this study, we have attempted to develop a less time-consuming method to challenge character stereotypes while embracing the concept of "exchange." We selected two character archetypes, namely the "tyrant" and the "mad scientist," and summarized their stereotypes by generating AI images or asking ChatGPT questions. Additionally, we conducted a survey of real historical tyrants to gain insights into their behavior and characteristics. This step helped us comprehend the reasons behind stereotyping in artwork depicting tyrants. Based on this understanding, we made choices about which stereotypes to retain. The intention was to empower the audience to better evaluate the identity of the character. Finally, the two remaining character stereotypes were exchanged, and the design was completed. This paper documents the last and most time-consuming method. By examining a large number of sources and examining what stereotypical influences were used, we were able to achieve a greater effect of subverting stereotypes. The other method is much less time-consuming but somewhat more random. Whether one chooses by subjective experience or by the most frequent choices, there is no guarantee of the best outcome. In other words, it is the one that best guarantees that the audience will be able to quickly identify the original character and at the same time move the two characters the furthest away from the original stereotypical image of the original. In conclusion, if the designer has sufficient time, ai portrait + research or chatGPT + research can be chosen. If there is not enough time, the remaining methods can be chosen. The remaining methods take less time and the designer can try them all to get the desired result.
{"title":"Subverting Characters Stereotypes: Exploring the Role of AI in Stereotype Subversion","authors":"Xiaohan Feng, Makoto Murakami","doi":"10.5121/ijaia.2023.14502","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14502","url":null,"abstract":"The Aim of this paper is to explore different ways of using AI to subvert stereotypes more efficiently and effectively. It will also enumerate the advantages and disadvantages of each approach, helping creators select the most appropriate method for their specific situations. AI opens up new possibilities, enabling anyone to effortlessly generate visually stunning images without the need for artistic skills. However, it also leads to the creation of more stereotypes when using large amounts of data. Consequently, stereotypes are becoming more prevalent and serious than ever before. Our belief is that we can use this situation in reverse, aiming to summarize stereotypes with AI and then subvert them through elemental exchange. In this study, we have attempted to develop a less time-consuming method to challenge character stereotypes while embracing the concept of \"exchange.\" We selected two character archetypes, namely the \"tyrant\" and the \"mad scientist,\" and summarized their stereotypes by generating AI images or asking ChatGPT questions. Additionally, we conducted a survey of real historical tyrants to gain insights into their behavior and characteristics. This step helped us comprehend the reasons behind stereotyping in artwork depicting tyrants. Based on this understanding, we made choices about which stereotypes to retain. The intention was to empower the audience to better evaluate the identity of the character. Finally, the two remaining character stereotypes were exchanged, and the design was completed. This paper documents the last and most time-consuming method. By examining a large number of sources and examining what stereotypical influences were used, we were able to achieve a greater effect of subverting stereotypes. The other method is much less time-consuming but somewhat more random. Whether one chooses by subjective experience or by the most frequent choices, there is no guarantee of the best outcome. In other words, it is the one that best guarantees that the audience will be able to quickly identify the original character and at the same time move the two characters the furthest away from the original stereotypical image of the original. In conclusion, if the designer has sufficient time, ai portrait + research or chatGPT + research can be chosen. If there is not enough time, the remaining methods can be chosen. The remaining methods take less time and the designer can try them all to get the desired result.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135470169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-28DOI: 10.5121/ijaia.2023.14501
Andrea Ruiz-Hernandez, Jennifer Lee, Nawal Rehman, Jayanthi Raghavan, Majid Ahmadi
Facial recognition (FR) is a pattern recognition problem, in which images can be considered as a matrix of pixels.There are manychallenges that affect the performance of face recognitionincluding illumination variation, occlusion, and blurring. In this paper,a few preprocessing techniques are suggested to handle the illumination variationsproblem. Also, other phases of face recognition problems like feature extraction and classification are discussed. Preprocessing techniques like Histogram Equalization (HE), Gamma Intensity Correction (GIC), and Regional Histogram Equalization (RHE) are tested inthe AT&T database. For feature extraction, methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Independent Component Analysis (ICA), and Local Binary Pattern (LBP) are applied. Support Vector Machine (SVM) is used as the classifier. Both holistic and block-based methods are tested using the AT&T database. For twelve different combinations of preprocessing, feature extraction, and classification methods, experiments involving various block sizes are conducted to assess the computation performance and recognition accuracy for the AT&T dataset.Using the block-based method, 100% accuracy is achieved with the combination of GIC preprocessing, LDA feature extraction,and SVM classification using 2x2 block-sizingwhile the holistic method yields the maximum accuracy of 93.5%. The block-sized algorithm performs better than the holistic approach under poor lighting conditions.SVM Radial Basis Function performs extremely well on theAT&Tdataset for both holistic and block-based approaches.
{"title":"Performance Evaluation of Block-Sized Algorithms for Majority Vote in Facial Recognition","authors":"Andrea Ruiz-Hernandez, Jennifer Lee, Nawal Rehman, Jayanthi Raghavan, Majid Ahmadi","doi":"10.5121/ijaia.2023.14501","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14501","url":null,"abstract":"Facial recognition (FR) is a pattern recognition problem, in which images can be considered as a matrix of pixels.There are manychallenges that affect the performance of face recognitionincluding illumination variation, occlusion, and blurring. In this paper,a few preprocessing techniques are suggested to handle the illumination variationsproblem. Also, other phases of face recognition problems like feature extraction and classification are discussed. Preprocessing techniques like Histogram Equalization (HE), Gamma Intensity Correction (GIC), and Regional Histogram Equalization (RHE) are tested inthe AT&T database. For feature extraction, methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Independent Component Analysis (ICA), and Local Binary Pattern (LBP) are applied. Support Vector Machine (SVM) is used as the classifier. Both holistic and block-based methods are tested using the AT&T database. For twelve different combinations of preprocessing, feature extraction, and classification methods, experiments involving various block sizes are conducted to assess the computation performance and recognition accuracy for the AT&T dataset.Using the block-based method, 100% accuracy is achieved with the combination of GIC preprocessing, LDA feature extraction,and SVM classification using 2x2 block-sizingwhile the holistic method yields the maximum accuracy of 93.5%. The block-sized algorithm performs better than the holistic approach under poor lighting conditions.SVM Radial Basis Function performs extremely well on theAT&Tdataset for both holistic and block-based approaches.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135470325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-28DOI: 10.5121/ijaia.2023.14503
Kazuhisa Fujita
This research aims to develop kernel GNG, a kernelized version of the growing neural gas (GNG) algorithm, and to investigate the features of the networks generated by the kernel GNG. The GNG is an unsupervised artificial neural network that can transform a dataset into an undirected graph, thereby extracting the features of the dataset as a graph. The GNG is widely used in vector quantization, clustering, and 3D graphics. Kernel methods are often used to map a dataset to feature space, with support vector machines being the most prominent application. This paper introduces the kernel GNG approach and explores the characteristics of the networks generated by kernel GNG. Five kernels, including Gaussian, Laplacian, Cauchy, inverse multiquadric, and log kernels, are used in this study. The results of this study show that the average degree and the average clustering coefficient decrease as the kernel parameter increases for Gaussian, Laplacian, Cauchy, and IMQ kernels. If we avoid more edges and a higher clustering coefficient (or more triangles), the kernel GNG with a larger value of the parameter will be more appropriate.
{"title":"Characteristics of Networks Generated by Kernel Growing Neural Gas","authors":"Kazuhisa Fujita","doi":"10.5121/ijaia.2023.14503","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14503","url":null,"abstract":"This research aims to develop kernel GNG, a kernelized version of the growing neural gas (GNG) algorithm, and to investigate the features of the networks generated by the kernel GNG. The GNG is an unsupervised artificial neural network that can transform a dataset into an undirected graph, thereby extracting the features of the dataset as a graph. The GNG is widely used in vector quantization, clustering, and 3D graphics. Kernel methods are often used to map a dataset to feature space, with support vector machines being the most prominent application. This paper introduces the kernel GNG approach and explores the characteristics of the networks generated by kernel GNG. Five kernels, including Gaussian, Laplacian, Cauchy, inverse multiquadric, and log kernels, are used in this study. The results of this study show that the average degree and the average clustering coefficient decrease as the kernel parameter increases for Gaussian, Laplacian, Cauchy, and IMQ kernels. If we avoid more edges and a higher clustering coefficient (or more triangles), the kernel GNG with a larger value of the parameter will be more appropriate.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135470159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-28DOI: 10.5121/ijaia.2023.14504
Pranav Gunhal
This study explores the utility of sentiment classification in political decision-making through an analysis of Twitter sentiment surrounding the 2023 Karnataka elections. Utilizing transformer-based models for sentiment analysis in Indic languages, the research employs innovative data collection methodologies, including novel data augmentation techniques. The primary focus is on sentiment classification, discerning positive, negative, and neutral posts, particularly regarding the defeat of the Bharatiya JanataParty (BJP) or the victory of the Indian National Congress (INC). Leveraging high-performing transformer architectures like IndicBERT, coupled with precise hyper parameter tuning, the AI models used in this study exhibit exceptional predictive accuracy, notably predicting the INC's electoral success. These findings underscore the potential of state-of-the-art transformer-based models in capturing and understanding sentiment dynamics within Indian politics. Implications are far-reaching, providing invaluable insights for political stakeholders preparing for the 2024 Lok Sabha elections. This study stands as a testament to the potential of sentiment analysis as a pivotal tool in political decision-making, specifically in non-Western nations.
{"title":"Sentiment Analysis in Indian Elections: Unraveling Public Perception of the Karnataka Elections With Transformers","authors":"Pranav Gunhal","doi":"10.5121/ijaia.2023.14504","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14504","url":null,"abstract":"This study explores the utility of sentiment classification in political decision-making through an analysis of Twitter sentiment surrounding the 2023 Karnataka elections. Utilizing transformer-based models for sentiment analysis in Indic languages, the research employs innovative data collection methodologies, including novel data augmentation techniques. The primary focus is on sentiment classification, discerning positive, negative, and neutral posts, particularly regarding the defeat of the Bharatiya JanataParty (BJP) or the victory of the Indian National Congress (INC). Leveraging high-performing transformer architectures like IndicBERT, coupled with precise hyper parameter tuning, the AI models used in this study exhibit exceptional predictive accuracy, notably predicting the INC's electoral success. These findings underscore the potential of state-of-the-art transformer-based models in capturing and understanding sentiment dynamics within Indian politics. Implications are far-reaching, providing invaluable insights for political stakeholders preparing for the 2024 Lok Sabha elections. This study stands as a testament to the potential of sentiment analysis as a pivotal tool in political decision-making, specifically in non-Western nations.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135470326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-28DOI: 10.5121/ijaia.2023.14505
Raghav Subramaniam
With the rising popularity of generative AI tools, the nature of apparent classification failures by AI content detection softwares, especially between different languages, must be further observed. This paper aims to do this through testing OpenAI’s “AI Text Classifier” on a set of human and AI-generated texts inEnglish, German, Arabic, Hindi, Chinese, and Swahili. Given the unreliability of existing tools for detection of AIgenerated text, it is notable that specific types of classification failures often persist in slightly different ways when various languages are observed: misclassification of human-written content as “AI-generated” and vice versa may occur more frequently in specific language content than others. Our findings indicate that false negative labelings are more likely to occur in English, whereas false positives are more likely to occur in Hindi and Arabic. There was an observed tendency for other languages to not be confidently labeled at all.
{"title":"Identifying Text Classification Failures in Multilingual AI-Generated Content","authors":"Raghav Subramaniam","doi":"10.5121/ijaia.2023.14505","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14505","url":null,"abstract":"With the rising popularity of generative AI tools, the nature of apparent classification failures by AI content detection softwares, especially between different languages, must be further observed. This paper aims to do this through testing OpenAI’s “AI Text Classifier” on a set of human and AI-generated texts inEnglish, German, Arabic, Hindi, Chinese, and Swahili. Given the unreliability of existing tools for detection of AIgenerated text, it is notable that specific types of classification failures often persist in slightly different ways when various languages are observed: misclassification of human-written content as “AI-generated” and vice versa may occur more frequently in specific language content than others. Our findings indicate that false negative labelings are more likely to occur in English, whereas false positives are more likely to occur in Hindi and Arabic. There was an observed tendency for other languages to not be confidently labeled at all.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"19 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135470168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-28DOI: 10.5121/ijaia.2023.14302
John Xu, John Morris
Designing metroidvania games often poses a unique challenge to video game developers, namely the difficulty of consistently preventing soft-locking, which hinders or blocks the player’s ability to traverse through levels effectively [1]. As a result, many turn to hand-making all levels to ensure the level’s traversability, but in the process often forsaking the ability to rely on procedural generation to lessen the time and burden on human game developers [2]. On the other hand, when developers rely on popular ways of procedural generation such as using perlin noise, they find themselves unable to control those procedural algorithms to guarantee certain characteristics of the outputs such as traversability [3]. Our paper aims to present a procedural solution that can also effectively guarantee the traversability of the generated level. Our method uses Answer Set Programming (ASP) to verify generation based on restrictions we place, guaranteeing the outcome to be what we want [4]. The generation of a level is divided into rooms, which are first mapped out in a graph to ensure traversability from a starting room to an ending boss area. The rooms’ geometry is then generated accordingly to create the full level. Using perlin noise, we were also able to create a demonstration of how traversability works in another form of procedural generation, and compare it with our methodology to identify strengths and weaknesses. To demonstrate our method, we applied our solution as well as the perlin noise algorithm to a 2D metroidvania game made in the Unity game engine and conducted quantitative tests on the ASP method to assess how well our method works as a level generator [5].
{"title":"PROCEDURAL GENERATION IN 2D METROIDVANIA GAME WITH ANSWER SET PROGRAMMING AND PERLIN NOISE","authors":"John Xu, John Morris","doi":"10.5121/ijaia.2023.14302","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14302","url":null,"abstract":"Designing metroidvania games often poses a unique challenge to video game developers, namely the difficulty of consistently preventing soft-locking, which hinders or blocks the player’s ability to traverse through levels effectively [1]. As a result, many turn to hand-making all levels to ensure the level’s traversability, but in the process often forsaking the ability to rely on procedural generation to lessen the time and burden on human game developers [2]. On the other hand, when developers rely on popular ways of procedural generation such as using perlin noise, they find themselves unable to control those procedural algorithms to guarantee certain characteristics of the outputs such as traversability [3]. Our paper aims to present a procedural solution that can also effectively guarantee the traversability of the generated level. Our method uses Answer Set Programming (ASP) to verify generation based on restrictions we place, guaranteeing the outcome to be what we want [4]. The generation of a level is divided into rooms, which are first mapped out in a graph to ensure traversability from a starting room to an ending boss area. The rooms’ geometry is then generated accordingly to create the full level. Using perlin noise, we were also able to create a demonstration of how traversability works in another form of procedural generation, and compare it with our methodology to identify strengths and weaknesses. To demonstrate our method, we applied our solution as well as the perlin noise algorithm to a 2D metroidvania game made in the Unity game engine and conducted quantitative tests on the ASP method to assess how well our method works as a level generator [5].","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135894857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-30DOI: 10.5121/ijaia.2023.14204
Nicholas Gahman, Vinayak Elangovan
Document similarity is an important part of Natural Language Processing and is most commonly used forplagiarism-detection and text summarization. Thus, finding the overall most effective document similarity algorithm could have a major positive impact on the field of Natural Language Processing. This report setsout to examine the numerous document similarity algorithms, and determine which ones are the mostuseful. It addresses the most effective document similarity algorithm by categorizing them into 3 types ofdocument similarity algorithms: statistical algorithms, neural networks, and corpus/knowledge-basedalgorithms. The most effective algorithms in each category are also compared in our work using a series of benchmark datasets and evaluations that test every possible area that each algorithm could be used in.
{"title":"A Comparison of Document Similarity Algorithms","authors":"Nicholas Gahman, Vinayak Elangovan","doi":"10.5121/ijaia.2023.14204","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14204","url":null,"abstract":"Document similarity is an important part of Natural Language Processing and is most commonly used forplagiarism-detection and text summarization. Thus, finding the overall most effective document similarity algorithm could have a major positive impact on the field of Natural Language Processing. This report setsout to examine the numerous document similarity algorithms, and determine which ones are the mostuseful. It addresses the most effective document similarity algorithm by categorizing them into 3 types ofdocument similarity algorithms: statistical algorithms, neural networks, and corpus/knowledge-basedalgorithms. The most effective algorithms in each category are also compared in our work using a series of benchmark datasets and evaluations that test every possible area that each algorithm could be used in.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135374875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the population grows and more land is being used for urbanization, ecosystems are disrupted by our roads and cars. This expansion of infrastructure cuts through wildlife territories, leading to many instances of Wildlife-Vehicle Collision (WVC). These instances of WVC are a global issue that is having a global socio-economic impact, resulting in billions of dollars in property damage and, at times, fatalitiesfor vehicle occupants. In Saudi Arabia, this issue is similar, with instances of Camel-Vehicle Collision (CVC) being particularly deadly due to the large size of camels, which results in a 25% fatality rate [1].The focus of this work is to test different object detection models on the task of detecting camels on theroad. The Deep Learning (DL) object detection models used in the experiments are: CenterNet, Efficient Det, Faster R-CNN, SSD, and YOLOv8. Results of the experiments show that YOLOv8 performed the best in terms of accuracy and was the most efficient in training. In the future, the plan is to expand on this work by developing a system to make countryside roads safer.
{"title":"Spot-the-Camel: Computer Vision for Safer Roads","authors":"Khalid AlNujaidi, Ghadah AlHabib, Abdulaziz AlOdhieb","doi":"10.5121/ijaia.2023.14201","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14201","url":null,"abstract":"As the population grows and more land is being used for urbanization, ecosystems are disrupted by our roads and cars. This expansion of infrastructure cuts through wildlife territories, leading to many instances of Wildlife-Vehicle Collision (WVC). These instances of WVC are a global issue that is having a global socio-economic impact, resulting in billions of dollars in property damage and, at times, fatalitiesfor vehicle occupants. In Saudi Arabia, this issue is similar, with instances of Camel-Vehicle Collision (CVC) being particularly deadly due to the large size of camels, which results in a 25% fatality rate [1].The focus of this work is to test different object detection models on the task of detecting camels on theroad. The Deep Learning (DL) object detection models used in the experiments are: CenterNet, Efficient Det, Faster R-CNN, SSD, and YOLOv8. Results of the experiments show that YOLOv8 performed the best in terms of accuracy and was the most efficient in training. In the future, the plan is to expand on this work by developing a system to make countryside roads safer.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135374876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-30DOI: 10.5121/ijaia.2023.14202
Jingyi Gu, Fadi P. Deek, Guiling Wang
Predicting the Stock movement attracts much attention from both industry and academia. Despite such significant efforts, the results remain unsatisfactory due to the inherently complicated nature of the stock market driven by factors including supply and demand, the state of the economy, the political climate, and even irrational human behavior. Recently, Generative Adversarial Networks (GAN) have been extended for time series data; however, robust methods are primarily for synthetic series generation, which fall short for appropriate stock prediction. This is because existing GANs for stock applications suffer from mode collapse and only consider one-step prediction, thus underutilizing the potential of GAN. Furthermore, merging news and market volatility are neglected in current GANs. To address these issues, we exploit expert domain knowledge in finance and, for the first time, attempt to formulate stock movement prediction into a Wasserstein GAN framework for multi-step prediction. We propose Index GAN, which includes deliberate designs for the inherent characteristics of the stock market, leverages news context learning to thoroughly investigate textual information and develop an attentive seq2seq learning network that captures the temporal dependency among stock prices, news, and market sentiment. We also utilize the critic to approximate the Wasserstein distance between actual and predicted sequences and develop a rolling strategy for deployment that mitigates noise from the financial market. Extensive experiments are conducted on real-world broad-based indices, demonstrating the superior performance of our architecture over other state-of-the-art baselines, also validating all its contributing components.
{"title":"Stock Broad-Index Trend Patterns Learning via Domain Knowledge Informed Generative Network","authors":"Jingyi Gu, Fadi P. Deek, Guiling Wang","doi":"10.5121/ijaia.2023.14202","DOIUrl":"https://doi.org/10.5121/ijaia.2023.14202","url":null,"abstract":"Predicting the Stock movement attracts much attention from both industry and academia. Despite such significant efforts, the results remain unsatisfactory due to the inherently complicated nature of the stock market driven by factors including supply and demand, the state of the economy, the political climate, and even irrational human behavior. Recently, Generative Adversarial Networks (GAN) have been extended for time series data; however, robust methods are primarily for synthetic series generation, which fall short for appropriate stock prediction. This is because existing GANs for stock applications suffer from mode collapse and only consider one-step prediction, thus underutilizing the potential of GAN. Furthermore, merging news and market volatility are neglected in current GANs. To address these issues, we exploit expert domain knowledge in finance and, for the first time, attempt to formulate stock movement prediction into a Wasserstein GAN framework for multi-step prediction. We propose Index GAN, which includes deliberate designs for the inherent characteristics of the stock market, leverages news context learning to thoroughly investigate textual information and develop an attentive seq2seq learning network that captures the temporal dependency among stock prices, news, and market sentiment. We also utilize the critic to approximate the Wasserstein distance between actual and predicted sequences and develop a rolling strategy for deployment that mitigates noise from the financial market. Extensive experiments are conducted on real-world broad-based indices, demonstrating the superior performance of our architecture over other state-of-the-art baselines, also validating all its contributing components.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"170 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136002534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-31DOI: 10.5121/ijaia.2022.13101
Shengchao Li, Lin Zhang, Xiumin Diao
Data augmentation has been broadly applied in training deep-learning models to increase the diversity of data. This study ingestigates the effectiveness of different data augmentation methods for deep-learningbased human intention prediction when only limited training data is available. A human participant pitches a ball to nine potential targets in our experiment. We expect to predict which target the participant pitches the ball to. Firstly, the effectiveness of 10 data augmentation groups is evaluated on a single-participant data set using RGB images. Secondly, the best data augmentation method (i.e., random cropping) on the single-participant data set is further evaluated on a multi-participant data set to assess its generalization ability. Finally, the effectiveness of random cropping on fusion data of RGB images and optical flow is evaluated on both single- and multi-participant data sets. Experiment results show that: 1) Data augmentation methods that crop or deform images can improve the prediction performance; 2) Random cropping can be generalized to the multi-participant data set (prediction accuracy is improved from 50% to 57.4%); and 3) Random cropping with fusion data of RGB images and optical flow can further improve the prediction accuracy from 57.4% to 63.9% on the multi-participant data set.
{"title":"Deep-Learning-based Human Intention Prediction with Data Augmentation","authors":"Shengchao Li, Lin Zhang, Xiumin Diao","doi":"10.5121/ijaia.2022.13101","DOIUrl":"https://doi.org/10.5121/ijaia.2022.13101","url":null,"abstract":"Data augmentation has been broadly applied in training deep-learning models to increase the diversity of data. This study ingestigates the effectiveness of different data augmentation methods for deep-learningbased human intention prediction when only limited training data is available. A human participant pitches a ball to nine potential targets in our experiment. We expect to predict which target the participant pitches the ball to. Firstly, the effectiveness of 10 data augmentation groups is evaluated on a single-participant data set using RGB images. Secondly, the best data augmentation method (i.e., random cropping) on the single-participant data set is further evaluated on a multi-participant data set to assess its generalization ability. Finally, the effectiveness of random cropping on fusion data of RGB images and optical flow is evaluated on both single- and multi-participant data sets. Experiment results show that: 1) Data augmentation methods that crop or deform images can improve the prediction performance; 2) Random cropping can be generalized to the multi-participant data set (prediction accuracy is improved from 50% to 57.4%); and 3) Random cropping with fusion data of RGB images and optical flow can further improve the prediction accuracy from 57.4% to 63.9% on the multi-participant data set.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48568232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}