Image classification plays a pivotal role in numerous applications, with substantial implications for daily life, including diagnosing disease from medical images and management of images in autonomous vehicles. However, such sort of research in this field continuously challenges scientists in terms of choosing datasets, testing accuracy, and improvement of models, etc. In this paper, we focus on the performance of two prominent models GoogleNet and residual attention network. We construct two models on the Python platform according to available online resources. To assess their capabilities, we employ the CIFAR-100 dataset, a widely used benchmark dataset. Despite the simplicity of our implementations, GoogleNet comprises approximately 75 convolutional layers and inception modules, and the Residual Attention Network incorporates multiple attention modules within its architecture. These characteristics demonstrate the models' potential for achieving exceptional classification results. Through comprehensive testing and visualization, we aim to provide insights into the efficacy of these models in the context of image classification. Our study contributes to a broader and profounder understanding of their suitability for real-world applications. According to our diagrams and analysis, we conclude that although attention56 is suitable to be adopted in image classification concerning its structure since the model is unstable and invalid in a wide range of training image data on dataset SIFAR100 it might not be exploited in practice. However, as to the model GoogleNet, with an increasing number of training, it obviously is prone to robustness and solid capability of noise resistance. Therefore, GoogleNet is a suitable one to be employed in image classification.
{"title":"Comparative analysis of transformer and GoogleNet models in image classification based on the CIFAR dataset","authors":"Xinran Xie, Qinwen Yan, Haoye Li, Sujie Yan, Zirong Jiang","doi":"10.54254/2755-2721/79/20241537","DOIUrl":"https://doi.org/10.54254/2755-2721/79/20241537","url":null,"abstract":"Image classification plays a pivotal role in numerous applications, with substantial implications for daily life, including diagnosing disease from medical images and management of images in autonomous vehicles. However, such sort of research in this field continuously challenges scientists in terms of choosing datasets, testing accuracy, and improvement of models, etc. In this paper, we focus on the performance of two prominent models GoogleNet and residual attention network. We construct two models on the Python platform according to available online resources. To assess their capabilities, we employ the CIFAR-100 dataset, a widely used benchmark dataset. Despite the simplicity of our implementations, GoogleNet comprises approximately 75 convolutional layers and inception modules, and the Residual Attention Network incorporates multiple attention modules within its architecture. These characteristics demonstrate the models' potential for achieving exceptional classification results. Through comprehensive testing and visualization, we aim to provide insights into the efficacy of these models in the context of image classification. Our study contributes to a broader and profounder understanding of their suitability for real-world applications. According to our diagrams and analysis, we conclude that although attention56 is suitable to be adopted in image classification concerning its structure since the model is unstable and invalid in a wide range of training image data on dataset SIFAR100 it might not be exploited in practice. However, as to the model GoogleNet, with an increasing number of training, it obviously is prone to robustness and solid capability of noise resistance. Therefore, GoogleNet is a suitable one to be employed in image classification.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"30 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141803546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.54254/2755-2721/55/20241475
Deming Liu, Hansheng Huang, Haimei Zhang, Xinyu Luo, Zhongyang Fan
The digital era has transformed the way businesses interact with their customers, with online platforms serving as crucial touchpoints for user engagement. Understanding customer behavior in this context is paramount for enhancing user experience, optimizing marketing strategies, and driving business growth. This study aims to explore the likelihood of customers making purchases based on their clickstream data by employing both machine learning and deep learning techniques. This research uses a machine learning model Random Forest (RF), Gradient Boosting Decision Trees (GBDT), Extreme Gradient Boosting (XGBOOST) and deep learning model Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) to predict whether customers will purchase the items using 33,040,175 records in the file of the click and 1,177,769 records in the buys file from real e-commerce customers. The results show that both machine learning and deep learning can accurately forecast the purchasing behavior of customers with an accuracy of around 72 to 75 percent. For the machine learning model, attains the highest prediction accuracy when using a sliding window of 6 days. For the deep learning model, the LSTM model with 50 layers shows the highest prediction of customers willingness to purchase an item. Compared with previous studies, the three machine learning models narrow the range of days, give more accurate predictions, and also improve the model. Both RNN and LSTM show similar accuracy for customer behavior. The current research has asserted that both machine learning and deep learning models give profound results on whether customers will purchase a product, and there is not a significant difference between machine learning and deep learning in this classification topic.
{"title":"Enhancing customer behavior prediction in e-commerce: A comparative analysis of machine learning and deep learning models","authors":"Deming Liu, Hansheng Huang, Haimei Zhang, Xinyu Luo, Zhongyang Fan","doi":"10.54254/2755-2721/55/20241475","DOIUrl":"https://doi.org/10.54254/2755-2721/55/20241475","url":null,"abstract":"The digital era has transformed the way businesses interact with their customers, with online platforms serving as crucial touchpoints for user engagement. Understanding customer behavior in this context is paramount for enhancing user experience, optimizing marketing strategies, and driving business growth. This study aims to explore the likelihood of customers making purchases based on their clickstream data by employing both machine learning and deep learning techniques. This research uses a machine learning model Random Forest (RF), Gradient Boosting Decision Trees (GBDT), Extreme Gradient Boosting (XGBOOST) and deep learning model Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) to predict whether customers will purchase the items using 33,040,175 records in the file of the click and 1,177,769 records in the buys file from real e-commerce customers. The results show that both machine learning and deep learning can accurately forecast the purchasing behavior of customers with an accuracy of around 72 to 75 percent. For the machine learning model, attains the highest prediction accuracy when using a sliding window of 6 days. For the deep learning model, the LSTM model with 50 layers shows the highest prediction of customers willingness to purchase an item. Compared with previous studies, the three machine learning models narrow the range of days, give more accurate predictions, and also improve the model. Both RNN and LSTM show similar accuracy for customer behavior. The current research has asserted that both machine learning and deep learning models give profound results on whether customers will purchase a product, and there is not a significant difference between machine learning and deep learning in this classification topic.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"10 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141803688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.54254/2755-2721/79/20241666
Qiuyang Yu
With the rapid development of technology, robotic technology is continuously updated and iterated. In this process, mobile robots are gradually entering different fields and playing important roles in people's daily lives. In this context, research on path planning for mobile robots has become a priority in order to meet the various tasks that mobile robots need to handle in different situations. This paper proposes a ROS-based mobile robot autonomous navigation system, which uses the gampping algorithm to implement SLAM technology for building environment maps. It addresses the accuracy issues found in traditional map construction methods and improves localization accuracy by combining IMU with laser odometry. The paper focuses on the problem of path planning for mobile robots and mainly compares and analyzes the advantages and disadvantages of the A* algorithm and Dijkstra algorithm in path planning algorithms, aiming to find a more suitable path planning algorithm. This paper provides support for the development of the modern mobile robot field.
随着科技的飞速发展,机器人技术也在不断更新迭代。在此过程中,移动机器人逐渐进入不同领域,并在人们的日常生活中扮演着重要角色。在此背景下,为了满足移动机器人在不同情况下需要处理的各种任务,移动机器人的路径规划研究已成为当务之急。本文提出了一种基于 ROS 的移动机器人自主导航系统,该系统采用伽扑(gampping)算法实现 SLAM 技术,用于构建环境地图。该系统解决了传统地图构建方法中存在的精度问题,并通过结合 IMU 和激光里程计提高了定位精度。本文重点研究了移动机器人的路径规划问题,主要对比分析了A*算法和Dijkstra算法在路径规划算法中的优缺点,旨在找到更合适的路径规划算法。本文为现代移动机器人领域的发展提供了支持。
{"title":"A review of research about automatic navigation system of mobile robots based on ROS","authors":"Qiuyang Yu","doi":"10.54254/2755-2721/79/20241666","DOIUrl":"https://doi.org/10.54254/2755-2721/79/20241666","url":null,"abstract":"With the rapid development of technology, robotic technology is continuously updated and iterated. In this process, mobile robots are gradually entering different fields and playing important roles in people's daily lives. In this context, research on path planning for mobile robots has become a priority in order to meet the various tasks that mobile robots need to handle in different situations. This paper proposes a ROS-based mobile robot autonomous navigation system, which uses the gampping algorithm to implement SLAM technology for building environment maps. It addresses the accuracy issues found in traditional map construction methods and improves localization accuracy by combining IMU with laser odometry. The paper focuses on the problem of path planning for mobile robots and mainly compares and analyzes the advantages and disadvantages of the A* algorithm and Dijkstra algorithm in path planning algorithms, aiming to find a more suitable path planning algorithm. This paper provides support for the development of the modern mobile robot field.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"48 21","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141803736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.54254/2755-2721/69/20241498
Feng Yuan
Time series data is widely available in a variety of industries. By forecasting time series, decision-makers can better grasp future trends and make more effective decisions. Financial time series data exhibit non-stationarity and high volatility. High-frequency fluctuations in financial products such as exchange rates, bonds and equities may reflect external shocks and risks in global financial markets, which are potentially dangerous and may threaten national economic security or even trigger financial crises. For financial time series data, a deep recurrent neural network first progressively processes each data point in the time series through its recurrent unit. Each recurring unit can adjust its own weights to better predict or analyze future values. Over time, these recurrent units continuously update their internal state, resulting in a comprehensive understanding of the characteristics of the entire data sequence. In addition, we add a gating mechanism to further improve the network's ability to control the flow of information, so that the model is more effective when retaining long-term dependencies, so as to improve the accuracy of prediction and the stability of the model. Experimental results show that our recurrent neural network model shows higher prediction accuracy and stability than other baseline models on financial time series datasets.
{"title":"Research on time-series financial data prediction and analysis based on deep recurrent neural network","authors":"Feng Yuan","doi":"10.54254/2755-2721/69/20241498","DOIUrl":"https://doi.org/10.54254/2755-2721/69/20241498","url":null,"abstract":"Time series data is widely available in a variety of industries. By forecasting time series, decision-makers can better grasp future trends and make more effective decisions. Financial time series data exhibit non-stationarity and high volatility. High-frequency fluctuations in financial products such as exchange rates, bonds and equities may reflect external shocks and risks in global financial markets, which are potentially dangerous and may threaten national economic security or even trigger financial crises. For financial time series data, a deep recurrent neural network first progressively processes each data point in the time series through its recurrent unit. Each recurring unit can adjust its own weights to better predict or analyze future values. Over time, these recurrent units continuously update their internal state, resulting in a comprehensive understanding of the characteristics of the entire data sequence. In addition, we add a gating mechanism to further improve the network's ability to control the flow of information, so that the model is more effective when retaining long-term dependencies, so as to improve the accuracy of prediction and the stability of the model. Experimental results show that our recurrent neural network model shows higher prediction accuracy and stability than other baseline models on financial time series datasets.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"9 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141804373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.54254/2755-2721/69/20241464
Rui Bao
Pursuit-evasion game is a game with a pursuer player and an evader player, it has numerous applications including artificial intelligence, robot motion planning and so on. The cops and robbers game is a type of pursuit-evasion game played on graphs, and its variant is a very fruitful research area in graph theory. The purpose of this paper is to find the specific relations between pursuit-evasion game applications and different types of variants of the cops and robbers game. The research method is to find variants with different game rules and winning strategies and classify them. Then compare these rules and strategies with the applications of pursuit-evasion games to find their relationships. The comparison result shows that there are many differences between the variants and the applications, which lead to their inability to be directly related. The application of the pursuit-evasion game is mainly based on route planning and object distance, and generally contains multiple variant types, which is different from the current research direction of variant.
{"title":"Cops and robbers game and applications of its variant","authors":"Rui Bao","doi":"10.54254/2755-2721/69/20241464","DOIUrl":"https://doi.org/10.54254/2755-2721/69/20241464","url":null,"abstract":"Pursuit-evasion game is a game with a pursuer player and an evader player, it has numerous applications including artificial intelligence, robot motion planning and so on. The cops and robbers game is a type of pursuit-evasion game played on graphs, and its variant is a very fruitful research area in graph theory. The purpose of this paper is to find the specific relations between pursuit-evasion game applications and different types of variants of the cops and robbers game. The research method is to find variants with different game rules and winning strategies and classify them. Then compare these rules and strategies with the applications of pursuit-evasion games to find their relationships. The comparison result shows that there are many differences between the variants and the applications, which lead to their inability to be directly related. The application of the pursuit-evasion game is mainly based on route planning and object distance, and generally contains multiple variant types, which is different from the current research direction of variant.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"46 13","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141805101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this study, we explore the ability of ChatGPT to predict stock market trends based on stock news headlines and real stock market data. In order to evaluate the performance of the fine-tuned model, we first obtain the prediction results of GPT-3.5 Turbo on specific stocks future trends as a comparison. We fine-tuned GPT-3.5 Turbo and conducted related training, testing and result evaluation. The experiments implemented on the two datasets Bigdata2022 and Cikm illustrate that fine-tuning can help the model to produce expected structured output according to user requirements, based on its more sophisticated understanding of the text and data in this field. However, although the models performance is improved significantly, GPT-3.5 Turbo does not demonstrate better performance compared to other traditional large language models in terms of integrating time series data and news headline data for stock forecasting. The fine-tuned ChatGPT model is expected to achieve excellent results in the stock market forecasting tasks through more in-depth research and become one of the mainstream research models in this field.
{"title":"The limits of excellence: Assessing fine-tuned ChatGPTs efficacy in stock price forecasting","authors":"Yiyang Huang, Xiang Liu, Naichuan Zhang, Tianshu Zhao","doi":"10.54254/2755-2721/79/20241612","DOIUrl":"https://doi.org/10.54254/2755-2721/79/20241612","url":null,"abstract":"In this study, we explore the ability of ChatGPT to predict stock market trends based on stock news headlines and real stock market data. In order to evaluate the performance of the fine-tuned model, we first obtain the prediction results of GPT-3.5 Turbo on specific stocks future trends as a comparison. We fine-tuned GPT-3.5 Turbo and conducted related training, testing and result evaluation. The experiments implemented on the two datasets Bigdata2022 and Cikm illustrate that fine-tuning can help the model to produce expected structured output according to user requirements, based on its more sophisticated understanding of the text and data in this field. However, although the models performance is improved significantly, GPT-3.5 Turbo does not demonstrate better performance compared to other traditional large language models in terms of integrating time series data and news headline data for stock forecasting. The fine-tuned ChatGPT model is expected to achieve excellent results in the stock market forecasting tasks through more in-depth research and become one of the mainstream research models in this field.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"49 14","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141805673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.54254/2755-2721/55/20241429
Yixiong Fang, Weixi Yang
In the pursuit of accelerating the computation of k-largest eigenvalues and eigenvectors, this work presents novel methodologies and insights across three main areas. 1) Speed Arnoldi Iteration Up: By observing existing algorithms, we propose an innovative approach that leverages matrix decomposition to accelerate the computation process. The implementation focuses on iteratively computing orthogonal projections and efficiently storing computed vectors in Krylov subspace. 2) Special Case: Eigenvector for =1: We examine a specific scenario concerning Markov matrices, where the largest eigenvalue is 1. The work provides a detailed proof and analysis of this property, contributing to a deeper understanding of eigenvalues and eigenvectors in the context of stochastic processes. 3) Gaussian Approximation for Markov Matrices: This section delves into the Gaussian approximation for Markov matrices, denoted by P. The work covers theoretical insights, practical challenges, computational efficiency, and empirical validation, providing a comprehensive exploration of this critical method. Together, these sections form a cohesive study aimed at enhancing the computational efficiency of significant algorithms within the field of dimensionality reduction and matrix analysis. The findings may find broad applications in various domains, including image segmentation, speaker verification, anomaly detection, and more.
为了加快 k 大特征值和特征向量的计算速度,本研究在三个主要领域提出了新颖的方法和见解。1) 加速阿诺德迭代:通过观察现有算法,我们提出了一种利用矩阵分解加速计算过程的创新方法。实现的重点是迭代计算正交投影,并将计算出的向量高效地存储在克雷洛夫子空间中。2) 特殊情况:=1 的特征向量:我们研究了马尔可夫矩阵的一个特殊情况,即最大特征值为 1。这项研究提供了对这一特性的详细证明和分析,有助于加深对随机过程中特征值和特征向量的理解。3) 马尔可夫矩阵的高斯逼近:本节深入探讨马尔可夫矩阵的高斯逼近(用 P 表示)。工作内容包括理论见解、实际挑战、计算效率和经验验证,对这一关键方法进行了全面探索。这些部分共同构成了一项具有凝聚力的研究,旨在提高降维和矩阵分析领域重要算法的计算效率。这些研究成果可广泛应用于各个领域,包括图像分割、说话人验证、异常检测等。
{"title":"Efficient computation of eigenvalues in diffusion maps: A multi-strategy approach","authors":"Yixiong Fang, Weixi Yang","doi":"10.54254/2755-2721/55/20241429","DOIUrl":"https://doi.org/10.54254/2755-2721/55/20241429","url":null,"abstract":"In the pursuit of accelerating the computation of k-largest eigenvalues and eigenvectors, this work presents novel methodologies and insights across three main areas. 1) Speed Arnoldi Iteration Up: By observing existing algorithms, we propose an innovative approach that leverages matrix decomposition to accelerate the computation process. The implementation focuses on iteratively computing orthogonal projections and efficiently storing computed vectors in Krylov subspace. 2) Special Case: Eigenvector for =1: We examine a specific scenario concerning Markov matrices, where the largest eigenvalue is 1. The work provides a detailed proof and analysis of this property, contributing to a deeper understanding of eigenvalues and eigenvectors in the context of stochastic processes. 3) Gaussian Approximation for Markov Matrices: This section delves into the Gaussian approximation for Markov matrices, denoted by P. The work covers theoretical insights, practical challenges, computational efficiency, and empirical validation, providing a comprehensive exploration of this critical method. Together, these sections form a cohesive study aimed at enhancing the computational efficiency of significant algorithms within the field of dimensionality reduction and matrix analysis. The findings may find broad applications in various domains, including image segmentation, speaker verification, anomaly detection, and more.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"56 22","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141805926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.54254/2755-2721/79/20241286
Jingjing Xu, Jiahao Du, Junyi Wang
The emergence and rapid development of neural networks have been pivotal in advancing text-to-image generative models, with particular emphasis on generative adversarial networks (GANs), variational autoencoders (VAEs), and augmented reality (AR). These models have greatly enriched the field, offering diverse avenues for image generation. Critical support has been provided by databases such as MS COCO, Flickr30K, Visual Genome, and Conceptual Captions, along with essential evaluation metrics, including Inception Score (IS), Frchet Inception Distance (FID), precision, and recall. In this comprehensive review, we delve into the mechanisms and significance of each model and technique, ensuring a holistic examination of their contributions. Both GANs and VAEs stand out as significant models within image generative frameworks, each excelling in distinct aspects. Therefore, it is imperative to discuss both models in this review, as they offer complementary strengths. Additionally, we include noteworthy models such as augmented reality to provide a well-rounded assessment of the current advancements in the field. In terms of datasets, MS COCO offers a diverse and extensive collection of images, serving as a cornerstone for model training. Other datasets like Flickr 30k, Visual Genome, and Conceptual Captions contribute valuable labeled examples, further enriching the learning process for these models. The incorporation of widely recognized metrics and methodologies in the field allows for effective evaluation and comparison of their relative significance. In conclusion, the field's recent achievements owe much to the integration of its various components. VAEs and GANs, with their unique strengths, complement each other, while metrics and datasets play complementary roles in advancing the capabilities of generative models in the context of text-to-image synthesis. This survey underscores the collaborative synergy between models, metrics, and datasets, propelling the field toward new horizons.
{"title":"A survey of generative models used in text-to-image","authors":"Jingjing Xu, Jiahao Du, Junyi Wang","doi":"10.54254/2755-2721/79/20241286","DOIUrl":"https://doi.org/10.54254/2755-2721/79/20241286","url":null,"abstract":"The emergence and rapid development of neural networks have been pivotal in advancing text-to-image generative models, with particular emphasis on generative adversarial networks (GANs), variational autoencoders (VAEs), and augmented reality (AR). These models have greatly enriched the field, offering diverse avenues for image generation. Critical support has been provided by databases such as MS COCO, Flickr30K, Visual Genome, and Conceptual Captions, along with essential evaluation metrics, including Inception Score (IS), Frchet Inception Distance (FID), precision, and recall. In this comprehensive review, we delve into the mechanisms and significance of each model and technique, ensuring a holistic examination of their contributions. Both GANs and VAEs stand out as significant models within image generative frameworks, each excelling in distinct aspects. Therefore, it is imperative to discuss both models in this review, as they offer complementary strengths. Additionally, we include noteworthy models such as augmented reality to provide a well-rounded assessment of the current advancements in the field. In terms of datasets, MS COCO offers a diverse and extensive collection of images, serving as a cornerstone for model training. Other datasets like Flickr 30k, Visual Genome, and Conceptual Captions contribute valuable labeled examples, further enriching the learning process for these models. The incorporation of widely recognized metrics and methodologies in the field allows for effective evaluation and comparison of their relative significance. In conclusion, the field's recent achievements owe much to the integration of its various components. VAEs and GANs, with their unique strengths, complement each other, while metrics and datasets play complementary roles in advancing the capabilities of generative models in the context of text-to-image synthesis. This survey underscores the collaborative synergy between models, metrics, and datasets, propelling the field toward new horizons.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"15 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141804415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.54254/2755-2721/67/2024ma0069
Ziqiao Yu, Zhengcheng Li
Aiming to optimize the modulation efficiency for the dual-bridge series-resonant converter (DBSRC), this paper proposes an deep reinforcement learning (DRL) aided EPS (DEPS) modulation scheme for minimum current operation based on the harmonic analysis method. Using the deep deterministic policy gradient (DDPG) algorithm as an advanced DRL algorithm, the scheme obtains the optimized modulation scheme DEPS through offline training of the agent, which can adopt the extended-phase-shift (EPS) modulation scheme and consider the zero-voltage-switching (ZVS) constraints. Thus, the trained agent of the DDPG which likes an implicit function, can provide optimal phase shift angle for the DBSRC in real-time with the minimum current and soft switching performance in the continuous operation range. Compared with the existing EPS modulation schemes using First Harmonic Approximation (FHA), the DEPS modulation scheme has similar operational efficiency and performance of the converter, while also possessing the ability to obtain modulation angles in real-time based on environmental parameters. Finally, PSIM simulation verifies the effectiveness of the proposed optimization scheme.
{"title":"Deep reinforcement learning-aided minimum current control for the DBSRC based on harmonic analysis method","authors":"Ziqiao Yu, Zhengcheng Li","doi":"10.54254/2755-2721/67/2024ma0069","DOIUrl":"https://doi.org/10.54254/2755-2721/67/2024ma0069","url":null,"abstract":"Aiming to optimize the modulation efficiency for the dual-bridge series-resonant converter (DBSRC), this paper proposes an deep reinforcement learning (DRL) aided EPS (DEPS) modulation scheme for minimum current operation based on the harmonic analysis method. Using the deep deterministic policy gradient (DDPG) algorithm as an advanced DRL algorithm, the scheme obtains the optimized modulation scheme DEPS through offline training of the agent, which can adopt the extended-phase-shift (EPS) modulation scheme and consider the zero-voltage-switching (ZVS) constraints. Thus, the trained agent of the DDPG which likes an implicit function, can provide optimal phase shift angle for the DBSRC in real-time with the minimum current and soft switching performance in the continuous operation range. Compared with the existing EPS modulation schemes using First Harmonic Approximation (FHA), the DEPS modulation scheme has similar operational efficiency and performance of the converter, while also possessing the ability to obtain modulation angles in real-time based on environmental parameters. Finally, PSIM simulation verifies the effectiveness of the proposed optimization scheme.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"6 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141804576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.54254/2755-2721/55/20241491
Ke Ning
Flanker tasks are mainly supposed to have activation signals on the prefrontal regions and anterior cingulate cortex (ACC) because this assignment requires immediate actions from participants after seeing unique signs. The SPM in MATLAB is used to process the data of activation signals of brains from flanker task data collection. This process has four steps: preprocessing of all the subjects, first-level analysis, second-level analysis, and Region of Interest (ROI) analysis. After all the steps, the final graph should show black spots representing the activation signal on the prefrontal and anterior cingulate cortex regions. However, the spotted area differs from the areas in the proper graph, indicating the errors during processing each subject in flanker tasks in this assignment. There are four conjectures of error steps after reviewing the wrong result
{"title":"Four possible error conjectures with SPM in MATLAB","authors":"Ke Ning","doi":"10.54254/2755-2721/55/20241491","DOIUrl":"https://doi.org/10.54254/2755-2721/55/20241491","url":null,"abstract":"Flanker tasks are mainly supposed to have activation signals on the prefrontal regions and anterior cingulate cortex (ACC) because this assignment requires immediate actions from participants after seeing unique signs. The SPM in MATLAB is used to process the data of activation signals of brains from flanker task data collection. This process has four steps: preprocessing of all the subjects, first-level analysis, second-level analysis, and Region of Interest (ROI) analysis. After all the steps, the final graph should show black spots representing the activation signal on the prefrontal and anterior cingulate cortex regions. However, the spotted area differs from the areas in the proper graph, indicating the errors during processing each subject in flanker tasks in this assignment. There are four conjectures of error steps after reviewing the wrong result","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"35 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141805350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}