Sung-Jae Chang, Hyeon-Seok Jeong, Hyun-Wook Jung, Su-Min Choi, Il-Gyu Choi, Youn-Sub Noh, Seong-Il Kim, Sang-Heung Lee, Ho-Kyun Ahn, Dong Min Kang, Dae-Hyun Kim, Jong-Won Lim
The effects of the parasitic gate capacitance and gate resistance (Rg) on the radiofrequency (RF) performance are investigated in LG = 0.15 μm GaN high-electron-mobility transistors with T-gate head size ranging from 0.83 to 1.08 μm. When the device characteristics are compared, the difference in DC characteristics is negligible. The RF performance in terms of the current-gain cut-off frequency (fT) and maximum oscillation frequency (fmax) substantially depend on the T-gate head size. For clarifying the T-gate head size dependence, small-signal modeling is conducted to extract the parasitic gate capacitance and Rg. When the T-gate head size is reduced from 1.08 to 0.83 μm, Rg increases by 82%, while fT and fmax improve by 27% and 26%, respectively, because the parasitic gate–source and gate–drain capacitances reduce by 19% and 43%, respectively. Therefore, minimizing the parasitic gate capacitance is more effective that reducing Rg in our transistor design and fabrication, leading to improved RF performance when reducing the T-gate head size.
研究了 LG = 0.15 μm GaN 高电子迁移率晶体管中寄生栅极电容和栅极电阻 (Rg) 对射频 (RF) 性能的影响,该晶体管的 T 形栅极头尺寸为 0.83 至 1.08 μm。在比较器件特性时,直流特性的差异可以忽略不计。就电流增益截止频率(fT)和最大振荡频率(fmax)而言,射频性能在很大程度上取决于 T 形栅极头的尺寸。为明确 T 形栅极头尺寸的相关性,我们进行了小信号建模,以提取寄生栅极电容和 Rg。当 T 形栅极头尺寸从 1.08 μm 减小到 0.83 μm 时,Rg 增加了 82%,而 fT 和 fmax 则分别提高了 27% 和 26%,这是因为寄生栅极-源极电容和栅极-漏极电容分别降低了 19% 和 43%。因此,在我们的晶体管设计和制造过程中,尽量减小寄生栅电容比减小 Rg 更有效,从而在减小 T 形栅极头尺寸时提高射频性能。
{"title":"Effects of parasitic gate capacitance and gate resistance on radiofrequency performance in LG = 0.15 μm GaN high-electron-mobility transistors for X-band applications","authors":"Sung-Jae Chang, Hyeon-Seok Jeong, Hyun-Wook Jung, Su-Min Choi, Il-Gyu Choi, Youn-Sub Noh, Seong-Il Kim, Sang-Heung Lee, Ho-Kyun Ahn, Dong Min Kang, Dae-Hyun Kim, Jong-Won Lim","doi":"10.4218/etrij.2023-0250","DOIUrl":"10.4218/etrij.2023-0250","url":null,"abstract":"<p>The effects of the parasitic gate capacitance and gate resistance (<i>R</i><sub>g</sub>) on the radiofrequency (RF) performance are investigated in <i>L</i><sub>G</sub> = 0.15 μm GaN high-electron-mobility transistors with T-gate head size ranging from 0.83 to 1.08 μm. When the device characteristics are compared, the difference in DC characteristics is negligible. The RF performance in terms of the current-gain cut-off frequency (<i>f</i><sub>T</sub>) and maximum oscillation frequency (<i>f</i><sub>max</sub>) substantially depend on the T-gate head size. For clarifying the T-gate head size dependence, small-signal modeling is conducted to extract the parasitic gate capacitance and <i>R</i><sub>g</sub>. When the T-gate head size is reduced from 1.08 to 0.83 μm, <i>R</i><sub>g</sub> increases by 82%, while <i>f</i><sub>T</sub> and <i>f</i><sub>max</sub> improve by 27% and 26%, respectively, because the parasitic gate–source and gate–drain capacitances reduce by 19% and 43%, respectively. Therefore, minimizing the parasitic gate capacitance is more effective that reducing <i>R</i><sub>g</sub> in our transistor design and fabrication, leading to improved RF performance when reducing the T-gate head size.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 6","pages":"1090-1102"},"PeriodicalIF":1.3,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0250","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Causal analysis involves analysis and discovery. We consider causal discovery, which implies learning and discovering causal structures from available data, owing to the significance of interpreting causal relationships in various fields. Research on causal discovery has been primarily focused on constraint- and score-based interpretable methods rather than on methods based on complex deep learning models. However, identifying causal relationships in real-world datasets remains challenging. Numerous studies have been conducted using small datasets with established ground truths. Moreover, constraint-based methods are based on conditional independence tests. However, such tests have a lower statistical power when applied to small datasets. To solve the small sample size problem, we propose a model that generates a continuous function from available samples using radial basis function approximation. We address the problem by extracting data from the generated continuous function and evaluate the proposed method on both real and synthetic datasets generated by structural equation modeling. The proposed method outperforms constraint-based methods using only small datasets.
{"title":"Small dataset augmentation with radial basis function approximation for causal discovery using constraint-based method","authors":"Chan Young Jung, Yun Jang","doi":"10.4218/etrij.2023-0397","DOIUrl":"https://doi.org/10.4218/etrij.2023-0397","url":null,"abstract":"Causal analysis involves analysis and discovery. We consider causal discovery, which implies learning and discovering causal structures from available data, owing to the significance of interpreting causal relationships in various fields. Research on causal discovery has been primarily focused on constraint- and score-based interpretable methods rather than on methods based on complex deep learning models. However, identifying causal relationships in real-world datasets remains challenging. Numerous studies have been conducted using small datasets with established ground truths. Moreover, constraint-based methods are based on conditional independence tests. However, such tests have a lower statistical power when applied to small datasets. To solve the small sample size problem, we propose a model that generates a continuous function from available samples using radial basis function approximation. We address the problem by extracting data from the generated continuous function and evaluate the proposed method on both real and synthetic datasets generated by structural equation modeling. The proposed method outperforms constraint-based methods using only small datasets.","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"18 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We aim to improve both the energy harvesting efficiency and communication reliability of overlay networks powered by harvested energy. To this end, multiple antennas are considered to collect energy efficiently and perform reliable decoding by selecting the transmitting antenna and applying maximum-ratio combining. To further improve communication reliability, nonorthogonal multiple access (NOMA)-relied decoding is applied to the secondary receiver. For performance evaluation, exact formulas for the secondary/primary outage probability are derived in a closed form. The evaluation results show that the proposed method substantially outperforms a baseline without the NOMA-relied decoding in all the system settings. The performance of the proposed method is determined by multiple specifications and optimized by allocating the times for energy harvesting and information processing.
{"title":"Performance analysis of transmit antenna selection and maximum-ratio combining in overlay networks powered by harvested energy","authors":"Khuong Ho-Van","doi":"10.4218/etrij.2023-0389","DOIUrl":"10.4218/etrij.2023-0389","url":null,"abstract":"<p>We aim to improve both the energy harvesting efficiency and communication reliability of overlay networks powered by harvested energy. To this end, multiple antennas are considered to collect energy efficiently and perform reliable decoding by selecting the transmitting antenna and applying maximum-ratio combining. To further improve communication reliability, nonorthogonal multiple access (NOMA)-relied decoding is applied to the secondary receiver. For performance evaluation, exact formulas for the secondary/primary outage probability are derived in a closed form. The evaluation results show that the proposed method substantially outperforms a baseline without the NOMA-relied decoding in all the system settings. The performance of the proposed method is determined by multiple specifications and optimized by allocating the times for energy harvesting and information processing.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 6","pages":"987-997"},"PeriodicalIF":1.3,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0389","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Husnu Baris Baydargil, Jangsik Park, Ibrahim Furkan Ince
Deep neural networks trained on labeled medical data face major challenges owing to the economic costs of data acquisition through expensive medical imaging devices, expert labor for data annotation, and large datasets to achieve optimal model performance. The heterogeneity of diseases, such as Alzheimer's disease, further complicates deep learning because the test cases may substantially differ from the training data, possibly increasing the rate of false positives. We propose a reconstruction-based self-supervised anomaly detection model to overcome these challenges. It has a dual-subnetwork encoder that enhances feature encoding augmented by skip connections to the decoder for improving the gradient flow. The novel encoder captures local and global features to improve image reconstruction. In addition, we introduce an entropy-based image conversion method. Extensive evaluations show that the proposed model outperforms benchmark models in anomaly detection and classification using an encoder. The supervised and unsupervised models show improved performances when trained with data preprocessed using the proposed image conversion method.
{"title":"Anomaly-based Alzheimer's disease detection using entropy-based probability Positron Emission Tomography images","authors":"Husnu Baris Baydargil, Jangsik Park, Ibrahim Furkan Ince","doi":"10.4218/etrij.2023-0123","DOIUrl":"10.4218/etrij.2023-0123","url":null,"abstract":"<p>Deep neural networks trained on labeled medical data face major challenges owing to the economic costs of data acquisition through expensive medical imaging devices, expert labor for data annotation, and large datasets to achieve optimal model performance. The heterogeneity of diseases, such as Alzheimer's disease, further complicates deep learning because the test cases may substantially differ from the training data, possibly increasing the rate of false positives. We propose a reconstruction-based self-supervised anomaly detection model to overcome these challenges. It has a dual-subnetwork encoder that enhances feature encoding augmented by skip connections to the decoder for improving the gradient flow. The novel encoder captures local and global features to improve image reconstruction. In addition, we introduce an entropy-based image conversion method. Extensive evaluations show that the proposed model outperforms benchmark models in anomaly detection and classification using an encoder. The supervised and unsupervised models show improved performances when trained with data preprocessed using the proposed image conversion method.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 3","pages":"513-525"},"PeriodicalIF":1.4,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0123","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The filtered-orthogonal frequency division multiplexing (F-OFDM) scheme has gained attention as a promising solution in the field of visible light communication (VLC) systems. One crucial aspect in VLC is the conversion of the complex F-OFDM signal into a real signal that corresponds with direct detection and intensity modulation. Traditionally, achieving a real F-OFDM signal has involved imposing Hermitian symmetry (HS) on the samples of the Inverse Fast Fourier transform (IFFT), which requires 2N-point IFFT and obtains an N-point FFT, thus adding complexity. In this study, a novel approach is presented and implemented, aiming to enhance spectral efficiency and reduce system complexity by generating a real F-OFDM signal without relying on HS. This approach is then compared with HS-free (HSF)-OFDM, direct current biased optical OFDM, and asymmetrically clipped optical OFDM. The suggested method offers a remarkable improvement of ~50% in the required IFFT/FFT volume. Consequently, this method reduces hardware complexity and power usage compared with the traditional F-OFDM method. Moreover, regarding error rates, the proposed method demonstrates better spectral efficiency than HSF-OFDM.
{"title":"Enhancing spectral efficiency with low complexity filtered-orthogonal frequency division multiplexing in visible light communication system","authors":"Hayder S. R. Hujijo, Muhammad Ilyas","doi":"10.4218/etrij.2023-0300","DOIUrl":"10.4218/etrij.2023-0300","url":null,"abstract":"<p>The filtered-orthogonal frequency division multiplexing (F-OFDM) scheme has gained attention as a promising solution in the field of visible light communication (VLC) systems. One crucial aspect in VLC is the conversion of the complex F-OFDM signal into a real signal that corresponds with direct detection and intensity modulation. Traditionally, achieving a real F-OFDM signal has involved imposing Hermitian symmetry (HS) on the samples of the Inverse Fast Fourier transform (IFFT), which requires 2N-point IFFT and obtains an N-point FFT, thus adding complexity. In this study, a novel approach is presented and implemented, aiming to enhance spectral efficiency and reduce system complexity by generating a real F-OFDM signal without relying on HS. This approach is then compared with HS-free (HSF)-OFDM, direct current biased optical OFDM, and asymmetrically clipped optical OFDM. The suggested method offers a remarkable improvement of ~50% in the required IFFT/FFT volume. Consequently, this method reduces hardware complexity and power usage compared with the traditional F-OFDM method. Moreover, regarding error rates, the proposed method demonstrates better spectral efficiency than HSF-OFDM.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 6","pages":"1007-1019"},"PeriodicalIF":1.3,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0300","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wireless sensor networks (WSNs) are composed of numerous nodes distributed in geographical regions. Security and energy efficiency are challenging tasks due to an open environment and a restricted battery source. The multiobjective trust-aware artificial hummingbird algorithm (M-TAAHA) is proposed to achieve secure and reliable transmission over a WSN with a mobile sink (MS). The M-TAAHA selects secure cluster head (SCH) nodes based on trust, energy, interspace between sensors, interspace between SCH and MS, and the CH balancing factor. A secure route is found by M-TAAHA with trust, energy, and interspace between SCH and MS. The M-TAAHA avoids the malicious nodes to improve data delivery and avoid unwanted energy consumption. The M-TAAHA is analyzed using energy consumption, alive nodes, life expectancy, delay, data packets received in MS, throughput, packet delivery ratio, and packet loss ratio. Existing techniques (LEACH-TM, EATMR, FAL, Taylor-spotted hyena optimization [Taylor-SHO], TBEBR, and TEDG) are used for comparison with the M-TAAHA. Findings show that the energy consumption of the proposed M-TAAHA for 1000 rounds is 0.56 J (1.78 × smaller than that of the Taylor-SHO).
{"title":"Multiobjective, trust-aware, artificial hummingbird algorithm-based secure clustering and routing with mobile sink for wireless sensor networks","authors":"Anil Kumar Jemla Naik, Manjunatha Parameswarappa, Mohan Naik Ramachandra","doi":"10.4218/etrij.2023-0330","DOIUrl":"10.4218/etrij.2023-0330","url":null,"abstract":"<p>Wireless sensor networks (WSNs) are composed of numerous nodes distributed in geographical regions. Security and energy efficiency are challenging tasks due to an open environment and a restricted battery source. The multiobjective trust-aware artificial hummingbird algorithm (M-TAAHA) is proposed to achieve secure and reliable transmission over a WSN with a mobile sink (MS). The M-TAAHA selects secure cluster head (SCH) nodes based on trust, energy, interspace between sensors, interspace between SCH and MS, and the CH balancing factor. A secure route is found by M-TAAHA with trust, energy, and interspace between SCH and MS. The M-TAAHA avoids the malicious nodes to improve data delivery and avoid unwanted energy consumption. The M-TAAHA is analyzed using energy consumption, alive nodes, life expectancy, delay, data packets received in MS, throughput, packet delivery ratio, and packet loss ratio. Existing techniques (LEACH-TM, EATMR, FAL, Taylor-spotted hyena optimization [Taylor-SHO], TBEBR, and TEDG) are used for comparison with the M-TAAHA. Findings show that the energy consumption of the proposed M-TAAHA for 1000 rounds is 0.56 J (1.78 × smaller than that of the Taylor-SHO).</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 6","pages":"950-964"},"PeriodicalIF":1.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0330","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In e-commerce platforms, sentiment analysis on an enormous number of user reviews efficiently enhances user satisfaction. In this article, an automated product recommendation system is developed based on machine and deep-learning models. In the initial step, the text data are acquired from the Amazon Product Reviews dataset, which includes 60 000 customer reviews with 14 806 neutral reviews, 19 567 negative reviews, and 25 627 positive reviews. Further, the text data denoising is carried out using techniques such as stop word removal, stemming, segregation, lemmatization, and tokenization. Removing stop-words (duplicate and inconsistent text) and other denoising techniques improves the classification performance and decreases the training time of the model. Next, vectorization is accomplished utilizing the term frequency–inverse document frequency technique, which converts denoised text to numerical vectors for faster code execution. The obtained feature vectors are given to the modified convolutional neural network model for sentiment analysis on e-commerce platforms. The empirical result shows that the proposed model obtained a mean accuracy of 97.40% on the APR dataset.
{"title":"Amazon product recommendation system based on a modified convolutional neural network","authors":"Yarasu Madhavi Latha, B. Srinivasa Rao","doi":"10.4218/etrij.2023-0162","DOIUrl":"10.4218/etrij.2023-0162","url":null,"abstract":"<p>In e-commerce platforms, sentiment analysis on an enormous number of user reviews efficiently enhances user satisfaction. In this article, an automated product recommendation system is developed based on machine and deep-learning models. In the initial step, the text data are acquired from the Amazon Product Reviews dataset, which includes 60 000 customer reviews with 14 806 neutral reviews, 19 567 negative reviews, and 25 627 positive reviews. Further, the text data denoising is carried out using techniques such as stop word removal, stemming, segregation, lemmatization, and tokenization. Removing stop-words (duplicate and inconsistent text) and other denoising techniques improves the classification performance and decreases the training time of the model. Next, vectorization is accomplished utilizing the term frequency–inverse document frequency technique, which converts denoised text to numerical vectors for faster code execution. The obtained feature vectors are given to the modified convolutional neural network model for sentiment analysis on e-commerce platforms. The empirical result shows that the proposed model obtained a mean accuracy of 97.40% on the APR dataset.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 4","pages":"633-647"},"PeriodicalIF":1.3,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0162","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junuk Jung, Sungbin Son, Joochan Park, Yongjun Park, Seonhoon Lee, Heung-Seon Oh
The performance of face recognition (FR) has reached a plateau for public benchmark datasets, such as labeled faces in the wild (LFW), celebrities in frontal-profile in the wild (CFP-FP), and the first manually collected, in-the-wild age database (AgeDB), owing to the rapid advances in convolutional neural networks (CNNs). However, the effects of faces under various fine-grained conditions on FR models have not been investigated, owing to the absence of relevant datasets. This paper analyzes their effects under different conditions and loss functions using K-FACE, a recently introduced FR dataset with fine-grained conditions. We propose a novel loss function called MixFace, which combines classification and metric losses. The superiority of MixFace in terms of effectiveness and robustness was experimentally demonstrated using various benchmark datasets.
{"title":"MixFace: Improving face verification with a focus on fine-grained conditions","authors":"Junuk Jung, Sungbin Son, Joochan Park, Yongjun Park, Seonhoon Lee, Heung-Seon Oh","doi":"10.4218/etrij.2023-0167","DOIUrl":"10.4218/etrij.2023-0167","url":null,"abstract":"<p>The performance of face recognition (FR) has reached a plateau for public benchmark datasets, such as labeled faces in the wild (LFW), celebrities in frontal-profile in the wild (CFP-FP), and the first manually collected, in-the-wild age database (AgeDB), owing to the rapid advances in convolutional neural networks (CNNs). However, the effects of faces under various fine-grained conditions on FR models have not been investigated, owing to the absence of relevant datasets. This paper analyzes their effects under different conditions and loss functions using K-FACE, a recently introduced FR dataset with fine-grained conditions. We propose a novel loss function called MixFace, which combines classification and metric losses. The superiority of MixFace in terms of effectiveness and robustness was experimentally demonstrated using various benchmark datasets.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 4","pages":"660-670"},"PeriodicalIF":1.3,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0167","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140147966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Violence can be committed anywhere, even in crowded places. It is hence necessary to monitor human activities for public safety. Surveillance cameras can monitor surrounding activities but require human assistance to continuously monitor every incident. Automatic violence detection is needed for early warning and fast response. However, such automation is still challenging because of low video resolution and blind spots. This paper uses ResNet50v2 and the gated recurrent unit (GRU) algorithm to detect violence in the Movies, Hockey, and Crowd video datasets. Spatial features were extracted from each frame sequence of the video using a pretrained model from ResNet50V2, which was then classified using the optimal trained model on the GRU architecture. The experimental results were then compared with wavelet feature extraction methods and classification models, such as the convolutional neural network and long short-term memory. The results show that the proposed combination of ResNet50V2 and GRU is robust and delivers the best performance in terms of accuracy, recall, precision, and F1-score. The use of ResNet50V2 for feature extraction can improve model performance.
{"title":"Violent crowd flow detection from surveillance cameras using deep transfer learning–gated recurrent unit","authors":"Elly Matul Imah, Riskyana Dewi Intan Puspitasari","doi":"10.4218/etrij.2023-0222","DOIUrl":"10.4218/etrij.2023-0222","url":null,"abstract":"<p>Violence can be committed anywhere, even in crowded places. It is hence necessary to monitor human activities for public safety. Surveillance cameras can monitor surrounding activities but require human assistance to continuously monitor every incident. Automatic violence detection is needed for early warning and fast response. However, such automation is still challenging because of low video resolution and blind spots. This paper uses ResNet50v2 and the gated recurrent unit (GRU) algorithm to detect violence in the Movies, Hockey, and Crowd video datasets. Spatial features were extracted from each frame sequence of the video using a pretrained model from ResNet50V2, which was then classified using the optimal trained model on the GRU architecture. The experimental results were then compared with wavelet feature extraction methods and classification models, such as the convolutional neural network and long short-term memory. The results show that the proposed combination of ResNet50V2 and GRU is robust and delivers the best performance in terms of accuracy, recall, precision, and F1-score. The use of ResNet50V2 for feature extraction can improve model performance.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 4","pages":"671-682"},"PeriodicalIF":1.3,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0222","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140147864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study introduces CR-M-SpanBERT, a coreference resolution (CR) model that utilizes multiple embedding-based span bidirectional encoder representations from transformers, for antecedent recognition in natural language (NL) text. Information extraction studies aimed to extract knowledge from NL text autonomously and cost-effectively. However, the extracted information may not represent knowledge accurately owing to the presence of ambiguous entities. Therefore, we propose a CR model that identifies mentions referring to the same entity in NL text. In the case of CR, it is necessary to understand both the syntax and semantics of the NL text simultaneously. Therefore, multiple embeddings are generated for CR, which can include syntactic and semantic information for each word. We evaluate the effectiveness of CR-M-SpanBERT by comparing it to a model that uses SpanBERT as the language model in CR studies. The results demonstrate that our proposed deep neural network model achieves high-recognition accuracy for extracting antecedents from NL text. Additionally, it requires fewer epochs to achieve an average F1 accuracy greater than 75% compared with the conventional SpanBERT approach.
{"title":"CR-M-SpanBERT: Multiple embedding-based DNN coreference resolution using self-attention SpanBERT","authors":"Joon-young Jung","doi":"10.4218/etrij.2023-0308","DOIUrl":"https://doi.org/10.4218/etrij.2023-0308","url":null,"abstract":"<p>This study introduces CR-M-SpanBERT, a coreference resolution (CR) model that utilizes multiple embedding-based span bidirectional encoder representations from transformers, for antecedent recognition in natural language (NL) text. Information extraction studies aimed to extract knowledge from NL text autonomously and cost-effectively. However, the extracted information may not represent knowledge accurately owing to the presence of ambiguous entities. Therefore, we propose a CR model that identifies mentions referring to the same entity in NL text. In the case of CR, it is necessary to understand both the syntax and semantics of the NL text simultaneously. Therefore, multiple embeddings are generated for CR, which can include syntactic and semantic information for each word. We evaluate the effectiveness of CR-M-SpanBERT by comparing it to a model that uses SpanBERT as the language model in CR studies. The results demonstrate that our proposed deep neural network model achieves high-recognition accuracy for extracting antecedents from NL text. Additionally, it requires fewer epochs to achieve an average F1 accuracy greater than 75% compared with the conventional SpanBERT approach.</p>","PeriodicalId":11901,"journal":{"name":"ETRI Journal","volume":"46 1","pages":"35-47"},"PeriodicalIF":1.4,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.4218/etrij.2023-0308","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139987396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}