Int. Arab J. Inf. Technol.最新文献_第2页

A Novel Architecture for Search Engine using Domain Based Web Log Data 一种基于域的Web日志数据搜索引擎架构

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/1/10

P. Sharma, Divakar Yadav

Search engines, an information retrieval tool are the main source of information for users’ information need now a day. For every query, the search engine explores its repository and/or indexer to find the relevant documents/URLs for that query. Page ranking algorithms rank the Uniform Resource Locator in abstract section (URLs) according to its relevancy with respect to users’ query. It is analyzed that many of the queries fired by users on search engines are duplicate. There is a scope to improve the performance of search engine to reduce its efforts for duplicate queries. In this paper a proxy server is created that keep store the search results of user queries in web log. The proposed proxy server uses this web log to find results faster for duplicate queries fired next time. The proposed scheme has been tested and found prominent. The proposed architecture tested for ten duplicate user queries. it return all relevant web pages for duplicate user query (if query is found in web log at proxy server) from a particular domain instead of entire database. It reduces the perceived latency for duplicate query and also improves the value of precession and accuracy up to 81.8% and 99% respectively for all duplicate user queries.

搜索引擎作为一种信息检索工具，是当今用户信息需求的主要信息来源。对于每个查询，搜索引擎探索其存储库和/或索引器，以查找该查询的相关文档/ url。页面排序算法根据url与用户查询的相关性对url中的统一资源定位符进行排序。据分析，用户在搜索引擎上发出的查询有很多是重复的。搜索引擎的性能还有待改进，以减少重复查询的工作量。本文创建了一个代理服务器，将用户查询的搜索结果保存在web日志中。建议的代理服务器使用此web日志来更快地找到下次触发的重复查询的结果。所提出的方案已经过测试，效果显著。所提出的体系结构针对十个重复的用户查询进行了测试。它返回所有相关的网页重复用户查询(如果查询在代理服务器的web日志中找到)从一个特定的域，而不是整个数据库。它减少了重复查询的感知延迟，并将所有重复用户查询的进动和准确率分别提高了81.8%和99%。

{"title":"A Novel Architecture for Search Engine using Domain Based Web Log Data","authors":"P. Sharma, Divakar Yadav","doi":"10.34028/iajit/20/1/10","DOIUrl":"https://doi.org/10.34028/iajit/20/1/10","url":null,"abstract":"Search engines, an information retrieval tool are the main source of information for users’ information need now a day. For every query, the search engine explores its repository and/or indexer to find the relevant documents/URLs for that query. Page ranking algorithms rank the Uniform Resource Locator in abstract section (URLs) according to its relevancy with respect to users’ query. It is analyzed that many of the queries fired by users on search engines are duplicate. There is a scope to improve the performance of search engine to reduce its efforts for duplicate queries. In this paper a proxy server is created that keep store the search results of user queries in web log. The proposed proxy server uses this web log to find results faster for duplicate queries fired next time. The proposed scheme has been tested and found prominent. The proposed architecture tested for ten duplicate user queries. it return all relevant web pages for duplicate user query (if query is found in web log at proxy server) from a particular domain instead of entire database. It reduces the perceived latency for duplicate query and also improves the value of precession and accuracy up to 81.8% and 99% respectively for all duplicate user queries.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"8 1","pages":"92-101"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91047162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Analysis of TCP issues and their possible solutions in the internet of things 分析物联网中TCP问题及其可能的解决方案

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/2/7

S. Z. Hussain, Sultana Parween

The Internet of Things (IoT) is widely known as a revolutionary paradigm that offers communication among different types of devices. The primary goal of this paradigm is to implement efficient and high-quality smart services. It requires a protocol stack that offers different service requirements for inter-communication between different devices. Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) are used as transport layer protocols in IoT to provide the quality of service needed in various IoT devices. IoT encounters many shortcomings of wireless networks, while also posing new challenges due to its uniqueness. When TCP is used in an IoT system, a variety of challenging issues have to be dealt with. This paper provides a comprehensive survey of various issues which arises due to the heterogeneous characteristics of IoT. We identify main issues such as Retransmission Timeout (RTO) algorithm issue, congestion and packet loss issue, header overhead, high latency issue, link layer interaction issue, etc. Moreover, we provide several most probable solutions to the above-mentioned issues in the case of IoT scenarios. RTO algorithm issue has been resolved by using algorithms such as CoCoA, CoCoA+, and CoCoA++. Apart from these, the high latency issue has been solved with the help of a long lived connection and TCP Fast open. Congestion and packet loss issue has been resolved by using several TCP variants such as TCP New Reno, Tahoe, Reno, Vegas, and Westwood.

物联网(IoT)被广泛认为是一种革命性的范例，它提供了不同类型设备之间的通信。此范例的主要目标是实现高效和高质量的智能服务。它需要一个协议栈，为不同设备之间的相互通信提供不同的服务需求。传输控制协议(TCP)和用户数据报协议(UDP)作为物联网中的传输层协议，为各种物联网设备提供所需的服务质量。物联网遇到了无线网络的许多缺点，同时也因其独特性提出了新的挑战。在物联网系统中使用TCP时，必须处理各种具有挑战性的问题。本文全面调查了由于物联网的异构特性而产生的各种问题。我们确定了主要问题，如重传超时(RTO)算法问题，拥塞和数据包丢失问题，报头开销，高延迟问题，链路层交互问题等。此外，在物联网场景下，我们为上述问题提供了几种最可能的解决方案。RTO算法问题已经通过使用CoCoA、CoCoA+和CoCoA+等算法解决。除此之外，在长连接和TCP快速打开的帮助下，高延迟问题已经得到了解决。拥塞和丢包问题已经通过使用几个TCP变体(如TCP New Reno、Tahoe、Reno、Vegas和Westwood)得到解决。

{"title":"Analysis of TCP issues and their possible solutions in the internet of things","authors":"S. Z. Hussain, Sultana Parween","doi":"10.34028/iajit/20/2/7","DOIUrl":"https://doi.org/10.34028/iajit/20/2/7","url":null,"abstract":"The Internet of Things (IoT) is widely known as a revolutionary paradigm that offers communication among different types of devices. The primary goal of this paradigm is to implement efficient and high-quality smart services. It requires a protocol stack that offers different service requirements for inter-communication between different devices. Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) are used as transport layer protocols in IoT to provide the quality of service needed in various IoT devices. IoT encounters many shortcomings of wireless networks, while also posing new challenges due to its uniqueness. When TCP is used in an IoT system, a variety of challenging issues have to be dealt with. This paper provides a comprehensive survey of various issues which arises due to the heterogeneous characteristics of IoT. We identify main issues such as Retransmission Timeout (RTO) algorithm issue, congestion and packet loss issue, header overhead, high latency issue, link layer interaction issue, etc. Moreover, we provide several most probable solutions to the above-mentioned issues in the case of IoT scenarios. RTO algorithm issue has been resolved by using algorithms such as CoCoA, CoCoA+, and CoCoA++. Apart from these, the high latency issue has been solved with the help of a long lived connection and TCP Fast open. Congestion and packet loss issue has been resolved by using several TCP variants such as TCP New Reno, Tahoe, Reno, Vegas, and Westwood.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"16 1","pages":"206-214"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81109562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Genetic algorithm with random and memory immigrant strategies for solving dynamic load balanced clustering problem in wireless sensor networks 基于随机和记忆迁移策略的遗传算法求解无线传感器网络中的动态负载均衡聚类问题

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/4/3

Mohaideen Pitchai

In Wireless Sensor Networks (WSNs), clustering is an effective method to distribute the load equally among all the nodes as compared to flat network architecture. Due to the dynamic nature of the network, the clustering process can be viewed as a dynamic optimization problem and the conventional computational intelligence techniques are not enough to solve these problems. The Dynamic Genetic Algorithm (DGA) addresses these problems with the help of searching optimal solutions in new environments. Therefore the dynamic load-balanced clustering process is modeled using the basic components of standard genetic algorithm and then the model is enhanced is using immigrants and memory-based schemes to elect suitable cluster heads. The metrics nodes’ residual energy level, node centrality, and mobility speed of the nodes are considered to elect the load-balanced cluster heads and the optimal number of cluster members are assigned to each cluster head using the proposed DGA schemes such as Random Immigrants Genetic Approach (RIGA), Memory Immigrants Genetic Approach (MIGA), and Memory and Random Immigrants Genetic Approach (MRIGA). The simulation results show that the proposed DGA scheme MRIGA outperforms well as compared with RIGA and MIGA in terms of various performance metrics such as the number of nodes alive, residual energy level, packet delivery ratio, end-to-end delay, and overhead for the formation of clusters.

在无线传感器网络(WSNs)中，与平面网络结构相比，聚类是一种有效的将负载平均分配给所有节点的方法。由于网络的动态性，聚类过程可以看作是一个动态优化问题，传统的计算智能技术不足以解决这些问题。动态遗传算法(DGA)通过在新环境中寻找最优解来解决这些问题。因此，采用标准遗传算法的基本组件对动态负载均衡聚类过程进行建模，然后采用基于迁移和基于内存的方案对模型进行增强，以选择合适的簇头。采用随机移民遗传方法(RIGA)、内存移民遗传方法(MIGA)和内存和随机移民遗传方法(MRIGA)，根据节点的剩余能量水平、节点的中心性和节点的移动速度来选择负载均衡的簇头，并为每个簇头分配最优的簇成员数量。仿真结果表明，与RIGA和MIGA方案相比，MRIGA方案在存活节点数、剩余能量水平、分组投递率、端到端延迟和集群形成开销等性能指标上都有较好的表现。

{"title":"Genetic algorithm with random and memory immigrant strategies for solving dynamic load balanced clustering problem in wireless sensor networks","authors":"Mohaideen Pitchai","doi":"10.34028/iajit/20/4/3","DOIUrl":"https://doi.org/10.34028/iajit/20/4/3","url":null,"abstract":"In Wireless Sensor Networks (WSNs), clustering is an effective method to distribute the load equally among all the nodes as compared to flat network architecture. Due to the dynamic nature of the network, the clustering process can be viewed as a dynamic optimization problem and the conventional computational intelligence techniques are not enough to solve these problems. The Dynamic Genetic Algorithm (DGA) addresses these problems with the help of searching optimal solutions in new environments. Therefore the dynamic load-balanced clustering process is modeled using the basic components of standard genetic algorithm and then the model is enhanced is using immigrants and memory-based schemes to elect suitable cluster heads. The metrics nodes’ residual energy level, node centrality, and mobility speed of the nodes are considered to elect the load-balanced cluster heads and the optimal number of cluster members are assigned to each cluster head using the proposed DGA schemes such as Random Immigrants Genetic Approach (RIGA), Memory Immigrants Genetic Approach (MIGA), and Memory and Random Immigrants Genetic Approach (MRIGA). The simulation results show that the proposed DGA scheme MRIGA outperforms well as compared with RIGA and MIGA in terms of various performance metrics such as the number of nodes alive, residual energy level, packet delivery ratio, end-to-end delay, and overhead for the formation of clusters.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"11 19 1","pages":"575-583"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87226218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Retina disorders classification via OCT scan: a comparative study between self-supervised learning and transfer learning 视网膜疾病OCT扫描分类:自监督学习与迁移学习的比较研究

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/3/8

Saeed Shurrab, Yazan Shannak, R. Duwairi

Retina disorders are among the common types of eye disease that occur due to several reasons such as aging, diabetes and premature born. Besides, Optical Coherence Tomography (OCT) is a medical imaging method that serves as a vehicle for capturing volumetric scans of the human eye retina for diagnoses purposes. This research compared two pretraining approaches including Self-Supervised Learning (SSL) and Transfer Learning (TL) to train ResNet34 neural architecture aiming at building computer aided diagnoses tool for retina disorders recognition. In addition, the research methodology employs convolutional auto-encoder model as a generative SSL pretraining method. The research efforts are implemented on a dataset that contains 109,309 retina OCT images with three medical conditions including Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME), DRUSEN as well as NORMAL condition. The research outcomes showed better performance in terms of overall accuracy, sensitivity and specificity, namely, 95.2%, 95.2% and 98.4% respectively for SSL ResNet34 in comparison to scores of 90.7%, 90.7% and 96.9% respectively for TL ResNet34. In addition, SSL pretraining approach showed significant reduction in the number of epochs required for training in comparison to both TL pretraining as well as the previous research performed on the same dataset with comparable performance.

视网膜疾病是由衰老、糖尿病和早产等多种原因引起的常见眼病之一。此外，光学相干断层扫描(OCT)是一种医学成像方法，可作为捕获人眼视网膜的体积扫描以用于诊断目的的工具。本研究比较了自监督学习(Self-Supervised Learning, SSL)和迁移学习(Transfer Learning, TL)两种预训练方法来训练ResNet34神经结构，旨在构建视网膜疾病识别的计算机辅助诊断工具。此外，研究方法采用卷积自编码器模型作为生成式SSL预训练方法。研究工作是在包含109,309张视网膜OCT图像的数据集上实施的，这些图像具有三种医疗条件，包括脉络膜新生血管(CNV)、糖尿病性黄斑水肿(DME)、DRUSEN以及正常状态。研究结果显示，SSL ResNet34的总体准确性、敏感性和特异性分别为95.2%、95.2%和98.4%，而TL ResNet34的总体准确性、敏感性和特异性分别为90.7%、90.7%和96.9%。此外，与TL预训练和之前在相同数据集上进行的研究相比，SSL预训练方法显示出训练所需的epoch数量显著减少。

{"title":"Retina disorders classification via OCT scan: a comparative study between self-supervised learning and transfer learning","authors":"Saeed Shurrab, Yazan Shannak, R. Duwairi","doi":"10.34028/iajit/20/3/8","DOIUrl":"https://doi.org/10.34028/iajit/20/3/8","url":null,"abstract":"Retina disorders are among the common types of eye disease that occur due to several reasons such as aging, diabetes and premature born. Besides, Optical Coherence Tomography (OCT) is a medical imaging method that serves as a vehicle for capturing volumetric scans of the human eye retina for diagnoses purposes. This research compared two pretraining approaches including Self-Supervised Learning (SSL) and Transfer Learning (TL) to train ResNet34 neural architecture aiming at building computer aided diagnoses tool for retina disorders recognition. In addition, the research methodology employs convolutional auto-encoder model as a generative SSL pretraining method. The research efforts are implemented on a dataset that contains 109,309 retina OCT images with three medical conditions including Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME), DRUSEN as well as NORMAL condition. The research outcomes showed better performance in terms of overall accuracy, sensitivity and specificity, namely, 95.2%, 95.2% and 98.4% respectively for SSL ResNet34 in comparison to scores of 90.7%, 90.7% and 96.9% respectively for TL ResNet34. In addition, SSL pretraining approach showed significant reduction in the number of epochs required for training in comparison to both TL pretraining as well as the previous research performed on the same dataset with comparable performance.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"28 1","pages":"357-367"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84319388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

In loco identity fraud detection model using statistical analysis for social networking sites: a case study with facebook 基于统计分析的社交网站身份欺诈检测模型:以facebook为例

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/2/15

Shalini Hanok, Shankaraiah

Rapid advancement in internet has made many Social Networking Sites (SNS) popular among a huge population, as various SNS accounts are interlinked with each other, spread of stored susceptible information of an individual is increasing. That has led to various security and privacy issues; one of them is impersonation or identity fraud. Identity fraud is the outcome of illegitimate or secret use of account owner’s identity to invade his/her account to track personal information. There are possibilities that known persons like parents, spouse, close friends, siblings who are interested in knowing what is going on in the account owner’s online life may check their personal SNS accounts. Hence an individual’s private SNS accounts can be invaded by an illegitimate user secretly without the knowledge of the account owner’s which results in compromise of private information. Thus, this paper proposes an in loco identity fraud detection strategy that employs a statistical analysis approach to constantly authenticate the authorized user, which outperforms the previously known technique. This strategy may be used to prevent stalkers from penetrating a person's SNS account in real time. The accuracy attained in this research is greater than 90% after 1 minute and greater than 95% after 5 minutes of observation.

互联网的飞速发展使得许多社交网站在庞大的人群中流行起来，由于各种社交网站账户之间的相互联系，存储的个人敏感信息的传播越来越大。这导致了各种安全和隐私问题;其中之一是冒充或身份欺诈。身份欺诈是指非法或秘密使用帐户所有者的身份侵入其帐户以跟踪个人信息的结果。父母、配偶、亲密的朋友、兄弟姐妹等熟悉的人有可能对账户所有者的网络生活感兴趣，可能会查看他们的个人SNS账户。因此，一个人的私人SNS账户可能会被非法用户在不知情的情况下秘密入侵，从而导致私人信息的泄露。因此，本文提出了一种采用统计分析方法对授权用户进行持续身份验证的在线身份欺诈检测策略，该策略优于现有技术。这种策略可以用来防止跟踪者实时侵入一个人的SNS账户。本研究在观察1分钟后获得的准确度大于90%，在观察5分钟后获得的准确度大于95%。

{"title":"In loco identity fraud detection model using statistical analysis for social networking sites: a case study with facebook","authors":"Shalini Hanok, Shankaraiah","doi":"10.34028/iajit/20/2/15","DOIUrl":"https://doi.org/10.34028/iajit/20/2/15","url":null,"abstract":"Rapid advancement in internet has made many Social Networking Sites (SNS) popular among a huge population, as various SNS accounts are interlinked with each other, spread of stored susceptible information of an individual is increasing. That has led to various security and privacy issues; one of them is impersonation or identity fraud. Identity fraud is the outcome of illegitimate or secret use of account owner’s identity to invade his/her account to track personal information. There are possibilities that known persons like parents, spouse, close friends, siblings who are interested in knowing what is going on in the account owner’s online life may check their personal SNS accounts. Hence an individual’s private SNS accounts can be invaded by an illegitimate user secretly without the knowledge of the account owner’s which results in compromise of private information. Thus, this paper proposes an in loco identity fraud detection strategy that employs a statistical analysis approach to constantly authenticate the authorized user, which outperforms the previously known technique. This strategy may be used to prevent stalkers from penetrating a person's SNS account in real time. The accuracy attained in this research is greater than 90% after 1 minute and greater than 95% after 5 minutes of observation.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"14 1","pages":"282-292"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88497903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mining android bytecodes through the eyes of gabor filters for detecting malware 通过gabor过滤器的眼睛挖掘android字节码来检测恶意软件

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/2/4

Shahid Alam, A. K. Demir

One of the basic characteristics of a Gabor filter is that it provides useful information about specific frequencies in a localized region. Such information can be used in locating snippets of code, i.e., localized code, in a program when transformed into an image for finding embedded malicious patterns. Keeping this phenomenon, we propose a novel technique using a sliding Window over Gabor filters for mining the Dalvik Executable (DEX) bytecodes of an Android application (APK) to find malicious patterns. We extract the structural and behavioral functionality and localized information of an APK through Gabor filtered images of the 2D grayscale image of the DEX bytecodes. A Window is slid over these features and a weight is assigned based on its frequency of use. The selected Windows whose weights are greater than a given threshold, are used for training a classifier to detect malware APKs. Our technique does not require any disassembly or execution of the malware program and hence is much safer and more accurate. To further improve feature selection, we apply a greedy optimization algorithm to find the best performing feature subset. The proposed technique, when tested using real malware and benign APKs, obtained a detection rate of 98.9% with 10-fold cross-validation.

Gabor滤波器的一个基本特性是它能提供局部区域内特定频率的有用信息。当转换成图像以查找嵌入的恶意模式时，这些信息可用于定位程序中的代码片段，即本地化代码。为了保持这种现象，我们提出了一种新的技术，使用Gabor过滤器上的滑动窗口来挖掘Android应用程序(APK)的Dalvik可执行文件(DEX)字节码来发现恶意模式。通过对DEX字节码的二维灰度图像进行Gabor滤波，提取APK的结构、行为功能和定位信息。在这些特征上滑动一个窗口，并根据其使用频率分配权重。选择权重大于给定阈值的Windows，用于训练分类器来检测恶意软件apk。我们的技术不需要任何反汇编或执行恶意软件程序，因此更安全，更准确。为了进一步改进特征选择，我们采用贪婪优化算法来寻找性能最好的特征子集。当使用真实恶意软件和良性apk进行测试时，该技术通过10倍交叉验证获得了98.9%的检测率。

{"title":"Mining android bytecodes through the eyes of gabor filters for detecting malware","authors":"Shahid Alam, A. K. Demir","doi":"10.34028/iajit/20/2/4","DOIUrl":"https://doi.org/10.34028/iajit/20/2/4","url":null,"abstract":"One of the basic characteristics of a Gabor filter is that it provides useful information about specific frequencies in a localized region. Such information can be used in locating snippets of code, i.e., localized code, in a program when transformed into an image for finding embedded malicious patterns. Keeping this phenomenon, we propose a novel technique using a sliding Window over Gabor filters for mining the Dalvik Executable (DEX) bytecodes of an Android application (APK) to find malicious patterns. We extract the structural and behavioral functionality and localized information of an APK through Gabor filtered images of the 2D grayscale image of the DEX bytecodes. A Window is slid over these features and a weight is assigned based on its frequency of use. The selected Windows whose weights are greater than a given threshold, are used for training a classifier to detect malware APKs. Our technique does not require any disassembly or execution of the malware program and hence is much safer and more accurate. To further improve feature selection, we apply a greedy optimization algorithm to find the best performing feature subset. The proposed technique, when tested using real malware and benign APKs, obtained a detection rate of 98.9% with 10-fold cross-validation.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"33 1","pages":"180-189"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80422811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

An adaptive traffic lights system using machine learning 使用机器学习的自适应交通灯系统

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/3/13

M. Ottom, A. Al-Omari

Traffic congestion is a major problem in many cities of the Hashemite Kingdom of Jordan as in most countries. The rapidly increase of vehicles and dealing with the fixed infrastructure have caused traffic congestion. One of the main problems is that the current infrastructure cannot be expanded further. Therefore, there is a need to make the system work differently with more sophistication to manage the traffic better, rather than creating a new infrastructure. In this research, a new adaptive traffic lights system is proposed to determine vehicles type, calculate the number of vehicles in a traffic junction using patterns detection methods, and suggest the necessary time for each side of the traffic junction using machine learning tools. In this context, the contributions of this paper are: (a) creating a new image-based dataset for vehicles, (b) proposing a new time management formula for traffic lights, and (c) providing literature of many studies that contributed to the development of the traffic lights system in the past decade. For training the vehicle detector, we have created an image-based dataset related to our work and contains images for traffic. We utilized Region-Based Convolutional Neural Networks (R-CNN), Fast Region-Based Convolutional Neural Networks (Fast R-CNN), Faster Region-Based Convolutional Neural Networks (Faster R-CNN), Single Shot Detector (SSD), and You Only Look Once v4 (YOLO v4) deep learning algorithms to train the model and obtain the suggested mathematical formula to the required process and give the appropriate timeslot for every junction. For evaluation, we used the mean Average Precision (mAP) metric. The obtained results were as follows: 78.2%, 71%, 75.2%, 79.8%, and 86.4% for SSD, R-CNN, Fast R-CNN, Faster R-CNN, and YOLO v4, respectively. Based on our experimental results, it is found that YOLO v4 achieved the highest mAP of the identification of vehicles with (86.4%) mAP. For time division (the junctions timeslot), we proposed a formula that reduces about 10% of the waiting time for vehicles.

与大多数国家一样，交通堵塞是约旦哈希姆王国许多城市的一个主要问题。车辆的快速增加和对固定基础设施的处理造成了交通拥堵。其中一个主要问题是，目前的基础设施无法进一步扩大。因此，有必要让系统以不同的方式工作，更复杂地管理交通，而不是创建一个新的基础设施。在本研究中，提出了一种新的自适应交通灯系统来确定车辆类型，使用模式检测方法计算交通路口的车辆数量，并使用机器学习工具建议交通路口每侧的必要时间。在此背景下，本文的贡献是:(a)为车辆创建了一个新的基于图像的数据集，(b)提出了一个新的交通信号灯时间管理公式，以及(c)提供了过去十年中对交通信号灯系统发展做出贡献的许多研究的文献。为了训练车辆检测器，我们创建了一个与我们的工作相关的基于图像的数据集，其中包含交通图像。我们利用基于区域的卷积神经网络(R-CNN)、快速区域卷积神经网络(Fast R-CNN)、更快区域卷积神经网络(Faster R-CNN)、单镜头检测器(SSD)和You Only Look Once v4 (YOLO v4)深度学习算法对模型进行训练，得到所需过程的建议数学公式，并给出每个连接点的适当时间段。为了进行评估，我们使用了平均精度(mAP)度量。结果表明:SSD、R-CNN、Fast R-CNN、Faster R-CNN、YOLO v4分别为78.2%、71%、75.2%、79.8%、86.4%。根据我们的实验结果，YOLO v4的车辆识别mAP最高，为86.4%。对于时间划分(路口时隙)，我们提出了一个公式，可以减少约10%的车辆等待时间。

{"title":"An adaptive traffic lights system using machine learning","authors":"M. Ottom, A. Al-Omari","doi":"10.34028/iajit/20/3/13","DOIUrl":"https://doi.org/10.34028/iajit/20/3/13","url":null,"abstract":"Traffic congestion is a major problem in many cities of the Hashemite Kingdom of Jordan as in most countries. The rapidly increase of vehicles and dealing with the fixed infrastructure have caused traffic congestion. One of the main problems is that the current infrastructure cannot be expanded further. Therefore, there is a need to make the system work differently with more sophistication to manage the traffic better, rather than creating a new infrastructure. In this research, a new adaptive traffic lights system is proposed to determine vehicles type, calculate the number of vehicles in a traffic junction using patterns detection methods, and suggest the necessary time for each side of the traffic junction using machine learning tools. In this context, the contributions of this paper are: (a) creating a new image-based dataset for vehicles, (b) proposing a new time management formula for traffic lights, and (c) providing literature of many studies that contributed to the development of the traffic lights system in the past decade. For training the vehicle detector, we have created an image-based dataset related to our work and contains images for traffic. We utilized Region-Based Convolutional Neural Networks (R-CNN), Fast Region-Based Convolutional Neural Networks (Fast R-CNN), Faster Region-Based Convolutional Neural Networks (Faster R-CNN), Single Shot Detector (SSD), and You Only Look Once v4 (YOLO v4) deep learning algorithms to train the model and obtain the suggested mathematical formula to the required process and give the appropriate timeslot for every junction. For evaluation, we used the mean Average Precision (mAP) metric. The obtained results were as follows: 78.2%, 71%, 75.2%, 79.8%, and 86.4% for SSD, R-CNN, Fast R-CNN, Faster R-CNN, and YOLO v4, respectively. Based on our experimental results, it is found that YOLO v4 achieved the highest mAP of the identification of vehicles with (86.4%) mAP. For time division (the junctions timeslot), we proposed a formula that reduces about 10% of the waiting time for vehicles.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"358 1","pages":"407-418"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84891721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Genetic Algorithm based Domain Adaptation Framework for Classification of Disaster Topic Text Tweets 基于遗传算法的灾难主题文本推文分类领域自适应框架

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/1/7

Lokabhiram Dwarakanath, A. Kamsin, Liyana Shuib

The ability to post short text and media messages on Social media platforms like Twitter, Facebook, etc., plays a huge role in the exchange of information following a mass emergency event like hurricane, earthquake, tsunami etc. Disaster victims, families, and other relief operation teams utilize social media to help and support one another. Despite the benefits offered by these communication media, the disaster topic related posts (posts that indicate conversations about the disaster event in the aftermath of the disaster) gets lost in the deluge of posts since there would be a surge in the amount of data that gets exchanged following a mass emergency event. This hampers the emergency relief effort, which in turn affects the delivery of useful information to the disaster victims. Research in emergency coordination via social media has received growing interest in recent years, mainly focusing on developing machine learning-based models that can separate disaster-related topic posts from non-disaster related topic posts. Of these, supervised machine learning approaches performed well when the machine learning model trained using source disaster dataset and target disaster dataset are similar. However, in the real world, it may not be feasible as different disasters have different characteristics. So, models developed using supervised machine learning approaches do not perform well in unseen disaster datasets. Therefore, domain adaptation approaches, which address the above limitation by learning classifiers from unlabeled target data in addition to source labelled data, represent a promising direction for social media crisis data classification tasks. The existing domain adaptation techniques for the classification of disaster tweets are experimented with using single disaster event dataset pairs; then, self-training is performed on the source target dataset pairs by considering the highly confident instances in subsequent iterations of training. This could be improved with better feature engineering. Thus, this research proposes a Genetic Algorithm based Domain Adaptation Framework (GADA) for the classification of disaster tweets. The proposed GADA combines the power of 1) Hybrid Feature Selection component using the Genetic Algorithm and Chi-Square Feature Evaluator for feature selection and 2) the Classifier component using Random Forest to classify disaster-related posts from noise on Twitter. The proposed framework addresses the challenge of the lack of labeled data in the target disaster event by proposing a Genetic Algorithm based approach. Experimental results on Twitter datasets corresponding to four disaster domain pair shows that the proposed framework improves the overall performance of the previous supervised approaches and significantly reduces the training time over the previous domain adaptation techniques that do not use the Genetic Algorithm (GA) for feature selection.

在Twitter、Facebook等社交媒体平台上发布短文本和媒体信息的能力，在飓风、地震、海啸等大规模紧急事件后的信息交流中发挥了巨大作用。灾难受害者、家属和其他救援团队利用社交媒体互相帮助和支持。尽管这些通信媒体提供了好处，但与灾难主题相关的帖子(表明灾难发生后关于灾难事件的对话的帖子)会在大量帖子中丢失，因为在大规模紧急事件发生后交换的数据量会激增。这妨碍了紧急救济工作，而紧急救济工作又影响了向灾民提供有用信息的工作。近年来，通过社交媒体进行应急协调的研究日益受到关注，主要侧重于开发基于机器学习的模型，将与灾害有关的主题帖子与与灾害无关的主题帖子区分开来。其中，当使用源灾难数据集和目标灾难数据集训练的机器学习模型相似时，监督机器学习方法表现良好。然而，在现实世界中，由于不同的灾害具有不同的特征，这可能并不可行。因此，使用监督机器学习方法开发的模型在看不见的灾难数据集中表现不佳。因此，除了源标记数据之外，领域自适应方法通过从未标记的目标数据中学习分类器来解决上述限制，代表了社交媒体危机数据分类任务的一个有希望的方向。利用单灾难事件数据对，对现有的灾难推文分类领域自适应技术进行了实验;然后，通过考虑后续训练迭代中的高置信度实例，对源目标数据集对进行自训练。这可以通过更好的特征工程来改进。因此，本研究提出了一种基于遗传算法的领域自适应框架(GADA)用于灾难推文分类。提出的GADA结合了1)使用遗传算法和卡方特征评估器进行特征选择的混合特征选择组件和2)使用随机森林对Twitter上与灾害相关的帖子进行噪声分类的分类器组件的功能。该框架通过提出一种基于遗传算法的方法，解决了目标灾害事件中缺乏标记数据的挑战。在4个灾难域对对应的Twitter数据集上进行的实验结果表明，所提出的框架提高了先前监督方法的整体性能，并且与先前不使用遗传算法(GA)进行特征选择的域自适应技术相比，显著减少了训练时间。

{"title":"A Genetic Algorithm based Domain Adaptation Framework for Classification of Disaster Topic Text Tweets","authors":"Lokabhiram Dwarakanath, A. Kamsin, Liyana Shuib","doi":"10.34028/iajit/20/1/7","DOIUrl":"https://doi.org/10.34028/iajit/20/1/7","url":null,"abstract":"The ability to post short text and media messages on Social media platforms like Twitter, Facebook, etc., plays a huge role in the exchange of information following a mass emergency event like hurricane, earthquake, tsunami etc. Disaster victims, families, and other relief operation teams utilize social media to help and support one another. Despite the benefits offered by these communication media, the disaster topic related posts (posts that indicate conversations about the disaster event in the aftermath of the disaster) gets lost in the deluge of posts since there would be a surge in the amount of data that gets exchanged following a mass emergency event. This hampers the emergency relief effort, which in turn affects the delivery of useful information to the disaster victims. Research in emergency coordination via social media has received growing interest in recent years, mainly focusing on developing machine learning-based models that can separate disaster-related topic posts from non-disaster related topic posts. Of these, supervised machine learning approaches performed well when the machine learning model trained using source disaster dataset and target disaster dataset are similar. However, in the real world, it may not be feasible as different disasters have different characteristics. So, models developed using supervised machine learning approaches do not perform well in unseen disaster datasets. Therefore, domain adaptation approaches, which address the above limitation by learning classifiers from unlabeled target data in addition to source labelled data, represent a promising direction for social media crisis data classification tasks. The existing domain adaptation techniques for the classification of disaster tweets are experimented with using single disaster event dataset pairs; then, self-training is performed on the source target dataset pairs by considering the highly confident instances in subsequent iterations of training. This could be improved with better feature engineering. Thus, this research proposes a Genetic Algorithm based Domain Adaptation Framework (GADA) for the classification of disaster tweets. The proposed GADA combines the power of 1) Hybrid Feature Selection component using the Genetic Algorithm and Chi-Square Feature Evaluator for feature selection and 2) the Classifier component using Random Forest to classify disaster-related posts from noise on Twitter. The proposed framework addresses the challenge of the lack of labeled data in the target disaster event by proposing a Genetic Algorithm based approach. Experimental results on Twitter datasets corresponding to four disaster domain pair shows that the proposed framework improves the overall performance of the previous supervised approaches and significantly reduces the training time over the previous domain adaptation techniques that do not use the Genetic Algorithm (GA) for feature selection.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"53 1","pages":"57-65"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91358427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Highly accurate grey neural network classifier for an abdominal aortic aneurysm classification based on image processing approach 基于图像处理方法的腹主动脉瘤高精度灰色神经网络分类器

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/2/8

A. Bose, Vasuki Ramesh

An Abdominal Aorta Aneurysm (AAA) is an abnormal focal dilation of the aorta. Most un-ruptured AAAs are asymptomatic, which leads to the problem of having abdominal malignancy, kidney damage, heart attack and even death. As it is ominous, it requires an astute scrutinizing approach. The significance of this proposed work is to scrutinize the exact location of the ruptured region and to make astute report of the pathological condition of AAA by computing the Ruptured Potential Index (RPI). To determine these two factors, image processing is performed in the retrieved image of aneurysm. Initially, it undergoes a process to obtain a high-quality image by making use of Adaptive median filter. After retrieving high quality image, segmentation is carried out using Artificial Neural Network-based segmentation. After segmenting the image into samples, 12 features are extracted from the segmented image by Gray Level Co-Occurrence Matrix (GLCM), which assists in extracting the best feature out of it. This optimization is performed by using Particle Swarm Optimization (PSO). Finally, Grey Neural Network (GNN) classifier is applied to analogize the trained and test set data. This classifier helps to achieve the targeted objective with high accuracy.

腹主动脉动脉瘤(AAA)是一种异常局灶性扩张的主动脉。大多数未破裂的AAAs是无症状的，这可能导致腹部恶性肿瘤、肾脏损害、心脏病发作甚至死亡。由于这是一个不祥之兆，它需要一种敏锐的审视方法。这项工作的意义在于通过计算破裂电位指数(ruptured Potential Index, RPI)来精确地检查破裂区域的确切位置，并对AAA的病理状况做出准确的报告。为了确定这两个因素，对检索到的动脉瘤图像进行图像处理。首先，利用自适应中值滤波器获得高质量图像。在检索到高质量图像后，采用基于人工神经网络的分割方法进行分割。将图像分割成样本后，利用灰度共生矩阵(GLCM)从分割后的图像中提取12个特征，帮助提取最佳特征。该算法采用粒子群算法(PSO)进行优化。最后，利用灰色神经网络(GNN)分类器对训练集和测试集数据进行模拟。该分类器有助于以较高的准确率实现目标。

{"title":"Highly accurate grey neural network classifier for an abdominal aortic aneurysm classification based on image processing approach","authors":"A. Bose, Vasuki Ramesh","doi":"10.34028/iajit/20/2/8","DOIUrl":"https://doi.org/10.34028/iajit/20/2/8","url":null,"abstract":"An Abdominal Aorta Aneurysm (AAA) is an abnormal focal dilation of the aorta. Most un-ruptured AAAs are asymptomatic, which leads to the problem of having abdominal malignancy, kidney damage, heart attack and even death. As it is ominous, it requires an astute scrutinizing approach. The significance of this proposed work is to scrutinize the exact location of the ruptured region and to make astute report of the pathological condition of AAA by computing the Ruptured Potential Index (RPI). To determine these two factors, image processing is performed in the retrieved image of aneurysm. Initially, it undergoes a process to obtain a high-quality image by making use of Adaptive median filter. After retrieving high quality image, segmentation is carried out using Artificial Neural Network-based segmentation. After segmenting the image into samples, 12 features are extracted from the segmented image by Gray Level Co-Occurrence Matrix (GLCM), which assists in extracting the best feature out of it. This optimization is performed by using Particle Swarm Optimization (PSO). Finally, Grey Neural Network (GNN) classifier is applied to analogize the trained and test set data. This classifier helps to achieve the targeted objective with high accuracy.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"112 1","pages":"215-223"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79666666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Intelligent recognition of gas-liquid two-phase flow based on optical image 基于光学图像的气液两相流智能识别

Int. Arab J. Inf. Technol.

Pub Date : 2023-01-01 DOI: 10.34028/iajit/20/4/7

Shujuan Wang, Haofu Guan, Yuqing Wang, Kanghui Zhang, Yuntao Dai, S. Qiao

Gas-liquid two-phase flow is widely involved in many scientific and technological fields, such as energy, electricity, nuclear energy, aerospace and environmental protection. In some fields, extracting the accurate position of bubbles in space can not only accurately capture the characteristics of bubbles in two-phase flow, but also plays an important role in the subsequent research like bubble tracking. It has got some progresses to use Convolutional Neural Network (CNNs) to identify bubbles in gas-liquid two-phase flow, while accurate pixel segmentation map in the bubble identification problem is more desirable in many areas. In this paper, VGG16-FCN model and U-Net model are utilized to identify bubbles in two-phase flow images from the perspective of semantic segmentation. LabelMe is used to label the images in the experiment, which can remove the noise in the original image. In addition, bubble pixels with low ratio relative to the background affects the loss function value tinily which cause the irrational evaluation for the recognition in traditional semantic segmentation, thus, Dice loss is used as the loss function for training to improve the recognition effect. The research results show that the two deep learning models have strong feature extraction ability and accurately detect the bubble boundary.

气液两相流广泛涉及能源、电力、核能、航空航天、环保等诸多科技领域。在某些领域，准确提取气泡在空间中的位置不仅可以准确地捕捉两相流中气泡的特征，而且在气泡跟踪等后续研究中具有重要作用。利用卷积神经网络(cnn)识别气液两相流中的气泡已经取得了一定的进展，而在气泡识别问题中更需要精确的像素分割图。本文从语义分割的角度，利用VGG16-FCN模型和U-Net模型对两相流图像中的气泡进行识别。LabelMe用于对实验中的图像进行标记，可以去除原始图像中的噪声。此外，由于气泡像素相对于背景的比例较低，对损失函数值的影响较小，导致传统语义分割中对识别的评价不合理，因此采用Dice loss作为损失函数进行训练，提高识别效果。研究结果表明，两种深度学习模型具有较强的特征提取能力，能够准确检测气泡边界。

{"title":"Intelligent recognition of gas-liquid two-phase flow based on optical image","authors":"Shujuan Wang, Haofu Guan, Yuqing Wang, Kanghui Zhang, Yuntao Dai, S. Qiao","doi":"10.34028/iajit/20/4/7","DOIUrl":"https://doi.org/10.34028/iajit/20/4/7","url":null,"abstract":"Gas-liquid two-phase flow is widely involved in many scientific and technological fields, such as energy, electricity, nuclear energy, aerospace and environmental protection. In some fields, extracting the accurate position of bubbles in space can not only accurately capture the characteristics of bubbles in two-phase flow, but also plays an important role in the subsequent research like bubble tracking. It has got some progresses to use Convolutional Neural Network (CNNs) to identify bubbles in gas-liquid two-phase flow, while accurate pixel segmentation map in the bubble identification problem is more desirable in many areas. In this paper, VGG16-FCN model and U-Net model are utilized to identify bubbles in two-phase flow images from the perspective of semantic segmentation. LabelMe is used to label the images in the experiment, which can remove the noise in the original image. In addition, bubble pixels with low ratio relative to the background affects the loss function value tinily which cause the irrational evaluation for the recognition in traditional semantic segmentation, thus, Dice loss is used as the loss function for training to improve the recognition effect. The research results show that the two deep learning models have strong feature extraction ability and accurately detect the bubble boundary.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":"263 1","pages":"609-617"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76772650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1