首页 > 最新文献

2022 7th International Workshop on Big Data and Information Security (IWBIS)最新文献

英文 中文
Modeling Person’s Creditworthiness over Their Demography and Personality Appearance in Social Media 通过社交媒体上的人口统计和个性表现来建模人的信誉
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924843
A. Alamsyah, D. P. Ramadhani, Syifa Afina Ekaputri
Financial institutions currently use credit history to determine whether to grant creditors credit. However, companies such as P2P Lending has a data shortage, especially credit history data, so innovative credit models emerge to improve the ability to assess creditors. Along with technology development, we have the opportunity to extract data from social media. This study uses social media data to create models for assessing creditworthiness. We collect data from social media and then process it using the credit scoring scorecard, linear correlation formula, credit scoring model weight composition, and threshold according to expert judgments. We find that by using a greater weight of the demographic attributes, we receive more data in the good credit category. This research on establishing model combinations contributes to assisting and making it easier for lenders to assess creditors using available data in a more practical way.
金融机构目前使用信用记录来决定是否给予债权人信贷。然而,P2P等公司缺乏数据,尤其是信用历史数据,因此出现了创新的信用模型,以提高对债权人的评估能力。随着科技的发展,我们有机会从社交媒体中提取数据。这项研究使用社交媒体数据来创建评估信用的模型。我们从社交媒体中收集数据,然后根据专家判断,使用信用评分记分卡、线性相关公式、信用评分模型权重构成和阈值进行处理。我们发现,通过使用更大的人口统计属性权重,我们在良好信用类别中收到更多的数据。这项关于建立模型组合的研究有助于帮助贷方以更实际的方式利用现有数据更容易地评估债权人。
{"title":"Modeling Person’s Creditworthiness over Their Demography and Personality Appearance in Social Media","authors":"A. Alamsyah, D. P. Ramadhani, Syifa Afina Ekaputri","doi":"10.1109/IWBIS56557.2022.9924843","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924843","url":null,"abstract":"Financial institutions currently use credit history to determine whether to grant creditors credit. However, companies such as P2P Lending has a data shortage, especially credit history data, so innovative credit models emerge to improve the ability to assess creditors. Along with technology development, we have the opportunity to extract data from social media. This study uses social media data to create models for assessing creditworthiness. We collect data from social media and then process it using the credit scoring scorecard, linear correlation formula, credit scoring model weight composition, and threshold according to expert judgments. We find that by using a greater weight of the demographic attributes, we receive more data in the good credit category. This research on establishing model combinations contributes to assisting and making it easier for lenders to assess creditors using available data in a more practical way.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122552607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of Natural Language Processing Using Cosine-Similarity Algorithm in Making Chatbot Information on the New Capital City of the Republic of Indonesia 余弦相似度算法自然语言处理在印尼共和国新首都聊天机器人信息制作中的应用
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924902
H. T. Y. Achsan, D. Kurniawan, Diki Gita Purnama, Quintin Kurnia Dikara Barcah, Yuri Yusyana Astoria
The new capital city (IKN) of the Republic of Indonesia has been ratified and inaugurated by President Joko Widodo since January 2022. Unfortunately, there are still many Indonesian citizens who do not understand all the information regarding the determination of the new capital city. Even though the Indonesian Government has created an official website regarding the new capital city (www.ikn.go.id) the information is still not optimal because web page visitors are still unable to interact actively with the required information. Therefore, the development of the Chatting Robot (Chatbot) application is deemed necessary to become an interactive component in obtaining information needed by users related to new capital city. In this study, a chatbot application was developed by applying Natural Language Processing (NLP) using the Term Frequency-Inverse Document Frequency (TF-IDF) method for term weighting and the Cosine-Similarity algorithm to calculate the similarity of the questions asked by the user. The research successfully designed and developed a chatbot application using the Cosine-Similarity algorithm. The testing phase of the chatbot model uses several scenarios related to the points of NLP implementation. The test results show that all scenarios of questions asked can be responded well by the chatbot.
印度尼西亚共和国的新首都(IKN)已于2022年1月由总统佐科·维多多(Joko Widodo)批准并揭幕。不幸的是,仍有许多印尼公民不了解有关确定新首都的所有信息。尽管印尼政府已经建立了一个关于新首都的官方网站(www.ikn.go.id),但信息仍然不是最理想的,因为网页访问者仍然无法积极地与所需的信息互动。因此,开发聊天机器人(Chatbot)应用程序是必要的,它可以成为用户获取新首都相关信息的交互组件。在本研究中,应用自然语言处理(NLP)开发了一个聊天机器人应用程序,使用术语频率-逆文档频率(TF-IDF)方法进行术语加权,并使用余弦相似度算法计算用户提出的问题的相似度。本研究成功地设计并开发了一个基于余弦相似度算法的聊天机器人应用。聊天机器人模型的测试阶段使用了与NLP实现点相关的几个场景。测试结果表明,该聊天机器人可以很好地响应所有场景的提问。
{"title":"Application of Natural Language Processing Using Cosine-Similarity Algorithm in Making Chatbot Information on the New Capital City of the Republic of Indonesia","authors":"H. T. Y. Achsan, D. Kurniawan, Diki Gita Purnama, Quintin Kurnia Dikara Barcah, Yuri Yusyana Astoria","doi":"10.1109/IWBIS56557.2022.9924902","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924902","url":null,"abstract":"The new capital city (IKN) of the Republic of Indonesia has been ratified and inaugurated by President Joko Widodo since January 2022. Unfortunately, there are still many Indonesian citizens who do not understand all the information regarding the determination of the new capital city. Even though the Indonesian Government has created an official website regarding the new capital city (www.ikn.go.id) the information is still not optimal because web page visitors are still unable to interact actively with the required information. Therefore, the development of the Chatting Robot (Chatbot) application is deemed necessary to become an interactive component in obtaining information needed by users related to new capital city. In this study, a chatbot application was developed by applying Natural Language Processing (NLP) using the Term Frequency-Inverse Document Frequency (TF-IDF) method for term weighting and the Cosine-Similarity algorithm to calculate the similarity of the questions asked by the user. The research successfully designed and developed a chatbot application using the Cosine-Similarity algorithm. The testing phase of the chatbot model uses several scenarios related to the points of NLP implementation. The test results show that all scenarios of questions asked can be responded well by the chatbot.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130589946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of Ship Detection Using OPENCV YOLO Method on Unmanned Prototype Boat for Monitoring National Sea 基于OPENCV YOLO方法的国海无人样机船船舶检测研究
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924891
A. P. W. Aldryani, Hendrana Tjahjadi, I. A. Dahlan, I. Kholis, Reza Istoni, Angelita Friskilla Bangun, Anry Christiano Tambunan, Jhon Kristel Sabathino Pigome
Indonesia is a vast archipelago and a large sea area. Quoting from the Preamble of the 1945 Constitution states that the purpose of the Government of the Republic of Indonesia is to protect the entire Indonesian nation and its homeland. Therefore, defensive aspects need to be taken into special account. The Sea Defense System requires an agile unmanned mini-sea boat maneuvering capability which is able to secure the sea area according to its function and detect foreign ships. Indonesia needs to be more vigilant and detect its underwater defenses so that intruders do not attempt to violate the sovereignty of the Republic of Indonesia. This research is intended as a solution by designing a mini marine prototype to protect and strengthen Indonesian sea border. The system is developed using OpenCV and YOLO (You Only Look Once) method to detect ship. It is developed on an laptop which run on Linux. This research yields the results of the ship detection system by percentage of precision confidence level range 54–96% and several factors of undetectable condition, namely camouflaged ship, half body of ship image, and sunset condition.
印度尼西亚是一个幅员辽阔的群岛和广阔的海域。引用1945年《宪法》的序言指出,印度尼西亚共和国政府的宗旨是保护整个印度尼西亚民族及其家园。因此,防御方面需要特别考虑。海上防御系统需要一种灵活的无人小型海上艇机动能力,能够根据其功能保护该海域并探测外国船只。印度尼西亚需要提高警惕,探测其水下防御,使入侵者不会试图侵犯印度尼西亚共和国的主权。本研究旨在通过设计一个小型海洋原型来保护和加强印度尼西亚的海上边界。该系统采用OpenCV和YOLO (You Only Look Once)方法对船舶进行检测。它是在运行Linux的笔记本电脑上开发的。本研究以精度置信度百分比在54-96%的范围内,结合伪装船舶、半船身图像、日落条件等几个不可检测条件,得出船舶检测系统的结果。
{"title":"Development of Ship Detection Using OPENCV YOLO Method on Unmanned Prototype Boat for Monitoring National Sea","authors":"A. P. W. Aldryani, Hendrana Tjahjadi, I. A. Dahlan, I. Kholis, Reza Istoni, Angelita Friskilla Bangun, Anry Christiano Tambunan, Jhon Kristel Sabathino Pigome","doi":"10.1109/IWBIS56557.2022.9924891","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924891","url":null,"abstract":"Indonesia is a vast archipelago and a large sea area. Quoting from the Preamble of the 1945 Constitution states that the purpose of the Government of the Republic of Indonesia is to protect the entire Indonesian nation and its homeland. Therefore, defensive aspects need to be taken into special account. The Sea Defense System requires an agile unmanned mini-sea boat maneuvering capability which is able to secure the sea area according to its function and detect foreign ships. Indonesia needs to be more vigilant and detect its underwater defenses so that intruders do not attempt to violate the sovereignty of the Republic of Indonesia. This research is intended as a solution by designing a mini marine prototype to protect and strengthen Indonesian sea border. The system is developed using OpenCV and YOLO (You Only Look Once) method to detect ship. It is developed on an laptop which run on Linux. This research yields the results of the ship detection system by percentage of precision confidence level range 54–96% and several factors of undetectable condition, namely camouflaged ship, half body of ship image, and sunset condition.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131819224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of Data Visualization in Defense 数据可视化在防御中的实现
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924722
Manan, Gathut Imam Gunadi, G. R. Deksino
The use of data visualization in defense is very important. In the application of data visualization can use GIS applications. GIS can help a country in maintaining the integrity of the country. This study aims to show how data visualization using GIS is used in defense. Methodology This study uses a qualitative research methodology. The results of this study are that by knowing visualization data assisted by GIS applications, you can find out the Hazard and Vulnerability of the terrain in the country. With data visualization, it can also provide the supply chain needed in the defense industry as well as in times of war. And everyone who needs this visualization data can better plan the network plan and urban plan that will be used.
在防御中使用数据可视化是非常重要的。在数据可视化的应用中可以利用GIS的应用。地理信息系统可以帮助一个国家保持国家的完整。本研究旨在展示如何使用GIS进行数据可视化在国防中使用。本研究采用定性研究方法。本研究的结果是,通过GIS应用辅助的可视化数据,可以发现该国地形的危害和脆弱性。通过数据可视化,它还可以提供国防工业以及战争时期所需的供应链。每个需要这些可视化数据的人都可以更好地规划将要使用的网络规划和城市规划。
{"title":"Implementation of Data Visualization in Defense","authors":"Manan, Gathut Imam Gunadi, G. R. Deksino","doi":"10.1109/IWBIS56557.2022.9924722","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924722","url":null,"abstract":"The use of data visualization in defense is very important. In the application of data visualization can use GIS applications. GIS can help a country in maintaining the integrity of the country. This study aims to show how data visualization using GIS is used in defense. Methodology This study uses a qualitative research methodology. The results of this study are that by knowing visualization data assisted by GIS applications, you can find out the Hazard and Vulnerability of the terrain in the country. With data visualization, it can also provide the supply chain needed in the defense industry as well as in times of war. And everyone who needs this visualization data can better plan the network plan and urban plan that will be used.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126779307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blockchain Security for 5G Network using Internet of Things Devices 使用物联网设备的5G网络区块链安全
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924937
O. Khalifa, M. Z. Ahmed, R. Saeed, Saleh Hussaini, A. H. Hashim, Elmahdi A. El-Khazmi
Network of vehicles using Internet of Things (IoT) frameworks have efficient characteristics of modern intelligent transportation system with a few challenges in vehicular ad-hoc networks (VANETs). However, its security framework is required to manage trust management by preserving user privacy. Wireless mobile communication (5G) system is regarded as an outstanding technology that provide ultra-reliable with limited latency wireless communication services. By extension, integrating Software Defined Network (SDN) with 5G-VANET enhances global information gathering and network control. Therefore, real-time IoT application for monitoring transport services is efficiently supported. These ensures vehicular security on this framework. This paper provides a technical solution to a self-confidential framework for a smart transport system. This process exploiting IoT for vehicle communication by incorporating SDN and 5G technology. Due to some features of blockchain, this framework has been implemented to provide various alternative support for vehicular smart services. This involves real-time access to cloud to stream video information and protection management to vehicular network. The implemented framework presents a promising technique and reliable vehicular IoT environment while ensuring user privacy. Results of simulation presents that vehicular nodes/messages (malicious) and overhead is detected and the impact on network performance are satisfactory when deployed in large-scale network scenarios.
采用物联网(IoT)框架的车辆网络具有现代智能交通系统的高效特征,但在车辆自组织网络(VANETs)中存在一些挑战。然而,它的安全框架需要通过保护用户隐私来管理信任管理。无线移动通信(5G)系统被认为是提供超可靠、低延迟无线通信服务的一项杰出技术。通过扩展,软件定义网络(SDN)与5G-VANET的集成增强了全球信息收集和网络控制。因此,有效支持物联网实时监控运输服务。这些确保了车辆在此框架上的安全性。本文为智能交通系统的自保密框架提供了一个技术解决方案。该过程通过结合SDN和5G技术,将物联网用于车辆通信。由于区块链的一些特性,该框架已经实现,为车辆智能服务提供各种替代支持。这涉及到实时访问云来传输视频信息和对车辆网络的保护管理。实现的框架在保证用户隐私的同时,提供了一种有前途的技术和可靠的车载物联网环境。仿真结果表明,在大规模网络场景下部署时,检测到车辆节点/消息(恶意)和开销,对网络性能的影响令人满意。
{"title":"Blockchain Security for 5G Network using Internet of Things Devices","authors":"O. Khalifa, M. Z. Ahmed, R. Saeed, Saleh Hussaini, A. H. Hashim, Elmahdi A. El-Khazmi","doi":"10.1109/IWBIS56557.2022.9924937","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924937","url":null,"abstract":"Network of vehicles using Internet of Things (IoT) frameworks have efficient characteristics of modern intelligent transportation system with a few challenges in vehicular ad-hoc networks (VANETs). However, its security framework is required to manage trust management by preserving user privacy. Wireless mobile communication (5G) system is regarded as an outstanding technology that provide ultra-reliable with limited latency wireless communication services. By extension, integrating Software Defined Network (SDN) with 5G-VANET enhances global information gathering and network control. Therefore, real-time IoT application for monitoring transport services is efficiently supported. These ensures vehicular security on this framework. This paper provides a technical solution to a self-confidential framework for a smart transport system. This process exploiting IoT for vehicle communication by incorporating SDN and 5G technology. Due to some features of blockchain, this framework has been implemented to provide various alternative support for vehicular smart services. This involves real-time access to cloud to stream video information and protection management to vehicular network. The implemented framework presents a promising technique and reliable vehicular IoT environment while ensuring user privacy. Results of simulation presents that vehicular nodes/messages (malicious) and overhead is detected and the impact on network performance are satisfactory when deployed in large-scale network scenarios.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124176477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OmeTV Pretexting Phishing Attacks: A Case Study of Social Engineering metv借口网络钓鱼攻击:社会工程案例研究
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924801
Girinoto, Dimas Febriyan Priambodo, Tiyas Yulita, R. K. A. M. Zulkham, A. Rifqi, A. S. Putri
One of the most common types of social engineering attacks is phishing. This technique uses psychological manipulation of the target to unknowingly hand over the information the attacker wants. Our research tries to find out how a phishing attack can be executed by first pretexting the OmeTV video chat application. The target of our attack is OmeTV players from Indonesia who are over 18 years old. The proposed attack methodology is Social Engineering Session (SES). This study also aims to provide an overview of what kind of information can be extracted from the target during an attack. The results show that the pretexting phishing attack on the OmeTV video chat application was successfully carried out to obtain some of the target’s personal information including: full name, date of birth or age, address, educational status, hobbies, Instagram account, and phone number.
最常见的社会工程攻击类型之一是网络钓鱼。这种技术利用目标的心理操纵,在不知情的情况下交出攻击者想要的信息。我们的研究试图找出网络钓鱼攻击是如何通过首先冒充metv视频聊天应用程序来执行的。我们的攻击目标是来自印尼的18岁以上的metv玩家。提出的攻击方法是社会工程会话(SES)。本研究还旨在概述在攻击期间可以从目标中提取什么样的信息。结果表明,对OmeTV视频聊天应用进行的假借网络钓鱼攻击成功,获取了目标的一些个人信息,包括:全名、出生日期或年龄、地址、教育状况、爱好、Instagram账号、电话号码等。
{"title":"OmeTV Pretexting Phishing Attacks: A Case Study of Social Engineering","authors":"Girinoto, Dimas Febriyan Priambodo, Tiyas Yulita, R. K. A. M. Zulkham, A. Rifqi, A. S. Putri","doi":"10.1109/IWBIS56557.2022.9924801","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924801","url":null,"abstract":"One of the most common types of social engineering attacks is phishing. This technique uses psychological manipulation of the target to unknowingly hand over the information the attacker wants. Our research tries to find out how a phishing attack can be executed by first pretexting the OmeTV video chat application. The target of our attack is OmeTV players from Indonesia who are over 18 years old. The proposed attack methodology is Social Engineering Session (SES). This study also aims to provide an overview of what kind of information can be extracted from the target during an attack. The results show that the pretexting phishing attack on the OmeTV video chat application was successfully carried out to obtain some of the target’s personal information including: full name, date of birth or age, address, educational status, hobbies, Instagram account, and phone number.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133976405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
IndoKEPLER, IndoWiki, and IndoLAMA: A Knowledge-enhanced Language Model, Dataset, and Benchmark for the Indonesian Language IndoKEPLER, IndoWiki和IndoLAMA:印尼语言的知识增强语言模型,数据集和基准
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924844
Inigo Ramli, A. Krisnadhi, Radityo Eko Prasojo
Pretrained language models posses an ability to learn the structural representation of a natural language by processing unstructured textual data. However, the current language model design lacks the ability to learn factual knowledge from knowledge graphs. Several attempts have been made to address this issue, such as the development of KEPLER. KEPLER combines the BERT language model and TransE knowledge embedding method to achieve a language model that can incorporate knowledge graphs as training data. Unfortunately, such knowledge enhanced language model is not yet available for the Indonesian language. In this experiment, we propose IndoKEPLER: a language model trained usingWikipedia Bahasa Indonesia andWikidata. We also create a new knowledge probing benchmark named IndoLAMA to test the ability of a language model to recall factual knowledge. The benchmark is based on LAMA, which is designed to test the suitability of our language model to be used as a knowledge base. IndoLAMA tests a language model by giving cloze style question and compare the prediction of the model to the factually correct answer. This experiment shows that IndoKEPLER increases the ability of a normal DistilBERT model to recall factual knowledge by 0.8%. Moreover, the most significant increase happens when dealing with many-to-one relationships, where IndoKEPLER outperforms it’s original text encoder model by 3%.
预训练语言模型具有通过处理非结构化文本数据来学习自然语言的结构化表示的能力。然而,目前的语言模型设计缺乏从知识图中学习事实知识的能力。为了解决这个问题,人们做了一些尝试,比如开发开普勒望远镜。KEPLER将BERT语言模型与TransE知识嵌入方法相结合,实现了一种可以将知识图作为训练数据的语言模型。不幸的是,这种知识增强的语言模型还不能用于印尼语。在这个实验中,我们提出了IndoKEPLER:一个使用维基百科印尼语和维基数据训练的语言模型。我们还创建了一个名为IndoLAMA的新的知识探测基准来测试语言模型回忆事实知识的能力。这个基准是基于LAMA的,它被设计用来测试我们的语言模型作为知识库的适用性。IndoLAMA通过给出完形填空式问题来测试语言模型,并将模型的预测结果与实际正确答案进行比较。这个实验表明,IndoKEPLER使一个正常的蒸馏酒模型回忆事实知识的能力提高了0.8%。此外,最显著的增长发生在处理多对一关系时,IndoKEPLER比其原始文本编码器模型高出3%。
{"title":"IndoKEPLER, IndoWiki, and IndoLAMA: A Knowledge-enhanced Language Model, Dataset, and Benchmark for the Indonesian Language","authors":"Inigo Ramli, A. Krisnadhi, Radityo Eko Prasojo","doi":"10.1109/IWBIS56557.2022.9924844","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924844","url":null,"abstract":"Pretrained language models posses an ability to learn the structural representation of a natural language by processing unstructured textual data. However, the current language model design lacks the ability to learn factual knowledge from knowledge graphs. Several attempts have been made to address this issue, such as the development of KEPLER. KEPLER combines the BERT language model and TransE knowledge embedding method to achieve a language model that can incorporate knowledge graphs as training data. Unfortunately, such knowledge enhanced language model is not yet available for the Indonesian language. In this experiment, we propose IndoKEPLER: a language model trained usingWikipedia Bahasa Indonesia andWikidata. We also create a new knowledge probing benchmark named IndoLAMA to test the ability of a language model to recall factual knowledge. The benchmark is based on LAMA, which is designed to test the suitability of our language model to be used as a knowledge base. IndoLAMA tests a language model by giving cloze style question and compare the prediction of the model to the factually correct answer. This experiment shows that IndoKEPLER increases the ability of a normal DistilBERT model to recall factual knowledge by 0.8%. Moreover, the most significant increase happens when dealing with many-to-one relationships, where IndoKEPLER outperforms it’s original text encoder model by 3%.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132267701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large-scale 3D Point Cloud Semantic Segmentation with 3D U-Net ASPP Sparse CNN 基于3D U-Net ASPP稀疏CNN的大规模三维点云语义分割
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924988
Naufal Muhammad Hirzi, M. A. Ma'sum, Mahardhika Pratama, W. Jatmiko
3D geometric modelling of urban areas has the potential for further development, not only for 3D urban visualization. 3D point cloud, as 3D data commonly used in 3D urban geometry modelling, is needed to extract objects from point clouds to analyze urban landscapes. An automated method to analyze objects from the 3D point cloud can be achieved by using the semantic segmentation method. Unlike other segmentation tasks in 3D point cloud data, 3D urban point cloud segmentation has the challenge of segmenting different object sizes on various types of landscape contours with imbalanced distribution of the object. Therefore, this study modified 3D U-Net Sparse CNN by adding Atrous Spatial Pyramid Pooling (ASPP) as one of the modules in this model, called 3D U-Net ASPP Sparse CNN. The use of ASPP aims to get the contextual multi-scale information of the input feature map from the encoder part of U-Net. Furthermore, 3D U-Net ASPP Sparse CNN is implemented by using weighted dice loss as the loss function. The experiment result shows 3D U-Net ASPP Sparse CNN with weighted dice loss has achieved the best evaluation score in our experiment, with OA = 96.53 and mIoU = 63.59.
城市区域的三维几何建模具有进一步发展的潜力,不仅仅是三维城市可视化。三维点云作为三维城市几何建模中常用的三维数据,需要从点云中提取物体进行城市景观分析。利用语义分割方法可以实现对三维点云对象的自动分析。与其他三维点云数据的分割任务不同,三维城市点云分割面临着在不同类型的景观轮廓上分割不同物体大小和物体分布不平衡的挑战。因此,本研究对3D U-Net稀疏CNN进行了改进,将astrous Spatial Pyramid Pooling (ASPP)作为该模型的模块之一,称为3D U-Net ASPP稀疏CNN。使用ASPP的目的是从U-Net的编码器部分获取输入特征图的上下文多尺度信息。利用加权骰子损失作为损失函数,实现了三维U-Net ASPP稀疏CNN。实验结果表明,加权骰子损失的3D U-Net ASPP稀疏CNN在我们的实验中获得了最好的评价分数,OA = 96.53, mIoU = 63.59。
{"title":"Large-scale 3D Point Cloud Semantic Segmentation with 3D U-Net ASPP Sparse CNN","authors":"Naufal Muhammad Hirzi, M. A. Ma'sum, Mahardhika Pratama, W. Jatmiko","doi":"10.1109/IWBIS56557.2022.9924988","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924988","url":null,"abstract":"3D geometric modelling of urban areas has the potential for further development, not only for 3D urban visualization. 3D point cloud, as 3D data commonly used in 3D urban geometry modelling, is needed to extract objects from point clouds to analyze urban landscapes. An automated method to analyze objects from the 3D point cloud can be achieved by using the semantic segmentation method. Unlike other segmentation tasks in 3D point cloud data, 3D urban point cloud segmentation has the challenge of segmenting different object sizes on various types of landscape contours with imbalanced distribution of the object. Therefore, this study modified 3D U-Net Sparse CNN by adding Atrous Spatial Pyramid Pooling (ASPP) as one of the modules in this model, called 3D U-Net ASPP Sparse CNN. The use of ASPP aims to get the contextual multi-scale information of the input feature map from the encoder part of U-Net. Furthermore, 3D U-Net ASPP Sparse CNN is implemented by using weighted dice loss as the loss function. The experiment result shows 3D U-Net ASPP Sparse CNN with weighted dice loss has achieved the best evaluation score in our experiment, with OA = 96.53 and mIoU = 63.59.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122308124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modified MultiResUNet for Left Ventricle Segmentation from Echocardiographic Images 超声心动图左心室分割的改进MultiResUNet方法
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924685
Fityan Azizi, Akbar Fathur Sani, R. Priambodo, Wisma Chaerul Karunianto, M. M. L. Ramadhan, M. F. Rachmadi, W. Jatmiko
An accurate assessment of heart function is crucial in diagnosing the cardiovascular disease. One way to evaluate or detect the disease can use echocardiography, by detecting systolic and diastolic volumes. However, manual human assessments can be time-consuming and error-prone due to the low resolution of the image. One way to detect heart failure on echocardiogram is by segmenting the left ventricle on the echocardiogram using deep learning. In this study, we modified the MultiResUNet model for left ventricle segmentation in echocardiography images by adding Atrous Spatial Pyramid Pooling block and Attention block. The use of multires blocks from MultiResUnet is able to overcome the problem of multi-resolution segmentation objects, where the segmentation objects have different sizes. This problem has similar characteristics to echocardiographic images, where the systole and diastole segmentation objects have different sizes from each other. Performance measure were evaluated using Echonet-Dynamic dataset. The proposed model achieves dice coefficient of 92%, giving an additional 2% performance result compared to the MultiResUNet.
心功能的准确评估对心血管疾病的诊断至关重要。一种评估或检测疾病的方法是使用超声心动图,通过检测收缩期和舒张期容积。然而,由于图像的低分辨率,人工评估可能非常耗时且容易出错。在超声心动图上检测心力衰竭的一种方法是使用深度学习对超声心动图上的左心室进行分割。在本研究中,我们改进了超声心动图图像中左心室分割的MultiResUNet模型,增加了阿特鲁斯空间金字塔池块和注意力块。使用来自MultiResUnet的multires块能够克服多分辨率分割对象的问题,其中分割对象具有不同的大小。该问题与超声心动图图像具有相似的特征,其中收缩期和舒张期分割对象彼此大小不同。使用Echonet-Dynamic数据集对性能指标进行评估。该模型实现了92%的骰子系数,与MultiResUNet相比,性能结果增加了2%。
{"title":"Modified MultiResUNet for Left Ventricle Segmentation from Echocardiographic Images","authors":"Fityan Azizi, Akbar Fathur Sani, R. Priambodo, Wisma Chaerul Karunianto, M. M. L. Ramadhan, M. F. Rachmadi, W. Jatmiko","doi":"10.1109/IWBIS56557.2022.9924685","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924685","url":null,"abstract":"An accurate assessment of heart function is crucial in diagnosing the cardiovascular disease. One way to evaluate or detect the disease can use echocardiography, by detecting systolic and diastolic volumes. However, manual human assessments can be time-consuming and error-prone due to the low resolution of the image. One way to detect heart failure on echocardiogram is by segmenting the left ventricle on the echocardiogram using deep learning. In this study, we modified the MultiResUNet model for left ventricle segmentation in echocardiography images by adding Atrous Spatial Pyramid Pooling block and Attention block. The use of multires blocks from MultiResUnet is able to overcome the problem of multi-resolution segmentation objects, where the segmentation objects have different sizes. This problem has similar characteristics to echocardiographic images, where the systole and diastole segmentation objects have different sizes from each other. Performance measure were evaluated using Echonet-Dynamic dataset. The proposed model achieves dice coefficient of 92%, giving an additional 2% performance result compared to the MultiResUNet.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123193761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mining Digital Traces to Uncover Global Perception of Bali’s Topmost Destinations 挖掘数字痕迹,揭示全球对巴厘岛顶级目的地的看法
Pub Date : 2022-10-01 DOI: 10.1109/IWBIS56557.2022.9924920
A. Alamsyah, D. P. Ramadhani, Herlambang Septiaji Basuseno
User generated content (UGC) provides abundant tourist information regarding destinations. The textual digital traces bring great opportunity along with great challenges. Text mining approaches including sentiment analysis, multiclass text classification, and network analysis are suitable for extracting the buried pattern under piles of unstructured data. We processed 18.721 reviews from worldwide tourists about Bali’s 15 topmost tourist attractions. This study uncovers the tourist perception through textual data using sentiment analysis to extract the positive and negative perceptions, and multiclass classification to extract the tourist cognitive concern for each destination. We discover the tourist visiting patterns deeper by combining perception tone and cognitive concern results using network analysis to map out the destinations’ popularity, interconnectivity, and major cognitive perception. Most of the tourists disclose positive expressions and give their concerns about Bali’s natural attractions. They feel best for the social setting and environment aspect, and worst for the accessibility. Sacred Monkey Forest Sanctuary is the most favorite destination and a potential point of a visit to other destinations. This research provides insight into the global perception of Bali’s topmost destinations for government and other tourism stakeholders to support the development and improvement of Bali’s tourism.
用户生成内容(UGC)提供了丰富的旅游目的地信息。文本数字痕迹在带来巨大挑战的同时,也带来了巨大的机遇。情感分析、多类文本分类和网络分析等文本挖掘方法适合于挖掘成堆的非结构化数据下隐藏的模式。我们处理了来自世界各地游客对巴厘岛15个顶级旅游景点的18.721条评论。本研究通过文本数据揭示旅游者的认知,利用情感分析提取旅游者的积极和消极感知,并采用多类别分类提取旅游者对各目的地的认知关注。通过网络分析,结合感知基调和认知关注结果,对旅游目的地的知名度、关联度和主要认知感知进行深入挖掘。大多数游客都表达了积极的态度,并对巴厘岛的自然景点表示担忧。他们对社交设置和环境方面的感觉最好,对可访问性的感觉最差。圣猴森林保护区是最受欢迎的目的地,也是参观其他目的地的潜在地点。这项研究为政府和其他旅游利益相关者提供了对巴厘岛顶级目的地的全球认知,以支持巴厘岛旅游业的发展和改善。
{"title":"Mining Digital Traces to Uncover Global Perception of Bali’s Topmost Destinations","authors":"A. Alamsyah, D. P. Ramadhani, Herlambang Septiaji Basuseno","doi":"10.1109/IWBIS56557.2022.9924920","DOIUrl":"https://doi.org/10.1109/IWBIS56557.2022.9924920","url":null,"abstract":"User generated content (UGC) provides abundant tourist information regarding destinations. The textual digital traces bring great opportunity along with great challenges. Text mining approaches including sentiment analysis, multiclass text classification, and network analysis are suitable for extracting the buried pattern under piles of unstructured data. We processed 18.721 reviews from worldwide tourists about Bali’s 15 topmost tourist attractions. This study uncovers the tourist perception through textual data using sentiment analysis to extract the positive and negative perceptions, and multiclass classification to extract the tourist cognitive concern for each destination. We discover the tourist visiting patterns deeper by combining perception tone and cognitive concern results using network analysis to map out the destinations’ popularity, interconnectivity, and major cognitive perception. Most of the tourists disclose positive expressions and give their concerns about Bali’s natural attractions. They feel best for the social setting and environment aspect, and worst for the accessibility. Sacred Monkey Forest Sanctuary is the most favorite destination and a potential point of a visit to other destinations. This research provides insight into the global perception of Bali’s topmost destinations for government and other tourism stakeholders to support the development and improvement of Bali’s tourism.","PeriodicalId":348371,"journal":{"name":"2022 7th International Workshop on Big Data and Information Security (IWBIS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126404431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2022 7th International Workshop on Big Data and Information Security (IWBIS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1