Pub Date : 2019-04-24DOI: 10.1109/ICWR.2019.8765277
Niloofar Arazkhani, M. Meybodi, Alireza Rezvanian
Popularity of online social network services makes it a suitable platform for rapid information diffusion ranging from positive to negatives information. Although the positive diffused information may welcomed by people, the negative information such as rumor, hate and misinformation content should be blocked. However, blocking inappropriate, unwanted and contamination diffusion are not trivial. In particular, in this paper, we study the notion of competing negative and positive campaigns in a social network by addressing the influence blocking maximization (IBM) problem to minimize the bad effect of misinformation. IBM problem can be defined as finding a subset of nodes to promote the positive influence under Multi-campaign Independent Cascade Model as diffusion model to minimize the number of nodes that adopt the negative influence at the end of both propagation processes. In this regard, we proposed a community based algorithm called FC_IBM algorithm using fuzzy clustering and centrality measures for finding a good candidate subset of nodes for diffusion of positive information in order to minimizing the IBM problem. The experimental results on well-known network datasets showed that the proposed algorithm not only outperforms the baseline algorithms with respect to efficiency but also with respect to the final number of positive nodes.
{"title":"An Efficient Algorithm for Influence Blocking Maximization based on Community Detection","authors":"Niloofar Arazkhani, M. Meybodi, Alireza Rezvanian","doi":"10.1109/ICWR.2019.8765277","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765277","url":null,"abstract":"Popularity of online social network services makes it a suitable platform for rapid information diffusion ranging from positive to negatives information. Although the positive diffused information may welcomed by people, the negative information such as rumor, hate and misinformation content should be blocked. However, blocking inappropriate, unwanted and contamination diffusion are not trivial. In particular, in this paper, we study the notion of competing negative and positive campaigns in a social network by addressing the influence blocking maximization (IBM) problem to minimize the bad effect of misinformation. IBM problem can be defined as finding a subset of nodes to promote the positive influence under Multi-campaign Independent Cascade Model as diffusion model to minimize the number of nodes that adopt the negative influence at the end of both propagation processes. In this regard, we proposed a community based algorithm called FC_IBM algorithm using fuzzy clustering and centrality measures for finding a good candidate subset of nodes for diffusion of positive information in order to minimizing the IBM problem. The experimental results on well-known network datasets showed that the proposed algorithm not only outperforms the baseline algorithms with respect to efficiency but also with respect to the final number of positive nodes.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"18 1","pages":"258-263"},"PeriodicalIF":0.0,"publicationDate":"2019-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88988821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-24DOI: 10.1109/ICWR.2019.8765284
Azam Kargar, S. Emadi
Quality-of-Service (QoS)-aware web service composition is very important in integrating individual services with respect to the functional and nonfunctional requirements. Despite the large number of candidate services, automation of the combination is essential in order to provide a good combination of service. Although many of the existing methods offer a solution that is optimal, most of them have little flexibility. In some cases, the compound service components fail, so the combination algorithm has to run again to find another optimal solution. Also in many situations, users prefer to have several alternative solutions. Therefore, providing a top-k service composition according to their QoS is more desirable. Because web services are unreliable, also since providing the transactional support in execution of a service composition is an important design requirement; so in this research, a fault management procedure is implemented to ensure the transaction execution of the service combinations. This procedure in the event of service failure undo the impact of this service by calling the equivalent service.The proposed method encounters these three issues: 1) the semantic selection of services; 2) QoS-aware web service composition with the top-k solution; and 3) fault-handling/recovery procedure. In the proposed method, ontology concept ranking algorithm is used in service selection, and top-k method is employed to solve service combination. Error handling procedure will then be reviewed and designed to ensure the transactional execution of the service composition. The evaluation results show that the proposed method not only finds optimal solutions but also can provide alternative solutions with optimal QoS.
{"title":"Fault Tolerance in Automatic Semantic Web Service Composition based on QoS-awareness Using BTSC-DFS Algorithm","authors":"Azam Kargar, S. Emadi","doi":"10.1109/ICWR.2019.8765284","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765284","url":null,"abstract":"Quality-of-Service (QoS)-aware web service composition is very important in integrating individual services with respect to the functional and nonfunctional requirements. Despite the large number of candidate services, automation of the combination is essential in order to provide a good combination of service. Although many of the existing methods offer a solution that is optimal, most of them have little flexibility. In some cases, the compound service components fail, so the combination algorithm has to run again to find another optimal solution. Also in many situations, users prefer to have several alternative solutions. Therefore, providing a top-k service composition according to their QoS is more desirable. Because web services are unreliable, also since providing the transactional support in execution of a service composition is an important design requirement; so in this research, a fault management procedure is implemented to ensure the transaction execution of the service combinations. This procedure in the event of service failure undo the impact of this service by calling the equivalent service.The proposed method encounters these three issues: 1) the semantic selection of services; 2) QoS-aware web service composition with the top-k solution; and 3) fault-handling/recovery procedure. In the proposed method, ontology concept ranking algorithm is used in service selection, and top-k method is employed to solve service combination. Error handling procedure will then be reviewed and designed to ensure the transactional execution of the service composition. The evaluation results show that the proposed method not only finds optimal solutions but also can provide alternative solutions with optimal QoS.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"87 1","pages":"50-54"},"PeriodicalIF":0.0,"publicationDate":"2019-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79026113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-24DOI: 10.1109/ICWR.2019.8765264
Fatemeh Sedighipour Chafjiri, Mohamad Mehdi Esnaashari Esfahani
Number of devices using Internet is increasing every day and this fact makes the need to improve the Internet of things protocol more than ever. Data protection and privacy is one of the key challenges in the Internet of Things technology. Dangers involved in the centralized technology of Blockchain system have led to the idea of using Tangle, which is a decentralized system. The main purpose of this new technology is to improve the problems and limitations of Blockchain such as high cost and time to confirm a transaction. In this new architecture, every node is involved in maintaining network security. This way, if a transaction is created, then it should select and confirm two unconfirmed transactions issued before. A walking algorithm is needed for this selection. Walking algorithms presented in literature thus far are either weighted or unweighted. An unweighted random walk algorithm can approve transactions nearly proportional to the time of their arrivals while a weighted algorithm can better defend against lazy and malicious transactions. In this paper, a new random walk algorithm is presented that has the benefits of both algorithms at the same time. The idea is to adapt the weight value to the current situation of transactions. Numerical results have shown the superiority of the proposed algorithm in comparison to the existing algorithms in providing a balance between timeliness of approving transactions and protecting against malicious activities.
{"title":"An Adaptive Random Walk Algorithm for Selecting Tips in the Tangle","authors":"Fatemeh Sedighipour Chafjiri, Mohamad Mehdi Esnaashari Esfahani","doi":"10.1109/ICWR.2019.8765264","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765264","url":null,"abstract":"Number of devices using Internet is increasing every day and this fact makes the need to improve the Internet of things protocol more than ever. Data protection and privacy is one of the key challenges in the Internet of Things technology. Dangers involved in the centralized technology of Blockchain system have led to the idea of using Tangle, which is a decentralized system. The main purpose of this new technology is to improve the problems and limitations of Blockchain such as high cost and time to confirm a transaction. In this new architecture, every node is involved in maintaining network security. This way, if a transaction is created, then it should select and confirm two unconfirmed transactions issued before. A walking algorithm is needed for this selection. Walking algorithms presented in literature thus far are either weighted or unweighted. An unweighted random walk algorithm can approve transactions nearly proportional to the time of their arrivals while a weighted algorithm can better defend against lazy and malicious transactions. In this paper, a new random walk algorithm is presented that has the benefits of both algorithms at the same time. The idea is to adapt the weight value to the current situation of transactions. Numerical results have shown the superiority of the proposed algorithm in comparison to the existing algorithms in providing a balance between timeliness of approving transactions and protecting against malicious activities.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"12 1","pages":"161-166"},"PeriodicalIF":0.0,"publicationDate":"2019-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86000468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-24DOI: 10.1109/ICWR.2019.8765286
L. Gholamhosseini, F sadoughi, Hossein Ahmadi, A. Safaei
These days, based on the recent development in IoT technologies, a large number of innovative healthcare applications should be created. Health Internet of Things (HIoT) is a novel technology that used to connect many diver’s medical sensors to IoT devices. The HIoT facilitates remote data collection and monitoring in healthcare fields. This paper is a review study that addresses the impact of HIoT on healthcare applications. In this regards, we investigated the most important technologies used in HIoT and describe the main strengths, weaknesses, opportunities, and threats of HIoT-based applications. Based on the results, the potential strengths and opportunities of the HIoT technology is very different and widespread; however, it has some significant weaknesses and threats including complexity of handling a huge number of heterogeneous objects, achieving scalability, reliability, efficiency, availability, security, and interoperability with IoT systems across healthcare applications.
{"title":"Health Internet of Things: Strengths, Weakness, Opportunity, and Threats","authors":"L. Gholamhosseini, F sadoughi, Hossein Ahmadi, A. Safaei","doi":"10.1109/ICWR.2019.8765286","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765286","url":null,"abstract":"These days, based on the recent development in IoT technologies, a large number of innovative healthcare applications should be created. Health Internet of Things (HIoT) is a novel technology that used to connect many diver’s medical sensors to IoT devices. The HIoT facilitates remote data collection and monitoring in healthcare fields. This paper is a review study that addresses the impact of HIoT on healthcare applications. In this regards, we investigated the most important technologies used in HIoT and describe the main strengths, weaknesses, opportunities, and threats of HIoT-based applications. Based on the results, the potential strengths and opportunities of the HIoT technology is very different and widespread; however, it has some significant weaknesses and threats including complexity of handling a huge number of heterogeneous objects, achieving scalability, reliability, efficiency, availability, security, and interoperability with IoT systems across healthcare applications.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"1 1","pages":"287-296"},"PeriodicalIF":0.0,"publicationDate":"2019-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90753608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-24DOI: 10.1109/ICWR.2019.8765288
A. Mansouri, F. Taghiyareh, J. Hatami
Opinion formation models describe the opinion dynamics of interacting people. Social media are drastically increasing and have become one of the most critical media for people interactions. According to psychological researches, one’s emotion diffuses across interacting people. Furthermore, emotion affects people’s opinion. The emotion contagion also happens through social media via the users’ posts and affects the readers. Therefore, emotion is an essential element in opinion formation models in a social network which has attracted little attention. In this paper, we show how considering emotion in opinion formation model for online social networks improves the model. We have used a dataset containing some debates from the CreateDebate.com website. Two classifiers, with and without considering emotions, have been implemented based on the social impact model of opinion formation to predict the stances of the users’ next post in the dataset and the results have been compared with the dataset. The experiment results lead us to conclude that considering emotions improves the accuracy and precision of the social impact model of opinion formation in social media.
{"title":"Improving Opinion Formation Models on Social Media Through Emotions","authors":"A. Mansouri, F. Taghiyareh, J. Hatami","doi":"10.1109/ICWR.2019.8765288","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765288","url":null,"abstract":"Opinion formation models describe the opinion dynamics of interacting people. Social media are drastically increasing and have become one of the most critical media for people interactions. According to psychological researches, one’s emotion diffuses across interacting people. Furthermore, emotion affects people’s opinion. The emotion contagion also happens through social media via the users’ posts and affects the readers. Therefore, emotion is an essential element in opinion formation models in a social network which has attracted little attention. In this paper, we show how considering emotion in opinion formation model for online social networks improves the model. We have used a dataset containing some debates from the CreateDebate.com website. Two classifiers, with and without considering emotions, have been implemented based on the social impact model of opinion formation to predict the stances of the users’ next post in the dataset and the results have been compared with the dataset. The experiment results lead us to conclude that considering emotions improves the accuracy and precision of the social impact model of opinion formation in social media.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"104 1","pages":"6-11"},"PeriodicalIF":0.0,"publicationDate":"2019-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80535078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-01DOI: 10.1109/ICWR.2019.8765292
F. Nejati, H. Sajedi, M. Mohammadi
Image authentication technique is one of the important methods for a large number of multimedia applications. When a digital image is passed over non-secure channels such as the Internet, it may be changed and manipulated. For some important images such as military and medical images, these manipulations are very harmful and such images should be protected against them. There are several ways such as fragile and semi-fragile watermarking to authenticate images from malicious attacks. This paper presents a fragile watermarking algorithm for image authentication by using QR factorization and Fourier Transform (FT). By applying Fourier transform to host image, frequency domain which causes visual quality in watermarking is achieved. After applying FT, it is factorized by QR decomposition. QR factorization is also applied to watermark image. After factorizing both images, a coefficient of the upper triangular matrix R from watermark image is embedded to the upper triangular matrix R from host image. So a sign of the watermark image is hidden in the host image. This method is a fragile watermarking and it is sensitive to a little attack. So if an image is attacked over the Internet, the watermark image can not be extracted and it means that it has been attacked and it helps us to recognize if an image is changed after being transmitted over the Internet. The experimental results show that this method is sensitive to every weak attack and extraction part can not extract watermark image if it has been attacked.
{"title":"Fragile Watermarking for Image Authentication Using QR factorization and Fourier Transform","authors":"F. Nejati, H. Sajedi, M. Mohammadi","doi":"10.1109/ICWR.2019.8765292","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765292","url":null,"abstract":"Image authentication technique is one of the important methods for a large number of multimedia applications. When a digital image is passed over non-secure channels such as the Internet, it may be changed and manipulated. For some important images such as military and medical images, these manipulations are very harmful and such images should be protected against them. There are several ways such as fragile and semi-fragile watermarking to authenticate images from malicious attacks. This paper presents a fragile watermarking algorithm for image authentication by using QR factorization and Fourier Transform (FT). By applying Fourier transform to host image, frequency domain which causes visual quality in watermarking is achieved. After applying FT, it is factorized by QR decomposition. QR factorization is also applied to watermark image. After factorizing both images, a coefficient of the upper triangular matrix R from watermark image is embedded to the upper triangular matrix R from host image. So a sign of the watermark image is hidden in the host image. This method is a fragile watermarking and it is sensitive to a little attack. So if an image is attacked over the Internet, the watermark image can not be extracted and it means that it has been attacked and it helps us to recognize if an image is changed after being transmitted over the Internet. The experimental results show that this method is sensitive to every weak attack and extraction part can not extract watermark image if it has been attacked.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"28 4 1","pages":"45-49"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77117099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-01DOI: 10.1109/ICWR.2019.8765287
Y. Norouzi, F. Hakimpour
In the field of geographic information science spatiotemporal information extraction from Web pages, especially unstructured documents, is one of the growing areas of the research. Abundant news is publishing every hour on the Web, which contains valuable spatiotemporal information for its users. It is cumbersome and time-consuming to search among unstructured texts and find events of interest. In this work, we will show you how to extract spatiotemporal and semantic entities and relationships representing in cultural event news reports and search within the information. Natural Language Processing (NLP) and automatic ontology population are tightly coupled, and together they make it possible to have Web documents semantically so that not only can machines comprehend the Web documents, but also as a result, users are able to find the ideal information with ease. A spatiotemporal semantic search engine enables us to answer, where and when an event will take place.
在地理信息科学领域,从网页尤其是非结构化文档中提取时空信息是一个新兴的研究领域。网络上每小时都有大量的新闻发布,这些新闻为用户提供了宝贵的时空信息。在非结构化文本中搜索和查找感兴趣的事件既麻烦又耗时。在这项工作中,我们将向您展示如何提取文化事件新闻报道中所代表的时空和语义实体和关系,并在信息中进行搜索。自然语言处理(Natural Language Processing, NLP)和自动本体填充(automatic ontology population)紧密耦合在一起,使语义化Web文档成为可能,不仅机器能够理解Web文档,而且用户能够轻松地找到理想的信息。一个时空语义搜索引擎使我们能够回答,一个事件将在何时何地发生。
{"title":"A Spatiotemporal Semantic Search Engine For Cultural Events","authors":"Y. Norouzi, F. Hakimpour","doi":"10.1109/ICWR.2019.8765287","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765287","url":null,"abstract":"In the field of geographic information science spatiotemporal information extraction from Web pages, especially unstructured documents, is one of the growing areas of the research. Abundant news is publishing every hour on the Web, which contains valuable spatiotemporal information for its users. It is cumbersome and time-consuming to search among unstructured texts and find events of interest. In this work, we will show you how to extract spatiotemporal and semantic entities and relationships representing in cultural event news reports and search within the information. Natural Language Processing (NLP) and automatic ontology population are tightly coupled, and together they make it possible to have Web documents semantically so that not only can machines comprehend the Web documents, but also as a result, users are able to find the ideal information with ease. A spatiotemporal semantic search engine enables us to answer, where and when an event will take place.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"52 1","pages":"117-122"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88718823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-01DOI: 10.1109/ICWR.2019.8765266
Zahra Nazari, A. Kamandi, M. Shabankhah
Major part of internet users are devices which are connected to each other on the internet and are exchanging data with internet brokers to receive requested services. Managing and accounting well to IoT requests needs maximum processing power, speed in data transfer and proper combining services in minimum time. This many devices in IoT, made solving problems in this area to use abilities and facilities of cloud environment. Hence combining services in cloud environment is paid attention recently. In this research we want to give an algorithm with approach of improving factors propounded in the problem combining service composition problem like number of clouds involved in giving services, number of services studied before fulfilling users requests and load balance between clouds. In this paper we use the factor, similarity measure, to find the most suitable cloud and composition plan in each phase which in addition to improving QoS metrics propounded in previous papers, it caused improving QoS metric of load balancing between clouds, prevention of formation of bottleneck in clouds entrance, decreasing the probability of temporarily failing of any of clouds and consequently increasing the users’ satisfaction.
{"title":"An Optimal Service Composition Algorithm in Multi-Cloud Environment","authors":"Zahra Nazari, A. Kamandi, M. Shabankhah","doi":"10.1109/ICWR.2019.8765266","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765266","url":null,"abstract":"Major part of internet users are devices which are connected to each other on the internet and are exchanging data with internet brokers to receive requested services. Managing and accounting well to IoT requests needs maximum processing power, speed in data transfer and proper combining services in minimum time. This many devices in IoT, made solving problems in this area to use abilities and facilities of cloud environment. Hence combining services in cloud environment is paid attention recently. In this research we want to give an algorithm with approach of improving factors propounded in the problem combining service composition problem like number of clouds involved in giving services, number of services studied before fulfilling users requests and load balance between clouds. In this paper we use the factor, similarity measure, to find the most suitable cloud and composition plan in each phase which in addition to improving QoS metrics propounded in previous papers, it caused improving QoS metric of load balancing between clouds, prevention of formation of bottleneck in clouds entrance, decreasing the probability of temporarily failing of any of clouds and consequently increasing the users’ satisfaction.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"2 1","pages":"141-151"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90304279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-01DOI: 10.1109/ICWR.2019.8765267
Delaram Javdani, H. Rahmani, Milad Allahgholi, Fatemeh Karimkhani
Entity resolution refers to the process of identifying and integrating records belonging to unique entities. The standard methods are using a rule-based or machine learning models to compare and assign a point, to indicate the status of matching or non-matching the pair of records. However, a comprehensive comparison across all the records pairs leads to a second-order matching complexity. Therefore blocking methods are using before the matching, to group the same entities into small blocks. Then the matching operation is done comprehensively. Several blocking methods provided to efficiently block the input data into manageable groups, including the token blocking, that holds records with a similar token in the same block. Most of the previous methods did not take any semantic criteria into account. In this paper, we propose a new method, called DeepBlock that uses deep learning for the task of blocking in entity resolution. DeepBlock combines syntactic and semantic similarities to calculate the similarity between records. We have evaluated the DeepBlock over the real-world dataset and compared it with the existing blocking technique (token blocking). Our experimental result shows that the combination of semantic and syntactic similarity can considerably improve the quality of blocking. The results show that DeepBlock outperforms the token blocking method significantly with respect to pair quality (PQ) measure.
{"title":"DeepBlock: A Novel Blocking Approach for Entity Resolution using Deep Learning","authors":"Delaram Javdani, H. Rahmani, Milad Allahgholi, Fatemeh Karimkhani","doi":"10.1109/ICWR.2019.8765267","DOIUrl":"https://doi.org/10.1109/ICWR.2019.8765267","url":null,"abstract":"Entity resolution refers to the process of identifying and integrating records belonging to unique entities. The standard methods are using a rule-based or machine learning models to compare and assign a point, to indicate the status of matching or non-matching the pair of records. However, a comprehensive comparison across all the records pairs leads to a second-order matching complexity. Therefore blocking methods are using before the matching, to group the same entities into small blocks. Then the matching operation is done comprehensively. Several blocking methods provided to efficiently block the input data into manageable groups, including the token blocking, that holds records with a similar token in the same block. Most of the previous methods did not take any semantic criteria into account. In this paper, we propose a new method, called DeepBlock that uses deep learning for the task of blocking in entity resolution. DeepBlock combines syntactic and semantic similarities to calculate the similarity between records. We have evaluated the DeepBlock over the real-world dataset and compared it with the existing blocking technique (token blocking). Our experimental result shows that the combination of semantic and syntactic similarity can considerably improve the quality of blocking. The results show that DeepBlock outperforms the token blocking method significantly with respect to pair quality (PQ) measure.","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"54 1","pages":"41-44"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82311893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-01DOI: 10.1109/icwr.2019.8765256
{"title":"ICWR 2019 Subject Index","authors":"","doi":"10.1109/icwr.2019.8765256","DOIUrl":"https://doi.org/10.1109/icwr.2019.8765256","url":null,"abstract":"","PeriodicalId":6680,"journal":{"name":"2019 5th International Conference on Web Research (ICWR)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78858929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}