Pub Date : 2020-06-09DOI: 10.1142/s1793351x20400024
Nonyelum Ndefo, Enrico Franconi
The problem of determining the relative information capacity between two knowledge bases or schemas, of the same or different models, is inherent when implementing schema transformations. When rest...
在实现模式转换时,确定相同或不同模型的两个知识库或模式之间的相对信息容量的问题是固有的。当休息……
{"title":"A Study on Information-Preserving Schema Transformations","authors":"Nonyelum Ndefo, Enrico Franconi","doi":"10.1142/s1793351x20400024","DOIUrl":"https://doi.org/10.1142/s1793351x20400024","url":null,"abstract":"The problem of determining the relative information capacity between two knowledge bases or schemas, of the same or different models, is inherent when implementing schema transformations. When rest...","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126705647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-09DOI: 10.1142/s1793351x2040005x
James Obert, A. Chavez
In recent years, the use of security gateways (SG) located within the electrical grid distribution network has become pervasive. SGs in substations and renewable distributed energy resource aggrega...
近年来,安全网关(SG)在电网配电网中的应用越来越普遍。变电站和可再生分布式能源的SGs…
{"title":"Graph Theory and Classifying Security Events in Grid Security Gateways","authors":"James Obert, A. Chavez","doi":"10.1142/s1793351x2040005x","DOIUrl":"https://doi.org/10.1142/s1793351x2040005x","url":null,"abstract":"In recent years, the use of security gateways (SG) located within the electrical grid distribution network has become pervasive. SGs in substations and renewable distributed energy resource aggrega...","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128847289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-09DOI: 10.1142/S1793351X20500026
Prabhakar Gupta, Mayank Sharma
We demonstrate the potential for using aligned bilingual word embeddings in developing an unsupervised method to evaluate machine translations without a need for parallel translation corpus or refe...
{"title":"Unsupervised Translation Quality Estimation for Digital Entertainment Content Subtitles","authors":"Prabhakar Gupta, Mayank Sharma","doi":"10.1142/S1793351X20500026","DOIUrl":"https://doi.org/10.1142/S1793351X20500026","url":null,"abstract":"We demonstrate the potential for using aligned bilingual word embeddings in developing an unsupervised method to evaluate machine translations without a need for parallel translation corpus or refe...","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114552260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-09DOI: 10.1142/s1793351x20400048
Bohui Xia, Hiroyuki Seshime, Xueting Wang, T. Yamasaki
As the online advertisement industry continues to grow, it is predicted that online advertisement will account for about 45% of global advertisement spending by 2020.a Thus, predicting the click-th...
{"title":"Click-Through Rate Prediction of Online Banners Featuring Multimodal Analysis","authors":"Bohui Xia, Hiroyuki Seshime, Xueting Wang, T. Yamasaki","doi":"10.1142/s1793351x20400048","DOIUrl":"https://doi.org/10.1142/s1793351x20400048","url":null,"abstract":"As the online advertisement industry continues to grow, it is predicted that online advertisement will account for about 45% of global advertisement spending by 2020.a Thus, predicting the click-th...","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131254270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1142/S1793351X20400097
Sebastian Weigelt, Vanessa Steurer, Tobias Hey, W. Tichy
Systems with conversational interfaces are rather popular nowadays. However, their full potential is not yet exploited. For the time being, users are restricted to calling predefined functions. Soon, users will expect to customize systems to their needs and create own functions using nothing but spoken instructions. Thus, future systems must understand how laypersons teach new functionality to intelligent systems. The understanding of natural language teaching sequences is a first step toward comprehensive end-user programming in natural language. We propose to analyze the semantics of spoken teaching sequences with a hierarchical classification approach. First, we classify whether an utterance constitutes an effort to teach a new function or not. Afterward, a second classifier locates the distinct semantic parts of teaching efforts: declaration of a new function, specification of intermediate steps, and superfluous information. For both tasks we implement a broad range of machine learning techniques: classical approaches, such as Naïve Bayes, and neural network configurations of various types and architectures, such as bidirectional LSTMs. Additionally, we introduce two heuristic-based adaptations that are tailored to the task of understanding teaching sequences. As data basis we use 3168 descriptions gathered in a user study. For the first task convolutional neural networks obtain the best results (accuracy: 96.6%); bidirectional LSTMs excel in the second (accuracy: 98.8%). The adaptations improve the first-level classification considerably (plus 2.2% points).
{"title":"Towards Programming in Natural Language: Learning New Functions from Spoken Utterances","authors":"Sebastian Weigelt, Vanessa Steurer, Tobias Hey, W. Tichy","doi":"10.1142/S1793351X20400097","DOIUrl":"https://doi.org/10.1142/S1793351X20400097","url":null,"abstract":"Systems with conversational interfaces are rather popular nowadays. However, their full potential is not yet exploited. For the time being, users are restricted to calling predefined functions. Soon, users will expect to customize systems to their needs and create own functions using nothing but spoken instructions. Thus, future systems must understand how laypersons teach new functionality to intelligent systems. The understanding of natural language teaching sequences is a first step toward comprehensive end-user programming in natural language. We propose to analyze the semantics of spoken teaching sequences with a hierarchical classification approach. First, we classify whether an utterance constitutes an effort to teach a new function or not. Afterward, a second classifier locates the distinct semantic parts of teaching efforts: declaration of a new function, specification of intermediate steps, and superfluous information. For both tasks we implement a broad range of machine learning techniques: classical approaches, such as Naïve Bayes, and neural network configurations of various types and architectures, such as bidirectional LSTMs. Additionally, we introduce two heuristic-based adaptations that are tailored to the task of understanding teaching sequences. As data basis we use 3168 descriptions gathered in a user study. For the first task convolutional neural networks obtain the best results (accuracy: 96.6%); bidirectional LSTMs excel in the second (accuracy: 98.8%). The adaptations improve the first-level classification considerably (plus 2.2% points).","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"16 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132639486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1142/S1793351X20400073
Md. Enamul Haque, Eddie C. Ling, Aminul Islam, M. E. Tozal
Microblog activity logs are useful to determine user’s interest and sentiment towards specific and broader category of events such as natural disaster and national election. In this paper, we present a corpus model to show how personal attitudes can be predicted from social media or microblog activities for a specific domain of events such as natural disasters. More specifically, given a user’s tweet and an event, the model is used to predict whether the user will be willing to help or show a positive attitude towards that event or similar events in the future. We present a new dataset related to a specific natural disaster event, i.e. Hurricane Harvey, that distinguishes user’s tweets into positive and non-positive attitudes. We build Term Embeddings for Tweet (TEmT) to generate features to model personal attitudes for arbitrary user’s tweets. In addition, we present sentiment analysis on the same disaster event dataset using enhanced feature learning on TEmT generated features by applying Convolutional Neural Network (CNN). Finally, we evaluate the effectiveness of our method by employing multiple classification techniques and comparative methods on the newly created dataset.
{"title":"Predicting Domain Specific Personal Attitudes and Sentiment","authors":"Md. Enamul Haque, Eddie C. Ling, Aminul Islam, M. E. Tozal","doi":"10.1142/S1793351X20400073","DOIUrl":"https://doi.org/10.1142/S1793351X20400073","url":null,"abstract":"Microblog activity logs are useful to determine user’s interest and sentiment towards specific and broader category of events such as natural disaster and national election. In this paper, we present a corpus model to show how personal attitudes can be predicted from social media or microblog activities for a specific domain of events such as natural disasters. More specifically, given a user’s tweet and an event, the model is used to predict whether the user will be willing to help or show a positive attitude towards that event or similar events in the future. We present a new dataset related to a specific natural disaster event, i.e. Hurricane Harvey, that distinguishes user’s tweets into positive and non-positive attitudes. We build Term Embeddings for Tweet (TEmT) to generate features to model personal attitudes for arbitrary user’s tweets. In addition, we present sentiment analysis on the same disaster event dataset using enhanced feature learning on TEmT generated features by applying Convolutional Neural Network (CNN). Finally, we evaluate the effectiveness of our method by employing multiple classification techniques and comparative methods on the newly created dataset.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131290670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1142/S1793351X20400103
Yingcheng Sun, R. Kolacinski, K. Loparo
With the explosive growth of online discussions published everyday on social media platforms, comprehension and discovery of the most popular topics have become a challenging problem. Conventional topic models have had limited success in online discussions because the corpus is extremely sparse and noisy. To overcome their limitations, we use the discussion thread tree structure and propose a “popularity” metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the “transitivity” concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.
{"title":"Transitive Topic Modeling with Conversational Structure Context: Discovering Topics that are Most Popular in Online Discussions","authors":"Yingcheng Sun, R. Kolacinski, K. Loparo","doi":"10.1142/S1793351X20400103","DOIUrl":"https://doi.org/10.1142/S1793351X20400103","url":null,"abstract":"With the explosive growth of online discussions published everyday on social media platforms, comprehension and discovery of the most popular topics have become a challenging problem. Conventional topic models have had limited success in online discussions because the corpus is extremely sparse and noisy. To overcome their limitations, we use the discussion thread tree structure and propose a “popularity” metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the “transitivity” concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131074115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1142/S1793351X20400085
Mehrdad Alizadeh, Barbara Maria Di Eugenio
Visual Question Answering (VQA) concerns providing answers to Natural Language questions about images. Several deep neural network approaches have been proposed to model the task in an end-to-end fashion. Whereas the task is grounded in visual processing, if the question focuses on events described by verbs, the language understanding component becomes crucial. Our hypothesis is that models should be aware of verb semantics, as expressed via semantic role labels, argument types, and/or frame elements. Unfortunately, no VQA dataset exists that includes verb semantic information. Our first contribution is a new VQA dataset (imSituVQA) that we built by taking advantage of the imSitu annotations. The imSitu dataset consists of images manually labeled with semantic frame elements, mostly taken from FrameNet. Second, we propose a multi-task CNN-LSTM VQA model that learns to classify the answers as well as the semantic frame elements. Our experiments show that semantic frame element classification helps the VQA system avoid inconsistent responses and improves performance. Third, we employ an automatic semantic role labeler and annotate a subset of the VQA dataset (VQAsub). This way, the proposed multi-task CNN-LSTM VQA model can be trained with the VQAsub as well. The results show a slight improvement over the single-task CNN-LSTM model.
{"title":"Incorporating Verb Semantic Information in Visual Question Answering Through Multitask Learning Paradigm","authors":"Mehrdad Alizadeh, Barbara Maria Di Eugenio","doi":"10.1142/S1793351X20400085","DOIUrl":"https://doi.org/10.1142/S1793351X20400085","url":null,"abstract":"Visual Question Answering (VQA) concerns providing answers to Natural Language questions about images. Several deep neural network approaches have been proposed to model the task in an end-to-end fashion. Whereas the task is grounded in visual processing, if the question focuses on events described by verbs, the language understanding component becomes crucial. Our hypothesis is that models should be aware of verb semantics, as expressed via semantic role labels, argument types, and/or frame elements. Unfortunately, no VQA dataset exists that includes verb semantic information. Our first contribution is a new VQA dataset (imSituVQA) that we built by taking advantage of the imSitu annotations. The imSitu dataset consists of images manually labeled with semantic frame elements, mostly taken from FrameNet. Second, we propose a multi-task CNN-LSTM VQA model that learns to classify the answers as well as the semantic frame elements. Our experiments show that semantic frame element classification helps the VQA system avoid inconsistent responses and improves performance. Third, we employ an automatic semantic role labeler and annotate a subset of the VQA dataset (VQAsub). This way, the proposed multi-task CNN-LSTM VQA model can be trained with the VQAsub as well. The results show a slight improvement over the single-task CNN-LSTM model.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124401571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-03-01DOI: 10.1142/s1793351x20400012
D. Lembo, Federico Maria Scafoglieri
Information Extraction (IE) is the task of automatically organizing in a structured form data extracted from free text documents. In several contexts, it is often desirable that the extracted data are then organized according to an ontology, which provides a formal and conceptual representation of the domain of interest. Ontologies allow for a better data interpretation, as well as for their semantic integration with other information, as in Ontology-based Data Access (OBDA), a popular declarative framework for data management where an ontology is connected to a data layer through mappings. However, the data layer considered so far in OBDA has consisted essentially of relational databases, and how to declaratively couple an ontology with unstructured data sources is still unexplored. By leveraging the recent study on document spanners for rule-based IE by Fagin et al., in this paper, we propose a new framework that allows to map text documents to ontologies, in the spirit of OBDA. We investigate the problem of answering conjunctive queries in this framework. For ontologies specified in the Description Logics [Formula: see text] and [Formula: see text], we show that the problem is polynomial in the size of the underlying documents. We also provide algorithms to solve query answering by rewriting the input query on the basis of the ontology and its mapping toward the source documents. Through these techniques, we pursue a virtual approach, similar to that typically adopted in OBDA, which allows us to answer a query without having to first populate the entire ontology. Interestingly, for [Formula: see text], both the spanners used in the mapping and the one computed by the rewriting algorithm belong to the same class of expressiveness. This holds also for [Formula: see text], modulo some limitations on the form of the mapping. These results say that in these cases our framework can be easily implemented by decoupling ontology management and document access, which can be delegated to an external IE system able to process the extraction rules we use in the mapping.
{"title":"Ontology-based Document Spanning Systems for Information Extraction","authors":"D. Lembo, Federico Maria Scafoglieri","doi":"10.1142/s1793351x20400012","DOIUrl":"https://doi.org/10.1142/s1793351x20400012","url":null,"abstract":"Information Extraction (IE) is the task of automatically organizing in a structured form data extracted from free text documents. In several contexts, it is often desirable that the extracted data are then organized according to an ontology, which provides a formal and conceptual representation of the domain of interest. Ontologies allow for a better data interpretation, as well as for their semantic integration with other information, as in Ontology-based Data Access (OBDA), a popular declarative framework for data management where an ontology is connected to a data layer through mappings. However, the data layer considered so far in OBDA has consisted essentially of relational databases, and how to declaratively couple an ontology with unstructured data sources is still unexplored. By leveraging the recent study on document spanners for rule-based IE by Fagin et al., in this paper, we propose a new framework that allows to map text documents to ontologies, in the spirit of OBDA. We investigate the problem of answering conjunctive queries in this framework. For ontologies specified in the Description Logics [Formula: see text] and [Formula: see text], we show that the problem is polynomial in the size of the underlying documents. We also provide algorithms to solve query answering by rewriting the input query on the basis of the ontology and its mapping toward the source documents. Through these techniques, we pursue a virtual approach, similar to that typically adopted in OBDA, which allows us to answer a query without having to first populate the entire ontology. Interestingly, for [Formula: see text], both the spanners used in the mapping and the one computed by the rewriting algorithm belong to the same class of expressiveness. This holds also for [Formula: see text], modulo some limitations on the form of the mapping. These results say that in these cases our framework can be easily implemented by decoupling ontology management and document access, which can be delegated to an external IE system able to process the extraction rules we use in the mapping.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"186 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123316011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}