Parallel corpus is the valuable resource for some important applications of natural language processing such as statistical machine translation, dictionary construction, cross-language information retrieval. The Web is a huge resource of knowledge, which partly contains bilingual information in various kinds of web pages. It currently attracts many studies on building parallel corpora based on the Internet resource. However, obtaining a parallel corpus with high accuracy is still a challenge. This paper focuses on extracting parallel texts from bilingual web-sites of the English and Vietnamese language pair. We first propose a new way of designing content-based features, and then combining them with structural features under a framework of machine learning. In the experiment we obtain 88.2% of precision for the extracted parallel texts.
{"title":"Extracting Parallel Texts from the Web","authors":"Le Quang Hung, L. Cuong","doi":"10.1109/KSE.2010.14","DOIUrl":"https://doi.org/10.1109/KSE.2010.14","url":null,"abstract":"Parallel corpus is the valuable resource for some important applications of natural language processing such as statistical machine translation, dictionary construction, cross-language information retrieval. The Web is a huge resource of knowledge, which partly contains bilingual information in various kinds of web pages. It currently attracts many studies on building parallel corpora based on the Internet resource. However, obtaining a parallel corpus with high accuracy is still a challenge. This paper focuses on extracting parallel texts from bilingual web-sites of the English and Vietnamese language pair. We first propose a new way of designing content-based features, and then combining them with structural features under a framework of machine learning. In the experiment we obtain 88.2% of precision for the extracted parallel texts.","PeriodicalId":158823,"journal":{"name":"2010 Second International Conference on Knowledge and Systems Engineering","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132504784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, several algorithms have been proposed to tackle different conservation questions under phylogenetic diversity. Such questions are variants of the more general problem of budgeted reserve selection under split diversity, an NP-hard problem. Here, we present a novel framework, Split Diversity Algorithm* (SDA*), to unify all these attempts. More specifically, SDA* transforms the budgeted reserve selection problem into a binary linear programming(BLP), that can be solved by available linear optimization techniques. SDA* guarantees to find optimal solutions in reasonable time.
{"title":"SDA*: A Simple and Unifying Solution to Recent Bioinformatic Challenges for Conservation Genetics","authors":"B. Minh, S. Klaere, A. von Haeseler","doi":"10.1109/KSE.2010.24","DOIUrl":"https://doi.org/10.1109/KSE.2010.24","url":null,"abstract":"Recently, several algorithms have been proposed to tackle different conservation questions under phylogenetic diversity. Such questions are variants of the more general problem of budgeted reserve selection under split diversity, an NP-hard problem. Here, we present a novel framework, Split Diversity Algorithm* (SDA*), to unify all these attempts. More specifically, SDA* transforms the budgeted reserve selection problem into a binary linear programming(BLP), that can be solved by available linear optimization techniques. SDA* guarantees to find optimal solutions in reasonable time.","PeriodicalId":158823,"journal":{"name":"2010 Second International Conference on Knowledge and Systems Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131295152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Near infrared spectroscopy (NIRS) is an effective technique for examining functional brain activity during cognitive tasks by enabling the measurement of the concentration changes of oxy-hemoglobin and deoxy-hemoglobin. In NIRS data analysis, accurate estimation of the hemodynamic response function (HRF) is still under investigation. Most existing methods assume that the shape of the HRF to be known. This assumption may not be appropriate when the HRF varies from subject to subject or from region to region. In this paper, a deconvolution algorithm to estimate the HRF is presented. The advantage of this method is no prior hypothesis about the shape of the HRF is required. In addition, in order to increase the sensitivity of NIRS to functional brain activity, an adaptive filter is designed to remove physiological noises from the noisy NIRS data. In order to verify the effectiveness of the proposed methods, numerical simulations were performed, the results of which are provided herein.
{"title":"Investigation of the Hemodynamic Response in Near Infrared Spectroscopy Data Analysis","authors":"Le Hoa Nguyen, K. Hong","doi":"10.1109/KSE.2010.26","DOIUrl":"https://doi.org/10.1109/KSE.2010.26","url":null,"abstract":"Near infrared spectroscopy (NIRS) is an effective technique for examining functional brain activity during cognitive tasks by enabling the measurement of the concentration changes of oxy-hemoglobin and deoxy-hemoglobin. In NIRS data analysis, accurate estimation of the hemodynamic response function (HRF) is still under investigation. Most existing methods assume that the shape of the HRF to be known. This assumption may not be appropriate when the HRF varies from subject to subject or from region to region. In this paper, a deconvolution algorithm to estimate the HRF is presented. The advantage of this method is no prior hypothesis about the shape of the HRF is required. In addition, in order to increase the sensitivity of NIRS to functional brain activity, an adaptive filter is designed to remove physiological noises from the noisy NIRS data. In order to verify the effectiveness of the proposed methods, numerical simulations were performed, the results of which are provided herein.","PeriodicalId":158823,"journal":{"name":"2010 Second International Conference on Knowledge and Systems Engineering","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128509444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many concurrency models have been developed for high-level programming languages such as Java. A trend here is towards more flexible concurrency control protocols, going beyond the original Java multi-threading treatment based on lexically-scoped concurrency control mechanism. Two proposals supporting flexible, non-lexical concurrency control are the lock-handling via the Lock-classes in Java5 and Transactional Featherweight Java (TFJ), an extension of Featherweight Java by transactions. Even if these two take quite different approaches towards dealing with concurrency —“pessimistic” or lock-based vs. “optimistic” or based on transactions— the added flexibility of non-lexical use of the corresponding concurrency operators comes at a similar price: improper usage leads to run-time exceptions and unwanted behavior. This is in contrast with the more disciplined use under a lexically scoped regime, where each entrance to a critical region is syntactically accompanied by a corresponding exit (as e.g. with traditional synchronized methods or as with so-called atomic blocks).To assure safe use of locking, resp. transactions in these settings, we present in this paper abstractions in the form of two static type and effect systems, which make sure that for instance, no lock is released by a thread which does not hold it, resp., that no commit is executed outside any transaction. We furthermore compare the two mentioned approaches to concurrency control on the basis of these type abstractions.
{"title":"Safe Typing for Transactional vs. Lock-Based Concurrency in Multi-threaded Java","authors":"Thi Mai Thuong Tran, Olaf Owe, M. Steffen","doi":"10.1109/KSE.2010.9","DOIUrl":"https://doi.org/10.1109/KSE.2010.9","url":null,"abstract":"Many concurrency models have been developed for high-level programming languages such as Java. A trend here is towards more flexible concurrency control protocols, going beyond the original Java multi-threading treatment based on lexically-scoped concurrency control mechanism. Two proposals supporting flexible, non-lexical concurrency control are the lock-handling via the Lock-classes in Java5 and Transactional Featherweight Java (TFJ), an extension of Featherweight Java by transactions. Even if these two take quite different approaches towards dealing with concurrency —“pessimistic” or lock-based vs. “optimistic” or based on transactions— the added flexibility of non-lexical use of the corresponding concurrency operators comes at a similar price: improper usage leads to run-time exceptions and unwanted behavior. This is in contrast with the more disciplined use under a lexically scoped regime, where each entrance to a critical region is syntactically accompanied by a corresponding exit (as e.g. with traditional synchronized methods or as with so-called atomic blocks).To assure safe use of locking, resp. transactions in these settings, we present in this paper abstractions in the form of two static type and effect systems, which make sure that for instance, no lock is released by a thread which does not hold it, resp., that no commit is executed outside any transaction. We furthermore compare the two mentioned approaches to concurrency control on the basis of these type abstractions.","PeriodicalId":158823,"journal":{"name":"2010 Second International Conference on Knowledge and Systems Engineering","volume":"308 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131953131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}