Pub Date : 2002-01-01DOI: 10.1163/9789004334441_003
E. Csuhaj-Varjú, C. Martín-Vide, V. Mitrana
A parallel communicating finite transducer system is a translating device where several finite transducers work in parallel, in a synchronized manner, and communicate with each other by requests. The communicated data are the current state of the transducer and the current contents of its output tape. Each computation step in such a system is either a usual translating step or a communication step; moreover, the communication steps have priority over the translating ones. Furthermore, whenever a component requests some data, that data must be communicated. We investigate the computational power of these systems. Then we consider systems restricted to subsequential transducers, as components, and compare these systems with the general ones. These devices turned out to be useful in computational linguistics. A short discussion on a possible relevance in the theory of discourse parsing and some directions for further work closes the paper.
{"title":"Parallel Communicating Finite Transducer Systems","authors":"E. Csuhaj-Varjú, C. Martín-Vide, V. Mitrana","doi":"10.1163/9789004334441_003","DOIUrl":"https://doi.org/10.1163/9789004334441_003","url":null,"abstract":"A parallel communicating finite transducer system is a translating device where several finite transducers work in parallel, in a synchronized manner, and communicate with each other by requests. The communicated data are the current state of the transducer and the current contents of its output tape. Each computation step in such a system is either a usual translating step or a communication step; moreover, the communication steps have priority over the translating ones. Furthermore, whenever a component requests some data, that data must be communicated. We investigate the computational power of these systems. Then we consider systems restricted to subsequential transducers, as components, and compare these systems with the general ones. These devices turned out to be useful in computational linguistics. A short discussion on a possible relevance in the theory of discourse parsing and some directions for further work closes the paper.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"23 1","pages":"9-23"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78631752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_004
S. V. Delden, F. Gomez
A finite state approach to determining the syntactic roles of commas has already been established for the English language. Here we extend this approach to Dutch. We identify syntactic dissimilarities of comma usages between the English and Dutch languages and show how much effort is needed to extend this finite state approach to Dutch. Once adapted to the Dutch language, the system is tested across several Dutch sources and results are given.
{"title":"Extending a Finite State Approach for Parsing Commas in English to Dutch","authors":"S. V. Delden, F. Gomez","doi":"10.1163/9789004334441_004","DOIUrl":"https://doi.org/10.1163/9789004334441_004","url":null,"abstract":"A finite state approach to determining the syntactic roles of commas has already been established for the English language. Here we extend this approach to Dutch. We identify syntactic dissimilarities of comma usages between the English and Dutch languages and show how much effort is needed to extend this finite state approach to Dutch. Once adapted to the Dutch language, the system is tested across several Dutch sources and results are given.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"56 1","pages":"25-38"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73327193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_006
J. Hammerton
In recent years, a number of models of speech segmentation have been developed, including models based on artificial neural networks (ANNs). The latter involved training a recurrent network to predict the next phoneme or utterance boundary, and deriving a means of predicting word boundaries from its behaviour. Here, a different connectionist approach to the task is investigated employing self-organising maps (SOMs) (Kohonen 1990). SOMs differ from other ANNs in that they are unsupervised learners. The aim is to investigate whether the SOM can become sensitive to where word boundaries occur, when trained on phonetically transcribed speech.
{"title":"Learning to Segment Speech with Self-Organising Maps","authors":"J. Hammerton","doi":"10.1163/9789004334441_006","DOIUrl":"https://doi.org/10.1163/9789004334441_006","url":null,"abstract":"In recent years, a number of models of speech segmentation have been developed, including models based on artificial neural networks (ANNs). The latter involved training a recurrent network to predict the next phoneme or utterance boundary, and deriving a means of predicting word boundaries from its behaviour. Here, a different connectionist approach to the task is investigated employing self-organising maps (SOMs) (Kohonen 1990). SOMs differ from other ANNs in that they are unsupervised learners. The aim is to investigate whether the SOM can become sensitive to where word boundaries occur, when trained on phonetically transcribed speech.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"13 1","pages":"51-64"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75284697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_011
T. V. Wouden, Ineke Schuurman, M. Schouppe, Heleen Hoekstra
In this paper, we report on quantitative research into certain word order phenomena in Dutch. In our research, we use the Spoken Dutch Corpus (CGN), a major new resource for research into contemporary spoken Dutch. After briefly introducing the primary data, the annotations added, and some of the tools to explore the primary data and the annotations, we illustrate how the Corpus may be utilized to answer certain linguistic questions concerning the Dutch language.
{"title":"Harvesting Dutch Trees: Syntactic Properties of Spoken Dutch","authors":"T. V. Wouden, Ineke Schuurman, M. Schouppe, Heleen Hoekstra","doi":"10.1163/9789004334441_011","DOIUrl":"https://doi.org/10.1163/9789004334441_011","url":null,"abstract":"In this paper, we report on quantitative research into certain word order phenomena in Dutch. In our research, we use the Spoken Dutch Corpus (CGN), a major new resource for research into contemporary spoken Dutch. After briefly introducing the primary data, the annotations added, and some of the tools to explore the primary data and the annotations, we illustrate how the Corpus may be utilized to answer certain linguistic questions concerning the Dutch language.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"36 1","pages":"129-141"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80005281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_012
Menno van Zaanen, G. V. Huyssteen
In this paper we describe the development of an improved spelling checker for Afrikaans. We compare two currently available spelling checkers and discuss their shortcomings. The existing applications are restricted in their suggestion capabilities, as well as their precision and recall, mainly because they cannot treat morphologically complex words correctly. Here, we will mainly focus on improvements in precision and recall.The general architecture of the existing spelling checker is discussed and several improvements are implemented. We describe an improved lookup phase and a newly added morphological analysis phase. The morphological analysis poses some problems which are also treated. Finally, some remaining problems are mentioned.
{"title":"Improving a Spelling Checker for Afrikaans","authors":"Menno van Zaanen, G. V. Huyssteen","doi":"10.1163/9789004334441_012","DOIUrl":"https://doi.org/10.1163/9789004334441_012","url":null,"abstract":"In this paper we describe the development of an improved spelling checker for Afrikaans. We compare two currently available spelling checkers and discuss their shortcomings. The existing applications are restricted in their suggestion capabilities, as well as their precision and recall, mainly because they cannot treat morphologically complex words correctly. Here, we will mainly focus on improvements in precision and recall.The general architecture of the existing spelling checker is discussed and several improvements are implemented. We describe an improved lookup phase and a newly added morphological analysis phase. The morphological analysis poses some problems which are also treated. Finally, some remaining problems are mentioned.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"30 1","pages":"143-156"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86162221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_008
S. Konstantopoulos
This paper reports on the application of Inductive Logic Programming (ILP) to the task of BaseNP chunking. After ILP and NP Chunking are discussed, the experimental setup for using ILP to construct a BaseNP tagger in Prolog is described. Finally, the results are analysed quantitatively as well as qualitatively.
{"title":"BaseNP Chunking using ILP","authors":"S. Konstantopoulos","doi":"10.1163/9789004334441_008","DOIUrl":"https://doi.org/10.1163/9789004334441_008","url":null,"abstract":"This paper reports on the application of Inductive Logic Programming (ILP) to the task of BaseNP chunking. After ILP and NP Chunking are discussed, the experimental setup for using ILP to construct a BaseNP tagger in Prolog is described. Finally, the results are analysed quantitatively as well as qualitatively.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"242 1","pages":"77-91"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90844104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_009
K. Spranger, U. Heid
We have developed a fully automatic recursive chunker for unrestricted Dutch text to be used as a basis for the extraction of linguistic and terminological information. The chunker is based on the approach adopted for the analysis of German in the YAC-chunker. Our tool builds up flat annotations of (maximal) syntactic constituents, using a multi-pass algorithm.We describe the chunking procedure and the coverage of the chunker with examples, e.g. PPs/NPs with prenominal modification, tegen de uit ioniserende stralingen voortspruitende gevaren or de te fuseren vennootschappen. We also illustrate its use in term candidate extraction from about 20 million words of social security documents from Flanders.
{"title":"A Dutch Chunker as a Basis for the Extraction of Linguistic Knowledge","authors":"K. Spranger, U. Heid","doi":"10.1163/9789004334441_009","DOIUrl":"https://doi.org/10.1163/9789004334441_009","url":null,"abstract":"We have developed a fully automatic recursive chunker for unrestricted Dutch text to be used as a basis for the extraction of linguistic and terminological information. The chunker is based on the approach adopted for the analysis of German in the YAC-chunker. Our tool builds up flat annotations of (maximal) syntactic constituents, using a multi-pass algorithm.We describe the chunking procedure and the coverage of the chunker with examples, e.g. PPs/NPs with prenominal modification, tegen de uit ioniserende stralingen voortspruitende gevaren or de te fuseren vennootschappen. We also illustrate its use in term candidate extraction from about 20 million words of social security documents from Flanders.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"1999 1","pages":"93-109"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88236047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_010
F. V. Eynde
For the treatment of agreement in Dutch NPs I adopt a distinction, familiar from HPSG, between morphosyntactic agreement and index agreement. In order to determine their respective roles, I make another distinction, proposed in Van Eynde (2003), between marked and unmarked nominals. Employing these distinctions, I will demonstrate that the combination of prenominal adjectives and determiners with unmarked nominals is subject to morphosyntactic agreement, whereas the combination of these prenominals with marked nominals is subject to index agreement.
{"title":"Morpho-Syntactic Agreement and Index Agreement in Dutch NPs","authors":"F. V. Eynde","doi":"10.1163/9789004334441_010","DOIUrl":"https://doi.org/10.1163/9789004334441_010","url":null,"abstract":"For the treatment of agreement in Dutch NPs I adopt a distinction, familiar from HPSG, between morphosyntactic agreement and index agreement. In order to determine their respective roles, I make another distinction, proposed in Van Eynde (2003), between marked and unmarked nominals. Employing these distinctions, I will demonstrate that the combination of prenominal adjectives and determiners with unmarked nominals is subject to morphosyntactic agreement, whereas the combination of these prenominals with marked nominals is subject to index agreement.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"193 1","pages":"111-127"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77479155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_007
Christer Johansson
Paradigmatic gaps are a problem for computational models of language acquisition, as most models that generalize online (eager learners, such as rule based learning and neural networks) will not notice systematically missing input. This is mainly a problem for the plausibility of the model, since the missing forms and structures will not deteriorate performance on recognition (because they will not be found often enough to matter). We are looking not only for a descriptive model of paradigmatic gaps, but also an explanatory model of why they emerge. The use for computational linguistics is that we can show how a linguistically motivated feature makes it possible to notice a negative regularity (i.e. that forms are missing), and this suggests that a hypothesis driven approach may be combined with statistical techniques (e.g. a memory-based learner) in interesting ways.
{"title":"How is Grammatical Gender Processed?","authors":"Christer Johansson","doi":"10.1163/9789004334441_007","DOIUrl":"https://doi.org/10.1163/9789004334441_007","url":null,"abstract":"Paradigmatic gaps are a problem for computational models of language acquisition, as most models that generalize online (eager learners, such as rule based learning and neural networks) will not notice systematically missing input. This is mainly a problem for the plausibility of the model, since the missing forms and structures will not deteriorate performance on recognition (because they will not be found often enough to matter). We are looking not only for a descriptive model of paradigmatic gaps, but also an explanatory model of why they emerge. The use for computational linguistics is that we can show how a linguistically motivated feature makes it possible to notice a negative regularity (i.e. that forms are missing), and this suggests that a hypothesis driven approach may be combined with statistical techniques (e.g. a memory-based learner) in interesting ways.","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"84 1","pages":"65-76"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73337774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-01-01DOI: 10.1163/9789004334441_002
H. B. Corstius
{"title":"De desillusie van mijn leven of Remember November","authors":"H. B. Corstius","doi":"10.1163/9789004334441_002","DOIUrl":"https://doi.org/10.1163/9789004334441_002","url":null,"abstract":"","PeriodicalId":82998,"journal":{"name":"The Clinician","volume":"11 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75254094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}