首页 > 最新文献

LDV Forum最新文献

英文 中文
Domain ontologies and wordnets in OWL: Modelling options OWL中的领域本体和词网:建模选项
Pub Date : 2007-07-01 DOI: 10.21248/jlcl.22.2007.92
H. Lüngen, Angelika Storrer
Word nets are lexical reference systems that follow the design principles of the Princeton WordNet project (Fellbaum 1998, henceforth referred to as PWN1). Domain ontologies (or domain-specific ontologies, e.g. GOLD2 or the GENE Ontology3) represent knowledge about a specific domain in a format that supports automated reasoning about the objects in that domain and the relations between them (cf. Erdmann 2001, 78). Word nets have been used in various applications of text processing, e.g. discourse parsing, lexical and thematic chaining, cohesion analyses, automatic segmentation and linking, anaphora resolution, and information extraction. When these applications process documents dealing with a specific domain, one needs to combine knowlegde about the domain-specific vocabulary represented in domain ontologies with lexical repositories representing general vocabulary (like PWN). In this context, it is useful to represent and interrelate the entities and relations in both types of resources using a common representation language. In our research group “Text-technological Information Modelling4” we chose OWL as a common format for this purpose. Since our projects are mainly concerned with German documents, we developed an OWL model that relates the German wordnet GermaNet (henceforth referred to as GN)5 with domain-specific ontologies in an approach that was inspired by the Plug-In model proposed in Magnini/Speranza (2002). Our approach is decribed in Kunze et al. (to appear); it was evaluated using representative subsets of GN and of the domain ontology TermNet6 (henceforth referred to as TN) as data and Protégé
词网是遵循普林斯顿WordNet项目(Fellbaum 1998,从此称为PWN1)设计原则的词汇参考系统。领域本体(或特定于领域的本体,例如GOLD2或GENE本体3)以支持对该领域中的对象及其之间关系的自动推理的格式表示关于特定领域的知识(参见Erdmann 2001,78)。词网在语篇分析、词汇和主题链接、衔接分析、自动分词和连接、回指消解和信息提取等方面有着广泛的应用。当这些应用程序处理处理特定领域的文档时,需要将有关领域本体中表示的特定于领域的词汇表的知识与表示通用词汇表的词汇库(如PWN)结合起来。在这种情况下,使用公共表示语言表示两种类型资源中的实体和关系并使其相互关联是很有用的。在我们的研究小组“文本技术信息建模”中,我们选择OWL作为实现此目的的通用格式。由于我们的项目主要与德语文档有关,因此我们开发了一个OWL模型,该模型将德语wordnet GermaNet(以下简称GN)5与特定领域的本体联系起来,其方法受到Magnini/Speranza(2002)中提出的插件模型的启发。Kunze等人描述了我们的方法(即将出现);使用GN和领域本体TermNet6(以下简称TN)的代表性子集作为数据和protg进行评估
{"title":"Domain ontologies and wordnets in OWL: Modelling options","authors":"H. Lüngen, Angelika Storrer","doi":"10.21248/jlcl.22.2007.92","DOIUrl":"https://doi.org/10.21248/jlcl.22.2007.92","url":null,"abstract":"Word nets are lexical reference systems that follow the design principles of the Princeton WordNet project (Fellbaum 1998, henceforth referred to as PWN1). Domain ontologies (or domain-specific ontologies, e.g. GOLD2 or the GENE Ontology3) represent knowledge about a specific domain in a format that supports automated reasoning about the objects in that domain and the relations between them (cf. Erdmann 2001, 78). Word nets have been used in various applications of text processing, e.g. discourse parsing, lexical and thematic chaining, cohesion analyses, automatic segmentation and linking, anaphora resolution, and information extraction. When these applications process documents dealing with a specific domain, one needs to combine knowlegde about the domain-specific vocabulary represented in domain ontologies with lexical repositories representing general vocabulary (like PWN). In this context, it is useful to represent and interrelate the entities and relations in both types of resources using a common representation language. In our research group “Text-technological Information Modelling4” we chose OWL as a common format for this purpose. Since our projects are mainly concerned with German documents, we developed an OWL model that relates the German wordnet GermaNet (henceforth referred to as GN)5 with domain-specific ontologies in an approach that was inspired by the Plug-In model proposed in Magnini/Speranza (2002). Our approach is decribed in Kunze et al. (to appear); it was evaluated using representative subsets of GN and of the domain ontology TermNet6 (henceforth referred to as TN) as data and Protégé","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116352209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Analysis of E-Discussions Using Classifier Induced Semantic Spaces 基于分类器诱导语义空间的电子讨论分析
Pub Date : 2007-07-01 DOI: 10.21248/jlcl.22.2007.87
Edda Leopold, J. Kindermann, G. Paass
We categorise contributions to an e-discussion platform using Classifier Induced Semantic Spaces and Self-Organising Maps. Analysing the contributions delivers insight into the nature of the communication process, makes it more comprehensible and renders the resulting decisions more transparent. Additionally, it can serve as a basis to monitor how the structure of the communication evolves over time. We evaluate our approach on a public ediscussion about an urban planning project, the Berlin Alexanderplatz, Germany. The proposed technique does not only produce high-level-features relevant to structure and monitor computer mediated communication, but also provides insight into how typical a particular document is for a specific category.
我们使用分类器诱导语义空间和自组织地图对电子讨论平台的贡献进行分类。分析这些贡献可以洞察沟通过程的本质,使其更容易理解,并使最终的决策更透明。此外,它还可以作为监视通信结构如何随时间演变的基础。我们在一个关于德国柏林亚历山大广场的城市规划项目的公共讨论中评估了我们的方法。所提出的技术不仅产生与结构和监视计算机中介通信相关的高级特性,而且还提供了对特定文档对于特定类别的典型程度的洞察。
{"title":"Analysis of E-Discussions Using Classifier Induced Semantic Spaces","authors":"Edda Leopold, J. Kindermann, G. Paass","doi":"10.21248/jlcl.22.2007.87","DOIUrl":"https://doi.org/10.21248/jlcl.22.2007.87","url":null,"abstract":"We categorise contributions to an e-discussion platform using Classifier Induced Semantic Spaces and Self-Organising Maps. Analysing the contributions delivers insight into the nature of the communication process, makes it more comprehensible and renders the resulting decisions more transparent. Additionally, it can serve as a basis to monitor how the structure of the communication evolves over time. We evaluate our approach on a public ediscussion about an urban planning project, the Berlin Alexanderplatz, Germany. The proposed technique does not only produce high-level-features relevant to structure and monitor computer mediated communication, but also provides insight into how typical a particular document is for a specific category.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121208046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural Classifiers of Text Types: Towards a Novel Model of Text Representation 文本类型的结构分类器:迈向一种新的文本表示模型
Pub Date : 2007-07-01 DOI: 10.21248/jlcl.22.2007.95
Alexander Mehler, Peter Geibel, O. Pustylnikov
Texts can be distinguished in terms of their content, function, structure or layout (Brinker, 1992; Bateman et al., 2001; Joachims, 2002; Power et al., 2003). These reference points do not open necessarily orthogonal perspectives on text classification. As part of explorative data analysis, text classification aims at automatically dividing sets of textual objects into classes of maximum internal homogeneity and external heterogeneity. This paper deals with classifying texts into text types whose instances serve more or less homogeneous functions. Other than mainstream approaches, which rely on the vector space model (Sebastiani, 2002) or some of its descendants (Baeza-Yates and Ribeiro-Neto, 1999) and, thus, on content-related lexical features, we solely refer to structural dierentiae. That is, we explore patterns of text structure as determinants of class membership. Our starting point are tree-like text representations which induce feature vectors and tree kernels. These kernels are utilized in supervised learning based on cross-validation as a method of model selection (Hastie et al., 2001) by example of a corpus of press communication. For a subset of categories we show that classification can be performed very well by structural dierentia only.
文本可以根据其内容、功能、结构或布局来区分(Brinker, 1992;贝特曼等人,2001;约阿希姆,2002;Power et al., 2003)。这些参考点并不一定打开文本分类的正交视角。作为探索性数据分析的一部分,文本分类旨在将文本对象集自动划分为最大内部同质性和最大外部异质性的类别。本文讨论将文本分类为文本类型,这些文本类型的实例或多或少具有同质功能。主流方法依赖于向量空间模型(Sebastiani, 2002)或它的一些后代(Baeza-Yates和Ribeiro-Neto, 1999),因此,与内容相关的词汇特征不同,我们只参考结构差异。也就是说,我们探索文本结构模式作为阶级成员的决定因素。我们的出发点是树状的文本表示,它引出特征向量和树核。这些核被用于基于交叉验证的监督学习中,作为模型选择的一种方法(Hastie等人,2001),以新闻传播语料库为例。对于类别的一个子集,我们表明仅通过结构差异可以很好地执行分类。
{"title":"Structural Classifiers of Text Types: Towards a Novel Model of Text Representation","authors":"Alexander Mehler, Peter Geibel, O. Pustylnikov","doi":"10.21248/jlcl.22.2007.95","DOIUrl":"https://doi.org/10.21248/jlcl.22.2007.95","url":null,"abstract":"Texts can be distinguished in terms of their content, function, structure or layout (Brinker, 1992; Bateman et al., 2001; Joachims, 2002; Power et al., 2003). These reference points do not open necessarily orthogonal perspectives on text classification. As part of explorative data analysis, text classification aims at automatically dividing sets of textual objects into classes of maximum internal homogeneity and external heterogeneity. This paper deals with classifying texts into text types whose instances serve more or less homogeneous functions. Other than mainstream approaches, which rely on the vector space model (Sebastiani, 2002) or some of its descendants (Baeza-Yates and Ribeiro-Neto, 1999) and, thus, on content-related lexical features, we solely refer to structural dierentiae. That is, we explore patterns of text structure as determinants of class membership. Our starting point are tree-like text representations which induce feature vectors and tree kernels. These kernels are utilized in supervised learning based on cross-validation as a method of model selection (Hastie et al., 2001) by example of a corpus of press communication. For a subset of categories we show that classification can be performed very well by structural dierentia only.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128065404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Towards a Logical Description of Trees in Annotation Graphs 注释图中树的逻辑描述
Pub Date : 2007-07-01 DOI: 10.21248/jlcl.22.2007.96
J. Michaelis, Uwe Mönnich
It is a matter of fact that a long history in artificial intelligence and computational linguistics tries to develop tools to extract semantic knowledge from syntactic information. In particular, from a text technological point of view the general research perspective is to extract (semantic) information from annotated documents. Regarding this aim, some of the relevant annotation models used are:
事实上,人工智能和计算语言学长期以来一直试图开发从句法信息中提取语义知识的工具。特别是,从文本技术的角度来看,一般的研究角度是从注释文档中提取(语义)信息。为了实现这一目标,使用了一些相关的注释模型:
{"title":"Towards a Logical Description of Trees in Annotation Graphs","authors":"J. Michaelis, Uwe Mönnich","doi":"10.21248/jlcl.22.2007.96","DOIUrl":"https://doi.org/10.21248/jlcl.22.2007.96","url":null,"abstract":"It is a matter of fact that a long history in artificial intelligence and computational linguistics tries to develop tools to extract semantic knowledge from syntactic information. In particular, from a text technological point of view the general research perspective is to extract (semantic) information from annotated documents. Regarding this aim, some of the relevant annotation models used are:","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128116242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Manually vs. Automatically Labelled Data in Discourse Relation Classification: Effects of Example and Feature Selection 篇章关系分类中人工与自动标记数据:实例和特征选择的影响
Pub Date : 2007-07-01 DOI: 10.21248/jlcl.22.2007.86
C. Sporleder
Wordnets are lexical reference systems that follow the design principles of the Princeton WordNet project (Fellbaum, ). Domain ontologies (or domain-specific ontologies such as GOLD, or the GENE Ontology) represent knowledge about a specific domain in a format that supports automated reasoning about the objects in that domain and the relations between them (Erdmann, ). In this paper, we will discuss how the Web Ontology Language OWL can be used to represent and interrelate the entities and relations in both types of resources. Our special focus will be on the question, whether synsets should be modelled as individuals (we use individual and instance as synonyms and will refer to this option as instance model) or as classes (we will refer to this option as class model). We will present three OWL models, each of which offers different solutions to this question. These models were developed in the context of the research group “Text-technological Modelling of Information” as a collaboration of the projects SemDok and HyTex. Since these projects are mainly concerned with German documents and with corpora that contain documents of a special technical or scientific domain, we used subsets of the German wordnet GermaNet (Kunze and Lemnitzer, ), henceforth referred to as GN, and the German domain ontology TermNet (Beiswenger et al., ), henceforth referred to as TN, to develop and evaluate the three models. To relate the general vocabulary of GN with the domain specific terms in TN, we developed an approach that was inspired by the plug-in model proposed by Magnini and Speranza (). In this approach, which has been developed in cooperation with the GermaNet research group (see Kunze et al. () for details), we adapted the OWL model for the English Princeton WordNet suggested by van Assem et al. () to GN, i.e. we modelled German synsets as instances of word-class-specific synset classes. For the reasons explained in section , we wanted to experiment with alternative models that implement the class model. In section  we will present three alternative OWL representations for GN and TN and discuss their benefits and drawbacks.
Wordnets是遵循普林斯顿WordNet项目(Fellbaum,)设计原则的词汇参考系统。领域本体(或特定于领域的本体,如GOLD或GENE本体)以支持对该领域中的对象及其之间关系的自动推理的格式表示关于特定领域的知识(Erdmann,)。在本文中,我们将讨论如何使用Web本体语言OWL来表示和关联两种类型资源中的实体和关系。我们将特别关注这个问题,即是否应该将同义词集建模为个体(我们使用个体和实例作为同义词,并将此选项称为实例模型),还是应该将其建模为类(我们将此选项称为类模型)。我们将介绍三个OWL模型,每个模型都为这个问题提供了不同的解决方案。这些模型是在“信息的文本技术建模”研究小组的背景下开发的,是SemDok和HyTex项目的合作。由于这些项目主要关注德语文档和包含特殊技术或科学领域文档的语料库,我们使用德语wordnet GermaNet (Kunze和Lemnitzer,)的子集(以下简称GN)和德语领域本体TermNet (Beiswenger等人,)的子集(以下简称TN)来开发和评估这三个模型。为了将GN的通用词汇表与TN中的领域特定术语联系起来,我们开发了一种方法,该方法受到Magnini和Speranza()提出的插件模型的启发。在这种与GermaNet研究小组合作开发的方法中(详见Kunze等人()),我们将van Assem等人()建议的英语普林斯顿WordNet的OWL模型改编为GN,即我们将德语同义词集建模为特定于词类的同义词集类的实例。由于节中解释的原因,我们想用实现类模型的替代模型进行实验。在节中,我们将介绍GN和TN的三种OWL表示,并讨论它们的优点和缺点。
{"title":"Manually vs. Automatically Labelled Data in Discourse Relation Classification: Effects of Example and Feature Selection","authors":"C. Sporleder","doi":"10.21248/jlcl.22.2007.86","DOIUrl":"https://doi.org/10.21248/jlcl.22.2007.86","url":null,"abstract":"Wordnets are lexical reference systems that follow the design principles of the Princeton WordNet project (Fellbaum, ). Domain ontologies (or domain-specific ontologies such as GOLD, or the GENE Ontology) represent knowledge about a specific domain in a format that supports automated reasoning about the objects in that domain and the relations between them (Erdmann, ). In this paper, we will discuss how the Web Ontology Language OWL can be used to represent and interrelate the entities and relations in both types of resources. Our special focus will be on the question, whether synsets should be modelled as individuals (we use individual and instance as synonyms and will refer to this option as instance model) or as classes (we will refer to this option as class model). We will present three OWL models, each of which offers different solutions to this question. These models were developed in the context of the research group “Text-technological Modelling of Information” as a collaboration of the projects SemDok and HyTex. Since these projects are mainly concerned with German documents and with corpora that contain documents of a special technical or scientific domain, we used subsets of the German wordnet GermaNet (Kunze and Lemnitzer, ), henceforth referred to as GN, and the German domain ontology TermNet (Beiswenger et al., ), henceforth referred to as TN, to develop and evaluate the three models. To relate the general vocabulary of GN with the domain specific terms in TN, we developed an approach that was inspired by the plug-in model proposed by Magnini and Speranza (). In this approach, which has been developed in cooperation with the GermaNet research group (see Kunze et al. () for details), we adapted the OWL model for the English Princeton WordNet suggested by van Assem et al. () to GN, i.e. we modelled German synsets as instances of word-class-specific synset classes. For the reasons explained in section , we wanted to experiment with alternative models that implement the class model. In section  we will present three alternative OWL representations for GN and TN and discuss their benefits and drawbacks.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129781474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Chatbots in der praktischen Fachlexikographie / Terminologie 技术用语中的吊索词
Pub Date : 2007-07-01 DOI: 10.21248/jlcl.22.2007.89
Franziskus Geeb
Chatkommunikation im Sinne eines interaktiven, textbasierten Gesprächs von Internetnutzern als Teil des Internets ist in verschiedenen Benutzungszusammenhängen und für verschiedenste Anwendungen von Marketing bis Freizeit belegt. Als Chatpartner kommen neben anderen Internetnutzern aber auch Computer in Betracht, und auch diese Kommunikationsform ist sowohl in der Wirtschaft als auch im Privatgebrauch bekannt. Der Erfolg eines Chatroboters begründet sich dabei wesentlich in seiner Fähigkeit, einen Dialog mit dem Chatpartner zu führen und sinnvolle Aussagen zu machen. Als Wissensbasis für diese Kommunikation ist neben regelbasierten Verfahren auch ein Rückgriff auf fachlexikographische / terminologische Daten denkbar – nicht zuletzt in einer Fachkommunikation. Der vorliegende Beitrag versucht diese Problematik einzugrenzen und konzipiert Randbedingungen einer möglichen Umsetzung.
聊天室交流,是互联网使用部分的基于文字的互动交流,在不同应用中使用,从营销到娱乐的应用都离不开这些应用。社交网站是一个社交网站,使用者和其他人都使用电脑。一个聊天室机器人之所以能成功,很大程度上要归功于他能与聊天对象进行对话,并且他能发表有意义的言论。作为这些交流的知识基础,除了基于规则的技术之外,还可以使用专业词汇/词汇数据,特别是专业交流。本论文试图缩小这个问题的范围,并提出不可避免的执行条件。
{"title":"Chatbots in der praktischen Fachlexikographie / Terminologie","authors":"Franziskus Geeb","doi":"10.21248/jlcl.22.2007.89","DOIUrl":"https://doi.org/10.21248/jlcl.22.2007.89","url":null,"abstract":"Chatkommunikation im Sinne eines interaktiven, textbasierten Gesprächs von Internetnutzern als Teil des Internets ist in verschiedenen Benutzungszusammenhängen und für verschiedenste Anwendungen von Marketing bis Freizeit belegt. Als Chatpartner kommen neben anderen Internetnutzern aber auch Computer in Betracht, und auch diese Kommunikationsform ist sowohl in der Wirtschaft als auch im Privatgebrauch bekannt. Der Erfolg eines Chatroboters begründet sich dabei wesentlich in seiner Fähigkeit, einen Dialog mit dem Chatpartner zu führen und sinnvolle Aussagen zu machen. Als Wissensbasis für diese Kommunikation ist neben regelbasierten Verfahren auch ein Rückgriff auf fachlexikographische / terminologische Daten denkbar – nicht zuletzt in einer Fachkommunikation. Der vorliegende Beitrag versucht diese Problematik einzugrenzen und konzipiert Randbedingungen einer möglichen Umsetzung.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114006880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
UniTerm - Formats and Terminology Exchange 格式和术语交换
Pub Date : 2006-07-01 DOI: 10.21248/jlcl.21.2006.79
W. Zenk
LDV FORUM – Band 21(1) – 2006 Abstract Th is article presents UniTerm, a typical representative of terminology management systems (TMS). Th e fi rst part will highlight common characteristics of TMS and give further insight into the UniTerm entry format and database design. Practise has shown that automatic, i.e. blind exchange of terminologies is diffi cult to achieve. Th e second section gives criteria where the exchange between diff erent TMS can fail and points out the relationship between the UniTerm like TMS data formats and existing terminology standards. Finally, it will be discussed what requirements have to be met in order to enable a deeper integration of terminology standards in a TMS and thus also a smoother transition between diff erent TMS. Th ese requirements are evaluated with Acolada s next generation TMS UniTerm Enterprise.
摘要本文介绍了术语管理系统(TMS)的典型代表unterm。第一部分将重点介绍TMS的共同特征,并进一步深入了解UniTerm条目格式和数据库设计。实践表明,术语的自动,即盲目交换是很难实现的。第二部分给出了不同TMS之间交换可能失败的标准,并指出了UniTerm(如TMS数据格式)与现有术语标准之间的关系。最后,将讨论为了在TMS中实现更深层次的术语标准集成,从而在不同的TMS之间实现更平滑的转换,必须满足哪些要求。这些需求是用Acolada的下一代TMS UniTerm Enterprise进行评估的。
{"title":"UniTerm - Formats and Terminology Exchange","authors":"W. Zenk","doi":"10.21248/jlcl.21.2006.79","DOIUrl":"https://doi.org/10.21248/jlcl.21.2006.79","url":null,"abstract":"LDV FORUM – Band 21(1) – 2006 Abstract Th is article presents UniTerm, a typical representative of terminology management systems (TMS). Th e fi rst part will highlight common characteristics of TMS and give further insight into the UniTerm entry format and database design. Practise has shown that automatic, i.e. blind exchange of terminologies is diffi cult to achieve. Th e second section gives criteria where the exchange between diff erent TMS can fail and points out the relationship between the UniTerm like TMS data formats and existing terminology standards. Finally, it will be discussed what requirements have to be met in order to enable a deeper integration of terminology standards in a TMS and thus also a smoother transition between diff erent TMS. Th ese requirements are evaluated with Acolada s next generation TMS UniTerm Enterprise.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114605350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Lexicon Exchange in MT - The Long Way to Standardization MT中的词汇交换-标准化之路漫漫
Pub Date : 2006-07-01 DOI: 10.21248/jlcl.21.2006.80
Stefanie Geldbach
LDV FORUM – Band 21(1) – 2006 Abstract Th is paper discusses the question to what extent lexicon exchange in MT has been standardized during the last years. Th e introductory section is followed by a brief description of OLIF2, a format specifi cally designed for the exchange of terminological and lexicographical data (Section 2). Section 3 contains an overview of the import/ export functionalities of fi ve MT systems (Promt Expert 7.0, Systran 5.0 Professional Premium, Translate pro 8.0, LexShop 2.2, OpenLogos). Th is evaluation shows that despite the standardization eff orts of the last years the exchange of lexicographical data between MT systems is still not a straightforward task.
摘要本文讨论了近年来机器翻译中词汇交换标准化的程度。介绍部分之后是OLIF2的简要描述,OLIF2是一种专门为术语和词典数据交换而设计的格式(第2节)。第3节包含五个MT系统(prompt Expert 7.0, systeman 5.0 Professional Premium, Translate pro 8.0, LexShop 2.2, OpenLogos)的导入/导出功能概述。该评估表明,尽管过去几年的标准化努力,MT系统之间的词典编纂数据交换仍然不是一个直截了当的任务。
{"title":"Lexicon Exchange in MT - The Long Way to Standardization","authors":"Stefanie Geldbach","doi":"10.21248/jlcl.21.2006.80","DOIUrl":"https://doi.org/10.21248/jlcl.21.2006.80","url":null,"abstract":"LDV FORUM – Band 21(1) – 2006 Abstract Th is paper discusses the question to what extent lexicon exchange in MT has been standardized during the last years. Th e introductory section is followed by a brief description of OLIF2, a format specifi cally designed for the exchange of terminological and lexicographical data (Section 2). Section 3 contains an overview of the import/ export functionalities of fi ve MT systems (Promt Expert 7.0, Systran 5.0 Professional Premium, Translate pro 8.0, LexShop 2.2, OpenLogos). Th is evaluation shows that despite the standardization eff orts of the last years the exchange of lexicographical data between MT systems is still not a straightforward task.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127263962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Terminology Exchange without Loss? Feasibilities and Limitations of Terminology Management Systems (TMS) 术语交换无损失?术语管理系统(TMS)的可行性与局限性
Pub Date : 2006-07-01 DOI: 10.21248/jlcl.21.2006.78
Uta Seewald-Heeg
LDV FORUM – Band 21(1) – 2006 Abstract Th e present article gives an overview over exchange formats supported by Terminology Management Systems (TMS) available on the market. As translation is one of the eldest application domains for terminology work, most terminology tools analyzed here are components of computer-aided translation (CAT) tools. In big corporates as well as in the localization industry, linguistic data, fi rst of all terminology, have to be shared by diff erent departments using diff erent systems, a situation that can be best solved by standardized formats. Th e evaluation of seven widely used TMS shows, however, that formats other than the standards proposed by organizations like LISA currently dominate the picture. In many cases, the only way to share data is to pass through fl at structured data stored as tab-delimited text fi les.
摘要本文概述了市场上可用的术语管理系统(TMS)支持的交换格式。由于翻译是术语工作最古老的应用领域之一,本文分析的大多数术语工具都是计算机辅助翻译(CAT)工具的组件。在大公司和本地化行业中,语言数据,首先是术语,必须由使用不同系统的不同部门共享,这种情况可以通过标准化格式最好地解决。然而,对7种广泛使用的TMS的评估表明,除了LISA等组织提出的标准之外,其他格式目前占主导地位。在许多情况下,共享数据的唯一方法是传递存储为以制表符分隔的文本文件的结构化数据。
{"title":"Terminology Exchange without Loss? Feasibilities and Limitations of Terminology Management Systems (TMS)","authors":"Uta Seewald-Heeg","doi":"10.21248/jlcl.21.2006.78","DOIUrl":"https://doi.org/10.21248/jlcl.21.2006.78","url":null,"abstract":"LDV FORUM – Band 21(1) – 2006 Abstract Th e present article gives an overview over exchange formats supported by Terminology Management Systems (TMS) available on the market. As translation is one of the eldest application domains for terminology work, most terminology tools analyzed here are components of computer-aided translation (CAT) tools. In big corporates as well as in the localization industry, linguistic data, fi rst of all terminology, have to be shared by diff erent departments using diff erent systems, a situation that can be best solved by standardized formats. Th e evaluation of seven widely used TMS shows, however, that formats other than the standards proposed by organizations like LISA currently dominate the picture. In many cases, the only way to share data is to pass through fl at structured data stored as tab-delimited text fi les.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115088965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Flexible Technologies to Visualize and Transform Terminological Representations Modelling Representations instead of Programming using Smalltalk 可视化和转换术语表示的灵活技术建模表示而不是使用Smalltalk编程
Pub Date : 2006-07-01 DOI: 10.21248/jlcl.21.2006.83
Georg Heeg
LDV FORUM Abstract Th is paper discusses a software design approach to allow interchange of linguistic data. It focuses on the modelling of the linguistic concepts represented in the data and describes the transfer between exchange formats as a multi-tier interpretation/generation. Th ese concepts are implemented in Smalltalk, a programming environment enabling fl exible conversion of data between formats supported by Terminology Management Systems (TMS).
摘要本文讨论了一种允许语言数据交换的软件设计方法。它侧重于对数据中表示的语言概念进行建模,并将交换格式之间的转换描述为多层解释/生成。这些概念是在Smalltalk中实现的,Smalltalk是一个编程环境,可以在术语管理系统(Terminology Management Systems, TMS)支持的格式之间灵活地转换数据。
{"title":"Flexible Technologies to Visualize and Transform Terminological Representations Modelling Representations instead of Programming using Smalltalk","authors":"Georg Heeg","doi":"10.21248/jlcl.21.2006.83","DOIUrl":"https://doi.org/10.21248/jlcl.21.2006.83","url":null,"abstract":"LDV FORUM Abstract Th is paper discusses a software design approach to allow interchange of linguistic data. It focuses on the modelling of the linguistic concepts represented in the data and describes the transfer between exchange formats as a multi-tier interpretation/generation. Th ese concepts are implemented in Smalltalk, a programming environment enabling fl exible conversion of data between formats supported by Terminology Management Systems (TMS).","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129229804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
LDV Forum
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1