首页 > 最新文献

Findings (Sydney (N.S.W.)最新文献

英文 中文
Tracking Parking Search and Occupancy in Zurich 追踪苏黎世的停车搜索和占用情况
Pub Date : 2023-03-16 DOI: 10.32866/001c.72793
Laurin Maurer, Flurin Roca, Noel Treffinger, Ludwig Henke, Vitus Hofmann, Oliver Leonhartsberger
This article describes how parking search times and occupancy in a highly attractive area in central Zurich (Switzerland) develop on a weekday morning when a farmer’s market takes place on Bürkliplatz, a central market place inside the study area. Individual vehicles were tracked by bike, drivers were interviewed about their parking behaviour and their route was tracked using GPS. Additionally, the parking occupancy was registered every fifteen minutes during a five-hour time period. A connection between market opening hours and parking dynamics within the perimeter was observed. Drivers usually overestimate their parking search duration.
这篇文章描述了在苏黎世(瑞士)中部一个极具吸引力的地区,当农贸市场在研究区域内的中央市场b rkliplatz上发生时,工作日早上停车搜索时间和占用率是如何发展的。个别车辆被自行车跟踪,司机被询问他们的停车行为,他们的路线被GPS跟踪。此外,在5个小时的时间内,每15分钟登记一次停车占用情况。观察到市场开放时间与周边停车动态之间的联系。司机通常会高估他们的停车搜索时间。
{"title":"Tracking Parking Search and Occupancy in Zurich","authors":"Laurin Maurer, Flurin Roca, Noel Treffinger, Ludwig Henke, Vitus Hofmann, Oliver Leonhartsberger","doi":"10.32866/001c.72793","DOIUrl":"https://doi.org/10.32866/001c.72793","url":null,"abstract":"This article describes how parking search times and occupancy in a highly attractive area in central Zurich (Switzerland) develop on a weekday morning when a farmer’s market takes place on Bürkliplatz, a central market place inside the study area. Individual vehicles were tracked by bike, drivers were interviewed about their parking behaviour and their route was tracked using GPS. Additionally, the parking occupancy was registered every fifteen minutes during a five-hour time period. A connection between market opening hours and parking dynamics within the perimeter was observed. Drivers usually overestimate their parking search duration.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43625783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How does ChatGPT Introduce Transport Problems and Solutions in North America? ChatGPT如何介绍北美的运输问题和解决方案?
Pub Date : 2023-03-06 DOI: 10.32866/001c.72634
Junghwan Kim, Jinhyung Lee
How does ChatGPT introduce transport problems and solutions in North America? By analyzing ChatGPT’s answers to four prompts related to transport issues and solutions in the United States and Canada, our results reveal that ChatGPT’s answers generally align well with transport researchers’ expectations. However, ChatGPT’s capability may be limited in providing trustworthy or sound solutions because of the potential issues (e.g., geographic biases, inaccuracy) in its training data. ChatGPT might be a decent starting point for discussing transport issues and solutions, but one should be aware of its limitations.
ChatGPT如何介绍北美的运输问题和解决方案?通过分析ChatGPT对美国和加拿大交通问题和解决方案相关的四个提示的回答,我们的结果表明,ChatGPT的回答总体上与交通研究人员的期望一致。然而,由于其训练数据中的潜在问题(例如,地理偏见、不准确),ChatGPT在提供值得信赖或可靠的解决方案方面的能力可能有限。ChatGPT可能是讨论传输问题和解决方案的一个不错的起点,但人们应该意识到它的局限性。
{"title":"How does ChatGPT Introduce Transport Problems and Solutions in North America?","authors":"Junghwan Kim, Jinhyung Lee","doi":"10.32866/001c.72634","DOIUrl":"https://doi.org/10.32866/001c.72634","url":null,"abstract":"How does ChatGPT introduce transport problems and solutions in North America? By analyzing ChatGPT’s answers to four prompts related to transport issues and solutions in the United States and Canada, our results reveal that ChatGPT’s answers generally align well with transport researchers’ expectations. However, ChatGPT’s capability may be limited in providing trustworthy or sound solutions because of the potential issues (e.g., geographic biases, inaccuracy) in its training data. ChatGPT might be a decent starting point for discussing transport issues and solutions, but one should be aware of its limitations.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48291282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multigroup Multimodality Index: A Method to Solve the Issue of Transport Mode Classification in Measuring Multimodality 多组多模态指数:一种解决多模态测量中运输方式分类问题的方法
Pub Date : 2023-03-02 DOI: 10.32866/001c.72072
Xingxing Fu, Dea van Lierop, D. Ettema
Recent methods to measure multimodality only consider the diversity and evenness of mode use, while ignoring that the classification of transport modes also matters. This study proposes a multigroup multimodality index to measure the extent of being multimodal at both single mode and mode group levels in a nested manner. The index is compared with the two most commonly used indices, the Herfindahl-Hirschman index and the Shannon Entropy index, to assess its reliability and improvement over existing approaches. Results show that the multigroup multimodality index can simultaneously distinguish the degree of being multimodal at both mode level and group level, which addresses the classification issue in measuring multimodality.
目前衡量多式联运的方法只考虑了运输方式使用的多样性和均匀性,而忽略了运输方式的分类也很重要。本研究提出了一个多组多模态指数,以嵌套的方式衡量单模态和模态组水平上的多模态程度。该指数与两种最常用的指数——赫芬达尔-赫希曼指数和香农熵指数——进行比较,以评估其可靠性和相对于现有方法的改进。结果表明,多组多模态指数可以同时在模态水平和组态水平上区分多模态程度,解决了多模态度量中的分类问题。
{"title":"Multigroup Multimodality Index: A Method to Solve the Issue of Transport Mode Classification in Measuring Multimodality","authors":"Xingxing Fu, Dea van Lierop, D. Ettema","doi":"10.32866/001c.72072","DOIUrl":"https://doi.org/10.32866/001c.72072","url":null,"abstract":"Recent methods to measure multimodality only consider the diversity and evenness of mode use, while ignoring that the classification of transport modes also matters. This study proposes a multigroup multimodality index to measure the extent of being multimodal at both single mode and mode group levels in a nested manner. The index is compared with the two most commonly used indices, the Herfindahl-Hirschman index and the Shannon Entropy index, to assess its reliability and improvement over existing approaches. Results show that the multigroup multimodality index can simultaneously distinguish the degree of being multimodal at both mode level and group level, which addresses the classification issue in measuring multimodality.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44842590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of an Influencer Campaign on Social Media Targeting Young E-scooter Users 针对年轻电动滑板车用户的社交媒体影响者活动的评估
Pub Date : 2023-03-01 DOI: 10.32866/001c.71347
A. Fyhri, V. Milch, Ingunn Ellis, Katrine Karlsen
This study evaluates the effect of an influencer campaign on e-scooter risk behavior among adolescent e-scooter users in Norway. The analysis shows no statistical differences in self-reported risk behaviors (dual riding, riding under the influence and mobile phone use) among respondents who had seen one of the campaign films, compared to respondents who had not seen the films. Neither did the campaign change norms or attitudes. Hence, the campaign did not appear to have intended effects. On the contrary, differences in perceived attitudes, descriptive norms and intentions were found, which could imply a backfire-effect. Respondents who had seen the campaign held poorer attitudes, were more likely to claim that it was normal, and were more inclined to perform some of the risky behaviors.
本研究评估了影响者运动对挪威青少年电动滑板车用户中电动滑板车风险行为的影响。分析显示,与没有看过电影的受访者相比,看过一部宣传电影的受访者在自我报告的风险行为(双人骑行、在影响下骑行和使用手机)方面没有统计学差异。竞选活动也没有改变规范或态度。因此,这场运动似乎没有达到预期的效果。相反,在感知态度、描述性规范和意图方面发现了差异,这可能意味着反作用。看过广告的受访者态度更差,更有可能认为这是正常的,也更倾向于做出一些危险的行为。
{"title":"Evaluation of an Influencer Campaign on Social Media Targeting Young E-scooter Users","authors":"A. Fyhri, V. Milch, Ingunn Ellis, Katrine Karlsen","doi":"10.32866/001c.71347","DOIUrl":"https://doi.org/10.32866/001c.71347","url":null,"abstract":"This study evaluates the effect of an influencer campaign on e-scooter risk behavior among adolescent e-scooter users in Norway. The analysis shows no statistical differences in self-reported risk behaviors (dual riding, riding under the influence and mobile phone use) among respondents who had seen one of the campaign films, compared to respondents who had not seen the films. Neither did the campaign change norms or attitudes. Hence, the campaign did not appear to have intended effects. On the contrary, differences in perceived attitudes, descriptive norms and intentions were found, which could imply a backfire-effect. Respondents who had seen the campaign held poorer attitudes, were more likely to claim that it was normal, and were more inclined to perform some of the risky behaviors.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44885049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prompt-based Learning for Text Readability Assessment 基于提示的文本可读性评估学习
Pub Date : 2023-02-25 DOI: 10.48550/arXiv.2302.13139
Bruce W. Lee, J. Lee
We propose the novel adaptation of a pre-trained seq2seq model for readability assessment. We prove that a seq2seq model - T5 or BART - can be adapted to discern which text is more difficult from two given texts (pairwise). As an exploratory study to prompt-learn a neural network for text readability in a text-to-text manner, we report useful tips for future work in seq2seq training and ranking-based approach to readability assessment. Specifically, we test nine input-output formats/prefixes and show that they can significantly influence the final model performance.Also, we argue that the combination of text-to-text training and pairwise ranking setup 1) enables leveraging multiple parallel text simplification data for teaching readability and 2) trains a neural model for the general concept of readability (therefore, better cross-domain generalization). At last, we report a 99.6% pairwise classification accuracy on Newsela and a 98.7% for OneStopEnglish, through a joint training approach. Our code is available at github.com/brucewlee/prompt-learning-readability.
我们提出了一种新的适应性的预训练seq2seq模型的可读性评估。我们证明了seq2seq模型- T5或BART -可以适应于从两个给定文本(成对)中区分哪个文本更困难。作为一项以文本对文本的方式快速学习文本可读性神经网络的探索性研究,我们报告了对未来seq2seq训练和基于排名的可读性评估方法的工作有用的提示。具体来说,我们测试了九种输入输出格式/前缀,并表明它们可以显著影响最终的模型性能。此外,我们认为文本到文本训练和两两排序设置的结合1)可以利用多个并行文本简化数据来教授可读性,2)为可读性的一般概念训练神经模型(因此,更好的跨域泛化)。最后,我们报告了通过联合训练方法,Newsela和OneStopEnglish的两两分类准确率分别为99.6%和98.7%。我们的代码可在github.com/brucewlee/prompt-learning-readability上获得。
{"title":"Prompt-based Learning for Text Readability Assessment","authors":"Bruce W. Lee, J. Lee","doi":"10.48550/arXiv.2302.13139","DOIUrl":"https://doi.org/10.48550/arXiv.2302.13139","url":null,"abstract":"We propose the novel adaptation of a pre-trained seq2seq model for readability assessment. We prove that a seq2seq model - T5 or BART - can be adapted to discern which text is more difficult from two given texts (pairwise). As an exploratory study to prompt-learn a neural network for text readability in a text-to-text manner, we report useful tips for future work in seq2seq training and ranking-based approach to readability assessment. Specifically, we test nine input-output formats/prefixes and show that they can significantly influence the final model performance.Also, we argue that the combination of text-to-text training and pairwise ranking setup 1) enables leveraging multiple parallel text simplification data for teaching readability and 2) trains a neural model for the general concept of readability (therefore, better cross-domain generalization). At last, we report a 99.6% pairwise classification accuracy on Newsela and a 98.7% for OneStopEnglish, through a joint training approach. Our code is available at github.com/brucewlee/prompt-learning-readability.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"82 1","pages":"1774-1779"},"PeriodicalIF":0.0,"publicationDate":"2023-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91392100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fairness in Language Models Beyond English: Gaps and Challenges 英语之外的语言模式公平:差距与挑战
Pub Date : 2023-02-24 DOI: 10.48550/arXiv.2302.12578
Krithika Ramesh, Sunayana Sitaram, M. Choudhury
With language models becoming increasingly ubiquitous, it has become essential to address their inequitable treatment of diverse demographic groups and factors. Most research on evaluating and mitigating fairness harms has been concentrated on English, while multilingual models and non-English languages have received comparatively little attention. In this paper, we survey different aspects of fairness in languages beyond English and multilingual contexts. This paper presents a survey of fairness in multilingual and non-English contexts, highlighting the shortcomings of current research and the difficulties faced by methods designed for English. We contend that the multitude of diverse cultures and languages across the world makes it infeasible to achieve comprehensive coverage in terms of constructing fairness datasets. Thus, the measurement and mitigation of biases must evolve beyond the current dataset-driven practices that are narrowly focused on specific dimensions and types of biases and, therefore, impossible to scale across languages and cultures.
随着语言模型变得越来越普遍,解决它们对不同人口群体和因素的不公平对待变得至关重要。评估和减轻公平损害的研究大多集中在英语领域,而多语言模型和非英语语言受到的关注相对较少。在本文中,我们调查了英语和多语言语境之外的语言中公平的不同方面。本文对多语言和非英语环境下的公平进行了调查,强调了当前研究的不足和为英语设计的方法所面临的困难。我们认为,世界各地的多种文化和语言使得在构建公平数据集方面实现全面覆盖是不可能的。因此,衡量和减轻偏见必须超越目前数据集驱动的做法,这些做法狭隘地关注特定的维度和类型的偏见,因此不可能跨语言和文化进行扩展。
{"title":"Fairness in Language Models Beyond English: Gaps and Challenges","authors":"Krithika Ramesh, Sunayana Sitaram, M. Choudhury","doi":"10.48550/arXiv.2302.12578","DOIUrl":"https://doi.org/10.48550/arXiv.2302.12578","url":null,"abstract":"With language models becoming increasingly ubiquitous, it has become essential to address their inequitable treatment of diverse demographic groups and factors. Most research on evaluating and mitigating fairness harms has been concentrated on English, while multilingual models and non-English languages have received comparatively little attention. In this paper, we survey different aspects of fairness in languages beyond English and multilingual contexts. This paper presents a survey of fairness in multilingual and non-English contexts, highlighting the shortcomings of current research and the difficulties faced by methods designed for English. We contend that the multitude of diverse cultures and languages across the world makes it infeasible to achieve comprehensive coverage in terms of constructing fairness datasets. Thus, the measurement and mitigation of biases must evolve beyond the current dataset-driven practices that are narrowly focused on specific dimensions and types of biases and, therefore, impossible to scale across languages and cultures.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"2061-2074"},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47727721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Improving User Controlled Table-To-Text Generation Robustness 提高用户控制表对文本生成的鲁棒性
Pub Date : 2023-02-20 DOI: 10.48550/arXiv.2302.09820
Hanxu Hu, Yunqing Liu, Zhongyi Yu, Laura Perez-Beltrachini
In this work we study user controlled table-to-text generation where users explore the content in a table by selecting cells and reading a natural language description thereof automatically produce by a natural language generator. Such generation models usually learn from carefully selected cell combinations (clean cell selections); however, in practice users may select unexpected, redundant, or incoherent cell combinations (noisy cell selections). In experiments, we find that models perform well on test sets coming from the same distribution as the train data but their performance drops when evaluated on realistic noisy user inputs. We propose a fine-tuning regime with additional user-simulated noisy cell selections. Models fine-tuned with the proposed regime gain 4.85 BLEU points on user noisy test cases and 1.4 on clean test cases; and achieve comparable state-of-the-art performance on the ToTTo dataset.
在这项工作中,我们研究了用户控制的表到文本生成,用户通过选择单元格和阅读自然语言生成器自动生成的自然语言描述来探索表中的内容。这种生成模型通常从精心选择的细胞组合(清洁细胞选择)中学习;然而,在实践中,用户可能会选择意外的、冗余的或不连贯的单元组合(有噪声的单元选择)。在实验中,我们发现模型在来自与训练数据相同分布的测试集上表现良好,但在实际有噪声的用户输入上评估时,它们的性能下降。我们提出了一种带有附加用户模拟噪声单元选择的微调机制。在用户噪声测试用例上得到4.85 BLEU点,在干净测试用例上得到1.4 BLEU点;并在ToTTo数据集上实现可比较的最先进性能。
{"title":"Improving User Controlled Table-To-Text Generation Robustness","authors":"Hanxu Hu, Yunqing Liu, Zhongyi Yu, Laura Perez-Beltrachini","doi":"10.48550/arXiv.2302.09820","DOIUrl":"https://doi.org/10.48550/arXiv.2302.09820","url":null,"abstract":"In this work we study user controlled table-to-text generation where users explore the content in a table by selecting cells and reading a natural language description thereof automatically produce by a natural language generator. Such generation models usually learn from carefully selected cell combinations (clean cell selections); however, in practice users may select unexpected, redundant, or incoherent cell combinations (noisy cell selections). In experiments, we find that models perform well on test sets coming from the same distribution as the train data but their performance drops when evaluated on realistic noisy user inputs. We propose a fine-tuning regime with additional user-simulated noisy cell selections. Models fine-tuned with the proposed regime gain 4.85 BLEU points on user noisy test cases and 1.4 on clean test cases; and achieve comparable state-of-the-art performance on the ToTTo dataset.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"2272-2279"},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42282954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Intent Identification and Entity Extraction for Healthcare Queries in Indic Languages 印度语医疗保健查询的意图识别和实体提取
Pub Date : 2023-02-19 DOI: 10.48550/arXiv.2302.09685
Ankan Mullick, Ishani Mondal, Sourjyadip Ray, R. Raghav, G. Chaitanya, Pawan Goyal
Scarcity of data and technological limitations for resource-poor languages in developing countries like India poses a threat to the development of sophisticated NLU systems for healthcare. To assess the current status of various state-of-the-art language models in healthcare, this paper studies the problem by initially proposing two different Healthcare datasets, Indian Healthcare Query Intent-WebMD and 1mg (IHQID-WebMD and IHQID-1mg) and one real world Indian hospital query data in English and multiple Indic languages (Hindi, Bengali, Tamil, Telugu, Marathi and Gujarati) which are annotated with the query intents as well as entities. Our aim is to detect query intents and corresponding entities. We perform extensive experiments on a set of models which in various realistic settings and explore two scenarios based on the access to English data only (less costly) and access to target language data (more expensive). We analyze context specific practical relevancy through empirical analysis. The results, expressed in terms of overall F-score show that our approach is practically useful to identify intents and entities.
在像印度这样的发展中国家,资源贫乏的语言缺乏数据和技术限制对复杂的医疗保健NLU系统的发展构成了威胁。为了评估医疗保健中各种最先进的语言模型的现状,本文通过最初提出两个不同的医疗保健数据集来研究这个问题,印度医疗保健查询意图- webmd和1mg (IHQID-WebMD和IHQID-1mg)和一个真实世界的印度医院查询数据,其中包括英语和多种印度语言(印地语、孟加拉语、泰米尔语、泰卢固语、马拉地语和古吉拉特语),这些数据用查询意图和实体进行了注释。我们的目标是检测查询意图和相应的实体。我们在一组模型上进行了广泛的实验,这些模型在各种现实环境中进行,并探索了基于仅访问英语数据(成本较低)和访问目标语言数据(成本较高)的两种场景。我们通过实证分析来分析具体情境的实际关联性。用总体f分数表示的结果表明,我们的方法在识别意图和实体方面实际上是有用的。
{"title":"Intent Identification and Entity Extraction for Healthcare Queries in Indic Languages","authors":"Ankan Mullick, Ishani Mondal, Sourjyadip Ray, R. Raghav, G. Chaitanya, Pawan Goyal","doi":"10.48550/arXiv.2302.09685","DOIUrl":"https://doi.org/10.48550/arXiv.2302.09685","url":null,"abstract":"Scarcity of data and technological limitations for resource-poor languages in developing countries like India poses a threat to the development of sophisticated NLU systems for healthcare. To assess the current status of various state-of-the-art language models in healthcare, this paper studies the problem by initially proposing two different Healthcare datasets, Indian Healthcare Query Intent-WebMD and 1mg (IHQID-WebMD and IHQID-1mg) and one real world Indian hospital query data in English and multiple Indic languages (Hindi, Bengali, Tamil, Telugu, Marathi and Gujarati) which are annotated with the query intents as well as entities. Our aim is to detect query intents and corresponding entities. We perform extensive experiments on a set of models which in various realistic settings and explore two scenarios based on the access to English data only (less costly) and access to target language data (more expensive). We analyze context specific practical relevancy through empirical analysis. The results, expressed in terms of overall F-score show that our approach is practically useful to identify intents and entities.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1825-1836"},"PeriodicalIF":0.0,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43353117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Role of Semantic Parsing in Understanding Procedural Text 语义分析在理解程序文本中的作用
Pub Date : 2023-02-14 DOI: 10.48550/arXiv.2302.06829
Hossein Rajaby Faghihi, Parisa Kordjamshidi, C. Teng, J. Allen
In this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser~(TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework.Second, we integrate semantic parsing information into state-of-the-art neural models to conduct procedural reasoning.Our experiments indicate that explicitly incorporating such semantic knowledge improves procedural understanding. This paper presents new metrics for evaluating procedural reasoning tasks that clarify the challenges and identify differences among neural, symbolic, and integrated models.
在本文中,我们研究了从深层语义解析器中提取的符号语义表示是否有助于推理过程文本中所涉及实体的状态。我们认为深度语义解析器(TRIPS)和语义角色标记是语义解析知识的两个来源。首先,我们提出了PROPOLIS,一个基于符号解析的过程推理框架。其次,我们将语义解析信息集成到最先进的神经模型中,以进行过程推理。我们的实验表明,明确地结合这种语义知识可以提高程序理解。本文提出了评估过程推理任务的新指标,阐明了神经模型、符号模型和集成模型之间的挑战和差异。
{"title":"The Role of Semantic Parsing in Understanding Procedural Text","authors":"Hossein Rajaby Faghihi, Parisa Kordjamshidi, C. Teng, J. Allen","doi":"10.48550/arXiv.2302.06829","DOIUrl":"https://doi.org/10.48550/arXiv.2302.06829","url":null,"abstract":"In this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser~(TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework.Second, we integrate semantic parsing information into state-of-the-art neural models to conduct procedural reasoning.Our experiments indicate that explicitly incorporating such semantic knowledge improves procedural understanding. This paper presents new metrics for evaluating procedural reasoning tasks that clarify the challenges and identify differences among neural, symbolic, and integrated models.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1792-1804"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47585602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bag of Tricks for In-Distribution Calibration of Pretrained Transformers 配电中预训练变压器校准的技巧包
Pub Date : 2023-02-13 DOI: 10.48550/arXiv.2302.06690
Jaeyoung Kim, Dongbin Na, Sungchul Choi, Sungbin Lim
While pre-trained language models (PLMs) have become a de-facto standard promoting the accuracy of text classification tasks, recent studies find that PLMs often predict over-confidently.Although calibration methods have been proposed, such as ensemble learning and data augmentation, most of the methods have been verified in computer vision benchmarks rather than in PLM-based text classification tasks. In this paper, we present an empirical study on confidence calibration for PLMs, addressing three categories, including confidence penalty losses, data augmentations, and ensemble methods. We find that the ensemble model overfitted to the training set shows sub-par calibration performance and also observe that PLMs trained with confidence penalty loss have a trade-off between calibration and accuracy. Building on these observations, we propose the Calibrated PLM (CALL), a combination of calibration techniques. The CALL complements shortcomings that may occur when utilizing a calibration method individually and boosts both classification and calibration accuracy. Design choices in CALL’s training procedures are extensively studied, and we provide a detailed analysis of how calibration techniques affect the calibration performance of PLMs.
虽然预训练语言模型(plm)已经成为提高文本分类任务准确性的事实上的标准,但最近的研究发现,plm经常过于自信地进行预测。虽然已经提出了校准方法,如集成学习和数据增强,但大多数方法都是在计算机视觉基准测试中验证的,而不是在基于plm的文本分类任务中验证的。本文对plm的置信度校准进行了实证研究,包括置信度惩罚损失、数据增强和集成方法。我们发现,与训练集过拟合的集成模型显示出低于标准的校准性能,并且还观察到使用置信度惩罚损失训练的plm在校准和精度之间存在权衡。基于这些观察,我们提出了校准PLM (CALL),这是校准技术的组合。CALL补充了单独使用一种校准方法时可能出现的缺点,并提高了分类和校准精度。对CALL培训程序中的设计选择进行了广泛研究,并详细分析了校准技术如何影响plm的校准性能。
{"title":"Bag of Tricks for In-Distribution Calibration of Pretrained Transformers","authors":"Jaeyoung Kim, Dongbin Na, Sungchul Choi, Sungbin Lim","doi":"10.48550/arXiv.2302.06690","DOIUrl":"https://doi.org/10.48550/arXiv.2302.06690","url":null,"abstract":"While pre-trained language models (PLMs) have become a de-facto standard promoting the accuracy of text classification tasks, recent studies find that PLMs often predict over-confidently.Although calibration methods have been proposed, such as ensemble learning and data augmentation, most of the methods have been verified in computer vision benchmarks rather than in PLM-based text classification tasks. In this paper, we present an empirical study on confidence calibration for PLMs, addressing three categories, including confidence penalty losses, data augmentations, and ensemble methods. We find that the ensemble model overfitted to the training set shows sub-par calibration performance and also observe that PLMs trained with confidence penalty loss have a trade-off between calibration and accuracy. Building on these observations, we propose the Calibrated PLM (CALL), a combination of calibration techniques. The CALL complements shortcomings that may occur when utilizing a calibration method individually and boosts both classification and calibration accuracy. Design choices in CALL’s training procedures are extensively studied, and we provide a detailed analysis of how calibration techniques affect the calibration performance of PLMs.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"551"},"PeriodicalIF":0.0,"publicationDate":"2023-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45425387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Findings (Sydney (N.S.W.)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1