首页 > 最新文献

Findings (Sydney (N.S.W.)最新文献

英文 中文
Examining the Use and Non-use of Special Transport Services in Sweden’s Large City-Regions: The Last Resort? 审查瑞典大城市地区特殊交通服务的使用和不使用:最后的选择?
Pub Date : 2022-12-05 DOI: 10.32866/001c.49873
Jean Ryan, M. Zingmark
This study examines the extent of the gap between the proportions of survey respondents reporting (1) having the possibility to use and (2) using special transport services (STS) compared to the corresponding gaps for other transport modes. For persons eligible for STS, differences between those who use them and those who do not use them are explored. The frequencies with which these two groups leave the home are then compared. Those aged 65-69, those with higher self-rated health and those cohabiting were less likely to use STS, despite being eligible. Those using STS tend to leave the home less often.
本研究考察了报告(1)有可能使用和(2)使用特殊运输服务(STS)的受访者比例与其他运输方式的相应差距之间的差距程度。对于合资格使用化粪池系统的人士,探讨使用化粪池系统的人士与不使用化粪池系统的人士之间的差异。然后比较这两组人离开家的频率。年龄在65-69岁之间的人、自评健康状况较高的人以及同居的人尽管符合条件,但使用STS的可能性较小。使用化粪池系统的人往往离家较少。
{"title":"Examining the Use and Non-use of Special Transport Services in Sweden’s Large City-Regions: The Last Resort?","authors":"Jean Ryan, M. Zingmark","doi":"10.32866/001c.49873","DOIUrl":"https://doi.org/10.32866/001c.49873","url":null,"abstract":"This study examines the extent of the gap between the proportions of survey respondents reporting (1) having the possibility to use and (2) using special transport services (STS) compared to the corresponding gaps for other transport modes. For persons eligible for STS, differences between those who use them and those who do not use them are explored. The frequencies with which these two groups leave the home are then compared. Those aged 65-69, those with higher self-rated health and those cohabiting were less likely to use STS, despite being eligible. Those using STS tend to leave the home less often.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41575956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect of Social Vulnerability on Taxi Trip Times during Hurricane Sandy 飓风桑迪期间社会脆弱性对出租车出行时间的影响
Pub Date : 2022-11-21 DOI: 10.32866/001c.53070
Avipsa Roy, B. Kar
The increase in the availability of GPS-based movement data has enabled the exploration of mobility patterns in urban transportation networks. Understanding the relationship between social vulnerability and transportation flows from big data during natural disasters is crucial for utilities and policymakers for decision-making purposes, such as evacuation and restoration planning. In this study, we explore the geographic variation of changes in trip times of taxi trips in New York City (NYC) before and after Hurricane Sandy (2012) using GPS trajectory data in relation to the underlying socio-economic distribution of impacted populations using localized regression technique with GWR. The findings reveal how the spatial patterns of trip change times with respect to SVI, income levels and population density in NYC.
基于gps的移动数据的可用性的增加使探索城市交通网络中的移动模式成为可能。在自然灾害期间,从大数据中了解社会脆弱性与交通流量之间的关系,对于公共事业和政策制定者的决策目的至关重要,例如疏散和恢复规划。在这项研究中,我们利用GPS轨迹数据,利用GWR局部回归技术,探讨了飓风桑迪(2012)前后纽约市出租车出行时间变化的地理变异,以及受影响人群的潜在社会经济分布。研究结果揭示了纽约市出行时间的空间格局与SVI、收入水平和人口密度之间的关系。
{"title":"Effect of Social Vulnerability on Taxi Trip Times during Hurricane Sandy","authors":"Avipsa Roy, B. Kar","doi":"10.32866/001c.53070","DOIUrl":"https://doi.org/10.32866/001c.53070","url":null,"abstract":"The increase in the availability of GPS-based movement data has enabled the exploration of mobility patterns in urban transportation networks. Understanding the relationship between social vulnerability and transportation flows from big data during natural disasters is crucial for utilities and policymakers for decision-making purposes, such as evacuation and restoration planning. In this study, we explore the geographic variation of changes in trip times of taxi trips in New York City (NYC) before and after Hurricane Sandy (2012) using GPS trajectory data in relation to the underlying socio-economic distribution of impacted populations using localized regression technique with GWR. The findings reveal how the spatial patterns of trip change times with respect to SVI, income levels and population density in NYC.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43189379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Online Interactive Dashboard to Explore Personal Exposure to Air Pollution 探索个人暴露在空气污染中的在线互动仪表板
Pub Date : 2022-11-18 DOI: 10.32866/001c.49875
W. Lee, Kayla Schulte, T. Schwanen
Studies increasingly examine individual exposure to air pollution while accounting for person-specific activity-travel patterns. Supporting policymakers and local communities using the resulting data requires transparent and ethical communication of exposure levels to affected individuals and other stakeholders. This paper asks how an interactive online dashboard might represent individual-level air pollution exposure profiles to different audiences while respecting individuals’ geoprivacy. Using data from 37 Oxford (UK) residents, it shows that heterogeneous individual-level exposure profiles can be shared ethically through different combinations of visualisation method, spatial and temporal resolution of data representation and Geomasking techniques for different dashboard user groups.
研究越来越多地考察个人暴露在空气污染中的情况,同时考虑个人的特定活动旅行模式。支持决策者和当地社区使用由此产生的数据,需要向受影响的个人和其他利益相关者透明、合乎道德地沟通接触水平。本文询问了交互式在线仪表板如何在尊重个人地理隐私的同时,向不同受众代表个人水平的空气污染暴露情况。利用37名牛津(英国)居民的数据,研究表明,通过可视化方法、数据表示的空间和时间分辨率以及不同仪表板用户群体的几何测量技术的不同组合,可以在道德上共享异质的个人水平暴露简档。
{"title":"An Online Interactive Dashboard to Explore Personal Exposure to Air Pollution","authors":"W. Lee, Kayla Schulte, T. Schwanen","doi":"10.32866/001c.49875","DOIUrl":"https://doi.org/10.32866/001c.49875","url":null,"abstract":"Studies increasingly examine individual exposure to air pollution while accounting for person-specific activity-travel patterns. Supporting policymakers and local communities using the resulting data requires transparent and ethical communication of exposure levels to affected individuals and other stakeholders. This paper asks how an interactive online dashboard might represent individual-level air pollution exposure profiles to different audiences while respecting individuals’ geoprivacy. Using data from 37 Oxford (UK) residents, it shows that heterogeneous individual-level exposure profiles can be shared ethically through different combinations of visualisation method, spatial and temporal resolution of data representation and Geomasking techniques for different dashboard user groups.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48898009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Key Correlates of Trail Satisfaction and their Nonlinear Relationships in Suburban Areas 探索郊区步道满意度的关键相关性及其非线性关系
Pub Date : 2022-11-15 DOI: 10.32866/001c.53105
Jasmine Cao, Chun Yin
Using data collected from trail users in Woodbury, MN, this study applies gradient-boosting decision trees to explore the nonlinear associations between trail elements and user overall satisfaction. Scenery, personal safety, and connection are the most important contributors to overall satisfaction. Several trail elements show nonlinear effects on overall satisfaction. Specifically, bumps and lighting greatly affect overall satisfaction when their performance is poor, whereas personal safety, home access to trails, and shade improve overall satisfaction when performing well. The results also showed that the city should prioritize improvements on bumps, lighting, roadway crossing, safety, and access to enhance user satisfaction effectively.
使用从Woodbury, MN的步道用户收集的数据,本研究应用梯度增强决策树来探索步道元素与用户总体满意度之间的非线性关联。风景、人身安全和人际关系是整体满意度最重要的因素。几个跟踪因素对总体满意度表现出非线性影响。具体来说,颠簸和照明在表现不佳时极大地影响了整体满意度,而个人安全、小径的家庭通道和阴影在表现良好时提高了整体满意度。结果还表明,该市应优先考虑改善颠簸、照明、道路交叉、安全和通道,以有效提高用户满意度。
{"title":"Exploring Key Correlates of Trail Satisfaction and their Nonlinear Relationships in Suburban Areas","authors":"Jasmine Cao, Chun Yin","doi":"10.32866/001c.53105","DOIUrl":"https://doi.org/10.32866/001c.53105","url":null,"abstract":"Using data collected from trail users in Woodbury, MN, this study applies gradient-boosting decision trees to explore the nonlinear associations between trail elements and user overall satisfaction. Scenery, personal safety, and connection are the most important contributors to overall satisfaction. Several trail elements show nonlinear effects on overall satisfaction. Specifically, bumps and lighting greatly affect overall satisfaction when their performance is poor, whereas personal safety, home access to trails, and shade improve overall satisfaction when performing well. The results also showed that the city should prioritize improvements on bumps, lighting, roadway crossing, safety, and access to enhance user satisfaction effectively.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42829781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating Synthetic Speech from SpokenVocab for Speech Translation 从SpokenVocab生成合成语音用于语音翻译
Pub Date : 2022-10-15 DOI: 10.48550/arXiv.2210.08174
Jinming Zhao, Gholamreza Haffar, Ehsan Shareghi
Training end-to-end speech translation (ST) systems requires sufficiently large-scale data, which is unavailable for most language pairs and domains. One practical solution to the data scarcity issue is to convert text-based machine translation (MT) data to ST data via text-to-speech (TTS) systems.Yet, using TTS systems can be tedious and slow. In this work, we propose SpokenVocab, a simple, scalable and effective data augmentation technique to convert MT data to ST data on-the-fly. The idea is to retrieve and stitch audio snippets, corresponding to words in an MT sentence, from a spoken vocabulary bank. Our experiments on multiple language pairs show that stitched speech helps to improve translation quality by an average of 1.83 BLEU score, while performing equally well as TTS-generated speech in improving translation quality. We also showcase how SpokenVocab can be applied in code-switching ST for which often no TTS systems exit.
训练端到端语音翻译(ST)系统需要足够大规模的数据,而这对于大多数语言对和领域来说是不可用的。数据稀缺问题的一个实际解决方案是通过文本到语音(TTS)系统将基于文本的机器翻译(MT)数据转换为ST数据。然而,使用TTS系统可能是乏味和缓慢的。在这项工作中,我们提出了SpokenVocab,这是一种简单、可扩展且有效的数据扩充技术,可以在飞行中将MT数据转换为ST数据。这个想法是从口语词汇库中检索并缝合与MT句子中的单词相对应的音频片段。我们在多语言对上的实验表明,拼接语音有助于将翻译质量平均提高1.83 BLEU分数,同时在提高翻译质量方面与TTS生成的语音表现相同。我们还展示了SpokenVocab如何应用于通常没有TTS系统退出的代码切换ST。
{"title":"Generating Synthetic Speech from SpokenVocab for Speech Translation","authors":"Jinming Zhao, Gholamreza Haffar, Ehsan Shareghi","doi":"10.48550/arXiv.2210.08174","DOIUrl":"https://doi.org/10.48550/arXiv.2210.08174","url":null,"abstract":"Training end-to-end speech translation (ST) systems requires sufficiently large-scale data, which is unavailable for most language pairs and domains. One practical solution to the data scarcity issue is to convert text-based machine translation (MT) data to ST data via text-to-speech (TTS) systems.Yet, using TTS systems can be tedious and slow. In this work, we propose SpokenVocab, a simple, scalable and effective data augmentation technique to convert MT data to ST data on-the-fly. The idea is to retrieve and stitch audio snippets, corresponding to words in an MT sentence, from a spoken vocabulary bank. Our experiments on multiple language pairs show that stitched speech helps to improve translation quality by an average of 1.83 BLEU score, while performing equally well as TTS-generated speech in improving translation quality. We also showcase how SpokenVocab can be applied in code-switching ST for which often no TTS systems exit.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1930-1936"},"PeriodicalIF":0.0,"publicationDate":"2022-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48791652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers 人口统计学因素能改善文本分类吗?重新审视变形金刚时代的人口适应
Pub Date : 2022-10-13 DOI: 10.48550/arXiv.2210.07362
Chia-Chien Hung, Anne Lauscher, Dirk Hovy, Simone Paolo Ponzetto, Goran Glavavs
Demographic factors (e.g., gender or age) shape our language. Previous work showed that incorporating demographic factors can consistently improve performance for various NLP tasks with traditional NLP models. In this work, we investigate whether these previous findings still hold with state-of-the-art pretrained Transformer-based language models (PLMs). We use three common specialization methods proven effective for incorporating external knowledge into pretrained Transformers (e.g., domain-specific or geographic knowledge). We adapt the language representations for the demographic dimensions of gender and age, using continuous language modeling and dynamic multi-task learning for adaptation, where we couple language modeling objectives with the prediction of demographic classes. Our results, when employing a multilingual PLM, show substantial gains in task performance across four languages (English, German, French, and Danish), which is consistent with the results of previous work. However, controlling for confounding factors – primarily domain and language proficiency of Transformer-based PLMs – shows that downstream performance gains from our demographic adaptation do not actually stem from demographic knowledge. Our results indicate that demographic specialization of PLMs, while holding promise for positive societal impact, still represents an unsolved problem for (modern) NLP.
人口因素(如性别或年龄)影响我们的语言。先前的工作表明,与传统的NLP模型相比,结合人口统计因素可以持续提高各种NLP任务的性能。在这项工作中,我们调查了这些先前的发现是否仍然适用于最先进的预训练的基于Transformer的语言模型(PLM)。我们使用了三种常见的专业化方法,这些方法被证明可以有效地将外部知识纳入预先训练的变压器中(例如,特定领域或地理知识)。我们使用连续的语言建模和动态多任务学习来适应性别和年龄的人口统计学维度,将语言建模目标与人口统计学类别的预测相结合。当使用多语言PLM时,我们的结果显示,四种语言(英语、德语、法语和丹麦语)的任务性能都有了显著提高,这与之前的工作结果一致。然而,控制混杂因素——主要是基于Transformer的PLM的领域和语言熟练度——表明我们从人口统计学适应中获得的下游绩效收益实际上并不来源于人口统计学知识。我们的研究结果表明,PLM的人口统计学专业化,尽管有望产生积极的社会影响,但仍然是(现代)NLP尚未解决的问题。
{"title":"Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers","authors":"Chia-Chien Hung, Anne Lauscher, Dirk Hovy, Simone Paolo Ponzetto, Goran Glavavs","doi":"10.48550/arXiv.2210.07362","DOIUrl":"https://doi.org/10.48550/arXiv.2210.07362","url":null,"abstract":"Demographic factors (e.g., gender or age) shape our language. Previous work showed that incorporating demographic factors can consistently improve performance for various NLP tasks with traditional NLP models. In this work, we investigate whether these previous findings still hold with state-of-the-art pretrained Transformer-based language models (PLMs). We use three common specialization methods proven effective for incorporating external knowledge into pretrained Transformers (e.g., domain-specific or geographic knowledge). We adapt the language representations for the demographic dimensions of gender and age, using continuous language modeling and dynamic multi-task learning for adaptation, where we couple language modeling objectives with the prediction of demographic classes. Our results, when employing a multilingual PLM, show substantial gains in task performance across four languages (English, German, French, and Danish), which is consistent with the results of previous work. However, controlling for confounding factors – primarily domain and language proficiency of Transformer-based PLMs – shows that downstream performance gains from our demographic adaptation do not actually stem from demographic knowledge. Our results indicate that demographic specialization of PLMs, while holding promise for positive societal impact, still represents an unsolved problem for (modern) NLP.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1520-1535"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48775945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog 面向任务对话的混合知识源联合推理
Pub Date : 2022-10-13 DOI: 10.48550/arXiv.2210.07295
Mayank Mishra, Danish Contractor, Dinesh Raghu
Traditional systems designed for task oriented dialog utilize knowledge present only in structured knowledge sources to generate responses. However, relevant information required to generate responses may also reside in unstructured sources, such as documents. Recent state of the art models such as HyKnow (Gao et al., 2021b) and SEKNOW (Gao et al., 2021a) aimed at overcoming these challenges make limiting assumptions about the knowledge sources. For instance, these systems assume that certain types of information, such as a phone number, is always present in a structured knowledge base (KB) while information about aspects such as entrance ticket prices, would always be available in documents.In this paper, we create a modified version of the MutliWOZ-based dataset prepared by (Gao et al., 2021a) to demonstrate how current methods have significant degradation in performance when strict assumptions about the source of information are removed. Then, in line with recent work exploiting pre-trained language models, we fine-tune a BART (Lewiset al., 2020) based model using prompts (Brown et al., 2020; Sun et al., 2021) for the tasks of querying knowledge sources, as well as, for response generation, without makingassumptions about the information present in each knowledge source. Through a series of experiments, we demonstrate that our model is robust to perturbations to knowledge modality (source of information), and that it can fuse information from structured as well as unstructured knowledge to generate responses.
为任务导向对话设计的传统系统利用仅存在于结构化知识来源中的知识来生成响应。但是,生成响应所需的相关信息也可能驻留在非结构化源中,例如文档。最近的最先进的模型,如HyKnow (Gao等人,2021b)和SEKNOW (Gao等人,2021a)旨在克服这些挑战,对知识来源做出有限的假设。例如,这些系统假设某些类型的信息,如电话号码,总是存在于结构化知识库(KB)中,而关于某些方面的信息,如门票价格,总是在文档中可用。在本文中,我们创建了(Gao等人,2021a)准备的基于multiwoz的数据集的修改版本,以演示当删除关于信息源的严格假设时,当前方法如何显著降低性能。然后,根据最近利用预训练语言模型的工作,我们使用提示(Brown et al., 2020;Sun et al., 2021)用于查询知识来源的任务,以及响应生成,而无需对每个知识来源中存在的信息进行假设。通过一系列的实验,我们证明了我们的模型对知识模态(信息源)的扰动具有鲁棒性,并且它可以融合结构化和非结构化知识的信息来生成响应。
{"title":"Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog","authors":"Mayank Mishra, Danish Contractor, Dinesh Raghu","doi":"10.48550/arXiv.2210.07295","DOIUrl":"https://doi.org/10.48550/arXiv.2210.07295","url":null,"abstract":"Traditional systems designed for task oriented dialog utilize knowledge present only in structured knowledge sources to generate responses. However, relevant information required to generate responses may also reside in unstructured sources, such as documents. Recent state of the art models such as HyKnow (Gao et al., 2021b) and SEKNOW (Gao et al., 2021a) aimed at overcoming these challenges make limiting assumptions about the knowledge sources. For instance, these systems assume that certain types of information, such as a phone number, is always present in a structured knowledge base (KB) while information about aspects such as entrance ticket prices, would always be available in documents.In this paper, we create a modified version of the MutliWOZ-based dataset prepared by (Gao et al., 2021a) to demonstrate how current methods have significant degradation in performance when strict assumptions about the source of information are removed. Then, in line with recent work exploiting pre-trained language models, we fine-tune a BART (Lewiset al., 2020) based model using prompts (Brown et al., 2020; Sun et al., 2021) for the tasks of querying knowledge sources, as well as, for response generation, without makingassumptions about the information present in each knowledge source. Through a series of experiments, we demonstrate that our model is robust to perturbations to knowledge modality (source of information), and that it can fuse information from structured as well as unstructured knowledge to generate responses.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1733-1742"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42413600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ezCoref: Towards Unifying Annotation Guidelines for Coreference Resolution ezCoref:统一共参照解析的注释准则
Pub Date : 2022-10-13 DOI: 10.48550/arXiv.2210.07188
Ankita Gupta, Marzena Karpinska, Wenlong Zhao, Kalpesh Krishna, Jack Merullo, Luke Yeh, Mohit Iyyer, Brendan T. O'Connor
Large-scale, high-quality corpora are critical for advancing research in coreference resolution. However, existing datasets vary in their definition of coreferences and have been collected via complex and lengthy guidelines that are curated for linguistic experts. These concerns have sparked a growing interest among researchers to curate a unified set of guidelines suitable for annotators with various backgrounds. In this work, we develop a crowdsourcing-friendly coreference annotation methodology, ezCoref, consisting of an annotation tool and an interactive tutorial. We use ezCoref to re-annotate 240 passages from seven existing English coreference datasets (spanning fiction, news, and multiple other domains) while teaching annotators only cases that are treated similarly across these datasets. Surprisingly, we find that reasonable quality annotations were already achievable (90% agreement between the crowd and expert annotations) even without extensive training. On carefully analyzing the remaining disagreements, we identify the presence of linguistic cases that our annotators unanimously agree upon but lack unified treatments (e.g., generic pronouns, appositives) in existing datasets. We propose the research community should revisit these phenomena when curating future unified annotation guidelines.
大规模、高质量的语料库是推进共参分辨研究的关键。然而,现有的数据集在共同引用的定义上各不相同,并且是通过为语言学专家策划的复杂而冗长的指导方针收集的。这些担忧引起了研究者们越来越大的兴趣,他们想要为不同背景的注释者制定一套统一的指南。在这项工作中,我们开发了一个众包友好的共同参考注释方法,ezCoref,由注释工具和交互式教程组成。我们使用ezCoref重新注释了来自7个现有英语共同参考数据集(跨越小说、新闻和多个其他领域)的240篇文章,同时只教注释者在这些数据集上处理相似的情况。令人惊讶的是,我们发现即使没有大量的培训,也可以实现合理的质量注释(大众和专家注释之间90%的一致性)。在仔细分析剩下的分歧后,我们确定了在现有数据集中,我们的注释者一致同意但缺乏统一处理的语言案例(例如,通用代词,同位语)的存在。我们建议研究团体在策划未来统一的注释指南时应该重新审视这些现象。
{"title":"ezCoref: Towards Unifying Annotation Guidelines for Coreference Resolution","authors":"Ankita Gupta, Marzena Karpinska, Wenlong Zhao, Kalpesh Krishna, Jack Merullo, Luke Yeh, Mohit Iyyer, Brendan T. O'Connor","doi":"10.48550/arXiv.2210.07188","DOIUrl":"https://doi.org/10.48550/arXiv.2210.07188","url":null,"abstract":"Large-scale, high-quality corpora are critical for advancing research in coreference resolution. However, existing datasets vary in their definition of coreferences and have been collected via complex and lengthy guidelines that are curated for linguistic experts. These concerns have sparked a growing interest among researchers to curate a unified set of guidelines suitable for annotators with various backgrounds. In this work, we develop a crowdsourcing-friendly coreference annotation methodology, ezCoref, consisting of an annotation tool and an interactive tutorial. We use ezCoref to re-annotate 240 passages from seven existing English coreference datasets (spanning fiction, news, and multiple other domains) while teaching annotators only cases that are treated similarly across these datasets. Surprisingly, we find that reasonable quality annotations were already achievable (90% agreement between the crowd and expert annotations) even without extensive training. On carefully analyzing the remaining disagreements, we identify the presence of linguistic cases that our annotators unanimously agree upon but lack unified treatments (e.g., generic pronouns, appositives) in existing datasets. We propose the research community should revisit these phenomena when curating future unified annotation guidelines.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"312-330"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45459122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Electric On Demand Transit Expands Network Coverage in Auckland 电动按需运输扩大奥克兰网络覆盖范围
Pub Date : 2022-10-13 DOI: 10.32866/001c.38773
Benjamin Kaufman, Ainsley Hughes, Elena Pihera, Srishti Lal
AT Local is an On Demand rideshare service operating in South Auckland, New Zealand. The service directly replaces the low patronage 371 fixed route bus and extends coverage to areas not previously served by public transport. This paper evaluates how AT Local is being used by customers located in two new catchment areas: an area in Conifer Grove and an Eastern Expansion area. Ridership analysis illustrates how AT has enabled new trip patterns. Trips from Conifer Grove are characterised by feeder service to the train network, while trips from the Eastern area fulfill feeder services while also facilitating various other trip patterns.
AT Local是一家在新西兰南奥克兰运营的按需拼车服务公司。该服务直接取代了客流量较低的371固定路线巴士,并将覆盖范围扩大到以前没有公共交通服务的地区。本文评估了位于两个新集水区的客户对AT Local的使用情况:Conifer Grove区和Eastern Expansion区。乘客量分析说明了AT是如何实现新的出行模式的。从Conifer Grove出发的旅行以向列车网络提供接驳服务为特色,而从东部地区出发的旅行则提供接驳服务,同时也促进了各种其他旅行模式。
{"title":"Electric On Demand Transit Expands Network Coverage in Auckland","authors":"Benjamin Kaufman, Ainsley Hughes, Elena Pihera, Srishti Lal","doi":"10.32866/001c.38773","DOIUrl":"https://doi.org/10.32866/001c.38773","url":null,"abstract":"AT Local is an On Demand rideshare service operating in South Auckland, New Zealand. The service directly replaces the low patronage 371 fixed route bus and extends coverage to areas not previously served by public transport. This paper evaluates how AT Local is being used by customers located in two new catchment areas: an area in Conifer Grove and an Eastern Expansion area. Ridership analysis illustrates how AT has enabled new trip patterns. Trips from Conifer Grove are characterised by feeder service to the train network, while trips from the Eastern area fulfill feeder services while also facilitating various other trip patterns.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48286522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Best Practices in the Creation and Use of Emotion Lexicons 创造和使用情感词汇的最佳实践
Pub Date : 2022-10-13 DOI: 10.48550/arXiv.2210.07206
Saif M. Mohammad
Words play a central role in how we express ourselves. Lexicons of word–emotion associations are widely used in research and real-world applications for sentiment analysis, tracking emotions associated with products and policies, studying health disorders, tracking emotional arcs of stories, and so on. However, inappropriate and incorrect use of these lexicons can lead to not just sub-optimal results, but also inferences that are directly harmful to people. This paper brings together ideas from Affective Computing and AI Ethics to present, some of the practical and ethical considerations involved in the creation and use of emotion lexicons – best practices. The goal is to provide a comprehensive set of relevant considerations, so that readers (especially those new to work with emotions) can find relevant information in one place. We hope this work will facilitate more thoughtfulness when one is deciding on what emotions to work on, how to create an emotion lexicon, how to use an emotion lexicon, how to draw meaningful inferences, and how to judge success.
言语在我们表达自己的方式中起着核心作用。单词-情感关联的词汇被广泛用于情感分析、跟踪与产品和政策相关的情绪、研究健康障碍、跟踪故事的情感弧线等研究和现实世界中的应用。然而,这些词汇的不当和不正确使用不仅会导致次优结果,而且还有对人直接有害的推论。本文汇集了情感计算和人工智能伦理的思想,介绍了情感词典创建和使用中涉及的一些实践和伦理考虑——最佳实践。目标是提供一套全面的相关考虑因素,以便读者(尤其是那些刚接触情绪工作的人)能够在一个地方找到相关信息。我们希望这项工作将有助于人们在决定处理什么情绪、如何创建情绪词典、如何使用情绪词典、怎样做出有意义的推断以及如何判断成功时进行更多的思考。
{"title":"Best Practices in the Creation and Use of Emotion Lexicons","authors":"Saif M. Mohammad","doi":"10.48550/arXiv.2210.07206","DOIUrl":"https://doi.org/10.48550/arXiv.2210.07206","url":null,"abstract":"Words play a central role in how we express ourselves. Lexicons of word–emotion associations are widely used in research and real-world applications for sentiment analysis, tracking emotions associated with products and policies, studying health disorders, tracking emotional arcs of stories, and so on. However, inappropriate and incorrect use of these lexicons can lead to not just sub-optimal results, but also inferences that are directly harmful to people. This paper brings together ideas from Affective Computing and AI Ethics to present, some of the practical and ethical considerations involved in the creation and use of emotion lexicons – best practices. The goal is to provide a comprehensive set of relevant considerations, so that readers (especially those new to work with emotions) can find relevant information in one place. We hope this work will facilitate more thoughtfulness when one is deciding on what emotions to work on, how to create an emotion lexicon, how to use an emotion lexicon, how to draw meaningful inferences, and how to judge success.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1780-1791"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48527229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
Findings (Sydney (N.S.W.)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1