PeerJ preprints最新文献_第4页

Prioritizing computer security controls for home users 优先为家庭用户设置计算机安全控制

PeerJ preprints

Pub Date : 2019-02-15 DOI: 10.7287/peerj.preprints.27540v1

J. Fanelli, John Waxler

Hundreds of thousands of home users are victimized by cyber-attacks every year. Most experts agree that average home users are not doing enough to protect their computers and their information from cyber-attacks. Improperly managed home computers can lead to individuals losing data, systems performing slowly, loss of identity, and ransom payments; en masse attacks can act in concert to infect personal computers in business and government. Currently, home users receive conflicting guidance for a complicated terrain, often in the form of anecdotal 'Top 10' lists, that is not appropriate for their specific needs, and in many instances, users ignore all guidance. Often, these popular ‘Top 10’ lists appear to be based solely on opinion. Ultimately, we asked ourselves the following: how can we provide home users with better guidance for determining and applying appropriate security controls that meet their needs and can be verified by the cyber security community? In this paper, we propose a methodology for determining and prioritizing the most appropriate security controls for home computing. Using Multi Criteria Decision Making (MCDM) and subject matter expertise, we identify, analyze and prioritize security controls used by government and industry to determine which controls can substantively improve home computing security. We apply our methodology using examples to demonstrate its benefits.

每年都有成千上万的家庭用户成为网络攻击的受害者。大多数专家都认为，普通家庭用户在保护他们的电脑和信息免受网络攻击方面做得不够。管理不当的家用电脑可能导致个人数据丢失、系统运行缓慢、身份丢失和赎金支付;大规模攻击可以协同行动，感染企业和政府的个人电脑。目前，对于复杂的地形，家庭用户收到的是相互矛盾的指导，通常是以轶事式的“十大”列表的形式出现，这并不适合他们的特定需求，而且在许多情况下，用户忽略了所有的指导。通常，这些流行的“十大”榜单似乎完全是基于个人意见。最终，我们问了自己以下问题:我们如何为家庭用户提供更好的指导，以确定和应用适当的安全控制，以满足他们的需求，并可由网络安全社区验证?在本文中，我们提出了一种方法来确定和优先考虑家庭计算的最适当的安全控制。使用多标准决策(MCDM)和主题专业知识，我们识别，分析和优先考虑政府和行业使用的安全控制措施，以确定哪些控制措施可以实质性地提高家庭计算安全性。我们应用我们的方法，用例子来证明它的好处。

{"title":"Prioritizing computer security controls for home users","authors":"J. Fanelli, John Waxler","doi":"10.7287/peerj.preprints.27540v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27540v1","url":null,"abstract":"Hundreds of thousands of home users are victimized by cyber-attacks every year. Most experts agree that average home users are not doing enough to protect their computers and their information from cyber-attacks. Improperly managed home computers can lead to individuals losing data, systems performing slowly, loss of identity, and ransom payments; en masse attacks can act in concert to infect personal computers in business and government. Currently, home users receive conflicting guidance for a complicated terrain, often in the form of anecdotal 'Top 10' lists, that is not appropriate for their specific needs, and in many instances, users ignore all guidance. Often, these popular ‘Top 10’ lists appear to be based solely on opinion. Ultimately, we asked ourselves the following: how can we provide home users with better guidance for determining and applying appropriate security controls that meet their needs and can be verified by the cyber security community? In this paper, we propose a methodology for determining and prioritizing the most appropriate security controls for home computing. Using Multi Criteria Decision Making (MCDM) and subject matter expertise, we identify, analyze and prioritize security controls used by government and industry to determine which controls can substantively improve home computing security. We apply our methodology using examples to demonstrate its benefits.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"38 1","pages":"e27540"},"PeriodicalIF":0.0,"publicationDate":"2019-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87416354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

GeoNode: an open source framework to build spatial data infrastructures GeoNode:一个用于构建空间数据基础设施的开源框架

PeerJ preprints

Pub Date : 2019-02-13 DOI: 10.7287/peerj.preprints.27534v1

P. Corti, F. Bartoli, Alessio Fabiani, C. Giovando, Athanasios Tom Kralidis, A. Tzotsos

GeoNode is an open source framework designed to build geospatial content management systems (GeoCMS) and spatial data infrastructure (SDI) nodes. Its development was initiated by the Global Facility for Disaster Reduction and Recovery (GFDRR) in 2009 and adopted by a large number of organizations in the following years. Using an open source stack based on mature and robust frameworks and software like Django, OpenLayers, PostGIS, GeoServer and pycsw, an organization can build on top of GeoNode its SDI or geospatial open data portal. GeoNode provides a large number of user friendly capabilities, broad interoperability using Open Geospatial Consortium (OGC) standards, and a powerful authentication/authorization mechanism. Supported by a vast, diverse and global open source community, GeoNode is an official project of the Open Source Geospatial Foundation (OSGeo).

GeoNode是一个开源框架，旨在构建地理空间内容管理系统(GeoCMS)和空间数据基础设施(SDI)节点。它的开发是由全球减灾与恢复基金(GFDRR)于2009年发起的，并在随后的几年中被大量组织采用。使用基于成熟和健壮的框架和软件(如Django, OpenLayers, PostGIS, GeoServer和pycsw)的开源堆栈，组织可以在GeoNode之上构建其SDI或地理空间开放数据门户。GeoNode提供了大量用户友好的功能，使用开放地理空间联盟(OGC)标准的广泛互操作性，以及强大的身份验证/授权机制。GeoNode是开源地理空间基金会(OSGeo)的官方项目，由一个庞大的、多样化的全球开源社区支持。

引用次数: 11

A use case centric survey of Blockchain: status quo and future directions 以用例为中心的区块链调查:现状和未来方向

PeerJ preprints

Pub Date : 2019-02-11 DOI: 10.7287/peerj.preprints.27529v1

S. Perera, F. Leymann, Paul Fremantle

This paper presents an assessment of blockchain technology based on the Emerging Technology Analysis Canvas (ETAC) to evaluate the drivers and potential outcomes. The ETAC is a framework to critically analyze emerging technologies. The assessment finds that blockchain can fundamentally transform the world. It is ready for specific applications in use cases such as digital currency, lightweight financial systems, ledgers, provenance, and disintermediation. However, Blockchain faces significant technical gaps in other use cases and needs at least 5-10 years to come to full fruition in those spaces. Sustaining the current level of effort (e.g. startups, research) for this period of time may be challenging. We also find that the need and merits of decentralized infrastructures compared to centralized and semi-centralized alternatives is not always clear. Given the risk involved and significant potential returns, we recommend a cautiously optimistic approach to blockchain with the focus on concrete use cases. The primary contributions of this paper are a use case centric categorization of the blockchain, a detailed discussion on challenges faced by those categories, and an assessment of their future.

本文提出了基于新兴技术分析画布(ETAC)的区块链技术评估，以评估驱动因素和潜在结果。ETAC是一个批判性分析新兴技术的框架。评估发现bb0可以从根本上改变世界。它已经为数字货币、轻量级金融系统、分类账、来源和非中介等用例中的特定应用做好了准备。然而，区块链在其他用例中面临着重大的技术差距，至少需要5-10年才能在这些领域取得圆满成果。在这段时间内保持当前的努力水平(如创业、研究)可能是具有挑战性的。我们还发现，与集中式和半集中式替代方案相比，去中心化基础设施的需求和优点并不总是很清楚。考虑到所涉及的风险和巨大的潜在回报，我们建议对区块链采取谨慎乐观的态度，重点放在具体的用例上。本文的主要贡献是以用例为中心对区块链进行分类，详细讨论了这些类别所面临的挑战，并评估了它们的未来。

{"title":"A use case centric survey of Blockchain: status quo and future directions","authors":"S. Perera, F. Leymann, Paul Fremantle","doi":"10.7287/peerj.preprints.27529v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27529v1","url":null,"abstract":"This paper presents an assessment of blockchain technology based on the Emerging Technology Analysis Canvas (ETAC) to evaluate the drivers and potential outcomes. The ETAC is a framework to critically analyze emerging technologies.\u0000 The assessment finds that blockchain can fundamentally transform the world. It is ready for specific applications in use cases such as digital currency, lightweight financial systems, ledgers, provenance, and disintermediation.\u0000 However, Blockchain faces significant technical gaps in other use cases and needs at least 5-10 years to come to full fruition in those spaces. Sustaining the current level of effort (e.g. startups, research) for this period of time may be challenging. We also find that the need and merits of decentralized infrastructures compared to centralized and semi-centralized alternatives is not always clear. Given the risk involved and significant potential returns, we recommend a cautiously optimistic approach to blockchain with the focus on concrete use cases.\u0000 The primary contributions of this paper are a use case centric categorization of the blockchain, a detailed discussion on challenges faced by those categories, and an assessment of their future.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"5 1","pages":"e27529"},"PeriodicalIF":0.0,"publicationDate":"2019-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84615785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Implementation and validity of the long jump knowledge-based system: Case of the approach run phase 跳远知识系统的实现与有效性:以短跑阶段为例

PeerJ preprints

Pub Date : 2019-02-08 DOI: 10.7287/peerj.preprints.27524v1

T. Kamnardsiri, W. Janchai, P. Khuwuthyakorn, Wacharee Rittiwat

This study aimed to propose the method of implementation of the Knowledge-Based System (KBS) in the case of approach-run phase. The proposed method was implemented for improving the long jump performance of athletes in the approach-run phase. Moreover, this study aimed to examine KBS concurrent validity in distinguishing between professional and amateur populations and then KBS convergent validity against a Tracker video analysis tool. Seven running professionals aged 19 to 42 years and five amateurs aged 18 to 38 years had captured with ten conditions of different movements (C1 to C10) using a standard video camera (60 fps, 10 mm lens). The camera was fixed on the tripod. The results showing an age-related difference in a speed measurement of ten conditions were evidently using the KBS. Good associations were found between KBS and Tracker 4.94 video analysis tool across various conditions of three variables that were the starting position (r=0.926 and 0.963), the maximum velocity (r=0.972 and 0.995) and the location of maximum velocity (r=0.574 and 0.919). In conclusion, the proposed method is a reliable tool for measuring the starting position, maximum speed and position of maximum speed. Furthermore, the proposed method can also distinguish speed performance between professional and amateur across multiple movement conditions.

本研究旨在提出在接近运行阶段实施知识系统(KBS)的方法。将所提出的方法应用于跳远运动员进跑阶段的成绩提高。此外，本研究旨在检验KBS在区分专业和业余人群时的并发效度，以及KBS对Tracker视频分析工具的收敛效度。7名年龄在19 - 42岁之间的跑步专业人士和5名年龄在18 - 38岁之间的业余爱好者使用标准摄像机(60帧/秒，10毫米镜头)拍摄了10种不同运动条件(C1到C10)。照相机固定在三脚架上。结果显示，在10种情况下的速度测量的年龄相关的差异是明显使用KBS。在起始位置(r=0.926和0.963)、最大速度(r=0.972和0.995)和最大速度位置(r=0.574和0.919)三个变量的不同条件下，KBS与Tracker 4.94视频分析工具之间存在良好的相关性。总之，该方法是一种可靠的测量起始位置、最大速度和最大速度位置的工具。此外，该方法还可以区分专业和业余运动员在不同运动条件下的速度表现。

{"title":"Implementation and validity of the long jump knowledge-based system: Case of the approach run phase","authors":"T. Kamnardsiri, W. Janchai, P. Khuwuthyakorn, Wacharee Rittiwat","doi":"10.7287/peerj.preprints.27524v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27524v1","url":null,"abstract":"This study aimed to propose the method of implementation of the Knowledge-Based System (KBS) in the case of approach-run phase. The proposed method was implemented for improving the long jump performance of athletes in the approach-run phase. Moreover, this study aimed to examine KBS concurrent validity in distinguishing between professional and amateur populations and then KBS convergent validity against a Tracker video analysis tool. Seven running professionals aged 19 to 42 years and five amateurs aged 18 to 38 years had captured with ten conditions of different movements (C1 to C10) using a standard video camera (60 fps, 10 mm lens). The camera was fixed on the tripod. The results showing an age-related difference in a speed measurement of ten conditions were evidently using the KBS. Good associations were found between KBS and Tracker 4.94 video analysis tool across various conditions of three variables that were the starting position (r=0.926 and 0.963), the maximum velocity (r=0.972 and 0.995) and the location of maximum velocity (r=0.574 and 0.919). In conclusion, the proposed method is a reliable tool for measuring the starting position, maximum speed and position of maximum speed. Furthermore, the proposed method can also distinguish speed performance between professional and amateur across multiple movement conditions.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"6 1","pages":"e27524"},"PeriodicalIF":0.0,"publicationDate":"2019-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78080447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Deployment of coordinated worm-hole peer in MANETs manet中协调虫孔对等体的部署

PeerJ preprints

Pub Date : 2019-02-03 DOI: 10.7287/peerj.preprints.27516v1

Fahmina Taranum, S. Mahin

With the enhancement in technical field of communication, efforts are made by the researchers to provide security. Security is dispensing protection and privacy to the system for the channeled data against any unwarranted access and refinements. MANET is a variant of wireless network used essentially by the dynamic devices with high motility and vulnerability. The distinctions like dynamic layout and curbed resources make them susceptible to miscellaneous kinds of threats. One such attack is wormhole which sneak and peep data with malicious intensions and operates either in coordinated or uncoordinated fashion. In its coordinated version, the malicious nodes coordinate their operations whereas in the uncoordinated version; they operate solitarily with the aim to decline the network performance. In this work, we aim to propose an algorithm for deployment of wormhole attack communicating with its peer through a tunnel. Planting of this attack in the network lays the foundation for developing successful strategies to mitigate their effects on the system.

随着通信技术水平的提高，研究人员正在努力提供安全保障。安全是为传输数据的系统提供保护和隐私，防止任何未经授权的访问和修改。MANET是无线网络的一种变体，主要用于具有高机动性和脆弱性的动态设备。动态布局和受限资源等特点使它们容易受到各种威胁。其中一种攻击是虫洞，它以恶意的意图偷窥数据，并以协调或不协调的方式运作。在其协调版本中，恶意节点协调其操作，而在非协调版本中;它们单独运行的目的是降低网络性能。在这项工作中，我们旨在提出一种通过隧道与对等体通信的虫洞攻击部署算法。在网络中植入这种攻击为开发成功的策略以减轻其对系统的影响奠定了基础。

引用次数: 0

Automatically Generating Psychiatric Case Notes From Digital Transcripts of Doctor-Patient Conversations 从医患对话的数字文本自动生成精神病病例笔记

PeerJ preprints

Pub Date : 2019-01-28 DOI: 10.18653/v1/W19-1918

Nazmul Kazi, Indika Kahanda

Electronic health records (EHRs) are notorious for reducing the face-to-face time with patients while increasing the screen-time for clinicians leading to burnout. This is especially problematic for psychiatry care in which maintaining consistent eye-contact and non-verbal cues are just as important as the spoken words. In this ongoing work, we explore the feasibility of automatically generating psychiatric EHR case notes from digital transcripts of doctor-patient conversation using a two-step approach: (1) predicting semantic topics for segments of transcripts using supervised machine learning, and (2) generating formal text of those segments using natural language processing. Through a series of preliminary experimental results obtained through a collection of synthetic and real-life transcripts, we demonstrate the viability of this approach.

电子健康记录(EHRs)因减少与患者面对面的时间而臭名昭著，同时增加了临床医生的屏幕时间，导致倦怠。这对于精神病学护理来说尤其有问题，因为在精神病学护理中，保持持续的目光接触和非语言暗示与口头语言一样重要。在这项正在进行的工作中，我们探索了从医患对话的数字转录本中自动生成精神病学电子病历病例记录的可行性，采用两步方法:(1)使用监督机器学习预测转录片段的语义主题，(2)使用自然语言处理生成这些片段的正式文本。通过一系列通过收集合成和现实生活转录本获得的初步实验结果，我们证明了这种方法的可行性。

引用次数: 14

GIZAChain: e-Government Interoperability Zone Alignment, based on blockchain technology GIZAChain:基于区块链技术的电子政务互操作性区域联盟

PeerJ preprints

Pub Date : 2019-01-11 DOI: 10.7287/peerj.preprints.27477v1

M. El-Dosuky, G. El-adl

E-government provides access to services anytime anywhere. There are many e-Government frameworks already exist to integrate e-government services, but efficient full interoperability still a challenge. Interoperability per se can be modeled via four maturity stages, in which the interoperability zone is the holy grail of full interoperability to be reached ultimately with strategy alignment. As e-government services shift in the same way as e-commerce with value chain, this implicitly implies the possibility of benefiting from blockchain with e-government. Blockchain is a nascent promising architecture, whose transactions are permanent, verifiable, and recorded in a distributed ledger. This research article suggests applying blockchain in achieving e- government interoperability. Forms are juxtaposed on the outer borders of the system. These forms adopt those used by UK government, because they are standard as well as they are available for Python developers. Once a form has been completed, PySOA calls the requested service, before storing the data in Ontology Blockchain. After the service is performed, the policies are analyzed in batch processing using quantgov. A report is submitted to the central government periodically. Ontology Blockchain has a dual effect. On the one hand, it works as a secure data storage. On the other hand, it cooperates with PySOA in supporting both technology and semantic interoperability . The most important feature of the proposed method is the presence of (Government Interoperability Zone Alignment; GIZA), which acts as a backbone that coherently connects the internal subcomponents. This linkage is possible, because each form has an title, that corresponds to the appropriate service name. Each service in turn has a counterpart in the wallets stored in Ontology blockchain. To measure interoperability empirically, there is a need for metrics. This study adopts and quantizes a standard interoperability matrix along three dimensions of interoperability of Conceptual (Syntax& Semantics), Organizational (Responsibilities& Organization per se), and Technology (Platform& Communication). While concerns are : data, business, service, and process. Any deviation from the standard could contributes to the interoperability score (counting mismatches) or interoperability grade (counting absolute differences). An estimation is performed, for 1000 total random cases. It is estimated that the probability of getting a conceptual/technical interoperability score as large as the standard strategy score is (713 /1000 = 0.713 (2 in 3). It is estimated too that the probability of getting a organizational interoperability score as large as the standard strategy score is (712 /1000 = 0.712 (2 in 3). Then, Markov model is proposed to provide an accurate representation of the evolution of the strategies over time.

电子政务提供随时随地的服务。目前已经存在许多电子政务框架来集成电子政务服务，但有效的完全互操作性仍然是一个挑战。互操作性本身可以通过四个成熟阶段进行建模，在这些阶段中，互操作性区域是最终通过战略一致性达到完全互操作性的圣杯。由于电子政务服务与价值链电子商务的转变方式相同，这隐含着电子政务从区块链中受益的可能性。区块链是一种新兴的有前途的架构，其交易是永久的，可验证的，并记录在分布式分类账中。本文建议将区块链应用于实现电子政务互操作性。形式并置在系统的外部边界上。这些表格采用了英国政府使用的表格，因为它们是标准的，并且Python开发人员也可以使用它们。表单完成后，PySOA调用所请求的服务，然后将数据存储在Ontology Blockchain中。服务执行后，使用quantgov在批处理中分析策略。定期向中央政府提交报告。本体区块链具有双重作用。一方面，它可以作为安全的数据存储。另一方面，它在支持技术和语义互操作性方面与PySOA合作。该方法最重要的特征是政府互操作性区域对齐;GIZA)，它作为一个骨干，连贯地连接内部子组件。这种链接是可能的，因为每个表单都有一个标题，对应于适当的服务名称。每个服务依次在存储在Ontology区块链中的钱包中有一个对应的服务。为了经验地度量互操作性，需要度量标准。本研究采用并量化了一个标准的互操作性矩阵，这三个维度是概念(语法和语义)、组织(责任和组织本身)和技术(平台和通信)的互操作性。而关注点是:数据、业务、服务和流程。任何对标准的偏离都可能导致互操作性得分(计算不匹配)或互操作性等级(计算绝对差异)。对总共1000个随机情况进行估计。估计的概率概念/技术的互操作性分数一样大的标准策略得分(713/1000 = 0.713(2,3)。据估计,组织互操作性得分的概率一样大的标准策略得分(712/1000 = 0.712(2 3)。然后,马尔可夫模型提出的发展策略提供一个准确的表示。

{"title":"GIZAChain: e-Government Interoperability Zone Alignment, based on blockchain technology","authors":"M. El-Dosuky, G. El-adl","doi":"10.7287/peerj.preprints.27477v1","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27477v1","url":null,"abstract":"E-government provides access to services anytime anywhere. There are many e-Government frameworks already exist to integrate e-government services, but efficient full interoperability still a challenge.\u0000 Interoperability per se can be modeled via four maturity stages, in which the interoperability zone is the holy grail of full interoperability to be reached ultimately with strategy alignment. As e-government services shift in the same way as e-commerce with value chain, this implicitly implies the possibility of benefiting from blockchain with e-government. Blockchain is a nascent promising architecture, whose transactions are permanent, verifiable, and recorded in a distributed ledger.\u0000 This research article suggests applying blockchain in achieving e- government interoperability. Forms are juxtaposed on the outer borders of the system. These forms adopt those used by UK government, because they are standard as well as they are available for Python developers. Once a form has been completed, PySOA calls the requested service, before storing the data in Ontology Blockchain. After the service is performed, the policies are analyzed in batch processing using quantgov. A report is submitted to the central government periodically. Ontology Blockchain has a dual effect. On the one hand, it works as a secure data storage. On the other hand, it cooperates with PySOA in supporting both technology and semantic interoperability . The most important feature of the proposed method is the presence of (Government Interoperability Zone Alignment; GIZA), which acts as a backbone that coherently connects the internal subcomponents. This linkage is possible, because each form has an title, that corresponds to the appropriate service name. Each service in turn has a counterpart in the wallets stored in Ontology blockchain.\u0000 To measure interoperability empirically, there is a need for metrics. This study adopts and quantizes a standard interoperability matrix along three dimensions of interoperability of Conceptual (Syntax& Semantics), Organizational (Responsibilities& Organization per se), and Technology (Platform& Communication). While concerns are : data, business, service, and process. Any deviation from the standard could contributes to the interoperability score (counting mismatches) or interoperability grade (counting absolute differences). An estimation is performed, for 1000 total random cases. It is estimated that the probability of getting a conceptual/technical interoperability score as large as the standard strategy score is (713 /1000 = 0.713 (2 in 3). It is estimated too that the probability of getting a organizational interoperability score as large as the standard strategy score is (712 /1000 = 0.712 (2 in 3). Then, Markov model is proposed to provide an accurate representation of the evolution of the strategies over time.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"9 1","pages":"e27477"},"PeriodicalIF":0.0,"publicationDate":"2019-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83202054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Parsing multi-ordered grammars with the Gray algorithm 用Gray算法解析多顺序语法

PeerJ preprints

Pub Date : 2019-01-05 DOI: 10.7287/peerj.preprints.27465v2

Nick Papoulias

Background. Context-free grammars (CFGs) and Parsing-expression Grammars (PEGs) are the two main formalisms used by formal specifications and parsing frameworks to describe programming languages. They mainly differ in the definition of the choice operator, describing language alternatives. CFGs support the use of non-deterministic choice (i.e., unordered choice), where all alternatives are equally explored. PEGs support a deterministic choice (i.e., ordered choice), where alternatives are explored in strict succession. In practice the two formalisms, are used through concrete classes of parsing algorithms (such as Left-to-right, rightmost derivation (LR) for CFGs and Packrat parsing for PEGs), that follow the semantics of the formal operators. Problem Statement. Neither the two formalisms, nor the accompanying algorithms are sufficient for a complete description of common cases arising in language design. In order to properly handle ambiguity, recursion, precedence or associativity, parsing frameworks either introduce implementation specific directives or ask users to refactor their grammars to fit the needs of the framework/algorithm/formalism combo. This introduces significant complexity even in simple cases and results in incompatible grammar specifications. Our Proposal. We introduce Multi-Ordered Grammars (MOGs) as an alternative to the CFG and PEG formalisms. MOGs aim for a better exploration of ambiguity, ordering, recursion and associativity during language design. This is achieved by (a) allowing both deterministic and non-deterministic choices to co-exist, and (b) introducing a form of recursive and scoped ordering. The formalism is accompanied by a new parsing algorithm (Gray) that extends chart parsing (normally used for Natural Language Processing) with the proposed MOG operators. Results. We conduct two case-studies to assess the expressiveness of MOGs, compared to CFGs and PEGs. The first consists of two idealized examples from literature (an expression grammar and a simple procedural language). The second examines a real-world case (the entire Smalltalk grammar and eleven new Smalltalk extensions) probing the complexities of practical needs. We show that in comparison, MOGs are able to reduce complexity and naturally express language constructs, without resorting to implementation specific directives. Conclusion. We conclude that combining deterministic and non-deterministic choices in a single grammar specification is indeed not only possible but also beneficial. Moreover, augmented by operators for recursive and scoped ordering the resulting multi-ordered formalism presents a viable alternative to both CFGs and PEGs. Concrete implementations of MOGs can be constructed by extending chart parsing with MOG operators for recursive and scoped ordering.

背景。上下文无关语法(cfg)和解析表达式语法(peg)是形式化规范和解析框架用来描述编程语言的两种主要形式化。它们的主要区别在于选择运算符的定义，即描述语言的可选性。cfg支持使用非确定性选择(即无序选择)，在这种情况下，所有的选择都被平等地探索。peg支持确定性选择(即有序选择)，其中选择是在严格的连续中探索的。在实践中，这两种形式化是通过具体的解析算法类(例如cfg的从左到右、最右派生(LR)和peg的Packrat解析)来使用的，它们遵循形式运算符的语义。问题陈述。这两种形式和相应的算法都不足以完整描述语言设计中出现的常见情况。为了正确处理歧义、递归、优先级或结合性，解析框架要么引入特定于实现的指令，要么要求用户重构他们的语法，以适应框架/算法/形式组合的需求。即使在简单的情况下，这也会带来显著的复杂性，并导致不兼容的语法规范。我们的建议。我们介绍了多级语法(mog)作为CFG和PEG形式的替代方案。mog的目的是在语言设计中更好地探索歧义、顺序、递归和联想。这可以通过(a)允许确定性和非确定性选择共存，以及(b)引入递归和范围排序的形式来实现。该形式化伴随着一种新的解析算法(Gray)，该算法通过提出的MOG操作符扩展了图表解析(通常用于自然语言处理)。结果。我们进行了两个案例研究来评估mog与cfg和peg的表达性。第一个由两个来自文学的理想化例子(一个表达式语法和一个简单的过程语言)组成。第二部分研究了一个真实的案例(整个Smalltalk语法和十一个新的Smalltalk扩展)，探讨了实际需求的复杂性。相比之下，mog能够降低复杂性并自然地表达语言结构，而无需诉诸于实现特定的指令。结论。我们的结论是，在单一语法规范中结合确定性和非确定性选择不仅是可能的，而且是有益的。此外，通过对递归和范围排序算子的扩充，所得到的多排序形式为cfg和peg提供了一种可行的替代方案。MOG的具体实现可以通过使用MOG操作符进行递归和范围排序来扩展图表解析来构建。

{"title":"Parsing multi-ordered grammars with the Gray algorithm","authors":"Nick Papoulias","doi":"10.7287/peerj.preprints.27465v2","DOIUrl":"https://doi.org/10.7287/peerj.preprints.27465v2","url":null,"abstract":"Background. Context-free grammars (CFGs) and Parsing-expression Grammars (PEGs) are the two main formalisms used by formal specifications and parsing frameworks to describe programming languages. They mainly differ in the definition of the choice operator, describing language alternatives. CFGs support the use of non-deterministic choice (i.e., unordered choice), where all alternatives are equally explored. PEGs support a deterministic choice (i.e., ordered choice), where alternatives are explored in strict succession. In practice the two formalisms, are used through concrete classes of parsing algorithms (such as Left-to-right, rightmost derivation (LR) for CFGs and Packrat parsing for PEGs), that follow the semantics of the formal operators. Problem Statement. Neither the two formalisms, nor the accompanying algorithms are sufficient for a complete description of common cases arising in language design. In order to properly handle ambiguity, recursion, precedence or associativity, parsing frameworks either introduce implementation specific directives or ask users to refactor their grammars to fit the needs of the framework/algorithm/formalism combo. This introduces significant complexity even in simple cases and results in incompatible grammar specifications. Our Proposal. We introduce Multi-Ordered Grammars (MOGs) as an alternative to the CFG and PEG formalisms. MOGs aim for a better exploration of ambiguity, ordering, recursion and associativity during language design. This is achieved by (a) allowing both deterministic and non-deterministic choices to co-exist, and (b) introducing a form of recursive and scoped ordering. The formalism is accompanied by a new parsing algorithm (Gray) that extends chart parsing (normally used for Natural Language Processing) with the proposed MOG operators. Results. We conduct two case-studies to assess the expressiveness of MOGs, compared to CFGs and PEGs. The first consists of two idealized examples from literature (an expression grammar and a simple procedural language). The second examines a real-world case (the entire Smalltalk grammar and eleven new Smalltalk extensions) probing the complexities of practical needs. We show that in comparison, MOGs are able to reduce complexity and naturally express language constructs, without resorting to implementation specific directives. Conclusion. We conclude that combining deterministic and non-deterministic choices in a single grammar specification is indeed not only possible but also beneficial. Moreover, augmented by operators for recursive and scoped ordering the resulting multi-ordered formalism presents a viable alternative to both CFGs and PEGs. Concrete implementations of MOGs can be constructed by extending chart parsing with MOG operators for recursive and scoped ordering.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"30 1","pages":"e27465"},"PeriodicalIF":0.0,"publicationDate":"2019-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84643768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A contrast of meta and metafor packages for meta-analyses in R R中用于元分析的元包和元包的对比

PeerJ preprints

Pub Date : 2019-01-01 DOI: 10.7287/peerj.preprints.27608v1

C. Lortie

There is extensive support and choice in R to support meta-analyses. Two common packages in the natural sciences include meta and metafor. Here, a brief contrast of the strengths of each is described for the synthesis scientist. Meta is a direct, intuitive choice for rapid implementation of general meta-analytical statistics. Metafor is a comprehensive package for analyses if the fit models are more complex. Both packages provide estimates of heterogeneity, excellent visualization tools, and functions to explore publication bias. Preference and critical outcomes can facilitate choice between these two specific options. Nonetheless, metafor has a steeper learning curve but greater rewards.

在R中有广泛的支持和选择来支持元分析。自然科学中两个常见的包包括meta和metafor。在这里，为合成科学家描述了每种力量的简要对比。Meta是一个直接的，直观的选择，用于快速实现一般的元分析统计。如果拟合模型比较复杂，metfor是一个全面的分析包。这两个软件包都提供了异质性估计、优秀的可视化工具和探索发表偏倚的功能。偏好和关键结果可以促进这两个特定选项之间的选择。尽管如此，元计算的学习曲线更陡峭，但回报也更大。

引用次数: 14

ProPheno: An online dataset for completely characterizing the human protein-phenotype landscape in biomedical literature ProPheno:一个完整描述生物医学文献中人类蛋白质表型景观的在线数据集

PeerJ preprints

Pub Date : 2019-01-01 DOI: 10.7287/peerj.preprints.27479v1

Morteza Pourreza Shahri, Indika Kahanda

Identifying protein-phenotype relations is of paramount importance for applications such as uncovering rare and complex diseases. One of the best resources that captures the protein-phenotype relationships is the biomedical literature. In this work, we introduce ProPheno, a comprehensive online dataset composed of human protein/phenotype mentions extracted from the complete corpora of Medline and PubMed. Moreover, it includes co-occurrences of protein-phenotype pairs within different spans of text such as sentences and paragraphs. We use ProPheno for completely characterizing the human protein-phenotype landscape in biomedical literature. ProPheno, the reported findings and the gained insight has implications for (1) biocurators for expediting their curation efforts, (2) researches for quickly finding relevant articles, and (3) text mining tool developers for training their predictive models. The RESTful API of ProPheno is freely available at http://propheno.cs.montana.edu.

确定蛋白质表型关系对于发现罕见和复杂疾病等应用至关重要。捕获蛋白质-表型关系的最佳资源之一是生物医学文献。在这项工作中，我们介绍了ProPheno，这是一个综合的在线数据集，由从Medline和PubMed的完整语料库中提取的人类蛋白质/表型提及组成。此外，它还包括蛋白质-表型对在句子和段落等不同文本范围内的共现。我们使用ProPheno来完全表征生物医学文献中的人类蛋白质表型景观。ProPheno，报告的发现和获得的见解对(1)生物馆长加快他们的策展工作，(2)快速找到相关文章的研究，以及(3)文本挖掘工具开发人员训练他们的预测模型具有重要意义。ProPheno的RESTful API可在http://propheno.cs.montana.edu免费获得。

引用次数: 0