Improving Yield Data Analysis Using Contextual Data

IF 0.8 4区农林科学 Q4 AGRICULTURAL ENGINEERING Applied Engineering in Agriculture Pub Date : 2023-01-01 DOI:10.13031/aea.14655

Elizabeth M. Hawkins, Dennis R. Buckmaster

{"title":"Improving Yield Data Analysis Using Contextual Data","authors":"Elizabeth M. Hawkins, Dennis R. Buckmaster","doi":"10.13031/aea.14655","DOIUrl":null,"url":null,"abstract":"Highlights Context-driven yield data cleaning resulted in more accurate whole field yield estimates Using a context-driven yield data cleaning method can improve yield estimates for zones within fields Identifying error-prone areas in field where data quality is likely to be low and removing that data in bulk can reduce data cleaning bias Abstract. As agriculture becomes more data driven, decision-making has become the focus of the industry and data quality will be increasingly important. Traditionally, yield data cleaning techniques have removed individual data points based on criteria primarily focused on the yield values themselves. However, when these methods are used, the underlying causes of the errors are often overlooked and as a result, these techniques may fail to remove all of the inaccurate (error-prone) data and/or remove legitimate data. In this research, an alternative to data cleaning was developed. Data integrity zones (DIZ) within each field were identified by evaluating metadata which included data collected by the combine that reported the operating conditions of the machinery (i.e., travel speed, crop mass flow), data about the field environment (i.e., soil type, topography, weather), and data of field operations (e.g., field logs, as-applied maps). Data in DIZ were isolated using buffers and the analysis of the reduced datasets was compared to the raw data. The amount of data removed depended on the amount of variability (e.g. soil characteristics, topography) in the field. Statistical comparisons of the data showed the mean yield estimates for soil type polygons increased by an average of 1.4 Mg/ha for corn when DIZ data was used compared to raw data. On average, the confidence around the mean remains similar even with a large amount (70%) of data removed. Notably, the none of the mean estimates derived from raw datasets were contained in the confidence intervals produced from DIZ data. This meta-data (context-driven) alternative to data cleaning effectively removed errors and artifacts from yield data which would only be identified when looking beyond the yield measurements themselves. When similarly reduced datasets are used to analyze historical yield data, they should provide a clearer picture of true yield effects of treatments, management zones, soil types, etc.; this will improve decisions on input and resource allocation, support wiser adoption of precision agricultural technologies, and refine future data collection. Keywords: Combine yield monitor, Context, Data analysis, Integrity zones, Management zones, Metadata, Precision agriculture, Yield, Yield data.","PeriodicalId":55501,"journal":{"name":"Applied Engineering in Agriculture","volume":"2016 1","pages":"0"},"PeriodicalIF":0.8000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Engineering in Agriculture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13031/aea.14655","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Highlights Context-driven yield data cleaning resulted in more accurate whole field yield estimates Using a context-driven yield data cleaning method can improve yield estimates for zones within fields Identifying error-prone areas in field where data quality is likely to be low and removing that data in bulk can reduce data cleaning bias Abstract. As agriculture becomes more data driven, decision-making has become the focus of the industry and data quality will be increasingly important. Traditionally, yield data cleaning techniques have removed individual data points based on criteria primarily focused on the yield values themselves. However, when these methods are used, the underlying causes of the errors are often overlooked and as a result, these techniques may fail to remove all of the inaccurate (error-prone) data and/or remove legitimate data. In this research, an alternative to data cleaning was developed. Data integrity zones (DIZ) within each field were identified by evaluating metadata which included data collected by the combine that reported the operating conditions of the machinery (i.e., travel speed, crop mass flow), data about the field environment (i.e., soil type, topography, weather), and data of field operations (e.g., field logs, as-applied maps). Data in DIZ were isolated using buffers and the analysis of the reduced datasets was compared to the raw data. The amount of data removed depended on the amount of variability (e.g. soil characteristics, topography) in the field. Statistical comparisons of the data showed the mean yield estimates for soil type polygons increased by an average of 1.4 Mg/ha for corn when DIZ data was used compared to raw data. On average, the confidence around the mean remains similar even with a large amount (70%) of data removed. Notably, the none of the mean estimates derived from raw datasets were contained in the confidence intervals produced from DIZ data. This meta-data (context-driven) alternative to data cleaning effectively removed errors and artifacts from yield data which would only be identified when looking beyond the yield measurements themselves. When similarly reduced datasets are used to analyze historical yield data, they should provide a clearer picture of true yield effects of treatments, management zones, soil types, etc.; this will improve decisions on input and resource allocation, support wiser adoption of precision agricultural technologies, and refine future data collection. Keywords: Combine yield monitor, Context, Data analysis, Integrity zones, Management zones, Metadata, Precision agriculture, Yield, Yield data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用上下文数据改进产量数据分析

使用上下文驱动的产量数据清洗方法可以提高田内区域的产量估计。识别数据质量可能较低的田中容易出错的区域，并批量删除这些数据可以减少数据清洗偏差。随着农业越来越多的数据驱动，决策已成为行业关注的焦点，数据质量将越来越重要。传统上，产量数据清理技术是基于主要关注产量值本身的标准删除单个数据点。然而，当使用这些方法时，往往会忽略导致错误的潜在原因，因此，这些技术可能无法删除所有不准确(容易出错)的数据和/或删除合法数据。在这项研究中，开发了一种替代数据清理的方法。每个农田内的数据完整性区(DIZ)是通过评估元数据来确定的，元数据包括联合收割机收集的数据，这些数据报告了机器的运行条件(即行驶速度、作物质量流量)、田间环境数据(即土壤类型、地形、天气)和田间作业数据(例如田间日志、应用地图)。DIZ中的数据使用缓冲区隔离，并将简化数据集的分析与原始数据进行比较。去除的数据量取决于田间的可变性(如土壤特征、地形)。数据的统计比较表明，与原始数据相比，使用DIZ数据时，土壤类型多边形的玉米平均产量估计值平均提高了1.4 Mg/ha。平均而言，即使删除了大量(70%)数据，平均值周围的置信度仍然相似。值得注意的是，从原始数据集得出的平均估计没有包含在DIZ数据产生的置信区间中。这种元数据(上下文驱动)替代数据清理，有效地消除了产量数据中的错误和工件，这些错误和工件只有在查看产量测量本身之外才能识别出来。当使用类似的简化数据集来分析历史产量数据时，它们应该能更清楚地反映出处理、管理区域、土壤类型等对产量的真实影响;这将改善投入和资源配置的决策，支持更明智地采用精准农业技术，并改进未来的数据收集。关键词:组合产量监测，上下文，数据分析，完整性区，管理区，元数据，精准农业，产量，产量数据

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Applied Engineering in Agriculture 农林科学-农业工程

CiteScore

1.80

自引率

11.10%

发文量

审稿时长

6 months

期刊介绍： This peer-reviewed journal publishes applications of engineering and technology research that address agricultural, food, and biological systems problems. Submissions must include results of practical experiences, tests, or trials presented in a manner and style that will allow easy adaptation by others; results of reviews or studies of installations or applications with substantially new or significant information not readily available in other refereed publications; or a description of successful methods of techniques of education, outreach, or technology transfer.