Data Filtering and Correction Techniques for Generating Yield Maps from Multiple-Combine Harvesting Systems

S.A. Shearer, S.G. Higgins, S.G. McNeill, G.A. Watkins, R.I. Barnhisel, J.C. Doyle, J.H. Leach, and J.P. Fulton


Shearer, S.A., S,G. Higgins, S.G. McNeill, G.A. Watkins, R.I. Barnhisel, J.C. Doyle, J.H. Leach, and J.P. Fulton. 1997. Data Filtering and Correction Techniques for Generating Yield Maps from Multiple Combine Systems. ASAE Paper No. 971034. Annual International Meeting, Minneapolis Minnesota, August 10-14.

 

Abstract

Generation of yield maps from data obtained from multiple-combine harvest systems reflect variations that may be a result of calibration practices or operator influence. Post-processing of the data enabled the generation of yield maps where the influence of calibrations practices, operator technique, and the nature of the sensing methodologies were minimized. A systematic approach is proposed to eliminate suspect data, and scale the parameters of interest from each combine to match scaled weights of grain leaving the field. Data from four combines operated in a Central Kentucky field during the 1997 wheat harvest are presented.

Introduction

The advent of the Global Positioning System (GPS) has lead to interest in geographic referencing of yield data for agricultural crops. GPS is a space-based radio-navigation facility that was developed by the U.S. Department of Defense (DoD). By combining both position and yield data grain producers can generate yield maps. Yield monitors, a new product in the area of agricultural machinery, have been developed to allow grain producers to assess the effects of their management skills on grain production. While producers and managers debate the use of after-market versus factory or dealer installed monitors, the accuracy of these devices depends on appropriate installation, calibration and operation. The intent of this manuscript is to suggest criteria to aid in the generation of yield maps from data acquired using yield monitors. Ultimately, any additional income to be realized by incorporating a yield monitor in an operation will come from changes in crop production practices that result from identification of problem areas via review of accurate yield maps. Therefore, it is essential that crop production managers have confidence in the quality of yield maps generated from data acquired during the yield monitoring process.

Background and Review

Development of yield monitoring capabilities continues to evolve. Essentially two portions of the approach have been the subject and focus of recent research efforts in this area. Numerous efforts have been undertaken to evaluate and develop mass and volumetric grain flow sensing strategies for the combine. It is also widely acknowledged that the mass flow of material through a combine affects the resulting yield determination, and is in part responsible for a portion of variability in the yield map. Many studies have been conducted to develop reconstruction and filtering techniques to reduce the effect of combine dynamics and calibration inconsistencies in the yield maps. A brief description of pertinent studies is included to provide basis and justification for the work reported within this manuscript.

De Baerdemaeker et al. (1985) reported on the development of an impulse type grain sensor which is the basis of the majority of yield monitoring devices in use today. The impulse flow-measuring device was constructed of a 90-degree, long radius elbow with a load cell located at the outside periphery and in the middle of the bend. The elbow radius was four times the radius of the tube. The tube measured 12 cm in diameter and was fed with flowing grain from above such that the acceleration was from a vertical to horizontal direction. This momentum change was recorded as impulses using a load cell attached to and supporting the elbow. Under laboratory test conditions the authors noted measurement errors that varied with grain moisture content, flow rate, and inclination of the elbow-load cell instrument. Vasichen and De Baeremaker (1995) expanded on the initial work by exchanging the original tube configuration for a curved plate. The authors proposed and evaluated an indirect calibration approach that used a linear relationship to correct the mass flow sensor data. A low-pass filter was used to condition the sensor signal resulting in a worst case estimate of yield error of 6% for this modified device.

Additional efforts in the area of flow sensor development have been reported since the work of De Baerdemaeker et al. (1985). Wagner and Schrock (1989) reported on the development of a pivoted auger sensor for determination of grain mass flow rate on a combine. Opposite the pivoted end of the horizontal auger a load cell was fixed to record difference in mass flow rates. Low-pass filtering of the load cell signal was required to overcome the signal noise generated by rotating elements on the combine. Comparisons of measured versus actual plot yields (total mass) revealed error rates of less than three percent. Colvin (1990) reported on the development of a weigh bin system for combines. No indication of system accuracy was given in this report. Stafford et al. (1991) detailed the performance of a capacitance-based sensing technology for grain flow rates at the discharge of the bin-loading auger. In addition, the same authors reported on the performance of a nucleonic device. The capacitance-based sensing method relied on the capacitance change of an air grain mixture to determine mass flow rate. Of the two devices it was noted that the capacitance-based sensing device required more frequent calibrations. The accuracy of either device was found to be acceptable. Strubbe et al. (1996) reported the development of an optical volumetric flow rate sensing method. An array of four optical sensors was mounted transverse of paddle travel in a clean grain elevator. Light circuit interruptions were correlated to volumetric grain flow. The maximum deviation from sample regression was determined to be nine percent, which represented an improvement a single sensor array where the maximum deviation was 13 percent.

Studies comparing yield monitoring methodologies have been somewhat limited. However, Auernhammer et al. (1993) reported on the evaluation of two yield measurement devices over a two-year period while operated on 300 ha of small grains. The first yield measurement system was a volumetric device, the same as investigated by Bae et al. (1989). The second device used radiometric principles to evaluate mass flow rate. Eighty percent of the measurement error with the radiometric device appeared randomly distributed while 50% of the error associated with the volumetric device could be attributed to the operator and calibration. Measurement accuracy was determined to be nearly identical for either system.

Bae et al. (1987) first merged the concepts of grain volumetric flow measurement and position determination at harvest for the purposes of logging data to support the generation of yield maps for grain sorghum. Combine position at harvest was determined using a microwave systems with fixed and mobile transponders. The authors also reported on the correction of yield data by smoothing the grain flow data, and by modeling and assessing the combine dynamics. Determination of moisture content at harvest was not addressed in this manuscript. The authors did report that "moderate" accuracy was achievable. However they also noted that yields in low yielding areas of the field were overestimated by 25% resulting form the averaging effect of the combine dynamics. Additional details of the combine modeling work were reported in a subsequent publication (Searcy et al., 1989). A noteworthy highlight of either reference was the quantification of the time constants for grain flow rates as the threshing/separating mechanism was loaded or unloaded as the machine entered or exited the standing crop. Grain flow rates were modeled as a first order system with time constants and transport delays. The time constant were estimated at 2 and 10 seconds, respectively for the combine upon entering and exiting the standing crop.

Efforts to reconstruct yield data and filter or smooth erroneous yield data have been the subject of several recent investigations. Stout et al. (1993) explored averaging and modeling techniques for the generation of yield maps from yield monitor data. The researchers considered the application of both first and second order models to describe grain flow in the combine. It was determined that variations in time constants for the first order model were too great relative to the mean, and therefore an average time constant was inappropriate. Investigation of the second order model, treating the combine as a spring-mass-damper system, resulted in unstable reconstruction of the yield data. The authors noted that this model was the best fit for grain flow rate data. A more significant element of their work involved the investigation of averaging techniques. They applied both arithmetic and moving average techniques to filter the data. Actual corn yield were found to be best approximated using either fourth order or 10 second moving averages.

Birrell et al. (1995) investigated several models to reconstruct instantaneous grain yields from yield monitor data. Two monitoring configurations were compared in this investigation, impacted-based and volumetric measuring devices. Simple time delay and first order system models were both used to correct the data from either sensing methodology. The first order model created considerable noise in the corrected flow rate when compared to the simple time delay model reconstruction. Data smoothing prior to use of the time delay model reduced the noise. The authors concluded that while the first order system model more closely approximated the expected step change in input. While the first order model did a good job of reconstructing the step input when entering the field, it was concluded that either model was acceptable for data reconstruction. Yield calculated for short periods with high accelerations were unreliable and should probably be disregarded. Perez-Munoz and Colvin (1996) investigated the interaction of measured parameters including moisture, ground speed, elevator speed, and grain impact force on the accuracy of yield prediction under both laboratory and field conditions. In the laboratory the correlation of calculated versus actual yield was reported to be 0.99. Correlation coefficients of 0.82 to 0.98 were reported from the field investigation portion of this work. The authors concluded that an impact-based yield sensing systems is "a good tool to obtain yield estimates for fields as it was originally marketed."

Objectives

1) To develop post-processing criteria for excluding grain yield monitor data that is of questionable quality prior to generation of yield maps.

2) To develop a post-processing methodology to scale and integrate yield monitor data from multiple combines operated within the same field prior to generation of yield maps.

Overview of Commercial Yield Monitors

Yield monitors are a combination of several components including: a grain mass-flow or volumetric-flow sensor, moisture sensor, ground speed sensor, separator speed sensor, data storage device, an integral user interface (display and key pad) and control box. When yield data is coupled with information generated by a Differential Global Positioning System (DGPS) receiver, yield maps can be plotted. The integration and interaction of these components is controlled via a microprocessor and sensors are interfaced using both analog to digital, and direct digital inputs.

The most essential element of any yield monitoring system is the device used to assess the weight or volume of clean grain moving through the separator. While the nature of mass or volumetric-flow sensing may vary by manufacturer, the location of these devices is nearly universal. Most devices are placed at or near the top of the clean grain elevator. The intent of these devices is to measure the mass or volumetric-flow of grain. Mass-flow, the most common approach, is detected by assessing the impact force of grain hitting a plate. The grain is accelerated by the paddles of the clean grain elevator as the chain makes a 180o turn at the top of the elevator. It is the centrifugal acceleration as the paddles make this turn that causes the grain to separate from the paddles. Grain is caught by the enclosure at the top of the elevator and falls toward the base of the bin loading or fountain auger. Impact style sensors are located at the position in the clean grain elevator where the grain impacts the elevator housing. These sensors are composed of beams instrumented with either strain gages or a linear potentiometer to measure the deflection of the beams that support the plate.

Essential to the yield monitoring process is the determination of both ground and separator speeds. Separator speed is determined using a simple magnetic sensor on one of shafts of the separator. This sensor generates a square wave with a frequency that is proportional to the speed at which a ferrous block passes the magnetic pick-up. For mass flow rate calibration purposes, the speed of the elevator chain must be known. By monitoring the speed of a shaft that is directly coupled to the elevator drive, a simple ratio can be used to determine the frequency of paddles passing the impact plate.

Harvested area is a product of distance traveled and effective header width. Distance traveled is simply a product of ground speed and sampling time. Ground speed or velocity is easily measured using the magnetic pick-up provided by the combine manufacturer. A square wave is generated as gear teeth within the transmission pass the magnetic pick-up. Ground speed is a constant multiple of this frequency. This constant is determined during the calibration phase of the yield monitor set up. An alternative is ground speed radar which is preferable when compared to the magnetic pick-up because errors from wheel slip are minimized especially in wet field conditions.

More recently, yield monitor manufacturers have begun using the velocity determined using a GPS receiver. However, if the GPS speed determination is to be used a functioning GPS receiver must be present. Secondly, GPS speed determination is subject to the same errors as position determination. The distance traveled by the combine during a sampling period is then found by multiplying the ground speed by the interval of the sampling period.

Moisture content determination is accomplished by sensing the dielectric properties of the harvested grain. The level of moisture within the grain affects the grain’s ability to store an electrical charge or what is more commonly known as capacitance. Sensing of capacitance, or more appropriately the impedance of the grain which can be related to the capacitance, is accomplished by confining a predetermined volume of grain between two conductive metal surfaces. For most yield monitors this is accomplished by installation of a moisture sensor in the tank loading auger. The moisture sensor is for the most part an aluminum fin set in a non-conductive plastic to electrically isolate it from the auger tube. When operating in weedy or green small grains, any build-up of dirt or plant residue on this fin must be removed to insure accurate moisture sensing. While the fin provides on electrically conductive surface, the steel auger tube provides the second.

Integration of the sensors, conversion of their output signals into numerical data for storage and later use, sensor calibration, GPS receiver interface, external data storage devices and user interface are incorporated into one or more boxes which are located in the cab of the combine. Controlling the interaction of all these devices is a microprocessor. Reconfiguration and upgrading of the yield monitor is done by downloading a new computer code to the yield monitor. This is the area where the differences from manufacturer to manufacturer are most significant. To a large extent the success of any attempt to monitor yields is the combine operator’s ability to master the calibration and operation of the yield monitor. Thus the user interface is a crucial part of the overall process.

Data Acquisition

Data for this manuscript was acquired from Worth and Dee Ellis Farms in Shelby County, located in the Outer Blue Grass Region of Kentucky. The farm owners owned and operated a late model Gleaner R70 combine with 30 feet wide small grain platform and Ag Leader 2000 Yield Monitor. In addition three custom operators were hired, and paid in accordance with harvested area as determined using yield monitors. The machines operated by the custom harvesters included a Case IH model 1680 combine with Ag Leader 2000 yield monitor, and two John Deere 9500 combines with GreenStar equipped with yield monitors. The Case IH combine used a 25 feet wide platform as did on of the John Deere 9500’s. The remaining John Deere 9500 combine was equipped with a 20 feet wide platform.

All four machines were operated from July 3 through July 10 on 1550 acres of soft red winter wheat. Grain handling limitation required all four combines be operated within the same field. Field sizes ranged from 3 acres upwards to 174 acres, with an average size of 24.7 acres. The land of the Outer Blue Grass Region consists of silt loam soils in a rolling terrain. Many of the fields aligned with a central ridge sloping toward the edge on either side. Elevation differences of 50 to 60 feet within a single field were not uncommon. The majority of fields were irregularly shaped and contained numerous grassed waterways which further defined the field boundaries.

Data from the yield monitors was downloaded via PCMCIA cards on a daily basis. Files were exported from the respective software packages in advanced file formats. Parameters of interest included grain mass flow rate and distance traveled per cycle, moisture contents, swath or header width, latitude, and longitude. This data were replicated for each cycle. Data were logged at cycle times of 2 s for the GreenStar yield monitors, and at 1 s intervals for the Ag Leader 2000 yield monitors. All data files were imported into a popular mapping package to verify that data were contained within the boundaries of respective fields.

In addition scale weights were obtained from all trucks leaving a particular field. Composite samples were extracted from flowing grain at the exit of hopper bottom trailers at random intervals to obtain a representative sample for analysis. On-farm grading consisted of test weight and moisture content determination. Moisture determinations were made using a Motomco Model 17 moisture meter. Temperature corrections were noted and applied to the resulting moisture contents. These data are presented in Table 1 along with field areas and average yields.

Data Filtering and Scaling

Yield data is a product of mass flow rate divided by harvested area. Mass flow must be corrected for moisture content. The correction is a ratio of percent dry matter at harvest divided by percent dry matter for marketing. Dry bushels in the case of wheat would contain 86.5 percent dry matter by weight. The exclusion of yield data should therefore be made on the basis of valid mass flow rate, harvested area and moisture content. It may be argued that moisture contents below 9 percent, for wheat in Central Kentucky would be uncommon, and therefore these values should be corrected. Harvested area is a product of distance traveled and header width. From the authors' experiences header width adjustments by the combine operators were infrequent. In general they either discontinued data logging, or made no effort to account for point rows or narrow trips. With respect to area, suspect data may result from low ground speeds where the distance traveled per cycle times is minimal. Large mass flow rates combined with short cycle distances result in large, and often unrealistic, yields. Cycle distance is determined by one of two principal methods, GPS positioning, or magnetic pick-ups to assess wheel speed. Either sensing method has it's own unique limitation for assessing harvested area.

An initial screening of the data will be conducted to locate and eliminate data points where the cycle distance is equal to zero. Division of the grain mass flow rate by zero, or near zero values, will result in unrealistically high values of estimated yield. In addition data points where yields of greater than 50 Mg/ha will also be excluded. This value was chosen as it represents an order of magnitude greater wheat yield value than one might expect in Central Kentucky. Data manipulation will be accomplished via the use spreadsheet software. The generation of normal probability plots for each of the logged attributes, and the final yield, will be used to assess the fit or a normal distribution of this data. Normal distribution parameters, the mean and standard deviation, will be estimated from the data of each combine. Outlying data points as judged from the distributions of moisture content, mass flow rate, and cycle distance will be disregarded if they are outside three standard deviations of the mean. Normal probability plots will again be used to review the appropriateness of normal distribution model for application to yield estimation in the yield monitoring process.

Data filtered using the above approach will then be corrected by first assigning a bias or offset to each combine to adjust the average moisture contents so that they are in line with the field average moisture contents as determined by standard grading practices. Mass flow data from each combine will then be scaled so that the estimated field-average yield is consistent with the field average yield as determined by scaling trucks that exited the field. This assumes that the sum of the product of header width and cycle distance matches the actual field area. To this end the operators have adjusted the effective header width to account for individual operating style and data logging practices. The operators were encouraged to stop logging yield data when the less than half of the header width was engaged in the crop. This effective width was checked periodically by allowing the operators to harvest a field of known area independent of other operators, and then comparing the yield monitor area values with previously determined values.

The final step in this process was to determine running averages for the data such that a length of traverse equal to or exceeding the width of the header was used to estimate yields. Stout et al. (1993) illustrated both a second-order averaging approach, and a running average method utilizing 10 consecutive data points. For the purposes of this work a running average of seven data points was felt to be a reasonable compromise between the header width criteria and the findings of Stout et al. (1993). A final comparison between the original data, and data manipulated using the filtering and scaling techniques proposed within will be accomplished by visual inspection of the resultant dot yield maps.

Results and Discussion

Figure 1a. is a gray-scale dot map of yield constructed from the original unfiltered and unadjusted data set. A significant and undesirable characteristic of this map is the variability associate with each harvest pass. The boarder of the field was harvested by the first combine to enter the field, with remaining rounds or passes being harvested by alternating combines. This pattern is further illustrated in Figure 2, a plot of the traverse of the Case IH 1680 combine. Point to point and between harvest variability appears to be greater than one might expect when reviewing yield generated by individual combine harvests. A further review of summary statistics generated from individual combine data sets, as will be shown later, supports the belief that machine/operator dependent variability has been introduced to this map.

To begin the data filtering process the zero cycle distance and greater than 50 Mg/ha yield criteria were applied to the data set. Of 17,881 original data points, 195 points were excluded using the zero cycle distance, and seven additional points were excluded using the maximum yield criteria. Table 1 summarizes exclusions by combine. Initial filtering of the data reduced the size of the data set by 1.13%.

Data were then assessed using normal probability plots for each logged attribute, and then for estimated yield. Figure 3 contains the probability plots generated from the raw data (minus the initial screened values as noted above) for the Gleaner R70 combine. Surprisingly the data for the most part appear normally distributed. While several deviations from linearity are noted in the mass flow and cycle distance plots, excellent normal fits are indicated in the moisture content and yield approximation plots. Table 2 summarizes estimates of the mean and standard deviation derived by fitting the normal distribution to this data. Further evidence of the normal distribution fit is provided when reviewing the tight range of the confidence intervals, although this is due in part to a size of the data sets.

A further review of Table 2 provide additional insight to machine and operator induced variability. Means of the estimated yield for the Gleaner R70 and John Deere 9500 (II) differed by 0.41 Mg/ha, or 12%. This difference is significant in that mass flow sensors on either machine were calibrated prior to entering Field 25. The difference between field-average yields reported for the Case IH 1680 and the John Deere 9500 (I) combines was 1.48 Mg/ha, or 36.9%. A comparison of the between machine variability is illustrated when comparing the stand deviations of yield for each machine. As noted in Table 3, the Case IH 1680 harvested 35.3% of the field whereas the John Deere 9500 (II) covered 46.7% of the field. When comparing the coefficient of variation for the estimated yields from either machine the coefficient of the John Deere 9500 (II) is nearly twice that of the Case IH 1680. The combination of harvested areas for either machine represent 82.0% of the field. Also of interest is the implied difference in ground speed between the custom harvesters and the owner-operated machine. The ground speed of the Gleaner R70, the owner operated machine, was 5.15 km/h. Speed for the custom operators ranged from 6.80 to 8.64 km/h, further supporting the contention that some of the yield variation may be operator dependent.

Further filtering of the data was accomplished by parsing data that exceeded three standard deviations of the mean. Applying this criteria to the cycle distance, 319, or 1.78% of the data set was excluded. Mass criteria reduced the data set by 84 entries and moisture content by 210. Totaling all of the filtered observation (815), the data sets were reduced by total of 4.56%. Normal probability plots of the filtered data for the Gleaner R70 combine are presented in Figure 4. As one might suspect the probability plots appear more linear. Figure 5 shows a comparison of the probability distributions for the filtered, estimated yield for all combines. Noteworthy is the non-linear appearance of the Case IH 1680 estimated yield probability plot.

Moisture offset or biases and scale factors are presented in Table 4. The moisture biases were set equal to the difference between the field-average moisture content determined by sampling grain from trucks exiting the field. The grand average moisture content of wheat harvested from this field was determined to be 22.1%. Scale factors for the mass flow were then determined using a field average yield of 3.46 Mg/ha. It should be note that this value is not a true field average, rather it is a whole farm average. An error in logging scaled truck weights precluded the use of a true field average. However, the 3.46 Mg/ha value illustrates the application of the methodology proposed within.

Figures 5 is presented to illustrate the affect of the bias introduced to the moisture content data. In this case the between combine variation is practically eliminated in the latter figure. Figure 1b. presents a gray-scale dot plot of the filtered and corrected data. This figure can be compared directly with Figure 1a. as all of the map parameters remain consistent from map to map, with the exception of the estimated yield data used to construct either map. Once again the appearance of machine and operator induced variability is practically eliminated.

The framework proposed within this is intended to be a starting point for software developers who desire to meet the needs of their clientele who generate and manage yield data from multiple-combine harvest systems. Additional work is needed to address the issue of machine induced variability. And prior to this work a quantification of machine induced versus actual, infield yield variations is needed. Yield monitors in general are very good devices for measuring relative differences in yields. However, extra caution is urged when merging data from multiple machines when generating yield maps that represent the true yield variability from within a field

Conclusions

  1. Accurate yield monitoring calibration techniques are essential in multiple-combine harvest for the generation of yield maps.
  1. Post-processing techniques can reduce influence calibration, operator and machine related variability in the construction of yield maps.
  1. Yield monitor raw data (i.e., mass flow, cycle distance and moisture content) appear to be normally distributed as do the resulting estimated yields.

Acknowledgements

The authors of this manuscript gratefully acknowledge and appreciate the interest and cooperation of Mike, Bob and Jim Ellis, the owners and operators of Worth and Dee Ellis Farms in Shebly County, Kentucky. We are also indebted the combine operators, Bill Fraizer, Gene Lacoumpt, Jack Trumbo and Steve Clark, for their patiencs and attention to details at harvest.

References

Auernhammer, H., M. Demmel, K. Muhr, J. Rottmeier, and K. Wild. 1993. Yield measurments on combine harvesters. ASAE Paper No. 93-1506. St. Joseph Michigan: ASAE.

Bae, Y.H., S.C. Borglet, S.W. Searcy, J.K. Schueller, and B.A. Stout. 1987. Determination of spatially variable yield maps. ASAE Paper No. 87-1533. St. Joseph, Michigan: ASAE.

Birrell, S.J., S.C. Borglet, and K.A. Sudduth. 1995. Crop yield mapping: Comparison of yield monitors and mappind techniques. In Proceedings of Site-specific Management for Agricultural Systems, ed. P.C. Robert, R.H. Rust, and W.E. Larson, 15-31. Minneapolis, Minnesota, March 27-30.

Colvin, T.S. 1990. Automated weighing and moisture sampling for a field plot combine. Applied Engineering in Agriculture. 6(6):713-714.

De Baerdemaeker, J. Decroix, R. Lindemans. 1985. Monitoring the grain flow on combines. In Proceedings of the Agrimation I Conference, 329-338. Chicago, Illinois, February 25-28.

Perez-Munoz, F., and T.S. Colvin. 1994. Continuous Grain Yield Monitoring. ASAE Paper No. 941053, St Joseph, Michigan.

Searcy, S.W., J.K. Schueller, Y.H. Bae, S.C. Borglet, and D.A. Stout. 1989. Mapping of spatially variable yield during grain combining. Transactions of the ASAE. 32(3):826-829.

Stafford, J.V., B. Ambler, and M.P. Smith. 1991. Sensing and mapping grain yield variation. In Proceedings of Automated Agriculture for the 21st Century, 356-365. Chicago, Illinois, December 16-17.

Stout, B.L., S.C. Borglet, and K.A. Sudduth. 1993. Yield determination using an instrumented Claas combine. ASAE Paper No. 93-1507. St. Joseph, Michigan: ASAE.

Vansichen, R., and J. De Baerdemaeker. 1991. Continuous wheat yield measurement on a combine. In Proceedings of Automated Agriculture for the 21st Century, 346-355. Chicago, Illinois, December 16-17.

Wagner, L.E., and M.D. Schrock. 1989. Yield determination using a pivoted auger flow sensor. Transactions of the ASAE. 32(2):409-413.