Steve E. Naranjo
Arid-Land Agricultural Research Center, USDA-ARS, Maricopa, AZ
William D. Hutchison
Department of Entomology, University of Minnesota, St. Paul MN
The Resampling Approach
Resampling for Validation of Sampling Plans:
Enumerative Sampling Plans
Binomial Sampling Plans
Field Data Requirements
General Input Requirements
Sample Plan Input Requirements
Sets of tools for sample plan evaluation have recently been developed (e.g. Nyrop & Binns 1991). These Monte Carlo simulations can be used to evaluate sampling models during the developmental phase; however, they may not be adequate for testing model validity and performance under field conditions. This is primarily due to the assumption of an underlying statistical distribution (e.g. negative-binomial, normal) which may not adequately represent the actual distributions of insects in all instances. Here we present a method in which actual field data is resampled to evaluate sample plan performance (Hutchison 1994, Hutchison et al. 1988, Naranjo and Hutchison 1997). We originally developed DOS-based computer software for this purpose. A new version which runs as an Excel Plug-in is now available. This new version has the same functionality and the general instruction below will still assist you in using RVSP. Please click the Help button on the new Plug-in to get additional instructions.
Limitations: The major limitation of a resampling approach is that additional field data, independent of that use to develop the sampling plan, must be collected. Ideally, these independent data also need to cover the range of population densities under which the sample plan will likely be used. Often, this task can be accomplished by withholding a certain amount of data during the developmental phase. We are currently evaluating the amount of data (both the number of data sets and the number of observations per data set) necessary to perform a robust analysis of sample plan performance.
Basically, RVSP randomly selects observations from an actual data set until either the sequential rule is satisfied or a fixed number of samples has been drawn, depending on the sample plan tested. This process is repeated numerous times (default=500) for each data set. Based on these iterations, RVSP then calculates averages and variances for precision and sample size, as well as operating characteristics for the binomial plans which classify population densities relative to a specified threshold. RVSP provides both summary and detailed output that can be easily imported in spreadsheet and graphical programs for further examination and analysis.
Green's Plan: Green's (1970) fixed-precision sampling algorithm uses Taylor's power law, S^2 = am^b, to model the relationship between the mean (m) and variance (S^2). The sequential sampling model is given as: Tn = (an^[1-b]/D^2)^[1/(2-b)] where Tn is the cumulative count from n samples, and D is precision (SE to mean ratio).
Kuno's Plan: Kuno's (1969) fixed-precision sampling algorithm uses Iwao's patchiness regression, m* = a + bm, to model the mean-variance relationship, where m* is Lloyd's mean crowding index and the variance, S^2 = (a+1)m + (b-1)m^2. The sequential sampling model is given as: Tn = (a + 1)/(D^2 - [b - 1]/n)
Wald's SPRT: Wald's (1947) sequential plan allows population density to be classified as either above or below a threshold density. For binomial count data upper and lower sampling stop lines are defined as: Tn = Bx +- A where x is the number of sample units examined, Tn is the cumulative number of units infested with at least t insects, and B and A are parameters derived as standard functions of specified type I and II error rates, and upper and lower boundaries bracketing the threshold density.
Fixed-Sample-Size Plan: This plan is based on a user-specified number of samples to determine the proportion of sample units infested with a least t insects. This value can then be compared to a specified threshold level.
Wald's SPRT or Fixed-Sample-Size Plan
Green's or Kuno's Plan: The summary tabulates the mean, SD and n of the original data set, and the mean, SD, maximum, and minimum of precision and required sample size over all resampling iterations.
Wald's SPRT: The summary tabulates the mean, n and the proportion infested of the original data set, and the mean, SD, maximum, and minimum of proportion infested and required sample size over all resampling iterations. The operating characteristic (probability of taking no action) is calculated directly as the proportion of iteration in which the proportion infested exceeded the upper boundary. The OC function can be estimated by plotting these probabilities against the observed mean.
Fixed-Sample-Size: Similar to Wald's SPRT with the exception that sample size statistics are not given and the OC is calculated as the proportion of iteration in which the proportion infested exceeded the threshold.
Hutchison, W. D. 1994. Sequential sampling to determine population density. P. 207-244. In L. Pedigo & G. Buntin (eds.), Handbook of Sampling Methods for Arthropods in Agriculture. CRC Press.
Hutchison, W.D., D.B. Hogg, M.A. Poswall, R.C. Berberet & G.W. Cuperus. 1988. Implications of the stochastic nature of Kuno's and Green's fixed-precision stop-lines: Sampling plans for the pea aphid (Homoptera: Aphididae) in alfalfa as an example. J. Econ. Entomol. 81:749-758.
Kuno, E. 1969. A new method of sequential sampling to obtain the population estimates with a fixed level of precision. Res. Popul. Ecol. (Kyoto) 11: 127-136.
Naranjo, S. E. & H. M. Flint. 1995. Spatial distribution of adult Bemisia tabaci in cotton and development and validation of fixed-precision sequential sampling plans for estimating population density. Environ. Entomol. 24: 261-270
Naranjo, S. E., H. M. Flint & T. J. Henneberry. 1996. Binomial sampling plans for estimating and classifying population density of adult Bemisia tabaci in cotton. Entomol. Exp. Appl. 80: 343-353.
Naranjo, S.E. & W.D. Hutchison. 1997. Validation of arthropod sampling plans using a resampling approach: software and analysis. Am. Entomol. 43: 48-57.
Nyrop, J. P. & M. Binns. 1991. Quantitative methods for designing and analyzing sampling programs for use in pest management. P. 67-132. In D. Pimentel (ed.), Handbook of Pest Management in Agriculture, Vol. 2. CRC Press.
Wald, A. 1947. Sequential Analysis. John Wiley & Sons, New York.
After you have copied all the files (see APPENDIX B) onto your hard drive simply type RVSP to execute the program. You will see the startup screen from which you can use the up and down cursor keys to highlight the menu item of interest. Start by highlighting Kuno's Plan and pressing the [ENTER] key. This will bring up another menu screen with 5 choices. Highlight Display/Modify Initialization and press [ENTER]. This brings up a third screen in which you can input data for a resampling analysis. The cursor is automatically positioned in the input data file field. Type BATCH.DAT in this field and use the down arrow to move to the next field. For this example we will accept the default value of D=0.25. Continue by entering data into the rest of the fields as follows: Iwao's a = -0.53, Iwao's b = 2.03, and Output data file = SAMPLE.OUT. Use the default values for the remaining fields. When all the data is entered press the [ESC] key. This will register your input and return you to the previous menu. At this point you could save the data you just entered in an ASCII file by selecting Save Initiatilization File. You will be prompted for a file name. You could also retrieve an existing initialization file if one had been previously saved. This is handy for repetitive runs and saves the time of re-entering the same data. You can verify the data entered at any time by selecting Display/Modify Initialization. Once you are satisfied that the data is correct, select Run Simulation. This will begin the resampling analysis.
Before resampling begins, you will be asked whether you want to print a copy of the summary table and whether you want the program to generate raw data files for each field data set specified in BATCH.DAT. These are handy if you want to do additional analysis beyond the summary statistics already provided. For this example answer Y (this will require about 600K of disk space for the 20 example field data files). RVSP will then begin resampling each data set in turn, flashing the name of the data set being resampled as it proceeds. RVSP has several error traps that permit continuation of the program if errors occur. These include checking to see that file names listed in BATCH.DAT exist, and testing the adequacy of individual data sets for analyses as previously discussed.
Once all data sets have been resampled you will be returned to the Kuno screen and the summary output table will be saved to disk using the name SAMPLE.OUT. This same table will also be printed if you asked the program to do so. Pressing the [ESC] key as this point will return you to the main menu where you can select another sample plan to test or exit the program. Work through the remaining sampling plans using the parameters given on the following summary tables. In all cases use BATCH.DAT as the input file, but rename the output file or RSVP will overwrite the summary file generated in a previous execution. Any valid DOS filename is acceptable.
RVSP20.xlam Excel Plug-in file
RVSP20.chm Excel help file
RVSP Installation Instructions
RVSPMAN.DOC Software documentation(V1.2/2.0)
BATCH.DAT Example input file
TEST183.92 Example field data set files
It is recommended that you create a separate directory for RVSP and load all these files in that directory. RVSP20.xlam and RVSP20.chm must be in the same directory.
Questions regarding this material can be directed to: Dr. Steve Naranjo email@example.com