Home Reports Start

Statistical Quality Control for Year 2003 Water Well Measurements

by
John C. Davis

logo of Kansas Geological Survey Kansas Geological Survey


Kansas Geological Survey
University of Kansas
Open-file Report No. 2003-8
Released January 2003, Electronic version created Jan. 2005

Introduction

The year 2003 Quality Control and Assurance Program for observation well water-level measurements in western Kansas is patterned after the quality assurance techniques developed during annual field work and statistical analyses conducted since 1997. This discussion of procedures is adapted from Miller, Davis, and Olea (1997), incorporating adjustments in the program that were noted in Davis (2001).

The primary variable measured in the water well observation program is depth to water in an observation well. This primary variable is associated with three secondary variables; the ground elevation, east-west coordinate, and north-south coordinate of the well. The secondary variables serve to locate the primary variable in space, and make it possible to determine spatial relationships between observation wells, including mapping the water table and calculating changes in aquifer volume. Historically, the three location variables were determined initially by the U.S. Geological Survey for each well and not re-determined unless a serious error in the original coordinates was suspected. In the 1997 ground water observation measurement program conducted by the Kansas Geological Survey, the geographic (latitude and longitude) coordinates of all wells were re-determined by GPS techniques. In subsequent year's measurement programs, all observation wells were again re-determined by GPS. "Selective Availability," which limited the resolution of GPS measurements, was turned off by the Federal government in 2001, so locations determined that year were substituted for previous determinations. For a few locations where year 2001 GPS measurements were not taken, measurements made in 2002 are used.

In addition, several secondary characteristics of the observation wells and of the measurement procedure were noted in order to determine if these influence the quality of the measurements being made (these measurements are referred to as exogenous variables). As part of the quality control program, water level measurements were repeated two or more times on 175 wells, yielding a collection of 203 quality control observations. Because these data include replicates, they provide an additional check on estimates of the influence of well conditions or measuring techniques on water levels. A subsequent round of measurements resampled 50 wells selected at random from the original set for quality assurance purposes. These wells were measured two or more times for a set of 58 quality assurance values.

The primary variable, depth to water, changes with geographic location and differences in topography so much that these factors will overwhelm all other sources of variation. Because of this, any errors in location may have a profound effect on the water table elevation. To avoid the complications of simultaneously considering uncertainties in the secondary variables, this statistical quality control study is based on first differences (specifically, the difference between 2003 and 2002 depth-to-water measurements). The secondary variables cancel out, leaving only the difference in depth, which is numerically identical to the year's change in water level. In this statistical quality control study, the difference between 2003 and 2002 corrected depth measurements is abbreviated "'03-'02." If the water table is lower this year, the variable '03-'02 will be a positive number. There were 492 wells measured in the current program, but three of these were not measured in 2002, so there are a total of 489 wells having the variable '03-'02. This is six fewer than the number of measurements available last year.

The objective in our quality control study is to identify and assess possible sources of unwanted variation in water level measurements made by the KGS. The purpose of the analysis is to provide guidance to the KGS field measurement program, to suggest ways in which field measurements might be improved, and to provide information necessary to identify past or current measurements that are suspect. The statistical quality control and field measurement programs have been intimately intertwined from the outset when the KGS assumed responsibility in 1997 for measuring observation wells formerly measured by the USGS. A comparison of results from 2003 with those from previous years shows that the desired improvements in the measurement program continue to be achieved through quality control.

Statistical Procedures

Preliminary examination detected three wells that deviated from last year's measurement by significant amounts; reexamination disclosed that their values contained typographical errors which were corrected. Three wells that were measured in this year's program were not measured in 2002, so for them the variable '03-'02 cannot be calculated. All repeated measurements are excluded from this analysis to avoid inflating the total variance. 489 observations are included in the initial statistical analysis, which is an unbalanced analysis of variance (ANOVA) procedure designed to estimate the influence of different well characteristics and procedural differences on variable '03-'02. The following variables have been recorded for each well.

1. Depth to water
2. GPS longitude
3. GPS latitude
4. Date
5. Measurer's initials
6. Well Access
1 = good
0 = poor
7. Weighted Tape
6. Well Access
1 = yes
0 = no
8. Oil on Water
1 = yes
0 = no
9. Chalk Cut Quality
2 = excellent
1 = good
0 = poor

In addition, the data file contains several variables that do not enter into the analyses. These include a unique USGS ID number and KGS ID designation, a surface elevation, a legal description of the well location, and a decimal latitude and longitude (obtained by LEO conversion of the legal description). There are other variables that are used for statistical analyses, taken from the historical records. These are Well Use, the purpose for which water from the well is used, and Aquifer Code, which describes the primary source of water in the well. The manner in which aquifer code values were assigned is summarized in Miller, Davis, and Olea (1997).

10. Well Use
H = household water supply
S = stock water supply
I = irrigation
U = unused observation
Z = animal disposal
11. Aquifer Code
KD = Cretaceous Dakota aquifer
KJ = undifferentiated Cretaceous/Jurassic aquifer
KN = Cretaceous Niobrara aquifer
QA = Quaternary alluvium aquifer
QAQU = Quaternary alluvium and undifferentiated aquifers
QAQUTO = Quaternary alluvium and undifferentiated aquifers and Tertiary Ogallala aquifer
QATO = Quaternary alluvium and Tertiary Ogallala aquifers
QU = Quaternary undifferentiated aquifer
QUTO = Quaternary undifferentiated and Tertiary Ogallala aquifers
QUTOKJ = Quaternary undifferentiated, Tertiary Ogallala, and Cretaceous/Jurassic aquifers
QUTOKD = Quaternary undifferentiated, Tertiary Ogallala, and Cretaceous Dakota aquifers
TO = Tertiary Ogallala aquifer
TOKD = Tertiary Ogallala and Cretaceous Dakota aquifers
TOKJ = Tertiary Ogallala and undifferentiated Cretaceous/Jurassic aquifers

Note that, as in 2002, the set of aquifer codes used in 2003 differs slightly from that used prior to 2002 because of changes in the areas where the KGS measures wells. In addition, the aquifer code QUKD is not used because the single well assigned this code was not measured in 2003. The initial statistical model includes all exogenous variables recorded during the quality control study that may contribute to the variability in the response, '03-'02, plus the variables Well Use and Aquifer Code. In contrast to the 2002 measurement program, the only exogenous variable to contribute significantly to the total variance is an operator effect measured by the variable Measurer. As expected, there are significant contributions to total variance from Well Use and Aquifer Code.

Analysis of Variance table for initial model
Source DF Sum of Squares Mean Square F Ratio Prob>F
Model 29 604.4319 20.8425 2.4089 <0.0001
Measurer 6 111.3776 18.5629 2.1454 0.0472*
Well Access 1 19.7194 19.7194 2.2791 0.1318ns
Weighted Tape 1 18.9069 18.9069 2.1852 0.1400ns
Well Use 4 37.4985 9.3746 1.0835 0.3641ns
Oil on Water 1 2.7463 2.7463 0.3174 0.5735ns
Chalk Cut Quality 2 136.1233 68.0617 7.8662 0.0004**
Aquifer Code 13 260.6781 20.0522 2.1520 0.0088**
Error 454 3928.1827 8.6524    
Total 483 4532.6147      
RSquare 0.13
ns = Not significant; * = Significant; ** = Highly significant

A revised model was run that combined aquifers into classes similar to those used in 1997 through 2002. This 5-part classification distinguishes between (1) wells that tap alluvial aquifers, (2) wells that tap both alluvial aquifers and other unconsolidated aquifers, (3) wells drawing from the High Plains aquifer, (4) wells into bedrock aquifers, and (5) wells that draw from both bedrock and unconsolidated aquifers. This has the effect of reducing the degrees of freedom required for the model and thus increasing the sensitivity of the analysis for detecting other influences.

Analysis of Variance table for grouped aquifers
Source DF Sum of Squares Mean Square F Ratio Prob>F
Model 19 419.2629 22.0665 2.4892 0.0005
Measurer 6 106.5518 17.7586 2.0032 0.0638ns
Well Access 1 21.4159 21.4159 2.4158 0.1208ns
Weighted Tape 1 27.1791 27.1791 3.0659 0.0806ns
Well Use 4 62.6359 15.6590 1.7664 0.1344ns
Oil on Water 1 4.3418 4.3418 0.4898 0.4844ns
Chalk Cut Quality 2 140.5487 70.2744 7.9279 0.0004**
Aquifer Group 4 75.5091 18.8773 2.1294 0.0762ns
Error 464 4113.3518 8.8650    
Total 483 4532.6147      
RSquare 0.09
ns = Not significant; * = Significant; ** = Highly significant

The surprising result is that none of the exogenous variables except Chalk Cut Quality are significant when aquifers are grouped. Most unexpected of all is that Aquifer Group itself is not a significant source of variation although Aquifer Code is a significant source of variation, indicating that there are significant differences in '03-'02 between individual aquifers that are obscured when the aquifers are combined into groups.

Unfortunately, past models are not directly comparable because there are different numbers of degrees of freedom assigned to some variables, and the response (annual change in water level) has significantly different variances from year to year. The pattern of alternating magnitude of variance in the response variable continues this year, which has a significantly higher variance than measurements made in 2002. Although the year-to-year changes in total variance are highly significant, the cause remains speculative (Davis, 2001) but may be due in part to the fact that the response variable is a first-order difference.

One way to improve the statistical results of the measurement program is to discard wells in which exogenous variables make unusually high contributions to the total variance, arguing that the readings from such wells are atypical and likely erroneous. Of four wells exhibiting extreme changes in water level in 2003, only one (25S 25W 32CDD 01) exhibits the alternating annual rise and fall that suggests poorly controlled measurements. The other wells exhibit a pattern of a continuing and even accelerating decline in water level (34S 35W 26ACC 01 and 30S 32W 22BBB 01), or a steady rise in water level (33S 37W 35ACD 01).

Importance of contributing variables

We can determine the relative contributions of each category of the contributing variables by examining the least-squares means (averages) of '03-'02 for a specified state of a variable, while holding all other variables at their average value. (In statistical terms, these averages are referred to as the expected values of the variables.) A positive value indicates the average depth to water in a well is greater in 2003 than in 2002 (the water level has declined from last year's measurement). That is, the elevation of the water level in the well is lower than it was previously. The following list gives the leastsquares means for the complete data set.

Operator
LevelOriginal
Least Sq Mean
BBW3.7414
DRL3.9141
JMA3.1801
JMH3.7991
NC4.1541
NP*4.8270
RDM4.0622
*indicates new operator in 2003
 
Well Access
LevelOriginal
Least Sq Mean
03.5657
14.3423


Weighted Tape
LevelOriginal
Least Sq Mean
04.4525
13.4555
 
Well Use
LevelOriginal
Least Sq Mean
H2.6239
I3.5699
S2.2427
U3.6929
Z7.6405
 
Oil on Water
LevelOriginal
Least Sq Mean
04.0866
13.8214
 
Chalk Cut
LevelOriginal
Least Sq Mean
06.6799
12.3597
22.8224
 
Geologic Group
LevelOriginal
Least Sq Mean
1 (Cretaceous)4.1139
2 (Alluvium)2.9334
3 (Al. + Tert.)4.3011
4 (Tertiary)4.5027
5 (Tert. + K)3.9189

Summary of the Analyses of Variance

Year 2003 measurements show significant or highly significant variations attributable to Measurer and Chalk Cut in addition to differences between the aquifer being tapped by the well. The standard deviation of variable '03-'02 is 3.86 ft, which is more than the standard deviation of variable '02-'01 (2.56 ft), the standard deviation of variable '01-'00 (3.07 ft), or the standard deviation of variable '00-'99 (2.69 ft), but less than the standard deviation of variable '99-'98 (4.21 ft). The median decline in water level from 2002 to 2003 is 2.50 ft, more than double the decline from 2001 to 2002 (1.09 ft) and greater than the median decline from 2000 to 2001 (1.39 ft). This year's decline is much greater than earlier declines including the 1999 to 2000 decline of 0.31 ft, the 1998 to 1999 decline (0.72 ft) and the decline between 1997 to 1998 (0.41 ft).

The significant differences between measurers are mostly attributable to NP (who tended to produce deeper than expected measurements). However, when aquifers are grouped into classes, the increase in degrees of freedom available for error results in the measurements made by JMA (whose measurements tended to be shallower than expected) also becoming significantly different The same change in degrees of freedom results in Well Use becoming a significant source in variance, attributable to stock water wells (S) being shallower than expected.

An unexpected consequence of combing individual aquifers into groups is that the differences between Aquifer Groups for '03-'02 are not significant, although the individual Geologic Units themselves are a significant contributor to the variance in '03-'02. Water levels measured in 2002 in exclusively Cretaceous aquifers (Group 1) show mean declines of over 4.1 ft from 2002, which is less than the mean decline of 5.3 ft between 2001 and 2002. The water level in the Ogalalla aquifer (Group 4) shows a greater mean decline (4.5 ft) than in the previous year (over 3.5 ft). Measurements made in wells tapping alluvial aquifers (Group 2) show the smallest decline of 2.9 ft, but this was greater than last year's decline of 2.2 ft or the previous year's slight increase in average water level. Wells in alluvial plus other sources (Group 3) show a decline in mean water level of 4.3 ft Water levels in wells tapping Cretaceous aquifers plus Quaternary and/or Tertiary aquifers (Group 5) tend to be 3.9 ft deeper on average this year. The only significant difference in the annual change in water level among Aquifer Groups is due to the difference between Group 2 and Group 4. A comparison with last years measurements shows that the decline is water level is significantly greater in '03'02 than in '02-'01 for all groups except for Group 1. (Statistics for 2003 can only be compared in detail with those from 2002 and 2001 because of the change in responsibility for wells in two counties that occurred after year 2000.)

The ANOVA equation can be used to create an expected value and residual (difference between observed and expected value) for each well. The distribution of residuals should be approximately normal. Examination of the residual outliers will reveal any well measurements which cannot be explained by extreme combinations of the different sources of variation. The residual plot, shown in Figure I, is more peaked than normal and skewed to negative values. Outliers, or extreme values, are measurements which differ from their expected values by more than tlO feet Five wells have been identified by this process. These wells show changes in water level between 2002 and 2003 that are outside the range expected. These well measurements may be correct and reflect unusual changes in aquifer level; the wrong wells may have been measured in one year or the other; or changes in well construction or other factors may have altered the measurability of a well. The five wells, with their residuals, are:

Well ID Residual, ft.
25S 25W 32CDD 01 -11.96
33S 37W 35ACD 01 -11.91
23S 33W 28CDC 01 10.10
30S 32W 22BBB 01 13.04
34S 35W 26ACC 01 13.14

A positive residual indicates that the 2003 water level is lower than predicted in a well with a declining water level, or is not as high as predicted in a well with an increasing water level. A negative residual indicates that the 2003 water level has declined less than predicted in a well with a declining water level, or has risen more than predicted in a well with a rising water table. None of these wells have unusual characteristics that make their current behavior suspect. Because only a few wells had questionable measurements, the decision was again made not to have a post-season remeasurement program in 2003.

Quality Assurance (remeasurement) Program

The year 2003 Quality Assurance program of random remeasurements resulted in QA data that contained no statistically significant sources of variation. Fifty randomly selected QA wells were remeasured by experienced personnel during the period when the regular field measurement program was underway. These were combined with data from the regular measurement program, to yield 125 measurements for statistical quality control. The fact that the QA program did not detect any significant exogenous sources of variation is a testament to the Survey's quality control efforts, and indicates that previous year's training in measurement techniques, plus the continuing refinement of the well measurement network, have been successful. The variance among the QA replicates is essentially identical to the variance of the complete data set. However, the most extreme value of '03-'02 among the QA wells is only -10.8 ft., compared to an extreme of 16.4 ft. in the complete data set.

Conclusions

The purpose of the Quality Control and Assurance Program is to identify wells and procedural conditions that may contribute significantly to the variance of Depth to Water measured in observation wells, and which do not reflect true changes in the water table elevation. Gathering Quality Control information requires little additional effort by the field crews, emphasizes the importance of procedural consistency, and certifies performance. Quality Control for the year 2003 field season, like the preceding two seasons, is remarkably free of inconsistencies compared to earlier field seasons. The results can be interpreted as demonstrating the value of training and the desirability of deleting troublesome wells from the monitoring program. Although this year the QA process did not identify any specific wells as troublesome, and did not flag any well locations which required verification before being permanently incorporated into the WIZARD data base, the importance of the Quality Control and Assurance Program remains unchanged. The continual improvement of data collected in the Water Well Measurement Program indicates the value of the program.

The Quality Control program has achieved its objectives of identifying and quantifying sources of unwanted variation in observation well data collection, and in flagging wells whose measurements require verification. It detected only five suspect values, confirming the benefits of "cleaning" the data base in past years. As the Quality Control process is routinely applied to KGS observation well measurements in the future, and particularly if it is applied to the entire Kansas observation well network, the quality of the groundwater measurement data will continue to be progressively improved with time.

Figure 1--Histogram of residuals from predicted change in water level '03-'02, as estimated by regression model. Curve is fitted normal distribution with same mean and variance as residuals. Wells whose change in water level deviates more than 10 feet from the predicted value are indicated.

histogram is well centered, skewed somewhat to the negative residuals

References

Davis, J.C., 2001, Statistical Quality Control For Year 2001 Water Well Measurements: Kansas Geological Survey Open-File Report No. 2001-2, 23 p. [Available Online]

Miller, R.D., J.C. Davis, and R.A. Olea, 1997, Acquisition Activity, Statistical Quality Control, and Spatial Quality Control for 1997 Annual Water Level Data Acquired by the Kansas Geological Survey: Kansas Geological Survey Open-File Report No. 97-33, 45 p. [Available Online]

Miller, RD., J.e. Davis, and RA. Olea, 1998, 1998 Annual Water Level Raw Data Report for Kansas: Kansas Geological Survey Open-File Report No. 98-7, 275 p., 6 plates, and 1 compact disk. [Available Online]

Next Page--DWR Wells


Kansas Geological Survey, Water Level CD-ROM
Send comments and/or suggestions to webadmin@kgs.ku.edu
Updated Jan. 5, 2005
Available online at URL = http://www.kgs.ku.edu/Magellan/WaterLevels/CD/Reports/OFR03_8/rep00.htm