Appendix C. GES Technical Notes
Standard Errors
The national estimates produced from GES data may differ from the true values, because they are based on a probability sample of crashes and not a census of all crashes. The size of these differences may vary depending on which sample of crashes was selected. [For a complete description of the GES sampling design, see National Accident Sampling System General Estimates System Technical Note (DOT HS 807 796) available from NCSA.] The standard error of an estimate is a measure of the precision or reliability with which an estimate from this particular GES sample approximates the results of a census.
In a report of this size, it is impractical to provide standard errors for each estimate. Instead, generalized standard errors for estimates of totals are provided in the following table. Generalized errors were calculated separately for the crash, vehicle, and people characteristics. The values for the GES estimates and an estimate of one standard error are given in Table C1 on the following page. By adding and subtracting two standard errors, a 95 percent confidence interval can be created for the GES estimates in this report. For example, the estimated number of injury crashes that occurred in the month of February is given in Table 23 as 144,000. To calculate one standard error for this crash estimate, use Table C1. Since 144,000 does not appear in the Crash Estimate column of Table C1, use linear interpolation from the standard error values for 100,000 (8,000) and 200,000 (14,600). One standard error would be approximately 10,900. The 95 percent confidence interval for this estimate would be 144,000 ± 2 × 10,900 or 122,200 to 165,800.
Table C1. 2004 GES Estimates and Standard Errors
|
Crash |
Crash |
Vehicle |
Vehicle |
Person |
Person |
|---|---|---|---|---|---|
|
1,000 |
400 |
1,000 |
400 |
1,000 |
400 |
|
5,000 |
900 |
5,000 |
900 |
5,000 |
900 |
|
6,000 |
1,000 |
10,000 |
1,400 |
10,000 |
1,400 |
|
7,000 |
1,100 |
20,000 |
2,300 |
20,000 |
2,100 |
|
8,000 |
1,200 |
30,000 |
3,100 |
30,000 |
2,800 |
|
9,000 |
1,300 |
40,000 |
3,800 |
40,000 |
3,500 |
|
10,000 |
1,400 |
50,000 |
4,500 |
50,000 |
4,100 |
|
20,000 |
2,300 |
60,000 |
5,200 |
60,000 |
4,700 |
|
30,000 |
3,100 |
70,000 |
5,900 |
70,000 |
5,300 |
|
40,000 |
3,800 |
80,000 |
6,600 |
80,000 |
5,800 |
|
50,000 |
4,600 |
90,000 |
7,200 |
90,000 |
6,400 |
|
60,000 |
5,300 |
100,000 |
7,900 |
100,000 |
6,900 |
|
70,000 |
6,000 |
200,000 |
14,200 |
200,000 |
12,200 |
|
80,000 |
6,700 |
300,000 |
20,300 |
300,000 |
17,200 |
|
90,000 |
7,300 |
400,000 |
26,300 |
400,000 |
22,200 |
|
100,000 |
8,000 |
500,000 |
32,400 |
500,000 |
27,100 |
|
200,000 |
14,600 |
600,000 |
38,500 |
600,000 |
31,900 |
|
300,000 |
21,000 |
700,000 |
44,600 |
700,000 |
36,800 |
|
400,000 |
27,400 |
800,000 |
50,700 |
800,000 |
41,600 |
|
500,000 |
33,800 |
900,000 |
56,900 |
900,000 |
46,500 |
|
600,000 |
40,300 |
1,000,000 |
63,100 |
1,000,000 |
51,400 |
|
700,000 |
46,900 |
2,000,000 |
127,200 |
2,000,000 |
100,700 |
|
800,000 |
53,400 |
3,000,000 |
194,700 |
3,000,000 |
151,700 |
|
900,000 |
60,100 |
4,000,000 |
265,200 |
4,000,000 |
204,200 |
|
1,000,000 |
66,700 |
5,000,000 |
338,500 |
5,000,000 |
258,100 |
|
2,000,000 |
136,300 |
6,000,000 |
414,200 |
6,000,000 |
313,400 |
|
3,000,000 |
210,300 |
7,000,000 |
492,200 |
7,000,000 |
370,000 |
|
4,000,000 |
288,100 |
8,000,000 |
572,400 |
8,000,000 |
427,800 |
|
5,000,000 |
369,400 |
9,000,000 |
654,500 |
9,000,000 |
486,600 |
|
6,000,000 |
453,800 |
10,000,000 |
738,600 |
10,000,000 |
546,600 |
|
6,500,000 |
497,100 |
11,000,000 |
824,400 |
11,000,000 |
607,500 |
|
7,000,000 |
541,000 |
12,000,000 |
912,000 |
12,000,000 |
669,400 |
|
*SE = e a + b (ln x) 2, where |
**SE = e a + b (ln x) 2, where |
***SE = e a + b (ln x) 2, where |
|||
Unknowns
GES data are obtained either directly from an item on the PAR or by interpreting the information provided in the report through reviewing the crash diagram, the Officers written summary of the crash, or combinations of variables on the PAR. Because of this interpretation, and because the police officer may not have entered some item of information or provide complete information, data can be missing. Two different statistical procedures are used on GES data to complete values for unknown data. These procedures, univariate and hotdeck imputation, are described in a technical report available from NCSA, Imputation in the General Estimates System (DOT HS 807 985). Table C2 below gives the reader the proportion of unknown values prior to imputation for variables with imputed values that were used in this report.
Table C2. Percent of Unknowns for 2004 GES Data Elements
|
Crash Level | |||
|---|---|---|---|
|
Alcohol Involved in Crash |
5.8% |
Manner of Collision |
0.2% |
|
Atmospheric Condition |
1.7% |
Minute of Crash |
0.5% |
|
Crash Severity |
3.1% |
Relation to Junction |
0.4% |
|
Day of Week |
0.0% |
Relation to Roadway |
0.2% |
|
First Harmful Event |
0.1% |
Roadway Surface Condition |
1.5% |
|
Hour of Crash |
0.5% |
Speed Limit |
15.7% |
|
Light Condition |
1.0% |
Traffic Control Device |
4.5% |
|
Vehicle/Driver Level |
|||
|
Driver Drinking in Vehicle |
8.7% |
Rollover Type |
0.5% |
|
Initial Point of Impact |
1.7% |
Vehicle Type |
1.6% |
|
Most Harmful Event |
0.1% |
|
|
|
Person Level |
|||
|
Age |
8.2% |
Seating Position |
1.0% |
|
Injury Severity |
4.3% |
Sex |
5.8% |
|
Police-Reported Alcohol Involvement |
4.2% |
||