2000 Florida Ballots Project
Raw Data File

The Raw Data File may be downloaded here:

  • ASCII dataset and SAS statements (ZIP file)
  • SAS (SD2) dataset (ZIP file)
  • SPSS (SAV) dataset (ZIP file)
  • Frequencies (Text file)

    The Raw Data File contains one record for each ballot examined in the Florida Ballot Project. There are two types of ballots: undervotes and overvotes. An undervote is a ballot for which no presidential vote was recorded. An overvote is a ballot for which more than one presidential candidate was selected.

    For each undervote, the ballot was examined by three independent coders. Each chad on the ballot was coded and the coders' evaluations appear on one ballot-level record in the raw data file. Variables with the suffix "C1" refer to the evaluations made by the first of the three coders. Data for coders 2 and 3 are recorded in the variables with suffixes "C2" and "C3," respectively. In addition to the coder evaluations of the ballot, each record in the raw data file contains the county name and FIPS code, precinct number, ballot system (Votomatic, Datavote, or Optical Scan), and other identifying information pertaining to that ballot. The unique identifier for each record is recorded in the variable BALNUM, which is a sequential integer ranging from 1 to 175,037.

    For three Florida counties (Nassau, Pasco, and Polk), overvotes were also examined by three coders. For the remaining counties, overvotes were examined by only one coder. For these remaining overvotes, the data relating to the single coder's evaluation are contained in the variables with the suffix "C1." The data in variables with the suffixes "C2" and "C3" were assigned reserve code values of -8 to indicate that the ballot was examined by only one coder. (For more information concerning the meaning of individual codes in the raw data file, please refer to the raw data layout contained in NORCLAY.XLS).

    The raw data file contains 175,010 records. Of these, 61,190 are undervotes and 113,820 are overvotes. In total, 138,037 ballots were from counties using Votomatic technology, 5,198 from counties using Datavote, and 31,775 from counties using Optical Scan technology. The complete breakdown of ballots by ballot type (undervotes/overvotes, Votomatic/Datavote/Optical Scan) follows:

    Total Records (Ballots): 175,010
    Total Undervotes: 61,190
    Total Overvotes: 113,820
    Total Votomatic: 138,037
    Total Datavote: 5,198
    Total Optical Scan: 31,775
    Votomatic Undervotes: 53,215
    Votomatic Overvotes: 84,822
    Datavote Undervotes: 771
    Datavote Overvotes: 4,427
    Optical Scan Undervotes: 7,204
    Optical Scan Overvotes: 24,571

    Note #1: There are 30 undervote ballots that did not have 3 codings and 11 overvote ballots that should have had 3 codings (Nassau, Pasco, or Polk counties) but did not. For the 30 undervotes and 6 of the 11 undervotes, the first two sets of codings are in the data, and the third set of codings has been assigned the reserve code -8 (to indicate no data for that coder). For the remaining 5 of the 11 overvote cases, the first set of codings is in the data, and the second and third are assigned reserve values of -8 (to indicate no data for that coder).

    Note #2: Analysts should be aware of the presence of an unusual coder (Coder ID 75683). In Baker county (FIPS = 3), one coder's work on 79 undervote ballots indicates misunderstanding of instructions, bias, or other problems. For documentary purposes and because of the relatively small number of ballots examined by this coder, the data was left in the database. Analysts are advised to conduct analyses without this coder's data.

    Ballots from absentee precincts are indicated with a 1 in the variable ABSENTEE (0 otherwise).

    Undervote and overvote ballots that have been contested are identified with a C (contested) in the variable PRECVERS (BLANK otherwise).