Conference on Optimal Coding of Open-Ended Survey Data
Abstracts and Presentations
Department of Communication, Stanford University
The Challenges of Transparency in Collecting, Coding, and Analyzing Open-ended Survey Data
Department of Political Science, University of Michigan
Knowing the Supreme Court? A Reconsideration of Public Ignorance of the High Court
James L. Gibson
Department of Political Science, Washington University in St. Louis
Conventional wisdom holds that the American people are woefully ignorant about law and courts. In light of this putative ignorance, scholars and other commentators have questioned whether the public should play a role in the judicial process — for example, whether public preferences should matter for Supreme Court confirmation processes.
Unfortunately, however, much of what we know — or think we know — about public knowledge of the U.S. Supreme Court is based upon flawed measures and procedures. So, for instance, the American National Election Study, a prominent source of the conclusion that people know little if anything about the U.S. Supreme Court, codes as incorrect the reply that William Rehnquist is (was) a justice on the U.S. Supreme Court; respondents, to be judged knowledgeable, must identify Rehnquist as the Chief Justice of the U.S. Supreme Court (which, of course, technically, he was not). More generally, the use of open-ended recall questions leads to a serious and substantial under-estimation of the extent to which ordinary people know about the nation’s highest court.
Our purpose in this paper is to revisit the question of how knowledgeable the American people are about the Supreme Court. Based on two national surveys — using more appropriate, closed-ended questions — we demonstrate that levels of knowledge about the Court and its justices are far higher than nearly all earlier surveys have documented. And, based on an experiment embedded in one of the national surveys, we also show the dramatic effect of question-form on estimates of levels of knowledge. Finally, we draw out the implications of political knowledge for the degree to which people support the U.S. Supreme Court. Our findings indicate that greater knowledge of the Court is associated with stronger loyalty toward the institution. We conclude by re-connecting these findings to “positivity theory,” which asserts that paying attention to courts not only provides citizens information, but also exposes them to powerful symbols of judicial legitimacy.
Problems with Open-ended ANES Questions: Measuring What Respondents Like and Dislike about Candidates and Political Parties
The talk will be on coding the ANES questions on likes and dislikes of candidates and parties.
I. Two types of coding: reflective and qualitative.
- A. Definitions.
II. How to set up a reflective code.
- A. Basic structure.
- Code should be built for only one question.
- Two major subsets of codes: one for positive responses, one for negative.
- Developing major subsections.
B. Methods and criteria for writing explicit code categories.
- C. Example of revamped code developed for use with like/dislike of candidate question.
III. Qualifications for coders using this code.
- A. Education and knowledge level.
- B. Checking reliability.
- C. Possible use of computer content analysis.
IV. Qualitative or evaluative coding.
- A. How the “levels of conceptualization” coding was done by various scholars.
- B. An attempt to measure voter rationality by David RePass (if time permits).
V. Qualifications for coders doing qualitative coding.
- A. Education and knowledge level.
- B. Reliability.
Open-ended Questions in the NLSY
Ohio State University
This presentation provides an overview of the different types of open-ended questions that are asked in the National Longitudinal Survey of Youth (NLSY). Results of evaluations on prior coder studies are discussed as well as possibilities for future evaluations. The presentation concludes with thoughts on data confidentiality and whether open-ended data should be released in public-use data sets.
Open-ended Questions in the GSS
National Opinion Research Center
This presentation provides an overview of the General Social Survey (GSS) and discusses the types of open-ended data that this survey collects. The difficulties and complexities of asking and answering open-ended questions are outlined and possible error sources associated with coding open-ended data are discussed. The presentation concludes with remarks on possible opportunities for analysts to access open-ended data collected from the GSS in prior years.
Assessing inflation in variance estimates due to coder error using a coder reliability study
Division of Social Statistics, University of Southampton, UK
Because verbatim responses to open-ended questionnaire items must be converted to nominal categories on a coding frame, an additional source of error is introduced into the data collection process, relative to using a fixed set of pre-coded response alternatives. This type of coding error can be of two types, correlated and uncorrelated (Cochran 1977; Kalton and Stowell 1979; Campanelli et al. 1997). Correlated error pertains when the tendency to apply the wrong code is systematically associated with one or more coders, uncorrelated error when the application of a wrong code is randomly distributed across coders. While both correlated and uncorrelated coder error serve to reduce the precision of survey estimates, conventional variance estimators only incorporate the effect of uncorrelated error. In this paper, I show how coder reliability studies can be used to estimate the effect of correlated coder error on the precision of population estimates. I illustrate this point with examples from a time diary study, in which respondents record their primary activities throughout the day, in their own words.
Coding Verbal Data – What to Optimize?
The Annenberg School for Communication, University of Pennsylvania, Philadelphia
Three experiences with content analysis may shed light on various ways to optimize the coding of linguistic interview data into quantifiable terms.
First, defining recording or coding units. This is not too problematic in this context but has unusual implications for question-centered statistics.
Second, the choice of a code. The standard conception of a code is a many-to-one mapping from transcripts of verbal exchanges into quantifiable terms. Such codes are typical for much of content analysis and built into structured interview situations in which answers are selected from a list. I suppose their inadequacy is the primary reason for asking questions with open ended answers. Conceiving codes as many-to-one-set mappings proves somewhat closer to respondents’ conceptions, but conceiving them as many-to-many-interpretative-schemes might be a semantically more valid approach but makes the analyzability of data more difficult.
Third, optimizing a code. I want to discuss five criteria:
- The reliability or reproducibility of the coding process — not an issue in computer applications;
- The relevance of the quantifiable terms to the research question;
- The semantic validity of the code, the degree to which quantifiable terms represent what respondents had in mind saying; and
- The generalizability of the code over diverse interviewing situations, an opportunity that most content analyses have not taken up
- The efficiency, the costs and time required to develop a code and generate relevant data,
Open-ended questions and text analysis
Department of Sociology, University of Groningen
Assume one has open-ended questions in a survey and seriously wants to analyze the answers to these questions. Now text analysis might be applied. This talk discusses a number of choices to be made when a thematic text analysis is to be applied. It starts with requirements to be posed to the open-ended questions themselves and sketches choices in the development of a category system. Here the coding comes immediately into view. Coding can be performed from an instrumental or a representational perspective. In the first the coding is performed from the point of view of the investigator, it can be performed in a run of a computer program. In the second the point of view of the respondent is acknowledged. Now the computer can be used as a management tool, but the coding itself must be performed by a human coder. The choice for one of these methods depends on what the investigator is looking for and has consequences for the way how to proceed. When the representational way of coding is applied also questions about intercoder reliability must be posed.
Matters of Fact, Opinion, and Credibility: Distinguishing Stochastic from Substantive Information in Texts
Departments of Statistics and Sociology, Iowa State University
Testing respondents’ knowledge of the facts (“George Bush’s job in the U.S. government”) is fundamentally different from seeking their opinions (“George Bush’s job performance in the U.S. government”). In the former case the researcher is an expert, who judges the respondent. In the latter case the researcher is a novice, who is open to the respondent’s expertise on her or his own subjectivity. Decisions on who is novice vs. expert are less apparent when researchers begin encoding semantic relations among the words respondents provide. A concept’s “objective correctness” (i.e., its correspondence with a generally-accepted empirical referent) will likely be easier to evaluate than a conceptual relation’s “grammatical correctness,” since the latter depends on whose perspective is used for deriving one’s rules of grammaticality. Are researchers to be expert interpreters (diagnosticians?) of respondents’ words, or more humble inquirers into the perspectives their respondents use in formulating their words? I shall close by providing a brief illustration of this latter approach to text analysis.
Relations in Texts (and How to Get at Them)
Department of Sociology, Emory University
Available commercial textual analysis software (Atlas.ti, MaxQda, NVIVO) allow traditional kinds of content analysis along the lines of thematic analysis. They are excellent tools for the classification (and extraction) of themes in seriatim fashion, one after the other. Texts, however, convey meaning through the contextual inter-relation of words, expressions, concepts, sentences. Furthermore, meaning is conveyed, through amplification and deletion, across different texts (inter-textuality). Dealing with these “relational” issues is very problematic in QDAS software. In this presentation, I will argue for a relational approach to textual meaning. Focusing on narrative as a specific text genre, I will show how Quantitative Narrative Analysis provides a way to approach text coding in a relational framework. QNA approaches narrative texts using a “story grammar,” i.e., the simple Subject-Verb-Object (SVO) structure and where, in narrative, both S and A are typically actors and actions, with a set of characteristics (Time and Space of action, in particular, the fundamental categories of narrative).
Coding Responses Generated by Open-Ended Questions: Meaning Matching or Meaning Inference?
Department of Communication, University of California-Santa Barbara
Can coders of responses generated by open-ended survey questions consistently capture the meaning provided by respondents to those surveys? This is the question of reliability and validity. Researchers who use content analysis typically think that coder inference is a threat to validity, that is, when coders are allowed to make inferences instead of automatically following a series of coding rules, those coders will introduce a wide variety of variation of meaning, thus preventing researchers from making strong claims for reliability and validity. But is that always the case? In this presentation I argue that coder inference is not only allowable in certain situations, it is required and highly desirable. My argument begins with an examination of the standards we use to make our evaluative judgments of reliability and validity. Then I challenge some of the assumptions we make when designing coding rules, when training coders, and when assessing reliability and validity.
Classifying Open Occupation Descriptions in the Current Population Survey
Frederick G. Conrad
(Joint work with Mick Couper)
Program in Survey Methodology, University of Michigan
Joint Program in Survey Methodology, University of Maryland
An overlooked source of survey measurement error is the misclassification of open-ended responses. We report on efforts to understand the misclassification of occupation descriptions in the Current Population Survey (CPS). We first analyzed a set of occupation descriptions (n=32,362) reported by CPS respondents and entered by interviewers; each description was classified by two independent coders. One factor that was strongly related to reliability was the length of the occupation description: contrary to our intuitions, longer descriptions were less reliably coded than shorter ones. We followed this with an experimental study in which we constructed occupation descriptions (n=800) to vary on key features (e.g. specific terms led to low or high reliability in study one); these descriptions were again double-coded. Finally, we asked coders to classify 100 experimental descriptions and report on their thinking while doing so. One practice that was evident in coders’ verbal reports is that they use informal rules that may improve reliability but have no bearing on the accuracy of coding.Based on these data, we propose possible interventions to improve intercoder agreement.
CATA (Computer Aided Text Analysis) Options for the Coding of Open-Ended Survey Data
School of Communication, Cleveland State University
This presentation discusses how Computer Aided Text Analysis (CATA) techniques can be used for the coding of open-ended data by survey researchers. CATA is a quantitative content analysis method involving the automated “machine” coding of textual messages. It has several advantages over traditional “human” coding and these will be reviewed along with weaknesses of the method. The presentation will then shift to a step-by-step walkthrough of how CATA is conducted that includes advice on dictionary construction and other practical considerations. To aid in this, several CATA programs will be demonstrated using sample open-ended survey data. These programs (favorites of the presenter) include Diction 5.0, Wordstat 5.1, and CATPAC (an emergent coding program that does not require the use of a dictionary). The findings from these sample analyses will be used to demonstrate the types of information and knowledge CATA can generate for survey researchers. The presentation will conclude with a brief review of CATA programs with an eye toward emerging and future software, such as Language Logic’s Verbatim Coding System and programs for the automated coding of audiovisual data.
Computer coding of 1992 ANES Like/Dislike and MIP responses
David P. Fan
Department of Genetics, Cell Biology and Development, University of Minnesota
Open ended questions can surface both unusual responses and assess the salience of ideas in a respondent population. This paper discusses results of surveys in an open ended format called the essay survey. The surveys consisted of a single, very general, open ended question beginning with “Please write your thoughts about cigarette smoking” and continuing with some very general guidelines with minimal suggestions about what the thoughts should be. The survey was administered to both university undergraduates and a national Internet sample from Survey Sampling International. Salience was inferred from the frequency of scored ideas in the respondent text and was found to differ by no more than 3 percent between the two populations for 85 percent of the 26 scored ideas from about 1000 sentences from each of the two populations. The method had high sensitivity for both rare and common ideas that were present in 0.02 to 19 percent of the respondent sentences.
The responses were scored by computer using the InfoTrend software (U.S. Patent 5,371,673 and patents pending). This software allows the user to specify key words and phrases as well as how words in sentences or paragraphs can be combined to give complex meaning. For example, the user can specify that “not” to the left of “support” to the left of a political candidate name can lead to an unfavorable score for that candidate. The user writes scoring algorithms in ASCII text so that any subsequent user will have an unambiguous understanding of the meaning of the scores. Repeated benchmarking shows that the accuracy of the machine scoring is the same as that among human coders for complex ideas. The software is accessed via the Internet through a password protected site.
Machines that Learn how to Code Open-Ended Survey Data: Underlying Principles, Experimental Data, and Methodological Issues
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche
In the last six years I have led projects aimed at developing software that learns how to code open-ended survey data from data manually coded by humans. These projects have led to the development of software now in operation at the Customer Satisfaction division of a large international banking group, and now integrated into a major software platform for the management of open-ended survey data. This software, which can code data at a rate of tens of thousands of responses per hour, is the result of contributions from different fields of computer science, including Information Retrieval, Machine Learning, Computational Linguistics, and Opinion Mining. In this talk I will discuss the basic philosophy underlying this software, I will present the results of experiments we have run on several datasets of respondent data in which we compare the accuracy of the software against the accuracy of human coders, and I will argue for a notion of “accuracy” defined in terms of inter-coder agreement rates. Finally, I will discuss the kind of characteristics that make a survey more or less amenable to automated coding by means of our system.
CAQDAS, Secondary Analysis and the Coding of Survey Data
Department of Sociology, University of Surrey, UK
Following a brief review of the problematic nature of open-ended questions in survey analysis the paper then characterizes the nature of coding and data analysis in qualitative research. While a common thread between survey and qualitative research is that data analysis is fundamentally a process of data reduction, the variety of approaches to qualitative data analysis obliges those who pursue a systematic practice of qualitative research to provide an ‘audit trail’ of analytic decisions. This customarily involves field diaries, coding commentaries and analytic memos, but the emergence of qualitative software (generically, ‘CAQDAS’) has brought digital resources to the task. The paper will profile computational affordances at the stages of data entry and coding, data analysis (highlighting Boolean retrieval strategies), and validation (highlighting system closure, Artificial Intelligence routines, and data integration). The role of developments in data archiving and secondary analysis in the drive for formal, systematic analysis will be outlined, and the conclusion will consider the disambiguation issue in content analysis and the possible susceptibility of open coding problems to Boolean and/or Artificial Intelligence solutions.
The Application of Concept Mapping to Text Analysis: Examples and Validity Issues
Merage School of Business, University of California-Irvine
Concept mapping as applied to text analysis is a hybrid methodology that contains elements of both content coding and mapping analysis techniques. It is based on a sort methodology and uses an iterative combination of statistical and human analysis to create a map, or visual representation, of structure in a dataset. The methodology is particularly useful for addressing inductive research questions, for comparing categorizations and ratings of different stakeholder groups, for contrasting naïve versus expert coding schemes, and for developing survey item content or interview protocols. This talk will discuss these applications as well as the reliability and validity issues to consider when using this methodology.
Methods for Assessing the Reliability of Coding
Intercoder reliability (more specifically intercoder agreement) is “near the heart of content analysis; if the coding is not reliable, the analysis cannot be trusted” (Singletary, 1993). But there are few standards or guidelines available concerning how to properly calculate and report the extent to which independent judges make the same coding decisions, and although a handful of tools are available to implement the sometimes complex formulae required, information about them is often difficult to find and they are often difficult to use. A 2001 study (Lombard, Snyder-Duch, & Bracken) of reports of content analyses in mass communication found low rates of reporting of intercoder reliability and inconsistencies and the use of inappropriate indices among those articles that did contain reports. Following a brief introduction to the issues researchers planning a content analysis study face regarding reliability, the presentation will review challenges in improving this key aspect of the method, including communicating its importance; establishing and publicizing expert agreement, where possible, regarding appropriate procedures; assessing closed- and open-ended responses and variables at different levels of measurement; providing appropriate software tools; and more. The presentation will conclude with discussion of a proposal for a broader definition of reliability related to standardizing coding across studies and its potential benefits.