Module 4 Chapter 1: Working With Qualitative Data

Dr. Audrey Begun

Module 4 Chapter 1: Working With Qualitative Data

Prior modules introduced qualitative approaches and methods for study design and data collection. You learned that exploratory and descriptive research questions are often addressed using qualitative methodologies—naturalistic observation, interview, focus group, social network, GIS, or open-ended survey data, to name a few. Qualitative studies do not necessarily test hypotheses about the data (although they can test hypotheses generated by prior theory (Glaser & Struss, 1967). Qualitative studies often use data to develop an understanding of social work problems, social phenomena, or diverse populations. The focus of this chapter is on what investigators do with the collected qualitative data to begin answering their research questions.

In this chapter you will learn about:

moving from qualitative data collection to data preparation;
coding qualitative data.

Qualitative Data Preparation

Investigators engaged in qualitative research are careful to collect data in a manner that preserves, to the closest possible degree, the specific wording and context or what their study participants share. Technology tools can assist in capturing participants’ statements verbatim. For example, digital audio or video recording is possible with small, portable recorders, cell phones, or dictation devices. Digital recording is preferable to recordings based on “tapes” because they are more durable and less vulnerable to accidental destruction. Furthermore, digital software exists to help transcribe these recordings into written text (more about this later). The entire process of data collection and preparation (recording and transcription) needs to be approved by the institutional review board (IRB) for human subject participation and consented to by the study participants.

unspooled casette tape

However, seasoned investigators do not rely on technology alone. There exist many painful stories concerning data lost through technology failures: not picking up the narrative in the first place because microphones or cameras were not sufficiently sensitive or were improperly placed, batteries running out, or damaged recording devices; and, including accidentally erased recordings or other epic failures. Audio and video recordings are usually supplemented with field notes—these are written (or typed) either by the person conducting the observation, interview, or focus group, or by a collaborating observer/recorder. It is difficult to effectively play both roles, interviewer and recorder, and maintain strong interview rapport with participants; engaging a collaborator as recorder/note-taker is well-advised.

There exists another, important reason for these field notes: they contribute rich descriptive detail about the context of statements made, supplementing the recorded and transcribed participant statements, infusing the record with greater meaning. For example, field notes can clarify who was the speaker when recorded voices sound similar. And, the notes can describe changes in body language, long pauses, facial expressions, making or losing eye contact, or other events that can help investigators interpret meaning from the context of what is said.

To demonstrate, consider an early study of children’s emerging sibling relationships (Nadelman & Begun, 1982). The investigators engaged firstborn preschool aged children (2 ½ to 5 years old) from families expecting their second baby in projective doll play, a structured format for observational data. Each of the children, in their own homes, were introduced to the standardized, portable doll house and doll family (parents, child, and baby matching their own race and family composition). What the child said was audio-recorded throughout the doll play session. The investigator also kept field notes describing each child’s behavior with the dolls with comments made every 30 seconds. These field notes were particularly helpful in understanding child utterances, since the children’s language skills were emerging. The notes also explained long silences that occurred when children, upon discovering the toy toilet, ran off to the potty themselves—potty training was an active part of the children’s reality at this age, and a reminder to go was often triggered by play with the toy toilet. Field notes also helped when the doll play sessions were repeated after the birth of the younger sibling. The children often ran to the baby when playing with the doll baby, and the investigators had notes characterizing a child’s touching, eye contact, and other interactions with the real baby brother or sister. Combining these different types of qualitative data allowed investigators to develop a rich description of emerging sibling relationships that extended far beyond the preexisting ‘sibling rivalry’ paradigm.

toddler boy kissing baby

Qualitative Data Transcription. Investigators may choose to work with observational data in real time—coding observed behaviors as they occur. For the sake of reliable and verifiable data, however, they often choose to work from recordings. One challenge with recordings is the necessity to replay them, over and over, to analyze the data. Instead, investigators often choose to work from transcripts of the recorded interview, focus group, or observation sessions; reading the transcript is often faster than repeatedly rewinding and replaying content for coding purposes. As previously noted, however, some of the rich context may be lost in translation from audio or video recording to written transcription.

Data transcription is time consuming. Transcribing a one-hour interview could require four or more hours of transcription time—assuming the recording is clear and easy to interpret and there is only one person speaking. It could require upwards of ten hours in real time to transcribe a one-hour group interview (a family with multiple members or a focus group, for example). Digital transcription software can help (e.g., Dragon Speak®) but often introduces inaccuracies, necessitating a great deal of time re-checking the transcription to be certain of its accuracy. These software packages MAY have contractual agreements that violate research standards for confidentiality and data security—reading the fine print before clicking on “ACCEPT” is critical, especially for “free” transcription software packages. Investigators need to consider the relative cost of their own time, the time and experience of study team members, or professional transcription (e.g., ranges $1 to $5 per minute across several websites visited in August, 2018). Furthermore, transcription services need to be approved by the institutional review board (IRB) when a study involving human participants is proposed, and participant consent is required, as well.

Microphone on a boom in front of a computer screen

Stop and Think
Try this out for yourself: watch a brief Youtube video, such as “Charlie Bit My Finger—Again.” While listening to the video but not watching the screen, write down everything you hear the children vocalize. Now replay the video, listening while reviewing your transcription.

How accurately did you record the children’s vocalizations?

Replay the video again, this time both watching and listening. Compare your audio transcription to what you see and hear–the seeing being a context for what you hear.

How accurately did your audio transcription reflect what happened—the children’s vocalizations and the context?
How much meaning do you think you lost without tone of voice and body language/gestures?
What do you think would improve the quality of the data you transcribed, moving from the video to the written word?

Qualitative Data Coding

The process of coding qualitative data is systematic and should be replicable—that being one hallmark of empirical evidence. This does not mean that different investigators would draw the same conclusions about the data—it means that others would be able to repeat the process, following in a systematic fashion, use of the same raw data. This represents a major distinction between qualitative and quantitative research: in qualitative studies, the investigator is recognized and accepted as part of the data interpretation process; in quantitative studies, investigator impact is minimized to enhance internal validity. Either way, however, the procedures used are clearly identified and replicable, even if the results in qualitative coding differ with different investigators.

Qualitative data coding and analysis is about grouping terms, concepts, ideas, images, or other elements together into themes or categories. The themes and categories provide a means of organizing participants’ data in meaningful ways. Qualitative analysis:

“aims to make sense of and give meaning to the data collected. In general, the process of qualitative data analysis involves the identification and organization of themes or patterns from the words, text, and narratives obtained in the data collection” (Corcoran & Secret, 2013, p. 166).

Coding themes, patterns, or categories derives from one of two sources, depending on the qualitative study approach adopted and the nature of the qualitative data collected. Contents may be analyzed based on predetermined categories (a priori coding) or the categories may emerge from the contents being analyzed. Predetermined categories come from hypotheses based on theory or literature. Categories emerging from the data is a process referred to as open coding, which involves creating the categories or groupings, confirming that the contents of each have points of similarity or overlap, and providing the categories or groupings with meaningful labels or names.

Tools for helping interpret qualitative data may be as simple as colored highlighters applied to a printed transcription, statements copied onto index cards which can be shuffled and re-arranged, or digital highlighting applied to a Word® or .pdf document. A number of sophisticated analysis-assistance software tools also exist: four often reported in the literature are Atlas.ti®, NVivo®, MAXQDA®, and Dedoose®. Several qualitative analysis assistance software packages are free to download, at least in their “lite” form (e.g., Provalis QDA Miner®).

When selecting qualitative data analysis programs, investigators need to consider several factors:

These programs do not DO the analysis, they support the investigator doing analyses—remember, the investigator is the “tool” for determining themes and codes.
Some programs only assist with text data, others assist with analysis of images and other forms of data, as well (e.g., Dedoose®).
Cloud-based programs and some free packages may have practices that violate data confidentiality or security requirements established by an institutional review board (IRB).

Computer tablet and headphones on a desk

Coding and Coding Confirmation

As previously noted, an investigator may approach qualitative data with a pre-established list of categories that are applied to the data—deductive or a priori coding. The investigator applies the predetermined coding categories with the data in terms of whether each them appears or, possibly, how frequently the theme appears (depending on the research question). For example, investigators were interested in exploring the nature and extent of personal information shared by adolescents in the MySpace social networking site (Hinduja & Patchin, 2008). The study addressed a widespread concern that individuals were providing information that left them vulnerable to sexual predators. The team began with an a priori coding scheme for the type of information youth publicly post: they conducted a content analysis with a randomly sampled set of 9,282 MySpace profile pages. The coding included first name, full name, birth date, telephone number, address or city/state of residence, school attended, email address, instant messaging screen name, pictures, pictures in swimsuit or underwear, and evidence of alcohol/other substance use. In terms of identification, over 38% of pages sampled provided the adolescent’s first name, almost 9% provided a full name, over 81% provided a city of residence, and almost 28% referenced their school. Furthermore, almost 57% of sampled profiles included a photograph, and over 15% had a friend’s photograph in a swimsuit or underwear, while slightly over 5% had a photograph of themselves in swimsuit/underwear; over 18% presented evidence of alcohol use, 7.5% tobacco use, and 1.7% marijuana use. About 40% of youth set access to “private,” meaning that the other 60% of profiles were potentially viewable by anyone. While single pieces of information alone might not lead to identifying an individual, in combination (especially with pictures included) information shared makes identification and/or personal contact possible. Since this research was completed, public education about safer Internet use may have had an impact on individuals’ posting patterns, and organizations have introduced security measures that may reduce vulnerability. However, concern remains that individuals, including adolescents, are vulnerable to predators or exploitation based on what is shared across public domains of the Internet.

Open Coding. In contrast, when investigators approach qualitative data without predetermined or preconceived ideas about categories and themes present in participants’ responses, they engage in open coding (an inductive process). This is the foundation of the qualitative approach called grounded theory. Open coding was defined by Rubin and Babbie (2013) as:

A qualitative data processing method in which, instead of starting out with a list of code categories derived from theory, one develops code categories through close examination of qualitative data” (p. 337).

Accompanying the investigators’ description of the resulting coding categories is a set of memos, notes, or journaling entries that depict what the investigator was thinking in making these coding decisions. This information helps guide others in how the results emerged to provide transparency and replicability. While qualitative research is science and results in empirical evidence, performing it well can be compared to the art of dance (Engel & Schutt, 2018). This is because it requires the investigator to maintain a state of openness in interpreting the variety of ideas shared by participants while concurrently maintaining objectivity in applying the methods and a subjective awareness and reflection about the dynamic processes by which themes and categories emerge.

For example, investigators reported on a study that utilized grounded theory methods to explore the ways that transgender persons are depicted in the media (McInroy & Craig, 2015). They conducted in-depth, semi-structured interviews of several hours length with 19 young adults who self-identify as LGBTQ persons. The interviews were audio-recorded and transcribed; coding of each interview was conducted by three separate coders. The investigators found, first of all, that participants described their experiences with offline media (e.g., television and movies) differently from their experiences with online media (e.g., websites and social media). Transphobic representations (negative reactions or opinions concerning transgender persons) emerged from the thematic coding as an issue with offline media more significantly than issues of homophobia (negative reactions of opinions concerning gay, lesbian, or bisexual persons). The participants also expressed that offline media exhibits very little in the way of positive representations of transgender people; this was experienced as a contrast with more positive representations of LGBTQ persons. The participants also experienced transphobia to a greater extent in the online environment, possibly because of the anonymity allowed in this environment. On the other hand, the online environment offered greater numbers and range of supportive, helpful options for transgender youth. The study authors shared a large number of verbatim quotes from their study participants to demonstrate the coding categories that emerged from the data.

Silhouettes of a man and woman superimposed on one another

Cross-Checking Coding Decisions

In the example just presented, notice that the authors reported three individuals coded each interview. This is an important aspect in preserving the integrity of qualitative analysis—that one person’s coding decisions be confirmed by others’ independent decisions. Furthermore, many studies take their findings back to a subsample of the original participants or a new sample to learn if their conclusions are a good fit with the participants’ lived experiences. These activities are part of the “assessing interpretations” step in the qualitative “data analysis spiral” (adapted from Creswell & Poth, 2018, p. 186):

data collection
managing and organizing the data
reading and making memo notes about emergent ideas (initial analysis)
describing and classifying codes into themes (data reduction)
developing interpretations (including how themes relate to one another, how they are distinct)
assessing interpretations (how themes/interpretations hold up in other examples, how well other investigators and study participants agree with interpretations)
representing and visualizing the data
presenting an account of the findings.

Stop and Think

Take a moment to complete the following activity.

Chapter Summary

In this chapter, you were introduced to general concepts and issues involved in managing and analyzing qualitative data. There exist excellent textbooks and online demonstrations for learning to master these skills. This content also applies to mixed methods research since many aspects of mixed methods data are qualitative.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License