June 13, 2025 | 4 minute read

Constructing Grounded Theory, chapter 5

by Kathy Charmaz

What I read

In this chapter (The Logic of Grounded Theory Coding Practices and Initial Coding), Charmaz introduces the coding methodology used in grounded theory, in order to analyze and understand data generated during research. Emphasis is placed on using gerunds – verbs, transformed into (acting like) nouns, which are presented as ending in “ing.”

First, Charmaz provides an example of initial grounded theory coding at work. The coding occurs in a two-column grid; on the right is the narrative (transcript), and on the left are the codes used; some examples are “Enduring recovery”, “Being in pain”, and “Feeling miserable.” She indicates that coding means “naming segments of data with a label that simultaneously categorizes, summarizes, and accounts for each piece of data.” These codes are short. Charmaz recommends coding early (even when research is still occurring), in order to “see where it takes you as you proceed.” Some of the notes Charmaz includes in her example coding are questions, which provide “clues to consider later.” Coding is conceptualizing what is happening in the data.

Charmaz explains that coding is the beginning of the process of creating theories that operate across time and place, and analysis of actions and events in context. First, a researcher names segments of the research; next, they select the most significant parts to use for further cross-data interpretation. Grounded theory coding avoids preconceived categories; instead, researchers generate the codes by defining what they see in the data.

Next, Charmaz provides more detail on why codes are constructed, instead of pre-determined. Fundamentally, this is because we experience the world around us through language and action; the language used in an interview or research intertwines our words with the words of the participants. It’s an “interactive” activity with a participant, even though the participant isn’t there. It is a study of emerging data, where we try to understand the data from the perspective of the participant.

Four practical questions are provided that can drive coding. These include asking “What is this data a study of?”, “What do the data suggest, pronounce, and leave unsaid?”, “From whose point of view?”, and “What theoretical category does this specific datum indicate?”

Initial codes can be temporary or provisional. They may be reworded; a goal is to “grab the reader immediately.” It is recommended to code quickly.

Charmaz then describes why coding should be done through gerunds rather than with topics or themes. Gerunds are action based and “preserves the fluidity of their experience and gives you new ways of looking at it.” Charmaz shows how the same transcript appears when coded with “-ing” phrases and with nouns or groups.

Coding can be performed on different amounts of segments of data. Some researchers code word-by-word; some code line-by-line; and others focus on incident-by-incident. Line-by-line coding forces a level of focus, and prompts a researcher to see nuances in the data. Seven strategies are offered to help code at this level, including breaking data into parts, defining actions, looking for unstated assumptions, making implicit actions explicit, formalizing significant parts, comparing data, and identifying gaps in data. Line-by-line coding forces a researcher to avoid accepting a participant’s world-views without question.

Each form of coding depends on using “comparative methods” to make distinctions of what is being analyzed. Comparisons identify similarities and differences.

Next, Charmaz introduces “In Vivo Codes,” as codes of “participants’ special terms.” Sometimes, researchers use these codes in the titles of their published papers. These codes “give you grist for analysis.” They reflect the way a participant thinks or acts, and the assumptions they make. Terms like this should be pursued during research.

Charmaz ends the chapter by recommending coding full transcripts.

What I learned and what I think

I appreciate the approach; it’s very close to the way we run interpretation sessions in practice, by working through transcript data at a close level and summarizing groups with gerunds and avoiding “red truck” theming. We have never coded within transcripts, though, so that’s new to me. It forces a level of reflection and consideration—and rigor—that substantiates the process, gives it credibility, and really does force the empathetic ability to switch the camera around and try to view the world through a different perspective.

I also appreciate that Charmaz provided actual examples; when I read Gee’s work, I struggled with the practicality of it because there were so few examples of how to actually present the data. Operationally, it’s going to be a little weird; her two columns work for the examples, but any given line may generate multiple codes, and so soon the codes will become out of sync (visually, and therefore semantically), with the transcript itself. I’m also thinking about how this will work in Excel; I think I would want each code to live in its own cell, but that means merging rows in a single column… yuck. I’m sure there is software that “helps” with this. We’ll see if an old dog wants to explore new tricks when I work this way.

Charmaz needs an editor in a bad way. The scattershot, completely jumbled notes I have above reflect the all-over-the-placeness of the text. While the sub-chapter headings sort of contain the contents, the contents themselves are presented as staccato buckshot, and it makes reading this really difficult. The language shifts from being very practical (“Line-by-line coding works particularly well with detailed data about fundamental empirical problems or processes”) to really casual (“See what you can learn.”) Maybe it doesn’t matter in overall world of method instruction. It made it really hard to absorb, though.

I’m thinking now about the way I want to approach my first data study. I know I want to leverage discourse analysis, because I want to try it (at Paul’s suggestion.) I know I want to take a gerund approach to coding, and code at an utterance level. And I know I want to pop out the other end of this with meaningful statements across participants, and within participants. It’s something like;

Transcribe the interviews, and read them several times
Put the utterances into Excel as rows (chunked as "stanzas", as Gee describes)
Create columns for the most relevant of Gee's Discourse Analysis tools, which I think - for this study - are "significance", "relationships", and "sign systems and knowledge"
Analyze the utterances for each interview through the lens of those tools; when something that was said seems to "fit", connect the utterance and the tool through a coded statement
Write a discourse model for each participant, following the examples Gee has described
Extract the most meaningful utterances across participants and put them in Miro as individual notes
Develop categories in a bottom-up fashion through visual affinity mapping of similar utterances across interviews
Group these statements into insight themes

Read some more

Space of Inquiry, v1