Skip to main content
AnalysisMAXQDAQualitative Data Analysis

The Art and Science of Text Analysis

By August 21, 2018June 4th, 2019No Comments

When designing a survey, researchers need to consider whether to use quantitative and/or qualitative questions. The benefit of using open text questions (qualitative) is that you’re able to further examine respondents’ answers which can be rich in meaning and insight into why a respondent thinks a certain way. The flip side is that you end up with a mountain of unstructured words. The question then is, how do you turn hundreds, thousands or even millions of individual comments into usable data that tells a story? This is where text analysis comes into play.

What is text analysis?

Text analysis is known as content analysis, text coding or text mining. At its simplest, text analysis is a way to compile, label and organise text data. To categorise data you can use codes which represent a summary of the actual text. For example, you might use the codes “enjoyed” or “fun” to summarise “I enjoyed reading this blog and now think text analysis is fun!”. Using codes helps to retrieve data that is similar in meaning letting the researcher quickly find and cluster text that relates to one another.

Text analysis has come a long way and today there are software packages that can aid the analysis process. The examples and images in this article relate to MAXQDA 12.

Why code?

Text coding allows you to evaluate open text and other documents in a systematic manner. This process converts qualitative data into quantitative data allowing the researcher to recover and examine perceptions and trends. The University of Georgia believes it is an important link between pure quantitative and qualitative research methodologies.

Theoretical approaches

Text analysis can be approached from two theoretical approaches, deductive or inductive analysis.

Deductive analysis, sometimes referred to as a priori coding, is where you start your analysis with codes already in mind. These codes are developed from your research or survey objectives.

Inductive analysis, based on ground theory, starts your analysis without any predetermined codes – the data informs your codes. Inductive analysis may include in-vivo coding where words and phrases of participants are used as codes. The advantage of inductive analysis is that we are working directly with the data and any prior ideas are removed.

Heather Stuckey believes that text analysis is often a combination of both.

How to code

Coding starts off with the collection and importing of data, also known as documents, that you wish to analyse and code. Traditionally this has been limited to text documents, however, software has evolved to allow analysis of pictures, video, and audio. Content analysis continues to evolve as social media becomes an ever-increasing source of information. Some analysis software, including MAXQDA, allow you to import data directly from Twitter and other social media platforms.

Once your documents have been imported you can organise them into groups and sets using the document system. This can be very handy when working on large projects where data can come from different sources such as focus groups, surveys, submissions and social media.

Capture of the Document System from MAXQDA

Example of a document system from MAXQDA

Developing your code structure

Before you start coding you have to decide on how you will approach it. This includes reviewing the data, developing a narrative and a structure to your coding. Heather Stuckey recommends reading through the data before you even start coding and developing a storyline. This storyline is related to the overall research question and is based on what the data tells you. Stuckey believes this storyline can help with your coding scheme by defining concepts and guiding the organisation of your codes.

There is a range of tools that help you design your coding structure. In MAXQDA there are word frequency tools and creative coding.

Frequency analysis allows you to get a very rough and ready overview of the data. The format varies between software, however, the principles behind them are the same. Frequency analysis searches your data and counts each time a word appears. Some software, such as MAXQDA, allow you to use stemming (lemmatization) modified words based on its stem word i.e. gave, given, gives would all be counted as gave. You can also search by phrase making the frequency analysis even more powerful. In the table below you can see we have done a word combination search so instead of “emergency response” being counted as two separate words it is counted as one phrase.

Most software tools also have a form of stop and go lists which help you clean the output by removing certain words from future frequency analysis. Words you might wish to remove include “the”, “and”, “I”, “then” – words which often don’t add much value to the analysis but are the most frequent. In the table below you can see several word combinations with a stop sign next to them meaning they have been removed.

Capture of Word Frequency table with stop list

Example of word frequency analysis output with stop list.

The words or phrases from the frequency analysis can inform your coding. “Emergency response” appears in our documents 42 times so having a code based on this will be very useful. As you go down the table you can see the word combinations “about deployment”, “public awareness” and “survival kit”  appear frequently. On their own, these codes don’t mean anything as there is little to no context – this is where the analyst needs to dive deeper.

Creative coding

Creative Coding is a tool within MAXQDA that provides you with a blank workspace on which you can create codes, move them around and create meaningful groups as well as show links. For us older researchers, it’s like drawing the code framework on a whiteboard or using butcher paper and post-it notes.

Capture of the Creative Coding workspce

Example of code planning using Creative Coding in MAXQDA

Creative coding allows you to manipulate codes and explore the relationships between them before locking down your coding system. Code systems are a hierarchical structure. You can have a parent code, subcodes and even subcodes of subcodes. In MAXQDA you can export the Creative Coding to become the Coding System.

It is important to note that while you have developed your coding system, you can still edit it throughout your project. This can including removing, adding or modifying codes as the data talks to you. In fact, the reviewing and editing of codes is called constant comparative analysis and is part of the coding process in ground theory.

Capture of the Code System

The Code System within MAXQDA

Code memos

Memoing is an important part of coding, especially if you’re using ground theory, and occurs alongside the coding and analysis process. Stuckey explains that memos can be used for noting edits to a code, conceptual notes about how a code develops the storyline and details on how the code is used. Memos are often for the researchers’ insight and information only.

Capture of a memo

Memo describing a code and how it is applied

Capture of a memo

Memo describing a parent code and where it has come from

Coding

Each software has its own way of coding text. Coding in MAXQDA is done by highlighting text and dragging it into the appropriate code. You can code a series of words, a paragraph and can have multiple codes for a segment of text.

MAXQDA User interface

User interface with text coded

As you can see there’s quite a lot involved in coding text. PublicVoice has completed projects that included 2000+ individual comments over a range of topics. The analysis of this data can be a pain-point if you don’t have the expertise or software. If you would like guidance, advice or someone to do the analysis and/or reporting work for you, give us a call, we can take the hassle out of analysing your qualitative data!

Cartoon researcher

Happy coding!

Leave a Reply