Qualitative Research: AI’s Role in Analysis Advancement


Qualitative research plays a pivotal role in enriching our comprehension of individual narratives and experiences. It is a cornerstone methodology for design researchers seeking to forge a deep connection with user perspectives, particularly during the initial phases of the design process. This approach is instrumental in guiding iterative design developments, ensuring that end-user needs are comprehensively addressed. Qualitative data encompasses a diverse array of formats, including textual content, photographs, and videos. Typically, these studies involve a more focused sample size, often with 10 or fewer participants, to facilitate an intensive, detail-oriented analysis that quantitative methods may not capture.

Although qualitative research is the methodology of choice for design researchers, the approach requires a considerable time commitment. Qualitative data is known for being unwieldy at times, and words and images require more hours of analysis than numeric data. Often, our clients are eager to obtain research findings as quickly as possible to move a product or system into production. Therefore, large scale qualitative studies are not feasible for most design research projects. With the recent surge in the availability of AI language model tools, we speculated that ChatGPT could be used to analyze extremely large sets of qualitative data more efficiently. To that end, we conducted a 6-month project testing ChatGPT as a potential tool for qualitative data analysis.


Our aim in conducting this project was to determine if AI could produce insights from a large dataset that would otherwise be unmanageable and time prohibitive for a human researcher. We used data from 25,000 open response questions to explore the capacity and capability of ChatGPT as computer-assisted qualitative data analysis software (CAQDAS). The dataset we used was provided by the VIA Institute on Character, a local non-profit organization with which we are affiliated. We decided to experiment with ChatGPT to determine if it could reliably and accurately analyze text data. Our expectation was that if ChatGPT could analyze qualitative datasets with tens or hundredsof thousands of respondents, new pathways for qualitative researchers may develop.Using AI for data analysis could change the trajectory of a research design and lead to large scale qualitative studies that were not possible before now.


To test the limits of ChatGPT 4.0 (the only version with the means to upload files), we tried two different approaches to determine the capability of the tool.


We started with a vague set of user queries to place the data preparation load on the CAQDAS and to determine if it would complete the same tasks a human researcher would.

User Query: Analyze the data in column AQ, identify themes, and provide 3-5 insights based on participant responses.

Result: Not useful.

  1. ChatGPT did not automatically clean the data without instruction which caused an error. The output from ChatGPT indicated the data file was either too long or too complex and it was unable to proceed with analysis. The raw data included responses such as “N/A” or random strings of letters, which a researcher would have deleted or ignored before analysis.
  2. Lacking more specific instruction, ChatGPT defaulted to a quantitative approach to data analysis, even though the data were text responses. One of the first outputs ChatGPT produced was a count of the most common phrases in the dataset.
We concluded that this approach to creating user queries was not useful. ChatGPT attempted to analyze the data but quickly became overwhelmed and either produced an error message or continued to attempt analysis, getting caught in the AI version of theMac’s “spinning wheel of death.



We then provided ChatGPT with more specific instructions. We instructed it toclean, review and code the data, then create insights using a theoretical framework as a guide for analysis.

User Query: I'd like to analyze some text data using Peace Psychology and Positive Psychology as theoretical frameworks. Include content from the VIA Institute on Character as an additional framework. Focus on data in the 'Open Responses_Political Differences' column.

First, ignore text that indicates a respondent did not want to answer such as 'N/A' or random strings of letters. Leave those cells blank. Next, use descriptive codessuch as a phrase that describes the content of the targeted data.

Create a new document and filter the data from columnAD. Group the data according to the codes created in columnAD and list each data point that corresponds to the code.

Create 3-5 insights using the coded data in ColumnAD using positive psychology and peace psychology as theoretical frameworks.

Create a Word document and place the insights you created in it. Make the file available for download.

Result: Success

ChatGPT produced a list of five insights using the VIA classification of character strengths as a framework, supported by references to positive psychology and peace psychology.

Our last step was to think about how to test the accuracy and reliability of the findings. Rachael has a strong background in qualitative methodology, so she extracted a sample of 20 data points and used a traditional qualitative approach to code the data. We also queried ChatGPT using the same data set with similar user queries to determine the degree of similarity between both sets of outputs.


Within the first 20 rows of participant responses, we identified at least three codes related to the five insights ChatGPT produced. We were pleased with the accuracy of the codes as they related to the insights from our sample data.


Both researchers requested insight generation from ChatGPT using similar but not identical queries. Chat GPT’s output for both sets of queries was closely aligned. Three of the five generated insights were nearly identical between the two researchers and two were unique per researcher. It is important to note that even within research teams, differences arise between researchers’ interpretation of qualitative data. Disagreements are resolved through discussion and are typical of a qualitative research project.

Finally, we tested some of the limits of ChatGPT with several ad hoc user queries. Ourqueries produced a great deal of information in a very short period.


  1. Code Participant Country of Origin: We queried the data set to determine if each participants country of origin represented an individualistic or collectivist nation(coded as “country designation”). We asked ChatGPT to code each participants country of origin using this designation. The codes were placed in a new column created by ChatGPT.
  2. Group Data and Analyze Frequency: We asked ChatGPT to examine the most common keywords and themes by country of origin. It provided a summary for all countries represented in this data set with the most frequently used codes according to participant’s country of origin.
  3. Character Strength Frequency by Country Designation: We prompted ChatGPT to organize the data according to country designation and create a bar chart to illustrate the frequency of character strengths chosen as most beneficial by participants.

All the ad hoc requests were completed successfully. ChatGPT confirmed its approach with users before completing each step. For example, after requesting the data be coded according to country designation, ChatGPT responded:

Sure! In order to label countries as "collectivist" or "individualist," I would need a reference or criteria to determine which category each country falls into. There are various sources and studies that have categorized countries based on their cultural dimensions, including the work by Geert Hofstede.

Do you have a specific reference or criteria in mind that you'd like me to use? Alternatively, I can use Hofstede's cultural dimensions as a general guideline.

ChatGPT also encouraged us to use caution when interpreting the findings from the between-country analysis, because significantly more participants were from individualistic nations.


We learned through this project that ChatGPT exhibits several esoteric preferences for working with Excel files. We only used Excel to upload data sets, so our suggestions are restricted to this software.

1. ChatGPT cannot analyze data if it has been tagged with a data type. The output will state that it completed the user query, but new files will not show any changes.

SOLUTION: Remove any Data Types tags before uploading Excel files to​​​ChatGPT.

qualitative research data types

2. ChatGPT prefers references to column names instead of the letters Excel uses to identify columns.

SOLUTION: If a user query contains a letter identifier instead of a column name, remove the space between the word “Column” and the letter.

CORRECT: “Provide a mean for the data in columnAI.”

INCORRECT: “Provide a mean for the data in Column AI.”

3. Unless instructed, ChatGPT will not automatically clean uploaded data. If a user attempts to request analysis before cleaning, it will respond with an error message.

SOLUTION: Provide explicit instructions for data cleaning before analysis.


We shared just a fraction of the user queries we submitted over a 6-month period to test ChatGPT as a qualitative analysis tool. We presented the successes and failuresas linear, concise processes for readability. However, early in the project, ChatGPT was often overwhelmed with requests and our queries resulted in error messages. Queries usually required several back-and-forth inputs between researchers and the AI to clarify instructions. With little or no guidance, ChatGPT was unable to produce results. We found that the AI required specific instructions to function as computer-assisted qualitative data analysis software. Our bottom-line recommendation is that well trained researchers test the tool using a data set for which they already possess human produced findings. Compare those findings with ChatGPT's output and evaluate its reliability and accuracy.

Based on our brief examination of ChatGPT’s capability, we advise only well-trained researchers with extensive qualitative research experience to use AI as a computer-assisted data analysis tool. As in any other profession, expertise and training are the best predictors of quality work. As the saying goes, garbage in garbage out.Users with no idea how to design a rigorous research study will not provide the needed input for AI to perform adequately.

Our early work indicates the potential for AI to assist in qualitative data analysis. Like other CAQDAS products such as MAXQDA and NVivo, the software serves as a management and organizational tool. We envision ChatGPT as a marginally higher-leveltool with the capacity for categorizing and summarizing qualitative data, with the proper guidance and instruction.

Back to Insights + News


  • Rachael Clark

    Rachael Clark

    Senior Design Researcher | [email protected]

    Rachael brings over 10 years of research experience to her role at Kaleidoscope Innovation. She has advanced training in clinical psychology and mixed methods research methodology. Guided by the principles of positive psychology, Rachael uses a human-centered lens for deeply understanding the user experience. Her work at Kaleidoscope focuses on human-machine interaction and identifying design changes capable of positively impacting well-being at the individual and institutional levels.

  • Grant Cothrel

    Grant Cothrel

    Senior Design Engineer | [email protected]

    Grant is a Senior Design Engineer who enjoys being faced with new challenges, and recognizes that well-designed products lead to better experiences and outcomes for users. He loves the design process, and has operated primarily in medical device and industrial applications. His passion to understand, innovate, and simplify has been supported and strengthened by the Kaleidoscope team and their talented partners. At home, he always has a fun project in the works (think: wooden bicycle, handheld Theremin, one-string electric guitar)!

Kaleidoscope Earns Gold

We are thrilled to share the exciting news that Kaleidoscope Innovation has achieved a Gold EcoVadis rating for 2023. This marks a major milestone in our journey to being a leader in environmental, social, and governance (ESG) excellence.​

EcoVadis, a globally recognized platform, evaluates companies based on their policies, training initiatives, and reporting mechanisms related to ESG practices. This comprehensive assessment provides a holistic view of our organization's sustainability performance, positioning us as a standout in our commitment to environmental stewardship, social responsibility, and corporate governance.​

Our latest EcoVadis assessment resulted in a remarkable increase of 15 points, propelling us to the 97th percentile among companies of similar size. This accomplishment not only highlights our dedication to sustainability but also underscores our team’s ability to outperform industry benchmarks.​

This achievement is a testament to the collective efforts of every member of the Kaleidoscope Innovation family, and it reflects our shared values and determination to make a positive impact on the world. Our success in this area wouldn't be possible without the dedication and hard work of each team member who has contributed to our sustainability initiatives.​

"We are grateful to every one of our dedicated team members for their contributions to this achievement," said Pam Ridgley, Associate Director of Business Planning. "Together, we are shaping a brighter and more sustainable future for Kaleidoscope Innovation and the communities we serve."​

Here is just a sample of how we achieved our goals in each of the three ESG categories.


  • We created a Design for Environment (DfE) Guide to help our team members ensure we use sustainable practices not just in our everyday work environment but also in the products we design and develop.
  • We reduced facility energy consumption by nearly 18% by resealing windows and creating an energy conservation culture
  • Our action team tested and implemented recommendations for sustainable 3D printing as they related to energy consumption, filament usage, and more eco-friendly materials
  • Our K-Cup recycling program diverted over 13,000 pods from the landfill and converted 479 lbsof material to compost and 160 lbs of material to recycled products.


  • Community outreach including May We Help, Adopt-a-Class, IDSA Cincinnati, Ronald McDonald House, Habitat for Humanity, and more.
  • Expanded employee engagement & wellness teams and programs including the Employee Engagement Group, DEI Committee, Sustainability Committee, Employee Lunch & Learns, Women in Product Development Monthly Lunch, and more.


  • Added numerous policies regarding diversity, equity, and inclusion, particularly with a focus on labor and human rights to our employee handbook
  • Created our own Institutional Review Board to ensure the ethical and safe development of healthcare products.

Our journey doesn't end here. This accomplishment serves as a stepping-stone toward even higher standards that continuously improve our ESG practices. The world is evolving, and so are the expectations placed upon businesses to operate sustainably. Kaleidoscope will remain agile and proactive in our approach to ensure we remain at the forefront of responsible corporate citizenship.​

“We are incredibly proud of our EcoVadis achievement,” said Kaleidoscope CEO Ben Ko. “And we’re not stopping there. Our DfE program sets a new standard for ensuring that all products Kaleidoscope creates are more sustainable than they have been in the past, thereby amplifying the impact we can have on the world through good design principles.”

Back to Insights + News
Precision in Practice: Navigating Usability Studies

Conducting a successful usability study, particularly for summative validation of medical products, can require a great deal of preparation and coordination of many moving parts. Whether it’s successfully simulating your target use environment or nailing down the exact scenarios and tasks to be presented, everything must come together. After all, nobody wants to waste the time and money it requires to complete a study if the result doesn’t align with what was intended.

One of the best ways to ensure success is to perform a pilot study with your protocol before starting the actual study. A pilot study is like a miniature version of the actual study conducted with far fewer participants. This approach helps confirm the study design will work as expected, the desired data can be obtained, the participants understand the task prompts, and more. For best results, the pilot study participants should be as close to actual participants as feasible; the same applies to the use environment. You’ll also want to conduct the pilot study early enough before the actual study to ensure there is sufficient time to update the protocol according to the findings.

Many people use the terms “dry run” and “pilot study” interchangeably, but there is technically a difference. A dry run is done to practice the protocol both before the pilot study and after the finalization of the protocol, ensuring the moderator and notetaker/analyst are comfortable with the product and script. This activity can be done with proxy participants or with no participants at all. What’s notable about a dry run is that it can be done in place of a pilot study if the participant population is expensive or difficult to recruit, if the protocol is very simple or essentially a repeat of a previous study, or if the timing does not permit a full pilot study.

While there can never be guarantees, these are proven ways to exponentially increase the odds of a study’s success. Taking the time upfront will ultimately save time and effort when it’s time to conduct the actual study. At Kaleidoscope, we've conducted hundreds of studies, so our process is as rigorous and dialed in as possible. If you're looking for a rock-solid research partner, we're ready to roll up our sleeves.

Back to Insights + News