Ethical AI for Interview Analysis
A five step guide for using AI to analyze interview transcripts ethically
Artificial intelligence (AI) is transforming how researchers across industry and academia analyze data from primary as well as secondary research. With its ability to process large amounts of data quickly and efficiently, AI can provide valuable insights and save a lot of time and money. But, as with any technology, there are also risks and limitations associated with the use of AI. So let’s explore the benefits and challenges of using AI products like Ferret to analyze transcripts from primary research. Here’s a five-step checklist for using AI for primary research in ethical and responsible ways:
Determine if using AI is appropriate for your project: Researchers should consider if using AI is appropriate for their project, especially if they are 1) getting information from vulnerable populations or those who may be targeted for surveillance in some way and 2) collecting sensitive information from research participants. Data breaches happen all the time, so researchers should exercise discretion when deciding whether to use AI.
Obtain consent from research participants to use AI for analysis: Researchers should inform participants that they will be using AI technologies during the informed consent process and screening process. Prospective participants should be able to opt-in and provide consent before they begin answering research questionnaires. Offering an additional incentive for participants to allow the use of AI may be appropriate for some studies.
Anonymize and edit transcripts before analysis: Generally, researchers should remove any information that could identify research participants from transcripts, such as specific locations or names, and replace them with pseudonyms or alphanumeric codes. The code/pseudonym list, matching identities with codes, should be kept safe and private until no longer needed and then destroyed after the research has been completed. Researchers should review their transcripts to remove any more unnecessary information and, for more sensitive research topics, it is appropriate to provide edited transcripts to participants for their approval review before analysis with AI. In these cases, participants should be given a specific time frame to review and provide their feedback and/or confirmation. This expectation should be conveyed before the research, when requesting their express consent to conduct it.
Double-check the prompts you use: While there are protocols and levels of encryption to protect data, once it’s made available to large language models like GPT-4 via products like ChatGPT, the researchers and research participants are no longer in control of what happens to it. As with transcripts, the prompts researchers use to control AI systems may directly or indirectly convey sensitive data. AI products designed for industrial research, like Ferret, avoid using ingested data to train language models and are designed to be used with proprietary data. However, consumer tools like ChatGPT, will often use prompts as well as ingested data for training. If you are using general purpose, consumer AI products for less sensitive aspects of research like ideation you can usually disable training and chat history in the product’s settings. For products that are not designed to handle proprietary data, only provide information that would be OK to hypothetically share with the general public, such as through a blog.
Use AI as a tool, not a replacement for human expertise: While AI can provide useful insights and save time, it should not replace human analysis entirely. Researchers can use AI for general insights but also to point them to the areas of raw data that require more in-depth analysis. The most important part of any research project is protecting research participants by adding additional steps to existing protocols. Secondarily, best practices for research still apply. AI is not a means of circumventing human researchers. In fact, the new capabilities come with new ethical and methodological threats. As a result, experienced human researchers are more important than ever when integrating AI.
Remember to think holistically about the information you provide to tools like ChatGPT. While individual transcripts may be okay to share with OpenAI and other third parties after following this guide, a corpus of knowledge composed of several pieces of information can introduce new liabilities and threats, such as the potential for cross-referencing to circumvent the most basic safeguards. The threats range from potentially exposing research participants to unexpected risks to making the information that you are using for private analysis available to potential competitors.
While AI can seem high-risk and high-reward, it’s important to remember that technology has always shaped how researchers do their work and that avoiding digital technologies can also present significant risks. The benefits and risks of using AI in primary research comes with unprecedented advantages that are counterbalanced by less predictable risks. But whether AI is part of your research stack or not, it’s always best to err on the side of caution.
Expert Request: We are seeking input for the tech and research community for a study on how artificial intelligence is changing how industry and academia conduct research.
Subject Matter Expert Questionnaire:
5-8 minute duration, participants can opt-in to longer sessions.
Names of participants are included in the final report after confirmation of consent from participants
Participants can opt in to having their work promoted via a hyperlink alongside their name
Relevant areas of expertise include: artificial intelligence, tech ethics, research ethics research methodology, innovation, design, social/behavioral/human science.
Participants (i.e. co-creators) will be credited in the final publication after confirming that they consent to 1) being credited 2) having a quote from them included (if one is included in the final draft).