---
title: "Synthetic Users Aren't Data"
description: "Synthetic users simulate what a person might say. That's a hypothetical with a UI, the exact bad data the Mom Test rejects. Where AI personas help, and where they lie."
url: "https://luc.so/articles/synthetic-users-vs-real-interviews/"
category: "AI customer research"
date: "2026-06-29T19:05:53.533545"
source: luc.so
---

# Synthetic Users Aren't Data

**TL;DR:** A synthetic user is a model imagining what a person might say. That is a hypothetical with a clean interface, and a hypothetical is the exact kind of answer [the Mom Test throws out](/manifesto). Simulations skip the one thing that makes research real: a specific person's past behavior. They have legitimate uses for rehearsal and rough hypotheses, but they are not evidence, and they cannot replace a real interview held to craft. "AI interviews" and "synthetic users" are two different things. Lùc runs the first, at volume, and refuses the second.

## The pitch for synthetic users is good

You should understand why this idea spread, because the pitch is genuinely appealing.

Research is slow and expensive. Recruiting takes weeks. Scheduling falls through. A traditional in-depth interview costs real money per respondent, and an agency eats that cost before it bills a deliverable. Synthetic users promise to erase all of it. Spin up a persona, ask it anything, get an answer in seconds. Versive built a self-serve product around AI personas after absorbing Snap, priced so a team can run it without a sales call. The vision is research that costs almost nothing and never sleeps.

If a model could give you what a real respondent gives you, this would be the end of the argument. It can't, and the reason is specific.

## Why a simulated answer is a hypothetical

When you ask a synthetic user a question, the model predicts the most plausible text a person like that might produce. It is pattern completion over training data. It has never used your product, never abandoned a checkout, never switched tools at 11pm because the old one lost their work again.

So what you get back is a guess about what someone would say. In interview terms, that has a name. It is hypothetical fluff, one of the three types of bad data the Mom Test names alongside compliments and wishlists. The whole discipline exists to drag respondents off "I would probably..." and onto "here is what I actually did last Tuesday." A real interviewer spends the conversation fighting to get past the hypothetical, because the hypothetical is where people are confident, agreeable, and wrong.

A synthetic user is made entirely of that material. The thing the craft works hardest to remove is the only thing a simulation has to offer. You are not skipping the slow part of research. You are keeping the worst part and throwing away the rest.

## What real interviews have that simulations can't

Two things, and a model cannot manufacture either.

**Past behavior.** A real respondent carries a history. They did something, in a real situation, for reasons they may not even understand yet. Good interviewing excavates that: what triggered the search, what they tried first, what made them switch, what they were anxious about. JTBD calls this the forces of progress and the switch moment. None of it exists inside a synthetic user, because nothing ever happened to it. There is no last Tuesday to walk back through.

**The flinch.** The most valuable moment in an interview is the small one. The pause before an answer. The word someone picks and then corrects. The story that contradicts the thing they said two minutes ago. A skilled interviewer follows the flinch, not the script, because the flinch is where the real motivation leaks out. A synthetic user has nothing to flinch about. It produces fluent, confident text on demand. It will never hesitate, never surprise you, never reveal a tension it didn't know it had. You can only learn what you already encoded into the prompt, which means you learn nothing you didn't already believe.

That last point is the quiet danger. Synthetic users are a mirror that talks back. They tend to confirm the assumptions baked into how you described them, which is the precise failure mode disciplined research is built to prevent.

## "AI interviews" is not the same thing as "synthetic users"

This is where the category gets muddled, so let's be exact, because the difference is the whole point.

**Synthetic users** replace the respondent. The human is gone. A model plays the customer.

**AI-moderated interviews** replace the interviewer's hours, not the customer. A real person sits on the other end. The AI asks, listens, probes, and follows up, with a live human answering from their own real experience.

These are opposite bets. One removes the source of truth. The other removes the labor of extracting it. *Harvard Business Review* treated AI-moderated interviewing as a serious, emerging research method in April 2026, which tells you the category is real and maturing. But "AI" in the headline hides the split. An AI that interviews 200 real people is doing research. An AI that imagines 200 people is doing autocomplete with a persona on top. Same buzzword, different epistemology.

Most competitors blur this on purpose, because "AI research" sounds like one big wave. It isn't. Read the fine print on whether a tool talks to people or invents them.

## Disciplined AI volume versus undisciplined AI volume

Putting an AI between you and a real respondent does not make the interview good. It makes it scalable, which is a different property, and scale cuts both ways.

Faster bad interviews are still bad interviews. An AI that runs your existing flawed discussion guide at scale will plant the answer inside the question 200 times instead of 12. It will accept the first compliment and move on. It will hear a vague answer and fail to probe, because [following up on a thin answer without leading the witness](/articles/customer-interviews-without-leading-questions) is the hardest move in the craft, and most automated interviewers don't even try. Strella's pitch is "run 100 customer interviews by tomorrow." The volume is real. The question nobody on that side answers is whether the hundredth interview is any more disciplined than the first.

That is the line Lùc is built on. The product is the discipline, not the speed. It asks about what people did, not what they would do. It treats enthusiasm as noise until a behavior backs it. It never feeds the answer inside the question. It follows the flinch. Then it runs that, held to craft, at the volume a project needs, so the researcher spends their judgment on synthesis instead of on the two-hundredth conversation. Volume only helps if the craft survives it. Otherwise you have automated the thing you should have stopped doing.

## Where synthetic data is fine, and where it lies

This is not a hit piece. Synthetic data has a real seat at the table. The mistake is letting it sit in the wrong chair.

**Where it earns its place:**

-   **Rehearsal.** Pressure-test a discussion guide against a simulated respondent before you spend a real one. Find the leading questions and the dead ends on a dummy, not on a paying interview.
-   **Draft hypotheses.** Generate a first pass at likely objections, segments, or edge cases to investigate. Use it to aim the research, never to conclude it.
-   **Synthetic control data and augmentation** in quantitative pipelines, where the method is understood, the limits are measured, and a human owns the error bars.
-   **Onboarding and training.** Let a junior researcher practice probing against a simulation before they sit with a real person.

**Where it lies, every time:**

-   As **evidence for a decision.** If a synthetic answer is the reason you shipped, killed, or repriced something, you decided on fiction.
-   As a **substitute for discovery.** It cannot tell you what you don't already know, because it only knows what you told it.
-   As **proof for a stakeholder or client.** A verbatim from a real person who lived the problem carries weight. A verbatim from a model carries a lawsuit-shaped risk and zero credibility the moment anyone asks who said it.

The honest line is simple. Use synthetic data to prepare for evidence. Never let it stand in for evidence. The moment a simulation starts answering questions only a real customer can answer, it has stopped helping and started lying with a confident face.

## Frequently asked questions

### Are synthetic users or AI personas reliable for research?

For some jobs, yes. For evidence, no. A synthetic user is reliable for rehearsing a guide, drafting likely objections, or training a researcher to probe. It is not reliable as a finding, because the model is predicting plausible text rather than reporting what a real person did. Treat it as a sparring partner that helps you prepare, never as a witness whose answer you'd act on.

### What AI tools run disciplined customer interviews?

Disciplined AI interviewing means an AI moderates a real conversation with a real respondent and holds the craft: ask about past behavior, never plant the answer inside the question, treat compliments as noise until a behavior backs them, and follow the flinch instead of the script. That is what Lùc is built to do, running the volume of disciplined real interviews a project needs in parallel with the human researcher. Watch the distinction when you evaluate any tool: does it talk to real people, or does it generate them?

### Can AI replace customer interviews?

No. AI can run the interview and speed up synthesis, but it cannot replace the real person on the other side. A simulated respondent has no past behavior to report and no body to flinch. AI changes who asks the questions and how many you can run. It does not change the fact that you still need a real human to answer them. For agencies, that shift is mostly economic rather than methodological, and it reshapes [how a discovery phase gets scoped and priced](/articles/ai-customer-research-agency-economics).

* * *

Lùc runs disciplined customer interviews with real people, at the volume a project needs. [Join the closed beta on the home page.](/)