Validate·Usability Testing·Automation·Emerging·VAL-038
Simulated Usability Testing
Value hypothesis
Agents simulate user interactions with a prototype, generating plausible behavioural data and task completion patterns without recruiting participants or scheduling sessions.
Velocity · Innovation
Following a typical task elicitation protocol, agents interact with a prototype or live interface, simulating user task attempts across a defined scenario set. Options range from tools modelling individual user journeys to systems generating thousands of simulated sessions in parallel, producing completion rates, error patterns, and drop-off points. Researchers review the findings, decide which behavioural patterns are plausible proxy outcomes for real users, and determines what warrants follow-up with live participants.
Risks in application
Empathy Gap
While agents can simulate task completion they cannot replicate the emotional state, cognitive load, confusion, or lived context that shape how real users experience a design. They can offer no response regarding brand credibility or product fit within a broader context of use or market options. Simulated findings may be structurally plausible but humanly wrong.
Shallow Solutions
High-volume simulation output can create false confidence in results; large numbers of simulated sessions do not compensate for the gap between agent behaviour and human behaviour.
Expertise that differentiates
Research and Insight
Determing which simulated behaviours are credible approximations for real user responses and which reflect model assumptions about how users approach tasks, as opposed to how they actually do.
Behavioral Reasoning
Interpreting task failure patterns in terms of underlying user mental models, rather than treating agent errors as equivalent to human errors.
AI Fluency that assures
Platform Awareness
AI usability simulation has unvalidated predictive accuracy against real user behaviour. Knowing whether a study type and decision stakes warrant simulation over live testing must be assessed before planning simulations, and requires knowledge of platform and methodology limits.
Process Description
Task scenario configuration matters: underspecified scenarios let agents navigate paths of least resistance and miss the usability failures the study was designed to find.
Related
Possible Indicators
Session generation speed
Time from prototype to behavioural findings relative to recruited usability testing baseline
Issue detection overlap
Proportion of AI-identified issues confirmed in subsequent live testing
Sources
Kublanow (2024). New in Maze: Introducing Interview Studies for simplified moderated research. Maze.
Author unknown (n.d.). UXAgent. HAI Lab.
Holter et al. (2026). UXCascade: Scalable Usability Testing with Simulated User Agents. arXiv.
Madan (2025). Insurance UX and AI-Simulated Personas. Designing with AI 2025.