Product discovery is fundamentally a process of interpreting ambiguity. Stakeholder conversations—whether held in meetings, captured in emails, or reflected in support tickets—are rich in context but often diffuse, overlapping, and unstructured. Transforming this raw input into a coherent, prioritized product backlog remains one of the most cognitively intensive tasks in product management.
A recent white paper from the University of Washington and the Allen Institute, titled “Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation”, introduces a compelling solution to this class of problem. The authors propose CanDist, a two-phase framework that separates candidate generation (by a large language model, or LLM) from decision distillation (by a smaller, targeted model, or SLM). Rather than forcing an LLM to commit prematurely to a single answer, CanDist encourages it to surface multiple plausible interpretations, which are then refined downstream.
This decomposition of tasks reflects how experienced product teams often operate—brainstorming broadly before converging on a concise definition of value. The CanDist framework offers a blueprint for systematizing this process at scale.
Stage |
Role of LLM (Teacher) |
Role of SLM (Student) |
Candidate Generation |
|
Collect and organize candidate stories with metadata (e.g., intent, urgency, dependencies) |
Candidate Evaluation |
Provide multiple priority rationales (e.g., "critical for MVP", "regulatory need") |
Synthesize and score priority across competing rationales |
Distillation & Selection |
Summarize, deduplicate, and select the best story candidates |
Rank backlog by value, urgency, and feasibility |
Stakeholder input is frequently underspecified or expressed in divergent ways. The LLM can generate multiple candidate interpretations, ensuring that no intent is prematurely excluded. The SLM then distills these options into a single, coherent user story that aligns with system constraints, prior roadmap decisions, and organizational goals.
By preserving a range of candidate formulations, the system avoids over-indexing on any individual stakeholder’s language, assumptions, or priorities. This reduces bias and supports multi-stakeholder alignment.
LLMs can propose justifications across various dimensions (e.g., "critical for compliance," "low engineering complexity," "high NPS impact"). The SLM evaluates and reconciles these dimensions into a structured priority ranking, using composite scoring or decision trees informed by historical delivery data.
The LLM can suggest multiple thematic tags per story (e.g., usability, data integrity, internationalization), while the SLM normalizes tags into a predefined taxonomy. This ensures consistent backlog segmentation for team ownership, reporting, and roadmap planning.
As stakeholder priorities shift, the LLM can produce alternative versions of affected backlog items. The SLM compares these changes to the existing backlog and resolves conflicts, maintaining coherence while adapting to change.
Input: Transcript from a product strategy meeting.
LLM Output (candidate user stories):
SLM Tasks:
Final Backlog Item:
User Story: "As a stakeholder, I want real-time shipping and delay notifications across the logistics chain."
Tags: logistics, real-time, SLA
Priority: High
Dependencies: Alerting service upgrade, warehouse API access
To operationalize this architecture:
The CanDist framework formalizes a highly effective pattern for transforming unstructured input into structured, decision-ready artifacts. For product managers, it represents a path toward scalable, auditable, AI-assisted backlog generation—without sacrificing stakeholder nuance or delivery quality.
At Forte Group, this approach aligns directly with our Concerto framework for AI-augmented delivery, in which humans orchestrate and supervise multi-agent AI systems to deliver faster, more traceable outcomes. The CanDist model is not simply an annotation technique—it is a design pattern for enterprise-grade reasoning at scale.
In the age of LLMs, the backlog should no longer be the product of manual synthesis alone. It can—and should—be the result of structured orchestration between models and humans, working in concert.