Skip to content

Add AssemblyAI universal-3-5-pro with conversation-context carryover#155

Open
dlange-aai wants to merge 2 commits into
ServiceNow:mainfrom
dlange-aai:assemblyai-universal-3-5-pro-context-carryover
Open

Add AssemblyAI universal-3-5-pro with conversation-context carryover#155
dlange-aai wants to merge 2 commits into
ServiceNow:mainfrom
dlange-aai:assemblyai-universal-3-5-pro-context-carryover

Conversation

@dlange-aai

Copy link
Copy Markdown

What & why

Upgrades pipecat-ai to >=1.4.0 and adds first-class support for AssemblyAI's universal-3-5-pro streaming STT model (the Universal-3 Pro family) plus pipecat 1.4.0's new conversation-context carryover, so the agent's most recent reply seeds the STT before the user's next turn — improving transcription of short answers, spelled-out entities (codes/emails/IDs), and disambiguation.

Changes

  • pyproject.toml / uv.lock: pipecat-ai>=1.0.0>=1.4.0 (one new transitive dep, pyyaml-include; no other churn).
  • services.py:
    • New update_stt_agent_context(stt, text) helper — forwards the agent's reply to AssemblyAISTTService.update_agent_context() when the STT exposes it (AssemblyAI U3 Pro), no-op otherwise.
    • Plumb vad_force_turn_endpoint through EVA_MODEL__STT_PARAMS — it's a constructor arg (not a Settings field), so the existing dataclasses.fields(...) forwarding didn't carry it. Default True (Pipecat-mode).
    • Model selection and the new carryover Settings fields (agent_context, previous_context_n_turns) already flow through the existing dataclass forwarding — no extra code.
  • pipecat_server.py: call the helper from the cascade on_assistant_response hook so each agent reply seeds STT context.
  • .env.example: documented AssemblyAI universal-3-5-pro example with carryover + tuning fields.
  • tests: AssemblyAI tests use universal-3-5-pro; cover carryover-Settings forwarding, the update_stt_agent_context helper (forward / no-op-absent / empty / None), and vad_force_turn_endpoint default + override.

Why explicit carryover (not pipecat's automatic path)

In a standard pipecat bot, carryover fires automatically: the assistant context aggregator emits LLMContextAssistantTurnFrame, the upstream STT picks it up via _process_assistant_turn(), and the AssemblyAI override calls update_agent_context(). EVA's cascade pipeline drives the agent turn through a custom BenchmarkAgentProcessor that pushes TTSSpeakFrame directly and does not emit the standard LLM response frames, so that aggregation is empty and the frame is never produced. We therefore trigger the update explicitly from the existing assistant-response hook. The call is idempotent with the auto-path (update_agent_context replaces rather than accumulates), and universal-3-5-pro is recognized by pipecat's U3_PRO_MODEL_PREFIXES, so it gets the full feature set.

Verification

  • Full unit suite passes on pipecat 1.4.0 (1765 passed, 52 skipped, 3 xfailed).
  • End-to-end cascade runs with universal-3-5-pro + Cartesia sonic-3: conversations complete, and carryover is confirmed firing on the live run (AssemblyAI's _clip_agent_context logs the agent reply being sent each turn).

Notes

  • Scope is the cascade pipeline. The audio-LLM path transcribes via a separate branch and is intentionally not wired here.
  • The carryover update is awaited before the agent's TTS frame is pushed (by design — the context must be set before the user's next turn).

dlange-aai and others added 2 commits June 17, 2026 18:35
Upgrade pipecat-ai to >=1.4.0, which adds the universal-3-5-pro streaming
model (Universal-3 Pro family) and AssemblyAI conversation-context carryover
(`agent_context` Settings seed + `AssemblyAISTTService.update_agent_context()`).

In a standard pipecat bot, carryover is automatic: the assistant context
aggregator emits `LLMContextAssistantTurnFrame`, the upstream STT picks it up
via `_process_assistant_turn()`, and the AssemblyAI override forwards it to
`update_agent_context()`. EVA's cascade pipeline drives the agent turn through
a custom `BenchmarkAgentProcessor` that pushes `TTSSpeakFrame` directly and
never emits the standard LLM response frames, so the aggregation is empty and
that frame is not produced. We therefore trigger carryover explicitly.

- services.py: add `update_stt_agent_context()` helper — forwards the agent's
  reply to STT when it exposes `update_agent_context` (AssemblyAI U3 Pro),
  no-op otherwise. The existing Settings-forwarding already passes `model`,
  `agent_context`, and `previous_context_n_turns` (opt-out) through from config.
- pipecat_server.py: call the helper from the cascade `on_assistant_response`
  hook so each agent reply seeds STT before the user's next turn. Calling it
  alongside the auto-path is idempotent (update_agent_context replaces).
- .env.example: document the AssemblyAI universal-3-5-pro config + carryover.
- tests: AssemblyAI tests use universal-3-5-pro, assert carryover Settings
  forwarding, and cover the helper (forward / no-op / empty / None).

Verified: full unit suite passes on pipecat 1.4.0 (1765 passed); universal-3-5-pro
is recognized as a U3 Pro model (U3_PRO_MODEL_PREFIXES) so carryover applies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
vad_force_turn_endpoint is an AssemblyAISTTService constructor arg, not a
Settings field, so the dataclass-field forwarding in create_stt_service() does
not carry it and it was stuck at the pipecat default. Thread it explicitly from
EVA_MODEL__STT_PARAMS (default True = Pipecat-mode: force the endpoint on Silero
VAD stop; False lets AssemblyAI's server-side min/max_turn_silence decide).

The Settings-level tuning fields (vad_threshold, min_turn_silence,
max_turn_silence) already forward via the existing dataclass introspection.

- .env.example: document the tuned AssemblyAI example
  (vad_threshold=0.1, min_turn_silence=100, max_turn_silence=100,
  vad_force_turn_endpoint=true).
- tests: assert vad_force_turn_endpoint defaults True and is overridable, and
  that vad_threshold/min_turn_silence/max_turn_silence forward.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant