Skip to content

MiniMax-M2.7 / M3: tool output not propagated between chained tool calls (design_workflow → validate_design) #19

@sxtay1914

Description

@sxtay1914

Capability area: Agent harness / Tool use

What does M2.7 fail to do for you?
MiniMax-M2.7 fails to produce a valid JSON schema conforming to the validate_design tool definition when chaining from design_workflow.

What would "good" look like in M3?
validate_design receives a valid JSON schema that conforms to the tool's parameter definition — populated from the actual design_workflow result, not reconstructed from model reasoning.


Summary

MiniMax-M2.7 and MiniMax-M3 both fail to propagate tool output across a sequential tool call chain. When design_workflow returns a schema object, both models call the next tool (validate_design) with a schema that does not match the returned output — either reconstructed, incomplete, or structurally invalid against the tool definition.

MiniMax-M2.5 does not reproduce this behavior.


Environment

Field Value
Models affected MiniMax-M2.7, MiniMax-M3
Model passing MiniMax-M2.5
Endpoint OpenRouter
Tool calling Sequential 2-step chain
Observed 2026-06-17 (UTC+8)

Expected behavior

  1. design_workflow is called and returns a schema object
  2. The model reads the tool result
  3. validate_design is called with schema set to the exact object returned in step 1, conforming to the tool's JSON schema definition

Actual behavior

The model calls validate_design with a schema argument that does not match the design_workflow output. The model appears to reconstruct the schema from internal reasoning rather than reading the tool result, producing a value that violates the tool's parameter schema — missing required fields or structurally incorrect.

Benchmark eval output:

issues: ["Validated design is missing fields: project name, client, start date, deadline, owner, budget"]

Tool call sequence observed

1. design_workflow(...)
   → returns: { name, information: [{name, type, ai_hint}, ...], states: [...], ... }

2. validate_design({ schema: <does not match step 1 output, fails tool schema validation> })
   → missing required information fields

Steps to reproduce

We have an open-source MCP tool benchmark that reproduces this consistently:

Repo: https://github.com/Inistate/harness-test-bench

  1. Clone the repo and follow the setup instructions
  2. Run the smoke_all_tools_2 scenario against minimax/MiniMax-M2.7 or minimax/MiniMax-M3
  3. Inspect Task 2 (task_2_module_design) — validate_design will be called with a schema that does not conform to the tool's parameter definition
  4. Compare against a passing model (e.g. minimax/MiniMax-M2.5) running the same scenario

Questions

  1. Is this a known regression from M2.5?
  2. Does the model read tool results back into context between sequential calls, or is this not guaranteed?
  3. Is there a recommended pattern to ensure tool output is forwarded as-is rather than reconstructed?

Workaround

We currently exclude M2.7 and M3 from workflows requiring sequential tool output propagation. A fix at the model level would be preferable.


References

Reproduced across multiple runs in an independent MCP tool benchmark. M2.5 passed consistently. M2.7 scored 3/5 and M3 scored 2/5, with the T2 chain failing on every run for both models.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions