BofAI · Will-Guan · Jul 2, 2026 · Jul 1, 2026 · Jul 2, 2026 · Jul 2, 2026
diff --git a/docs/llmservice/models/claude-fable-5.md b/docs/llmservice/models/claude-fable-5.md
@@ -0,0 +1,53 @@
+# Claude Fable 5
+
+## Overview
+
+Claude Fable 5 is a high-capability Anthropic model available on B.AI for advanced reasoning, coding, long-context analysis, and agentic workflows. It is designed for complex tasks that require sustained context, tool-assisted execution, and high-quality structured outputs. Specific capabilities, context limits, tool support, and availability may vary by B.AI model catalog and platform configuration.
+
+## Key Features
+
+* **Advanced Reasoning**: Suitable for complex analytical, technical, and professional knowledge tasks.
+* **Software Engineering Workflows**: Designed for coding assistance, debugging, refactoring, code review, and multi-step implementation planning.
+* **Long-Context Tasks**: Supports extended analysis across large codebases, long documents, and multi-turn work sessions when enabled by the platform configuration.
+* **Agentic and Tool-Assisted Workflows**: Suitable for workflows that rely on tool use, function calling, code execution, MCP, or compatible agent environments.
+* **Multimodal Understanding**: Supports text and image input for document, screenshot, chart, and diagram understanding where available.
+
+## Best Use Cases
+
+* **Complex Software Engineering**: Large feature work, repository-scale refactors, migration planning, bug investigation, and code review.
+* **Extended Agentic Workflows**: Multi-step tasks that require planning, tool use, verification, and sustained context over longer sessions.
+* **Research and Knowledge Work**: Analysis and synthesis across technical documents, legal or financial materials, and structured research sources.
+* **Visual Document Analysis**: Understanding screenshots, diagrams, charts, PDFs, and other image-based materials when supported by the workflow.
+
+## Capabilities and Limitations
+
+| Capability         | Description                                                                                         |
+| :----------------- | :-------------------------------------------------------------------------------------------------- |
+| **Reasoning**      | Advanced reasoning for complex professional and technical tasks                                      |
+| **Coding**         | Strong coding, debugging, refactoring, and code review capabilities                                  |
+| **Agentic**        | Suitable for long-running tool workflows and multi-step agent tasks                                  |
+| **Computer Use**   | Can support browser and desktop interaction through compatible tools and environments                |
+| **Multimodal**     | Text and image input; text output                                                                   |
+| **Context Window** | Up to 1,000,000 tokens, subject to platform configuration                                            |
+| **Max Output**     | Up to 128,000 tokens, subject to platform configuration                                              |
+| **Tool Use**       | Function calling, code execution, MCP support, adaptive thinking, and compatible agent workflows     |
+| **Multilingual**   | Strong multilingual performance across major world languages                                         |
+
+### Known Limitations
+
+* Specific capability availability may depend on the B.AI integration, Anthropic platform support, plan settings, and rollout status.
+* Web access, code execution, computer use, and external actions require compatible tools or integrations.
+* Image input is supported, but native audio or video input is not listed for this model.
+* Public evaluations, third-party comparisons, policy behavior, and implementation details may change over time, so they are not treated as fixed guarantees in this documentation.
+
+## Credits Usage
+
+| Model | Input (Credits/Token) | Cache Write (Credits/Token) | Cache Read (Credits/Token) | Output (Credits/Token) | Web Search (Credits/Use) | Billing Notes |
+| :--- | --------------------: | --------------------------: | -------------------------: | ---------------------: | -----------------------: | :--- |
+| **Claude Fable 5** | `10.00` | `12.50` | `1.00` | `50.00` | `10,000` | - |
+
+:::info Pricing note
+Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records.
+:::
+
+* **Prompt caching**: Cache writes are charged at 1.25x base input price for the 5-minute TTL option, or 2x base input price for the 1-hour TTL option. Cache reads are charged at 0.1x base input price. Prompt caching requires a minimum of 1,024 tokens.
diff --git a/docs/llmservice/models/claude-sonnet-5.md b/docs/llmservice/models/claude-sonnet-5.md
@@ -0,0 +1,50 @@
+# Claude Sonnet 5
+
+## Overview
+
+Claude Sonnet 5, released by Anthropic on June 30, 2026, is the next generation of the Sonnet family and a drop-in upgrade for Claude Sonnet 4.6. It is designed for stronger agentic behavior, coding, tool use, computer-use workflows, and knowledge work at Sonnet-tier pricing.
+
+## Key Features
+
+* **Agentic Task Execution**: Designed to plan, use tools such as browsers and terminals, and complete multi-step work more reliably than Claude Sonnet 4.6.
+* **Adaptive Thinking by Default**: Requests run with adaptive thinking unless `thinking: {type: "disabled"}` is passed; the `effort` parameter controls the capability, latency, and token-spend tradeoff.
+* **1M Context Window**: Supports a 1M-token context window by default, with no smaller context variant and no long-context surcharge.
+* **Broad Platform Availability**: Available through the Claude API, Claude Code, Claude Platform on AWS, Amazon Bedrock, Google Cloud, Microsoft Foundry preview, and Claude consumer plans.
+
+## Best Use Cases
+
+* **Production Agent Workflows**: Multi-step automation, agentic search, tool-heavy reasoning, browser/terminal workflows, and long-running delegated tasks.
+* **Software Engineering**: Coding, debugging, refactoring, code review, test-fix loops, and brownfield repository work where follow-through matters.
+* **High-Volume Knowledge Work**: Research, analysis, structured extraction, business operations, legal research, customer workflows, and internal productivity tools that need a balance of capability and cost.
+
+## Capabilities and Limitations
+
+| Capability           | Description                                                                                                                                                                                                               |
+| :------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| **Reasoning**        | Adaptive thinking is on by default; `effort` supports `low`, `medium`, `high`, `xhigh`, and `max`, with `high` as the default. Anthropic reports stronger agentic, coding, and knowledge-work performance than Sonnet 4.6. |
+| **Creative Writing** | Supports general text generation and long-form writing; Anthropic recommends re-evaluating style prompts because prose style may shift versus earlier Sonnet models.                                                      |
+| **Multimodal**       | Text and image input, text output, multilingual capabilities, and vision. Native audio/video input or generation is not listed.                                                                                           |
+| **Response Speed**   | Listed with "Fast" comparative latency in Anthropic's model table. Lower effort can reduce latency and token usage; higher effort increases thinking/tool-use depth.                                                      |
+| **Context Window**   | 1M tokens.                                                                                                                                                                                                                |
+| **Max Output**       | 128K tokens.                                                                                                                                                                                                              |
+| **Tool Use**         | Supports the same tool and platform feature set as Claude Sonnet 4.6 except Priority Tier is not available; tool use is more readily triggered at higher effort levels.                                                   |
+| **Multilingual**     | Official docs state multilingual support across current Claude models, but do not publish a separate Sonnet 5 language benchmark.                                                                                         |
+
+### Known Limitations
+
+* Manual extended thinking (`thinking: {type: "enabled", budget_tokens: N}`) is removed and returns a 400 error; use adaptive thinking with `effort` instead.
+* Non-default sampling parameters (`temperature`, `top_p`, `top_k`) return a 400 error; use system instructions for tone and style control.
+* Assistant message prefilling remains unsupported and returns a 400 error.
+* The new tokenizer produces approximately 30% more tokens for the same text than Claude Sonnet 4.6, so token budgets, context usage, and equivalent-request costs should be remeasured.
+* Cybersecurity safeguards are enabled by default; high-risk or prohibited cybersecurity requests may be refused with `stop_reason: "refusal"`.
+
+## Credits Usage
+
+| Model | Pricing Period | Input (Credits/Token) | 5m Cache Write (Credits/Token) | 1h Cache Write (Credits/Token) | Cache Read (Credits/Token) | Output (Credits/Token) | Web Search (Credits/Use) |
+| :--- | :--- | --------------------: | -----------------------------: | -----------------------------: | -------------------------: | ---------------------: | -----------------------: |
+| **Claude Sonnet 5** | Through Aug 31, 2026 | `2.00` | `2.50` | `4.00` | `0.20` | `10.00` | `10,000` |
+| **Claude Sonnet 5** | From Sep 1, 2026 | `3.00` | `3.75` | `6.00` | `0.30` | `15.00` | `10,000` |
+
+:::info Pricing note
+The main pricing table shows the currently effective standard reference price. For Claude Sonnet 5, the current standard reference price applies through August 31, 2026. Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records.
+:::
diff --git a/docs/llmservice/pricing-and-usage.md b/docs/llmservice/pricing-and-usage.md
@@ -33,17 +33,23 @@ The platform uses a unified Credits system to measure and settle usage across al
 | GPT-5 Mini        |                  0.25 |                        0.25 |                      0.025 |                   2.00 |                   10,000 |
 | GPT-5.4 Nano      |                  0.20 |                        0.20 |                       0.02 |                   1.25 |                   10,000 |
 | GPT-5 Nano        |                  0.05 |                        0.05 |                      0.005 |                   0.40 |                        - |
+| Claude Fable 5    |                 10.00 |                       12.50 |                       1.00 |                  50.00 |                   10,000 |
 | Claude Opus 4.8   |                  5.00 |                        6.25 |                       0.50 |                  25.00 |                   10,000 |
 | Claude Opus 4.7   |                  5.00 |                        6.25 |                       0.50 |                  25.00 |                   10,000 |
 | Claude Opus 4.6   |                  5.00 |                        6.25 |                       0.50 |                  25.00 |                   10,000 |
 | Claude Opus 4.5   |                  5.00 |                        6.25 |                       0.50 |                  25.00 |                   10,000 |
+| Claude Sonnet 5   |                  2.00 |                        2.50 |                       0.20 |                  10.00 |                   10,000 |
 | Claude Sonnet 4.6 |                  3.00 |                        3.75 |                       0.30 |                  15.00 |                   10,000 |
 | Claude Sonnet 4.5 |                  3.00 |                        3.75 |                       0.30 |                  15.00 |                   10,000 |
 | Claude Haiku 4.5  |                  1.00 |                        1.25 |                       0.10 |                   5.00 |                   10,000 |
 | Gemini 3.1 Pro    |                  2.00 |                        2.00 |                       0.20 |                  12.00 |                   14,000 |
 | Gemini 3.5 Flash  |                  1.50 |                        1.50 |                       0.15 |                   9.00 |                   14,000 |
 | Gemini 3 Flash    |                  0.50 |                        0.50 |                       0.05 |                   3.00 |                   14,000 |
 
+:::caution Main table scope
+The main pricing table shows the currently effective standard reference price for each model. The `Cache Write` column represents the billing rate when cache writing occurs; it does not imply a unified cache TTL across all models. Cache behavior, retention time, and extended caching options may vary by model provider. If a model has special caching rules, 1-hour cache write pricing, or time-based pricing, please refer to the corresponding model detail page.
+:::
+
 :::info Pricing note
 Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records.
 :::

diff --git a/...Hans/docusaurus-plugin-content-docs/current/llmservice/models/claude-fable-5.md b/...Hans/docusaurus-plugin-content-docs/current/llmservice/models/claude-fable-5.md
@@ -0,0 +1,53 @@
+# Claude Fable 5
+
+## 概述
+
+Claude Fable 5 是 B.AI 上可用的 Anthropic 高能力模型，面向复杂推理、代码任务、长上下文分析和 Agent 工作流。它适合需要持续上下文、工具辅助执行和高质量结构化输出的复杂任务。具体能力、上下文长度、工具支持和可用状态可能会随 B.AI 模型目录和平台配置调整。
+
+## 核心特性
+
+* **高级推理能力**：适合复杂分析、技术任务和专业知识工作。
+* **软件工程工作流**：面向代码辅助、调试、重构、代码审查和多步骤实现规划。
+* **长上下文任务**：在平台配置支持时，可用于大型代码库、长文档和多轮工作会话的持续分析。
+* **Agent 与工具辅助工作流**：适合依赖工具调用、函数调用、代码执行、MCP 或兼容 Agent 环境的工作流。
+* **多模态理解**：在可用场景下，支持文本和图像输入，可用于文档、截图、图表和技术示意图理解。
+
+## 适用场景
+
+* **复杂软件工程**：大型功能开发、仓库级重构、迁移规划、Bug 排查和代码审查。
+* **长时间 Agent 工作流**：需要规划、工具调用、验证和持续上下文保持的多步骤任务。
+* **研究与知识工作**：技术文档、法律或金融材料以及结构化研究资料的分析与综合。
+* **视觉文档分析**：在工作流支持时，可处理截图、图表、PDF 和其他图像型材料。
+
+## 能力与限制
+
+| 能力维度 | 说明 |
+| :--- | :--- |
+| **推理能力** | 适合复杂专业任务和技术任务的高级推理 |
+| **编程能力** | 具备较强的编码、调试、重构和代码审查能力 |
+| **Agent 能力** | 适合长时间工具调用工作流和多步骤 Agent 任务 |
+| **计算机操作** | 可通过兼容工具和环境支持浏览器及桌面交互 |
+| **多模态能力** | 支持文本和图像输入；输出为文本 |
+| **上下文窗口** | 最高 1,000,000 tokens，具体以平台配置为准 |
+| **最大输出** | 最高 128,000 tokens，具体以平台配置为准 |
+| **工具调用** | 支持函数调用、代码执行、MCP、自适应思考和兼容 Agent 工作流 |
+| **多语言能力** | 在主要世界语言上具备较强的多语言表现 |
+
+### 已知限制
+
+* 具体能力可用性可能取决于 B.AI 集成、Anthropic 平台支持、套餐配置和功能上线状态。
+* 联网访问、代码执行、计算机操作和外部动作需要兼容工具或集成支持。
+* 支持图像输入，但该模型未标明原生音频或视频输入能力。
+* 公开评测、第三方对比、策略行为和实现细节可能随时间变化，因此本文档不将其作为固定承诺。
+
+## 积分消耗
+
+| 模型名称 | 输入 (Credits/Token) | Cache Write (Credits/Token) | Cache Read (Credits/Token) | 输出 (Credits/Token) | 网页搜索（Credits/次） | 计费说明 |
+| :--- | --------------------: | --------------------------: | -------------------------: | -------------------: | ---------------------: | :--- |
+| **Claude Fable 5** | `10.00` | `12.50` | `1.00` | `50.00` | `10,000` | - |
+
+:::info 价格说明
+文档价格为 B.AI 平台模型标准参考价，仅供基础计费说明使用。B.AI 可能会通过充值赠送及账户权益等方式，为用户提供更低的实际使用成本。具体价格、赠送积分及账户权益请以平台页面展示及最终账单为准。
+:::
+
+* **Prompt caching**：缓存写入按基础输入价格的 1.25x 计费（5 分钟 TTL），或按基础输入价格的 2x 计费（1 小时 TTL）。缓存读取按基础输入价格的 0.1x 计费。使用 Prompt caching 时，最低需要 1,024 tokens。