Skip to content

[Content Understanding] Update to_llm_input page markers and filter telemetry warnings#47326

Merged
yungshinlintw merged 10 commits into
mainfrom
cu-sdk/llm-input-helper-update
Jun 10, 2026
Merged

[Content Understanding] Update to_llm_input page markers and filter telemetry warnings#47326
yungshinlintw merged 10 commits into
mainfrom
cu-sdk/llm-input-helper-update

Conversation

@chienyuanchang

Copy link
Copy Markdown
Member

Description

Updates the azure-ai-contentunderstanding to_llm_input() helper to align its rendered output with the upcoming service page-marker format and to remove non-user-facing telemetry from RAI warning output.

Changes made:

  • Updated SDK-injected document page markers from <!-- page N --> to <!-- InputPageNumber: N -->.
  • Added duplicate-marker defense: if service markdown already contains <!-- InputPageNumber:, to_llm_input() does not inject additional page markers.
  • Filtered service-emitted internal telemetry warnings whose message starts with LLMStats: from the rendered rai_warnings front matter.
  • Preserved LLMStats: text when it appears in the document markdown body; only structured warnings are filtered.
  • Updated unit tests and sample tests for the new marker format and warning-filter behavior.
  • Updated CHANGELOG.md.

Relevant issues / context:

This PR is not based on regenerated SDK code from a new API spec.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
    • No public API signatures are changed. This only changes rendered text produced by the preview to_llm_input() helper.
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Testing performed:

cd sdk/contentunderstanding/azure-ai-contentunderstanding

.venv/bin/python -m pytest tests/test_to_llm_input.py -q
# 84 passed

AZURE_TEST_RUN_LIVE=true .venv/bin/python -m pytest tests/samples/test_sample_to_llm_input.py::TestSampleToLlmInput::test_to_llm_input_multi_page_content_range -q -s
# 1 passed

@chienyuanchang chienyuanchang marked this pull request as ready for review June 3, 2026 20:42
Copilot AI review requested due to automatic review settings June 3, 2026 20:42

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the azure-ai-contentunderstanding to_llm_input() helper output format to align with an upcoming service page-marker convention and to suppress non-user-facing telemetry warnings from the rendered rai_warnings YAML front matter.

Changes:

  • Switched SDK-injected page markers from <!-- page N --> to <!-- InputPageNumber: N -->, and avoided injecting markers when the service markdown already includes InputPageNumber markers.
  • Filtered service warning messages that begin with LLMStats: (after leading whitespace) from the rendered rai_warnings block.
  • Updated unit tests and sample tests to validate the new marker format and warning filtering, and bumped package version/changelog.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/test_to_llm_input.py Updates assertions for the new InputPageNumber marker format and adds coverage for LLMStats: warning filtering and duplicate-marker defense.
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/samples/test_sample_to_llm_input.py Updates sample test expectations to the new page marker format.
sdk/contentunderstanding/azure-ai-contentunderstanding/tests/samples/test_sample_to_llm_input_async.py Updates async sample test expectations to the new page marker format.
sdk/contentunderstanding/azure-ai-contentunderstanding/README.md Adds 1.2.0b2 to the SDK-to-service-version compatibility table.
sdk/contentunderstanding/azure-ai-contentunderstanding/CHANGELOG.md Adds an unreleased 1.2.0b2 entry documenting the marker change and telemetry-warning filtering.
sdk/contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_version.py Bumps the package version to 1.2.0b2.
sdk/contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_helpers.py Implements InputPageNumber marker injection + duplicate-marker bypass, and filters LLMStats: entries from rendered RAI warnings.

@chienyuanchang chienyuanchang force-pushed the cu-sdk/llm-input-helper-update branch from a248a81 to 02ca9c2 Compare June 10, 2026 15:21
@chienyuanchang chienyuanchang force-pushed the cu-sdk/llm-input-helper-update branch from cf846b7 to 4dfefad Compare June 10, 2026 15:31
@yungshinlintw yungshinlintw merged commit ab7e362 into main Jun 10, 2026
18 checks passed
@yungshinlintw yungshinlintw deleted the cu-sdk/llm-input-helper-update branch June 10, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants