fix(ocr): sanitize Graph validationToken echo to prevent reflected XSS by gnjoseph · Pull Request #250 · microsoft/SharePoint-Embedded-Samples

gnjoseph · 2026-07-03T23:19:25Z

Purpose

Fixes the CodeQL js/reflected-xss (high severity) alert on #248 — the check currently failing on that PR (mergeStateStatus: UNSTABLE). Targets user/dluces/sample_app_test_plan so it flows into #248 after review (same pattern as #249).

The vulnerability

AI/ocr/server/onReceiptAdded.ts echoed the Microsoft Graph subscription validationToken (from req.query) straight into the HTTP response body:

const validationToken = req.query['validationToken'];
if (validationToken) {
  res.status(200).type('text/plain').send(String(validationToken));   // reflected XSS sink

The value is attacker-influenceable and reflected verbatim. Even as text/plain, this is a reflected-XSS sink (browser MIME-sniffing, and the value is user-controlled). Introduced when the OCR backend was ported from restify to express (b4b864f).

The fix

Graph requires the opaque, URL-safe validationToken to be echoed back verbatim to complete the subscription handshake, so I strip any character outside the token's known-safe set (base64url/base64: A–Z a–z 0–9 . _ ~ + / = - and space) before reflecting it, and add X-Content-Type-Options: nosniff:

const sanitizeValidationToken = (value: unknown): string =>
  String(value).replace(/[^A-Za-z0-9._~+/=\- ]/g, '');
...
res.status(200).type('text/plain').set('X-Content-Type-Options', 'nosniff').send(safeValidationToken);

This is a no-op for legitimate tokens (so the Graph handshake still works) while removing the characters needed for XSS (<, >, ", etc.).

Verification (Windows, Node 24)

npm run build:backend compiles clean.
validate-sample.ps1 OCR backend smoke passes.
Live handshake test against the running backend:
- Legit token abc123-XYZ_.~token== → echoed verbatim (MATCH: True), Content-Type: text/plain, X-Content-Type-Options: nosniff.
- <script>alert(1)</script> → returned as scriptalert1/script (all angle brackets stripped).

CodeQL will re-run on this PR to confirm the alert is resolved.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

CodeQL js/reflected-xss (high) flagged AI/ocr/server/onReceiptAdded.ts: the Microsoft Graph subscription validationToken from req.query was echoed straight into the HTTP response body. Although the response is sent as text/plain, a user-controlled value reflected verbatim is a reflected XSS sink (browsers can MIME-sniff, and the value is attacker influenceable). Graph requires the opaque, URL-safe validationToken to be echoed back to complete the subscription handshake, so strip any character outside the token's known-safe set (base64url/base64) before reflecting it. This is a no-op for legitimate tokens while removing the characters needed for XSS. Also set X-Content-Type-Options: nosniff as defense in depth. Verified on Windows (Node 24): backend builds; validate-sample.ps1 backend smoke passes; a legitimate token echoes verbatim (handshake intact) while '<script>alert(1)</script>' is returned as 'scriptalert1/script' (angle brackets stripped). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ocr): sanitize Graph validationToken echo to prevent reflected XSS#250

fix(ocr): sanitize Graph validationToken echo to prevent reflected XSS#250
gnjoseph wants to merge 1 commit into
user/dluces/sample_app_test_planfrom
user/gnjoseph/pr-248-ocr-xss-fix

gnjoseph commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

gnjoseph commented Jul 3, 2026

Purpose

The vulnerability

The fix

Verification (Windows, Node 24)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant