Skip to content

feat(oq): add GCF as --format gcf output option#216

Open
blackwell-systems wants to merge 3 commits into
speakeasy-api:mainfrom
blackwell-systems:gcf-format
Open

feat(oq): add GCF as --format gcf output option#216
blackwell-systems wants to merge 3 commits into
speakeasy-api:mainfrom
blackwell-systems:gcf-format

Conversation

@blackwell-systems

@blackwell-systems blackwell-systems commented Jun 19, 2026

Copy link
Copy Markdown

Summary

Add --format gcf as a new output format option for the oq query command, alongside existing table, json, markdown, and toon formats. Also available as an inline pipeline stage: format(gcf).

Why

Your TOON encoder has a silent data corruption bug

The hand-rolled toonValue function in format.go encodes arrays by joining elements with semicolons:

case expr.KindArray:
    return toonEscape(strings.Join(v.Arr, ";"))

If any array element contains a semicolon, the data silently corrupts on decode. A 2-element array becomes 4 elements:

Input:   ["v1;deprecated", "v2;current"]  (2 elements)
Encoded: v1;deprecated;v2;current
Decoded: ["v1", "deprecated", "v2", "current"]  (4 elements)

OpenAPI schemas have array fields (scopes, tags, enum values) where this can occur. GCF handles arrays natively with proper quoting, no lossy joins.

LLM comprehension

When oq output is consumed by LLMs (agent workflows, AI-assisted API exploration), format comprehension accuracy matters:

  • GCF: 100% on general structured data across every frontier model (Claude, GPT-5.5, Gemini)
  • GCF: 90.7% on adversarial/complex payloads (500 symbols, 200 edges)
  • TOON: 68.5% on the same adversarial data
  • JSON: 53.6%

1,700+ evaluations across 10 models from 3 providers. No model has been trained on GCF.

Full eval data: GCF benchmarks

Data integrity

GCF is verified lossless across 43 billion+ round-trips in 5 formats (JSON, YAML, TOML, CSV, MessagePack) and 6 language implementations. Zero failures.

Unlike the hand-rolled TOON encoder, GCF has a formal spec, a decoder, conformance fixtures, and fuzz testing backing every claimed number.

Full verification data: Lossless verification

Zero dependencies

gcf-go has zero runtime dependencies beyond the Go standard library, same as the rest of the oq package.

Changes

File Change
oq/format.go Add FormatGCF function, import gcf-go
oq/parse.go Accept "gcf" in inline format() stage
oq/oq.go Update FormatHint comment
cmd/openapi/commands/openapi/query.go Add case "gcf", update --format help text
go.mod / go.sum Add github.com/blackwell-systems/gcf-go

Usage

openapi query petstore.yaml "schemas | where(isComponent) | take(5)" --format gcf
openapi query petstore.yaml "schemas | where(isComponent) | take(5) | format(gcf)"

Links


Summary by cubic

Adds GCF as an output format to the oq query command and pipeline for lossless arrays and clearer LLM consumption. Use --format gcf or format(gcf).

  • New Features
    • Add --format gcf and pipeline stage format(gcf) to openapi query.
    • Implement FormatGCF using github.com/blackwell-systems/gcf-go v1.2.2; preserves types and emits native arrays.
    • Update help text, parser validation, and docs to include gcf.
    • Add tests for GCF output (count, groups, explain, special chars, inline stage).

Written for commit b882efa. Summary will update on new commits.

Review in cubic

@blackwell-systems blackwell-systems requested a review from a team as a code owner June 19, 2026 05:18

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 6 files

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread go.mod Outdated

@TristanSpeakEasy TristanSpeakEasy left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes. I don’t think this is ready to merge as-is.

Main blockers:

  • The Go version bump breaks the workspace locally: go test ./oq and go list -m all fail with module . listed in go.work file requires go >= 1.26.1, but go.work lists go 1.26.0. Please drop the bump unless it is truly required; if gcf-go requires it, we should either refactor/choose a dependency that supports the repo’s existing Go version or make a deliberate workspace-wide toolchain change separately.
  • This adds a new third-party dependency for an optional output format. I audited the downloaded module and didn’t see obvious runtime network/process/unsafe behavior, and go list -deps only showed the package itself as a non-stdlib runtime import, but this repo is intentionally dependency-light and the existing table/json/markdown/toon formatters do not add format-specific third-party deps. The cost/benefit needs maintainer buy-in, especially since the dependency is new and from the PR author’s org.
  • No tests were added for FormatGCF, format(gcf), CLI wiring, count/empty output, or array/special-character behavior. Existing formats have coverage in oq/oq_test.go; this should match that bar.
  • The TOON semicolon issue cited in the PR body is not fixed by this PR. If TOON array elements containing ; are ambiguous, we should add a regression test and fix TOON directly rather than use it only as rationale for a new format.
  • CLI docs/help are incomplete: the long command help and cmd/openapi/commands/openapi/README.md still only list table, json, markdown, and toon.

Validation run:

  • go test ./oq — failed before compile due to the go.work / go.mod version mismatch.
  • go list -m all — same failure.
  • GOWORK=off go test ./oq — passed, after downloading github.com/blackwell-systems/gcf-go v1.2.1.
  • GOWORK=off go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' github.com/blackwell-systems/gcf-go — only returned github.com/blackwell-systems/gcf-go.

Comment thread go.mod Outdated
Comment thread go.mod Outdated
Comment thread oq/format.go Outdated
@blackwell-systems

Copy link
Copy Markdown
Author

Thanks for the thorough review. Addressed everything:

  • Go version: reverted to 1.25.0.
  • Tests: added 7 tests matching existing coverage.
  • Code comment: stripped benchmark numbers, kept factual.
  • Docs: updated both READMEs and CLI help text.
  • Dependency: marked as direct.

On the TOON semicolon issue: after investigating, a proper fix would require changing the output contract of the TOON format (the ; delimiter is ambiguous when elements contain ;). Leaving this to your discretion.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 7 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="oq/oq_test.go">

<violation number="1" location="oq/oq_test.go:1462">
P2: TestFormatGCF_SpecialChars does not test special characters despite its name, creating a misleading coverage gap for GCF's core special-character handling feature</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic

Comment thread oq/format.go Outdated
Comment thread oq/oq_test.go Outdated
assert.Contains(t, out, "GCF profile=generic", "empty gcf should still have profile header")
}

func TestFormatGCF_SpecialChars(t *testing.T) {

@cubic-dev-ai cubic-dev-ai Bot Jun 19, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: TestFormatGCF_SpecialChars does not test special characters despite its name, creating a misleading coverage gap for GCF's core special-character handling feature

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At oq/oq_test.go, line 1462:

<comment>TestFormatGCF_SpecialChars does not test special characters despite its name, creating a misleading coverage gap for GCF's core special-character handling feature</comment>

<file context>
@@ -1412,6 +1412,86 @@ func TestFormatToon_Explain(t *testing.T) {
+	assert.Contains(t, out, "GCF profile=generic", "empty gcf should still have profile header")
+}
+
+func TestFormatGCF_SpecialChars(t *testing.T) {
+	t.Parallel()
+	g := loadTestGraph(t)
</file context>
Fix with cubic

@TristanSpeakEasy

Copy link
Copy Markdown
Member

Dependency audit follow-up for github.com/blackwell-systems/gcf-go@v1.2.2, specifically checking whether OpenAPI data could be sent to a third party or processed remotely:

  • Verified the PR only imports the root package as github.com/blackwell-systems/gcf-go and calls gcf.EncodeGeneric(...) from oq.FormatGCF.
  • Audited the v1.2.2 tag in a temporary clone at commit 009f87b947ef9c0d9f8a1efb20554e2adb887d21.
  • The imported root package direct imports are local-only stdlib packages: crypto/sha256, encoding/json, fmt, io, math, reflect, regexp, sort, strconv, strings, sync, unicode, unicode/utf8.
  • The EncodeGeneric path builds output in memory with strings.Builder and returns a string. It does not accept a network/file/process sink and does not perform HTTP, DNS/socket, subprocess, filesystem, or environment access.
  • Repo-wide network/API/process references exist in eval/*_test.go benchmark/evaluation code, not in the imported runtime package. The only other compiled package is cmd/gcf, which imports os for CLI stdin/stdout/file handling and is not imported by this PR.
  • go.mod still lists gopkg.in/yaml.v3 as an indirect dependency in gcf-go, but rg 'gopkg.in/yaml|yaml\.' found no source usage at v1.2.2.
  • go test ./... in the dependency passed in the temporary clone.

Conclusion: I do not see a code path in the dependency used by this PR that can send OpenAPI/query data to a third party; the imported encoder appears to process locally in memory only.

@TristanSpeakEasy TristanSpeakEasy left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up after re-reviewing the latest branch, validating the current cubic findings, and auditing github.com/blackwell-systems/gcf-go@v1.2.2:

I think this is close and I’m no longer concerned about the dependency sending OpenAPI/query data to a third party. The imported gcf.EncodeGeneric(...) path processes in memory and returns a string; I did not find HTTP/DNS/socket, subprocess, filesystem, environment, or callback-sink behavior in the imported runtime package.

Recommendation before approval:

  1. Please rebase/resolve the current conflict with main. The TOON finding is valid against this PR branch’s raw JSON fallback, but main already has the safer semicolon-array fix from #217 (toonArrayValue/toonQuote). After rebasing, this PR should keep the main implementation rather than reintroducing the JSON fallback.
  2. Please address the remaining GCF test cleanup: TestFormatGCF_SpecialChars currently does not exercise special characters despite its name. Either rename it to reflect what it checks or add a real special-character case.

Targeted validation run locally:

  • mise test -run 'TestExecute_ToonEscape_SpecialChars|TestFormatToon_SpecialChars|TestFormatGCF_SpecialChars|TestFormatGCF_Success|TestFormatGCF_Count_Success|TestFormatGCF_Groups_Success|TestFormatGCF_Empty_Success|TestFormatGCF_Explain|TestFormatGCF_InlinePipeline' ./oq — passed.

Once the branch is rebased and the misleading GCF test is cleaned up, I expect this should be ready to approve.

@blackwell-systems

Copy link
Copy Markdown
Author

Rebased on main (picks up #217's toonArrayValue fix). Also renamed TestFormatGCF_SpecialChars to TestFormatGCF_BoolAndIntFields and added a real special-characters test using hash/location fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants