Skip to content

chore(deps): use html-to-markdown v2#163

Merged
omarluq merged 5 commits into
mainfrom
feat/fetch-url-tool
Jun 23, 2026
Merged

chore(deps): use html-to-markdown v2#163
omarluq merged 5 commits into
mainfrom
feat/fetch-url-tool

Conversation

@omarluq

@omarluq omarluq commented Jun 23, 2026

Copy link
Copy Markdown
Owner

No description provided.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. CodeRabbit does not review PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1fff5969-8020-4c02-8f3b-47235dccfd3d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Migrates internal/tool/fetch.go HTML-to-Markdown conversion from the old converter to html-to-markdown/v2 with domain-aware ConvertString(..., WithDomain(finalURL)). Conversion errors are now propagated to callers. HTML extraction is refactored into fetchedDocumentHTML. Dependency entries in go.mod are updated accordingly. New internal tests cover decode errors, redirect policy, network validation edges, and rendering helpers.

Changes

HTML-to-Markdown Converter Migration

Layer / File(s) Summary
Dependency and import wiring
go.mod, internal/tool/fetch.go
Removes the direct html-to-markdown/v2 require entry, adds golang.org/x/net v0.55.0 as direct require, adds github.com/JohannesKaufmann/dom v0.3.1 as indirect, adds fetchRenderHTMLContext error constant, and replaces the converter import with the v2 htmltomarkdown and converter subpackages.
Markdown conversion flow and error handling
internal/tool/fetch.go
Rewrites formatFetchedHTML markdown case to propagate errors; replaces string-returning helpers with fetchedHTMLMarkdown (returns string+error, calls ConvertString with WithDomain) and fetchedDocumentHTML (prefers <body>, falls back to full document); removes old converter option and URL-resolution helpers.
Fetch tool execution and validation tests
internal/tool/fetch_internal_test.go
Adds net/url import; adds TestFetchTool_FetchFormattingErrors for rendering failure coverage; adds TestFetchTool_ExecuteDecodeError, TestFetchTool_RedirectValidation, TestFetchTool_NetworkValidationEdges, and TestFetchTool_URLAndRenderingHelpers; introduces fetchTestTruncation and fetchTestSelection deterministic builders for rendering assertions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • omarluq/librecode#160: Introduced the fetch tool and its core fetch-path behavior that this PR extends with domain-aware conversion and error propagation.

Poem

🐇 Hippity-hop, the converter's new,
With domains resolved and errors in view!
The body's extracted, the markdown flows free,
WithDomain sets every relative URL spree.
No more silent failures — errors are shown,
This rabbit refactored right down to the bone! 🌿

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive No pull request description was provided by the author, making this check inconclusive as it cannot be evaluated. Add a brief pull request description explaining the motivation for upgrading to html-to-markdown v2 and any breaking changes or behavioral improvements.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'chore(deps): use html-to-markdown v2' clearly summarizes the main change: updating the html-to-markdown dependency to version 2 and refactoring code to use it.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands.

@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 95.45455% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.77%. Comparing base (3191852) to head (3b07ad0).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
internal/tool/fetch.go 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #163      +/-   ##
==========================================
+ Coverage   82.59%   82.77%   +0.18%     
==========================================
  Files         285      285              
  Lines       22660    22653       -7     
==========================================
+ Hits        18715    18752      +37     
+ Misses       2782     2754      -28     
+ Partials     1163     1147      -16     
Flag Coverage Δ
unittests 82.77% <95.45%> (+0.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/tool/fetch.go (1)

488-505: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Optional: extract the shared body-extraction logic.

fetchedHTMLForMarkdown (488-505) and fetchedHTMLBody (507-524) duplicate the same body-vs-full-document extraction; the only difference is the <html><body> wrapper. Extracting a shared helper would remove the duplication and also collapse the repeated "render fetched html" literal that SonarCloud flagged (duplicated 4×).

♻️ Sketch
const fetchRenderHTMLMsg = "render fetched html"

func fetchedDocumentHTML(doc *goquery.Document) (string, error) {
	if body := doc.Find("body").First(); body.Length() > 0 {
		html, err := body.Html()
		if err != nil {
			return "", oops.In("tool").Code("fetch_render_html").Wrapf(err, fetchRenderHTMLMsg)
		}
		return html, nil
	}
	html, err := doc.Html()
	if err != nil {
		return "", oops.In("tool").Code("fetch_render_html").Wrapf(err, fetchRenderHTMLMsg)
	}
	return html, nil
}

fetchedHTMLForMarkdown returns it directly; fetchedHTMLBody wraps the result in <html>\n<body>\n....

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/tool/fetch.go` around lines 488 - 505, Extract the shared
body-extraction logic from fetchedHTMLForMarkdown and fetchedHTMLBody functions
into a common helper function (such as fetchedDocumentHTML) that returns the
HTML content by checking if a body tag exists and falling back to the full
document. Define a constant for the repeated "render fetched html" error message
string. Have fetchedHTMLForMarkdown return the helper's result directly, and
have fetchedHTMLBody wrap the helper's result in the html and body tags. This
eliminates code duplication and consolidates the repeated error message literal.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/tool/fetch.go`:
- Around line 488-505: Extract the shared body-extraction logic from
fetchedHTMLForMarkdown and fetchedHTMLBody functions into a common helper
function (such as fetchedDocumentHTML) that returns the HTML content by checking
if a body tag exists and falling back to the full document. Define a constant
for the repeated "render fetched html" error message string. Have
fetchedHTMLForMarkdown return the helper's result directly, and have
fetchedHTMLBody wrap the helper's result in the html and body tags. This
eliminates code duplication and consolidates the repeated error message literal.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 755cda4f-554e-419b-a1fd-67e46a2a0a8c

📥 Commits

Reviewing files that changed from the base of the PR and between 7e65266 and a3b387d.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (2)
  • go.mod
  • internal/tool/fetch.go

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 23, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 23, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 23, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 23, 2026
@sonarqubecloud

Copy link
Copy Markdown

@omarluq omarluq merged commit bbf8d3a into main Jun 23, 2026
17 of 20 checks passed
@omarluq omarluq deleted the feat/fetch-url-tool branch June 23, 2026 03:08
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant