Skip to content

Introduce Sitemap, LLMs.txt, Address LLM observability by ensuring https://rspec.info/robots.txt allows bot crawling#246

Open
JulienDefrance wants to merge 1 commit into
rspec:sourcefrom
JulienDefrance:source
Open

Introduce Sitemap, LLMs.txt, Address LLM observability by ensuring https://rspec.info/robots.txt allows bot crawling#246
JulienDefrance wants to merge 1 commit into
rspec:sourcefrom
JulienDefrance:source

Conversation

@JulienDefrance

@JulienDefrance JulienDefrance commented Jun 20, 2026

Copy link
Copy Markdown

Changelog

  • Introduces sitemap/sitemap-index.xml
  • Introduces llms.txt
  • Addresses LLM observability by ensuring https://rspec.info/robots.txt allows bot crawling (gptbot previously disallowed)

Fixes #245

@JonRowe JonRowe left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to adding a sitemap if its auto generated, we use middleman and theres probably a plugin for it. Can be a separate PR (or dropped entirely from this one).

I will think about relaxing therobots.txtgiven its not likely that effective and is certainly not exhaustive enough, however the changes requested to it need to be made.

Lastly I'm also not interested in adding an llms.txt given I don't want to use my limited time to maintain it.

Comment thread source/llms.txt
@@ -0,0 +1,184 @@
# RSpec

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not interested in adding this, so please remove.

@@ -0,0 +1,336 @@
<?xml version="1.0" encoding="UTF-8"?>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add a sitemap, but it needs to be autogenerated.

Comment thread source/robots.txt
User-agent: GPTBot
Disallow: /
User-agent: *
Content-Signal: search=yes, ai-train=yes, ai-input=yes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not willing to add this, this is the minimum I will consider:

Suggested change
Content-Signal: search=yes, ai-train=yes, ai-input=yes
Content-Signal: search=yes, ai-train=no, ai-input=no

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Address LLM observability by ensuring https://rspec.info/robots.txt allows bot crawling

2 participants