Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robots.txt: add /server/* and /app/health to list of disallowed paths #3275

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

saschaszott
Copy link
Contributor

@saschaszott saschaszott commented Aug 29, 2024

Description

This minor PR extends robots.txt by additional Disallow rules.

@saschaszott saschaszott changed the title robots.txt: add /server/* to list of disallowed paths robots.txt: add /server/* and /app/health to list of disallowed paths Aug 29, 2024
Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @saschaszott . Overall, this looks good. Just a minor comment inline to recommend.

@@ -18,6 +18,10 @@ Disallow: /profile
Disallow: /workflowitems
# Crawlers should be able to access entity pages, but not the facet search links present on entity pages
Disallow: /entities/*?f
# do not crawl REST API endpoints and HAL browser
Disallow: /server/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the backend is deployed at /server. It may not be deployed in that location. That said, I think it's likely many sites will deploy the backend to /server.

We just may want to clarify that this path may require customization by adding something like:

# Do not crawl the REST API and HAL Browser
# NOTE: You may need to update this path if you deploy the REST API to a different location.

@tdonohue tdonohue added bug component: SEO Search Engine Optimization 1 APPROVAL pull request only requires a single approval to merge port to dspace-7_x This PR needs to be ported to `dspace-7_x` branch for next bug-fix release port to dspace-8_x This PR needs to be ported to `dspace-8_x` branch for next bug-fix release labels Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 APPROVAL pull request only requires a single approval to merge bug component: SEO Search Engine Optimization port to dspace-7_x This PR needs to be ported to `dspace-7_x` branch for next bug-fix release port to dspace-8_x This PR needs to be ported to `dspace-8_x` branch for next bug-fix release
Projects
Status: 👀 Under Review
Development

Successfully merging this pull request may close these issues.

2 participants