Add baseurl to favor stable versions in search engines#1029
Conversation
|
@Yann-P this is awesome!!! Is it really this simple? I've spent hours searching for how to make it so search engines point towards Does this also close #326? I guess my question is does this solve "all" the related problems like:
The reason that I'm asking is because I'd love to upstream this somehow so that folks like me can find it easier. I searched places, including the sphinx docs, and nothing ever stood out to me to meet this need. Certainly now that I have a bit more context, sphinx linking to RFC6596 makes more sense. However, it would be nice if the |
|
Hello @TimMonko
yes
https://github.com/vroncevic/readthedocs.org/blob/master/docs/guides/canonical.rst |
|
@Yann-P thank you for taking the time to help me learn. This is absolutely a highlight of my week and a highlight of what makes open source so special to me. You took the time to share your expertise across borders, and I am truly grateful. I learned something new with In addition, it helped me understand the canonical part better, for example, I went to view-source:https://docs.ray.io/en/releases-2.55.0/index.html and see that it still points towards the latest url. I'm merging this PR so its in our (SoonTM) next release. Thank you so much again!!!! 😁 🌟 |
|
|
Sticky banners and index crawling are being addressed in the same PR both to reduce commit bloat and because it breaks the Github API anyways, so its not useful for review. # Sticky Banners [`e40b0ff` (this PR)](e40b0ff) ## References and Relevant Issues Follow-up to napari/docs#1029 Closes napari/docs#1039 Closes napari/docs#1038 ## Description This uses a script (that I coaxed out of Deepseek) [`add_sticky_banner.py`](https://gist.github.com/TimMonko/ec2aa08803fa1d815f090a78402c12b0) to add the pydata-sphinx-theme sticky header css to the `<head>` of every necessary html file. I _did not_ target the CSS directly because... well its a bit of a mess due to the unversioned pages I think -- go back to for example the live 0.5.6 docs and you'll see the homepage uses CSS from `/dev/_static/` and this causes some issues. The key changes are... adding the sticky css to `<head>` at the very end prior to `</head>` ``` <style> .pst-sticky-header{position:sticky;top:0;z-index:1030} .pst-sticky-header .bd-header{position:static} </style> </head> ``` And adding `<div class="pst-sticky-header">` above `<div class="pst-async-banner-revealer d-none">` (and a matching `</div>` later) to mirror the implementation observed in the html of napari/docs#1029 This results in a 6 line diff for each file, and has only touched 0.4.19 (where banners were introduced) to 0.7.0 <img width="1950" height="1381" alt="image" src="https://github.com/user-attachments/assets/3f08eecf-b867-4022-89bf-9a78afa6e8d8" /> # noindex and searching [`7780976` (this PR)](7780976) ## References and relevant issues closes napari/docs#364 closes #445 closes napari/docs#1038 rediraffe: napari/docs#946 ## Description This one was a doozy of learning. We ultimately had 3 things to decide between: 1. `<meta name="robots" content="noindex">` 2. `<link rel="canonical" href="https://napari.org/stable/PATH">` 3. `robots.txt` `robots.txt` disallows crawlers to look through old files, not allowing updates to their indexing = bad because we don't have any other way to change the ranking. Then, I really desired to do the `rel="canonical"` approach, because this would allow someone to search "napari 0.4.15 docs" and still find the old documentation. However, after taking significant time to test and run this on the files , [`inject_canonical.py`](https://gist.github.com/TimMonko/0f7402c031558c2fedbdd53c300c980f)) I realized the primary issues is given by that exact same problematic search... "napari docs" currently leads to https://napari.org/0.4.15/developers/docs.html and the canonical link from that time would be https://napari.org/stable/developers/docs.html which is a 404 😢 so I used [`inject_noindex.py`](https://gist.github.com/TimMonko/fd866cf371f73ac31f0595fc08667cd7) So, the solution will be to, unfortunately `noindex`, but this is good because it prevents things from showing up in search This results in adding the line just after `<head>` ``` <head> <meta name="robots" content="noindex"> ``` and effects every doc pre 0.7.1
References and relevant issues
#364
Description
Set html_baseurl which translates to canonical to point to the stable version.
However, for this to be effective we need to set noindex as suggested on #364, which is what I did in this additional pull request:
napari/napari.github.io#444