bazzite/docs/preprocessors/replace-urls.py
Zeglius 181497bc17
feat: Prepare mdBook workflow for documentation (#1441)
* feat: add mdbook docs

* chore: add several articles to docs

* docs: add documentation at surface level

Using Discourse urls as fallback for missing content for now

docs: add missing image files

* docs: Add missing chapter emojis

docs: Add missing warning in Advanced docs in summary

docs: add missing waydroid guide

docs: rename files to avoid spaces

docs: fix badly set docs build params

docs: remove unnecesary placeholders

* docs: Realocate 'Gaming' section under 'General'

* docs: Add 'Introduction' section

This section contains a table of contents of the documentation

* docs: Add unstable documentation warning

* docs: Add missing github url

docs: add missing symlink to resources

* docs: Add discourse scrapper utility

* docs: minor discourse scrapper docs changes

* docs: Add youtube embeding preprocessor

* minor reformat for youtube-embed

* docs: Add mdbook preprocessor template

* docs: add format-author preprocessor

* docs: add git lib to mdbook toolset

* docs: Always fetch the highest quality image by fetch_discourse_md

* docs: fix youtube-embed ignoring new line requirement

* docs: Add documentation transcription guide

* docs: Missing url in transcription guide

* docs: Remove YAML header from doc guide

* docs: Minor tweaks to transcription guide

* docs: Add utilities preprocessor module

docs: Move debug preprocessor util to utils

* docs: tweak debug function

* docs: Add 'replace-urls' preprocessor

* chore: Move mappings parameter in replace-urls preprocessor

* docs: add ignore field to replace-urls

* docs: add Mdbook python types

* docs: Add ignore field to replace-urls

Now we can exclude files from being processed with blob patterns

* chore(ci): add deploy_docs

* chore(ci): Add dynamic edit url template to deploy_docs

* chore(ci): Add html.site-url to deploy_docs

* chore(readme): Use relative paths for repo_content

* chore(ci): Add README to included paths for deploy_docs

* chore(ci): Disable deploy_docs

* chore(ci): Use main in deploy_docs.on.push.branches

* docs: Rephrase unstable docs warning

* chore(ci): Exclude docs from triggering build workflow

* chore(ci): Enable deploy_docs

* fix(docs): Remove unnecessary imports in preprocessors

* docs: Move unstable docs warning to index.hbs

* docs: Add page metadata inclusion with fetch_discourse_md.py

* docs: Move fetch_discourse_md.py to docs/utils

* docs: Add 'fetched_at' metadata field in fetch_discourse_md.py

* docs: Update fetch_discourse_md.py to format metadata in json

* Revert "chore(readme): Use relative paths for repo_content"

This reverts commit 6a781c6596.

* docs: Replace include with an url to repo README

* ci(docs): Add multilanguage doc build support

* docs: add Justfile utility

* docs: update Justfile utility

* ci(docs): Add stricter workflow trigger to deploy_docs

* docs: add 'preview_translation' to Justfile

* docs: add documentation translation guide

* ci(docs): Add mdbook cache

* ci(docs): Add i18n-report

* ci(docs): tweak deploy_docs workflow triggers

* ci(docs): remove unnecessary slash at build.yml

* ci(docs): remove unnecessary slash at deploy_docs.yml

* ci(docs): add docs/book.toml to deploy_docs trigger

* ci(docs): Add schedule trigger

* ci(docs): add github-pages cleaning

* ci(docs): Exclude docs from generate_changelog

* docs: Add dependencies installation script

* ci(docs): Add mdbook pdf build

* docs: Tweak Justfile to support pdf generation

* Revert "docs: Always fetch the highest quality image by fetch_discourse_md"

This reverts commit 74130ee1fe.

* ci(docs): Exclude deploy_docs.yml from cache-mdbook keys

* docs: Add 'mdbook_build' to Justfile

* docs: Add 'mdbook_serve' to Justfile

* docs: Add debug flag to fetch_discourse_md

* docs: Automate discourse documentation scrapping

* docs: Add flock to fetch_discourse_md

* docs: Add translation file generation with Justfile

* docs: Prefix url replacements with site-url in replace-urls.py preprocessor

* docs: Add installation guides

docs: Replace print button

* Revert "docs: Prefix url replacements with site-url in replace-urls.py preprocessor"

This reverts commit a685de4dce.

* Reapply "docs: Prefix url replacements with site-url in replace-urls.py preprocessor"

This reverts commit 777d8055ea.

* docs: fix replace-urls.py

* docs: fix fetch_discourse_md.py hitting discourse ip_10_secs_limit

* ci(docs): Remove duplicate '/' in build translation step

* ci(docs): Update actions/cache

* ci(docs): Reduce deploy_docs schedule timespan between triggers

* docs: update install-deps.sh

* docs: Update Advanced docs

* docs: Add favicon

* docs: Reword unstable documentation warning

* docs: Change default theme to 'navy'

* ci(docs): Move permisions to job scope
2024-08-21 11:56:12 -07:00

110 lines
2.8 KiB
Python

__doc__ = """Replace urls across the entire book"""
import glob
import json
from pathlib import Path
import sys
from typing import List, cast
from urllib.parse import urljoin, urlparse
from libs.utils import debug as _debug
from libs.types import MdBook
PREPROCESSOR_NAME = "replace-urls"
def debug(*obj):
return _debug("REPLACE-URLS:", *obj)
_IGNORE_STRINGS = [
"before",
"after",
"command",
"renderers",
]
def is_url(url) -> bool:
res: bool = False
try:
tmp = urlparse(url)
res = tmp.netloc != "" and tmp.scheme != ""
except Exception as _:
res = False
return res
def main():
if len(sys.argv) > 1:
if sys.argv[1] == "supports":
sys.exit(0)
context, book = json.load(sys.stdin)
book = MdBook(book)
config = context["config"]["preprocessor"][PREPROCESSOR_NAME]
if not config:
print(json.dumps(book._data))
exit(0)
elif not isinstance(config, dict):
print(json.dumps(book._data))
exit(0)
book_src = cast(str, context["config"]["book"]["src"])
# Prefix to append to replaced urls if output.html.site-url is set and the replacement starts with `/`
try:
site_url_prefix = cast(str, context["config"]["output"]["html"]["site-url"])
except Exception as _:
site_url_prefix = ""
ignore_paths_list_globs = cast(list[str], list(config.get("ignore") or []))
ignore_paths: List[str] = list()
root_dir = Path(context["root"], book_src)
for p in ignore_paths_list_globs:
ignore_paths += glob.glob(p, root_dir=root_dir)
debug("My ignored paths:", ignore_paths)
config_mappings: dict = config["mappings"]
# Get the url mappings
# If replacement starts with `/`, prepend
url_mappings: list[tuple[str, str]] = [
(k, v)
for k, v in config_mappings.items()
if k not in _IGNORE_STRINGS and is_url(k)
]
# Replace the urls
# book_s = json.dumps(book)
# for mapp_old, map_new in url_mappings:
# book_s = book_s.replace(mapp_old, map_new)
for section in book.sections:
if not section.chapter:
debug("Section skipped, was parttitle:", section.part_title)
continue
debug("section.chapter.path =", section.chapter.path)
if section.chapter.path in ignore_paths:
debug("Section skipped, was in ignore_paths:", section.chapter.path)
continue
for old_url, new_url in url_mappings:
if new_url.startswith("/"):
new_url_aux = urljoin(site_url_prefix, new_url.lstrip("/"))
else:
new_url_aux = new_url
section.chapter.content = section.chapter.content.replace(
old_url, new_url_aux
)
print(json.dumps(book._data))
if __name__ == "__main__":
main()