Skip to content

Commit

Permalink
Add sample path and URL for glossary pages
Browse files Browse the repository at this point in the history
  • Loading branch information
benoit74 committed Nov 13, 2024
1 parent c516446 commit 477304f
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions scraper/src/mindtouch2zim/processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,12 @@ def _process_page(
if self.mindtouch_client.library_url.endswith(".libretexts.org") and re.match(
r"^.*\/zz:_[^\/]*?\/20:_[^\/]*$", page.path
):
# glossary pages on libretexts.org, e.g. "Courses/California_State_Universi
# ty_Los_Angeles/Book:_An_Introduction_to_Geology_(Johnson_Affolter_Inkenbr
# andt_and_Mosher)/zz:_Back_Matter/20:_Glossary", running at https://geo.li
# bretexts.org/Courses/California_State_University_Los_Angeles/Book%3A_An_I
# ntroduction_to_Geology_(Johnson_Affolter_Inkenbrandt_and_Mosher)/zz%3A_Ba
# ck_Matter/20%3A_Glossary
rewriten = rewrite_glossary(page_content.html_body)

Check warning on line 487 in scraper/src/mindtouch2zim/processor.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/mindtouch2zim/processor.py#L487

Added line #L487 was not covered by tests
if not rewriten:
rewriten = rewriter.rewrite(page_content.html_body).content

Check warning on line 489 in scraper/src/mindtouch2zim/processor.py

View check run for this annotation

Codecov / codecov/patch

scraper/src/mindtouch2zim/processor.py#L489

Added line #L489 was not covered by tests
Expand Down

0 comments on commit 477304f

Please sign in to comment.