Skip to content

Commit

Permalink
identify and map xpath patterns
Browse files Browse the repository at this point in the history
  • Loading branch information
CarsonDavis committed May 23, 2024
1 parent 5bd2178 commit be25d8a
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions scripts/xpath_cleanup/find_xpath_patterns.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# flake8: noqa
"""this script is used to find all the xpath patterns in the database, so that they can be mapped to new patterns in xpath_mappings.py"""

from sde_collections.models.pattern import TitlePattern

print(
"there are", TitlePattern.objects.filter(title_pattern__contains="xpath").count(), "xpath patterns in the database"
)

# Get all the xpath patterns and their candidate urls
xpath_patterns = TitlePattern.objects.filter(title_pattern__contains="xpath")
for xpath_pattern in xpath_patterns:
print(xpath_pattern.title_pattern)
# for url in xpath_pattern.candidate_urls.all():
# print(url.url)
print()

# not every xpath pattern has a candidate url, but I went ahead and fixed all of them anyway
File renamed without changes.

0 comments on commit be25d8a

Please sign in to comment.