Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed MySql query bug; Upgraded pyproject file; Updated documentation… #18

Merged
merged 1 commit into from
Jun 10, 2024

Conversation

liseli
Copy link
Contributor

@liseli liseli commented Jun 7, 2024

This PR consists of fixing an issue found in the MySql query (Task on Jira). It improves the log messages, printing MySQL queries and the host name to indicate the environment (production or development).

See below some of the MySql queries you will find in the logs:

2024-06-07 17:56:49 2024-06-07 21:56:49,282  :: document_generator.mysql_data_extractor :: INFO :: MySQL query: SELECT * FROM rights_current WHERE namespace="hvd" AND id="32044092647320"
2024-06-07 17:56:49 2024-06-07 21:56:49,283  :: document_generator.mysql_data_extractor :: INFO :: MySQL query: SELECT member_id FROM holdings_htitem_htmember WHERE volume_id="hvd.32044092647320"
2024-06-07 17:56:49 2024-06-07 21:56:49,284  :: document_generator.mysql_data_extractor :: INFO :: MySQL query: SELECT member_id FROM holdings_htitem_htmember WHERE volume_id="hvd.32044092647320" AND access_count > 0
2024-06-07 17:56:49 2024-06-07 21:56:49,284  :: document_generator.mysql_data_extractor :: INFO :: MySQL query: SELECT mb_item.MColl_ID FROM mb_coll_item mb_item, mb_collection mb_coll WHERE mb_item.extern_item_id="hvd.32044092647320" AND mb_coll.num_items > 1000 

As part of this PR, the TXT file filter_ids.txt was updated adding a new list of ht_ids. This file will be used to test the ht_indexer workflow in Kubernetes.

How do you test this PR?

  1. Clone the repository
    git clone ... or git fetch

  2. Use the branch of this PR
    git checkout DEV-1216-mysql_query_bug

  3. In the working directory,
    3.1. Create the image
    docker build -t document_generator .
    3.2. Run the container
    docker compose up document_generator -d
    3.3. Run Python tests related to document_generator_service
    docker compose exec document_generator pytest document_generator ht_document ht_queue_service ht_utils

If you see the output below, the change works well on your local machine
image

… and Generated a new list of ht_ids (TXT file filter_ids.txt) to index in Kubernetes
Copy link
Member

@aelkiss aelkiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes/edits look good for now. Eventually, we will want to think about what we log at each level - e.g. at info level we may only want to log the items indexed and how long it took, and be more verbose with things like the mysql queries at the debug level.

@liseli
Copy link
Contributor Author

liseli commented Jun 10, 2024

I think the changes/edits look good for now. Eventually, we will want to think about what we log at each level - e.g. at info level we may only want to log the items indexed and how long it took, and be more verbose with things like the mysql queries at the debug level.

Thanks for this comment. I've added to my notes. I will likely review the logs when I start to implement the monitoring with Prometheus.

@liseli liseli merged commit 4f38b7f into main Jun 10, 2024
1 check passed
@liseli liseli deleted the DEV-1216-mysql_query_bug branch June 10, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants