Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexer service use Solr authentication to index documents & unites c… #24

Merged
merged 1 commit into from
Sep 26, 2024

Conversation

liseli
Copy link
Contributor

@liseli liseli commented Sep 19, 2024

This PR is about this task, which aims to run the ht-indexer in Kubernetes (ICTC) using the new Solr cluster.

What changed?

  • Indexer service defines SOLR_USER and SOLR_PASSWORD to authenticate to Solr server and index documents
  • Unites changed to avoid Solr authentication (We need it only for development in local machine)
  • The script init_ht_inderxer.sh has been created to run all the services and tests.
  • In the docker-compose, the Solr service has been updated to use the new image of lss_solr_config, which requires authentication. The service also creates the collection.

These changes have been successfully tested in Kubernetes, as the following images show. I manually updated the ht-indexer image in the ht-tanka repository and defined SOLR_USER and SOLR_PASSWORD in the terminal.

image

How to test this PR

git clone [email protected]:hathitrust/ht_indexer.git
git checkout DEV-1349-solrAuth
  • Run the script, to create the image, start all the services and run the tests
./init_ht_indexer.sh

Note: For these tests, you can ignore the following Warning: the SOLR_USER and SOLR_PASSWORD environment variables are not set.

To approve this PR, check all the tests passed in your terminal.

document_retriever: 25 passed
document_generator: 29 passed
document_indexer: 12 passed

Copy link
Member

@aelkiss aelkiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested locally (which someone should), but all the changes in the Docker setup and tests look reasonable to me.


1. Retriever service

`docker compose up document_retriever -d`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define dependencies or profiles in docker compose to avoid needing to manually start all 3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created profiles for each service, and I also created different services in the docker-compose to group and independently run the tests of each service, but this solution started to fail running the tests in the github actions. For that reason, I abandoned it. I can add a GitHub issue to try it again in the future with K'Ron support.

@liseli liseli requested a review from carylwyatt September 25, 2024 17:47
Copy link
Member

@carylwyatt carylwyatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to run it a few times because I already had some of those ports open to other processes in babel 😂 but they all passed eventually!
Approved! ✅

…hanged to avoid Solr authentication

Profiles have been added to the docker-compose to start and test each of the service (retriever, generator, indexer) indendently, but I removed because it did not work well with the github actions & The Solr service has been updated to accept authentication and created the collection.
Update README.md
@liseli liseli merged commit 262e533 into main Sep 26, 2024
1 check passed
@liseli liseli deleted the DEV-1349-solrAuth branch September 26, 2024 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants