Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: pynvml is pinned when using TRTLLM v13 due to breaking change in 12.0.0 #485

Merged
merged 3 commits into from
Jan 21, 2025

Conversation

terrykong
Copy link
Collaborator

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

  • Please update the CHANGELOG.md under next version with high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

Checklist when contributing a new algorithm

  • Does the trainer resume and restore model state all states?
  • Does the trainer support all parallelism techniques(PP, TP, DP)?
  • Does the trainer support max_steps=-1 and validation?
  • Does the trainer only call APIs defined in alignable_interface.py?
  • Does the trainer have proper logging?

Additional Information

  • Related to # (issue)

Copy link
Collaborator

@ko3n1g ko3n1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for placing this dependency here instead into the requirements.txt?

@terrykong
Copy link
Collaborator Author

What was the reason for placing this dependency here instead into the requirements.txt?

At the moment, we install --no-deps, so it actually wouldn't make a difference, but that's a good point that we can also document this in requirements.txt since we'll eventually rely on that

ko3n1g
ko3n1g previously approved these changes Jan 17, 2025
Copy link
Collaborator

@ko3n1g ko3n1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder! SG, I think we’ll anyway be revisiting the dependency management soon

@terrykong
Copy link
Collaborator Author

@ko3n1g Updated requirements.txt + comment

@terrykong terrykong added Run CICD Set + un-set to retrigger and removed Run CICD Set + un-set to retrigger labels Jan 17, 2025
Copy link
Collaborator

@ko3n1g ko3n1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change!

@terrykong terrykong enabled auto-merge (squash) January 21, 2025 17:07
@terrykong terrykong added Run CICD Set + un-set to retrigger and removed Run CICD Set + un-set to retrigger labels Jan 21, 2025
@terrykong terrykong merged commit 5f4f6d6 into main Jan 21, 2025
20 checks passed
@terrykong terrykong deleted the tk/pynvml-fix branch January 21, 2025 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Run CICD Set + un-set to retrigger
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants