Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue running parallelformers test script in a VM #23

Open
Mehrad0711 opened this issue Mar 15, 2022 · 1 comment
Open

Issue running parallelformers test script in a VM #23

Mehrad0711 opened this issue Mar 15, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@Mehrad0711
Copy link

Mehrad0711 commented Mar 15, 2022

How to reproduce

First of all, thanks for this great project!

I'm facing an issue running the test code provided here on Kubernetes.

This is what I'm running inside a Kubeflow pod:

python3 tests/seq2seq_lm.py --test-name=test --name=Helsinki-NLP/opus-mt-en-zh --gpu-from=0 --gpu-to=3 --use-pf

I'm using a g4dn.12xlarge AWS machine with four T4 GPUs.

The pod hangs when executing this line until I manually terminate it.

I suspected this change might have been the culprit so I ran the same code with v1.2.4 of parallelformers. This time, the pod quits during execution of the same line without outputting any errors which is odd.

Notably, if I run the same command without --use-pf it runs fine.

I saw you've reported some problems using docker. However, memory should not be an issue here since I'm using Helsinki-NLP/opus-mt-en-zh model which is relatively small.

I was wondering if parallelformers code has ever been tested on Kubernetes?
Also would appreciate it if you could look into this issue. Thanks!

Environment

  • OS : Linux
  • Python version : 3.8.3
  • Transformers version : 4.17.0
  • Whether to use Docker: Yes
  • Misc.:
  • branch: main
@Mehrad0711 Mehrad0711 added the bug Something isn't working label Mar 15, 2022
@hyunwoongko
Copy link
Contributor

can you try that in the if __name__ == '__main__' context?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants