Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model (deeplearning): Camembert baseline #48

Open
ezalos opened this issue Mar 13, 2022 · 2 comments
Open

model (deeplearning): Camembert baseline #48

ezalos opened this issue Mar 13, 2022 · 2 comments
Assignees
Labels
fixme This issue will be soon fixed
Milestone

Comments

@ezalos
Copy link
Member

ezalos commented Mar 13, 2022

📖 Describe what you want

Make a deeplearning baseline with Hugging Face and Pytorch.

Relevant tutorials to follow:

Relevant documentation:

✔️ Definition of done

  1. Being able to train the model then save it's weights poetry run python -n src models --model camembert --output-weights models/xxxxx --train-split xxx.csv
  2. Save the weights in remote s3 dvc add models/xxxxx git add && git commit dvc push -r s3-remote
  3. Being able to load model weights and predict on a given dataset
  4. Create a unit test which train and predict from a csv of 100 examples which is push to github (in src/tests/test_dataset.csv)
  5. If necessary update the .github/workflows/cicd.yaml and the .42AI/pre-commit.git to pass the new unit test
@ezalos ezalos added the fixme This issue will be soon fixed label Mar 13, 2022
@madvid madvid self-assigned this Mar 16, 2022
@madvid
Copy link
Contributor

madvid commented Mar 16, 2022

I am starting the first part, I will used the IMDB if test with data is necessary

@madvid
Copy link
Contributor

madvid commented Mar 21, 2022

Interesting library developped by hugginface https://github.com/huggingface/datasets
In particular if we want to follow the first ressource gave by ezalos

@ezalos ezalos added this to the Models milestone Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixme This issue will be soon fixed
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants