This repo contains code for a summarization task performed using the Bart LLM.
The goal of this project was to fine-tune the model on scientific paper abstracts and have it generate paper titles.
This code makes use of the transformers
Python library and tutorials from the awesome Hugging Face community!
This repo contains a folder called src
inside which you can find 4 scripts:
prepare_dataset.ipynb
: This notebook contains code for preparing train, validation and test splits.modeling.ipynb
: You will find code related to model training in this notebook.inference.ipynb
: This notebook contains code for model inferencing and hyperparameter tuning using the test set.app.py
: This script contains code for deploying the model using gradio.
You can find the dataset used for modeling here. It consists of all 3 splits: train, validation and test. The original dataset was obtained from Kaggle and it contains metadata for papers sourced from arXiv. For the purposes of this project, only the abstract
, title
and categories
fields were used.
Feel free to interact with the model here and use it to generate a title given your abstract! This Hugging Face space was set up using gradio.
This project would not have been possible without the NLP Course on Hugging Face that taught me so much about transformers!! I can't stress enough how amazing the documentation is that exists on Hugging Face. 🔥 They make it really easy to pick up a model and just get going with all the tutorials, detailed code snippets and explanations available!
In addition, I found this article on hyperparameters extremely useful when learning about tuning your model and it gave me a good place to start.
This project has gotten me really hyped about the world of LLMs 👀 and I'm excited to explore this area of ML! 🙌