Welcome to my NLP project repository! 🤖🔤

👋 Introduction

This repo contains code for a summarization task performed using the Bart LLM.

The goal of this project was to fine-tune the model on scientific paper abstracts and have it generate paper titles.

This code makes use of the transformers Python library and tutorials from the awesome Hugging Face community!

📦 Repo Content

This repo contains a folder called src inside which you can find 4 scripts:

prepare_dataset.ipynb : This notebook contains code for preparing train, validation and test splits.
modeling.ipynb : You will find code related to model training in this notebook.
inference.ipynb : This notebook contains code for model inferencing and hyperparameter tuning using the test set.
app.py : This script contains code for deploying the model using gradio.

🎞️ Accessing the data

You can find the dataset used for modeling here. It consists of all 3 splits: train, validation and test. The original dataset was obtained from Kaggle and it contains metadata for papers sourced from arXiv. For the purposes of this project, only the abstract, title and categories fields were used.

🔮 Playing with the model

Feel free to interact with the model here and use it to generate a title given your abstract! This Hugging Face space was set up using gradio.

🙏 Conclusion

This project would not have been possible without the NLP Course on Hugging Face that taught me so much about transformers!! I can't stress enough how amazing the documentation is that exists on Hugging Face. 🔥 They make it really easy to pick up a model and just get going with all the tutorials, detailed code snippets and explanations available!

In addition, I found this article on hyperparameters extremely useful when learning about tuning your model and it gave me a good place to start.

This project has gotten me really hyped about the world of LLMs 👀 and I'm excited to explore this area of ML! 🙌

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to my NLP project repository! 🤖🔤

Table of Contents

👋 Introduction

📦 Repo Content

🎞️ Accessing the data

🔮 Playing with the model

🙏 Conclusion

About

Releases

Packages

Languages

License

MehnaazAsad/NLP_summarization_bart

Folders and files

Latest commit

History

Repository files navigation

Welcome to my NLP project repository! 🤖🔤

Table of Contents

👋 Introduction

📦 Repo Content

🎞️ Accessing the data

🔮 Playing with the model

🙏 Conclusion

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages