Skip to content

Official Repository for the Summer Project Lluminating Language offered by BCS

Notifications You must be signed in to change notification settings

AritraAmbudhDutta/BCS-Lluminating-Language

 
 

Repository files navigation

Lluminating-Language

Official Repository for the Summer Project Lluminating Language offered by BCS

Project Objective : To learn about the wonderful world of Natural language Processing(NLP) from ground up and finally building our own custom open-source RAG infused Chatbot (We may see finetuning as well).

Week 0

To start with, just to jog your memory, here are some resources to brush up on your basics.

  1. Building a Neural Network from Scratch using only Numpy

  2. Neural Networks from Ground Up

  3. Git :

    1. https://rogerdudler.github.io/git-guide/
    2. https://github.com/firstcontributions/first-contributions
  4. MarkDown

  5. Latex:

    1. https://www.overleaf.com/learn/latex/Learn_LaTeX_in_30_minutes
    2. https://latex-tutorial.com/
  6. Basic Python:

    1. https://dabeaz-course.github.io/practical-python/
    2. https://automatetheboringstuff.com/

NOTE : DO NOT try to finish the entire thing, or even try to become perfect with every single concept. Get comfortable with things like printing, conditionals, loops, functions and importing libraries and you should be good to go :)

Week 1

The first week will be dedicated to understanding the basics of NLP and the tools that we will be using throughout the project.

Task : To build a sentiment analysis model using the IMDB dataset and trying and testing different models and techniques.

Resources:

  1. EDA :

    1. https://www.geeksforgeeks.org/what-is-exploratory-data-analysis/
    2. https://www.youtube.com/watch?v=-o3AxdVcUtQ
  2. Pre-Processing :

    1. Tokenization
    2. Stemming and Lemmatization:
      1. https://www.ibm.com/topics/stemming-lemmatization
      2. https://www.youtube.com/watch?v=HHAilAC3cXw
  3. Feature Extraction :

    1. https://www.geeksforgeeks.org/ml-one-hot-encoding/
    2. https://neptune.ai/blog/vectorization-techniques-in-nlp-guide
  4. Model Selection :

    1. https://towardsdatascience.com/top-machine-learning-algorithms-for-classification-2197870ff501
  5. Evaluation:

    1. https://www.analyticsvidhya.com/blog/2021/07/metrics-to-evaluate-your-classification-model-to-take-the-right-decisions/

Mentors:

  1. Udbhav Agarwal
  2. Arin Dhariwal
  3. Himanshu Shekhar
  4. Shreya Gupta

About

Official Repository for the Summer Project Lluminating Language offered by BCS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%