Skip to content

Latest commit

 

History

History
41 lines (35 loc) · 3.05 KB

README.md

File metadata and controls

41 lines (35 loc) · 3.05 KB

DrFAQ

  • DrFAQ is a plug-and-play question answering chatbot that can be generally applied to any organiation's text corpora.
  • Designed and implemented a NLP Question Answering architecture using spaCy, huggingface’s BERT language model, ElasticSearch, Telegram Bot API, and hosted on Heroku.

News

  • 4 Mar 2021 - Transfer learning of language models alongside evaluation study is currently in progress.
  • 13 Dec 2019 - Implementation of 4-step question-answering methodology completed.

Objective

  • Given an organisation's corpus of documents, generate a chatbot to enable natural question-answering capabilities.

Methodology

When a question is asked, the following processes are performed:

  1. FAQ Question Matching using spaCy's Similarity - /match
    • From a given list of Frequently Asked Questions (FAQs), the chatbot detects similarity to the specified question and selects the best answer from the existing list.
  2. NLP Question Answering using huggingface's BERT - /nlp
    • If the question asked is dissimilar to any existing FAQs, perform question answering on the knowledge base and return a sufficiently confident answer.
  3. Answer Search using ElasticSearch - /search
    • If the answer is not sufficiently confident, perform a search on the document corpus and return the search results.
  4. Human Intervention
    • If the search results are still not relevant, prompt a human to add the question-answer pair to the existing list of specified FAQs, or speak to a human.

Research

  • Transfer learning of language models researched in a benchmark study shows that:
    • If a large and clean QA dataset is available, RoBERTa is the best language model.
    • If only a small and unclean generated QA dataset is available, MobileBERT is the best language model.
    • If the QA dataset contains many 'Who' questions, RoBERTa should be considered.

Future Work

  • Release DrFAQ as a pip package.
  • Make an interactive demo available.
  • Integrate abstractive question-answering into the methodology.
  • Leverage databases and cloud services.

References