Open source Llama text summarizer project

Introduction

The following repository contains a collected set of files that are used to create a text summarizer.

Simply provide a URL to a text you want summarized and a question you want answered and the model will do the rest.

Example:

Reliance on third party services:

The primary model used is Ollama taken from the following site: Ollama. Furthermore, the repository uses LangChain to serve the model and a web interface for the user is provided via Streamlit.

Requirements

A CPU, the model is quantized and optimized for CPU, Metal, and CUDA by the awesome team behind ggml

Installation

The following installation steps are required to run the project:

Clone the repository
Install the requirements
Setup Ollama
1. Install Ollama
2. Pull the model: "llama2-7b" (chat) (CMD: ollama pull llama2:latest (fine-tuned 7b parm model)
3. Use the model file to create a QA model called "summarizev2" (CMD: ollama create summarizev2 -f ./Modelfile)
Either run the model via the command line or via the web interface

Usage

command line: locate the repositorry folder and type the following in the command line.

Example usage: python -m src.main --url="https://en.wikipedia.org/wiki/Francisco_Goya" --question="Who was Goya?"

Arguments

Use python main.py --help to see the full list of arguments:

--url - the url of the text to summarize
--question - the question to ask the model
--model - the model to use for summarization
--base-url - the base url for Ollama
--verbose - if True, print out debug information
--chunk-size - size of chunks to split text into
--embedding-model - embedding model to use
--retriever - retriever to use
--device - device to use

web interface: locate the file streamlit_app.py and run it via the command line.

Example usage: streamlit run streamlit_app.py. This should result in a web interface being opened in your browser. It should look similar to the image provided above.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
assets		assets
backend		backend
src		src
streamlit_interface		streamlit_interface
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Modelfile		Modelfile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open source Llama text summarizer project

Introduction

Requirements

Installation

Usage

About

Releases

Packages

Languages

kristianernst/text_summarization

Folders and files

Latest commit

History

Repository files navigation

Open source Llama text summarizer project

Introduction

Requirements

Installation

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages