baritone

Oh Grandma! What a deepVoice you have

A ready-to-use, voice to text, wrapper library based on a pretrained version of Mozilla and Baidu's DeepSpeech architecture. Made this to reduce the manual labour needed to run these models on personal projects. Wanted to make this as easy to use as possible so that anyone can download and use, all within a few minutes.

[Work in progress as of 2/4/20]

What I plan on doing (and have done):

Direct Youtube video-to-text support
Local MP3,MP4,WAV,M4A files supported
Ultimate caching using audio fingerprinting so that if the system has heard something before, it doesn't have to go throught the whole proccess again and just retrieves from the DB. (Thanks Dejavu )
Automatic download and setup of pretrained model
Real time audio stream compatible
A no-bullshit library, that you can just import and run state of the art voice to text in, without worrying about the hassles of file conversions, downloads, pretrained/training models, etc.
Dockerfile included
Support for pip
... and more feautures that I'll think of while making this in the next few weeks

Feel free to:

Add support for popular podcast platforms
Add more file type compatibilty

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
__pycache__		__pycache__
dejavu		dejavu
models		models
mp3		mp3
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
baritone.py		baritone.py
dejavu.py		dejavu.py
requirments.txt		requirments.txt
service.py		service.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

baritone

About

Releases

Packages

Languages

License

ramrathi/baritone

Folders and files

Latest commit

History

Repository files navigation

baritone

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages