PDF/Document to Speech Converter

This project is a web-based application built with Flask that converts text from PDF, DOCX, and TXT files to speech using Google Text-to-Speech (gTTS). It also supports language detection and translation, allowing users to select their desired output language for the audio.

Features

File Upload Support: Upload PDF, DOCX, and TXT files.
Language Detection: Automatically detects the language of the text.
Text Translation: If the detected language is different from the selected output language, it translates the text.
Text-to-Speech (TTS): Converts the uploaded text to speech using gTTS.
Multiple Languages: Supports multiple languages for text-to-speech conversion.
Downloadable MP3: After the conversion, users can download the combined MP3 file.
Chunk-Based Processing: Large texts are processed in chunks to ensure smooth conversion.

Supported File Formats

PDF
DOCX
TXT

Supported Languages

The application supports the following languages for speech synthesis:

Afrikaans (af)
Arabic (ar)
Bengali (bn)
Bosnian (bs)
Catalan (ca)
Czech (cs)
Welsh (cy)
Danish (da)
German (de)
Greek (el)
English (en)
Esperanto (eo)
Spanish (es)
Estonian (et)
Finnish (fi)
French (fr)
Gujarati (gu)
Hindi (hi)
Croatian (hr)
Hungarian (hu)
Armenian (hy)
Indonesian (id)
Icelandic (is)
Italian (it)
Japanese (ja)
Javanese (jw)
Khmer (km)
Kannada (kn)
Korean (ko)
Latin (la)
Latvian (lv)
Macedonian (mk)
Malayalam (ml)
Marathi (mr)
Burmese (my)
Nepali (ne)
Dutch (nl)
Norwegian (no)
Polish (pl)
Portuguese (pt)
Romanian (ro)
Russian (ru)
Sinhala (si)
Slovak (sk)
Albanian (sq)
Serbian (sr)
Sundanese (su)
Swedish (sv)
Swahili (sw)
Tamil (ta)
Telugu (te)
Thai (th)
Filipino (tl)
Turkish (tr)
Ukrainian (uk)
Urdu (ur)
Vietnamese (vi)
Chinese (zh-CN)

Technology Stack

Backend: Flask, Python, gTTS, PyMuPDF (for PDF processing), python-docx (for DOCX processing)
Frontend: HTML, CSS, JavaScript, Bootstrap
Other Libraries:
- pydub: For audio manipulation and merging.
- concurrent.futures: For multi-threaded text-to-speech processing.
- langdetect: For detecting the language of the uploaded text.
- googletrans: For translating text to the desired language.

FFmpeg Installation

To enable audio manipulation in this project, FFmpeg is required. You can install FFmpeg using the following commands based on your operating system:

Windows

Download FFmpeg from the official website: FFmpeg Downloads
Add FFmpeg to your system's PATH environment variable.

macOS

brew install ffmpeg

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Document to audio		Document to audio
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF/Document to Speech Converter

Features

Supported File Formats

Supported Languages

Technology Stack

FFmpeg Installation

Windows

macOS

About

Releases

Packages

Languages

License

prince-c11/text-file-to-audio-converter

Folders and files

Latest commit

History

Repository files navigation

PDF/Document to Speech Converter

Features

Supported File Formats

Supported Languages

Technology Stack

FFmpeg Installation

Windows

macOS

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages