Emotibridge is an advanced video processing application designed to enhance video content by removing background noise, transcribing and translating audio, and generating speech from translated text. This application leverages state-of-the-art machine learning models and various Python libraries to deliver high-quality results.
Untitled.Made.with.FlexClip.online-video-cutter.com.2.mp4
- Background Noise Removal: Clean audio by removing unwanted background noise.
- Transcription and Translation: Convert audio to text and translate it into multiple languages.
- Speech Generation: Generate speech from translated text.
- User Interface: Interactive UI with progress indicators and text editing capabilities.
- Programming Language: Python
- Libraries and Frameworks: PyTorch, TTS, pydub, SpeechRecognition
- GUI Framework: Tkinter
- Audio Processing: pydub
- Machine Learning: PyTorch for model training and inference
To get started with Emotibridge, follow these steps:
-
Clone the Repository
git clone https://github.com/Akshitkt001/Emotibridge.git cd Emotibridge Set Up Virtual Environment python -m venv venv source venv/bin/activate # On Windows use`venv\Scripts\activate`
-
Install Dependencies
pip install -r requirements.txt python main.py #Run the Application
Configuration Update the configuration files as needed to set paths for models and other resources. Make sure to adjust any settings specific to your environment.
Launch the application by running the main.py
script. The GUI will present the following screens:
- Main Screen: Displays the application title and a "Take me to app" button.
- Processing Screen: Allows users to input video files, select languages, and start processing.
- Output Screen: Displays the final processed video along with editable translated text.
Upload your video file using the file input section.
Choose the input and target languages from the dropdown menus.
Click the "Process" button to start the background noise removal, transcription, translation, and speech generation.
After processing, review the translated text and generated speech, and view the final video output.
- /process-video: Processes the uploaded video, performs transcription, translation, and speech generation.
- Method: POST
- Parameters:
video_file
: The video file to be processed.input_language
: Language of the original video audio.target_language
: Language to translate the audio into.
For detailed API usage examples, refer to the FastAPI documentation.
- TTS Model: Link to TTS model
- Translation Model: Link to translation model
- FastAPI Documentation
- PyTorch Documentation
- pydub Documentation
- SpeechRecognition Documentation
- Tkinter Documentation
Contributions to Emotibridge are welcome! If you find a bug or want to add a new feature, please follow these steps:
- Fork the repository.
- Create a new branch (e.g.,
git checkout -b feature/your-feature
). - Make your changes.
- Commit your changes (e.g.,
git commit -am 'Add new feature'
). - Push to the branch (e.g.,
git push origin feature/your-feature
). - Create a new Pull Request.
This project is licensed under the Apache License - see the LICENSE file for details.
For any inquiries or issues, please contact:
- Akshit Kumar Tiwari - GitHub Profile
- Email: [[email protected]]