From dfaf35acae9068c305fa95bea45f512ff15b25bc Mon Sep 17 00:00:00 2001
From: Chris Carini <6374067+ChrisCarini@users.noreply.github.com>
Date: Thu, 4 Apr 2024 18:05:33 -0700
Subject: [PATCH 1/2] fix typos

---
 README.md | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index d15654d..a910354 100644
--- a/README.md
+++ b/README.md
@@ -24,11 +24,11 @@ To clone or edit an unseen voice, VoiceCraft needs only a few seconds of referen
 ## How to run TTS inference
 There are two ways:
 1. with docker. see [quickstart](#quickstart)
-2. without docker. see [envrionment setup](#environment-setup)
+2. without docker. see [environment setup](#environment-setup)
 
-When you are inside the docker image or you have installed all dependencies, Checkout [`inference_tts.ipynb`](./inference_tts.ipynb).
+When you are inside the docker image, or you have installed all dependencies, checkout [`inference_tts.ipynb`](./inference_tts.ipynb).
 
-If you want to do model development such as training/finetuning, I recommend following [envrionment setup](#environment-setup) and [training](#training).
+If you want to do model development such as training/finetuning, I recommend following [environment setup](#environment-setup) and [training](#training).
 
 ## QuickStart
 :star: To try out TTS inference with VoiceCraft, the best way is using docker. Thank [@ubergarm](https://github.com/ubergarm) and [@jayc88](https://github.com/jay-c88) for making this happen.
@@ -119,7 +119,7 @@ python phonemize_encodec_encode_hf.py \
 --batch_size 32 \
 --max_len 30000
 ```
-where encodec_model_path is avaliable [here](https://huggingface.co/pyp1/VoiceCraft). This model is trained on Gigaspeech XL, it has 56M parameters, 4 codebooks, each codebook has 2048 codes. Details are described in our [paper](https://jasonppy.github.io/assets/pdfs/VoiceCraft.pdf). If you encounter OOM during extraction, try decrease the batch_size and/or max_len.
+where encodec_model_path is available [here](https://huggingface.co/pyp1/VoiceCraft). This model is trained on Gigaspeech XL, it has 56M parameters, 4 codebooks, each codebook has 2048 codes. Details are described in our [paper](https://jasonppy.github.io/assets/pdfs/VoiceCraft.pdf). If you encounter OOM during extraction, try decrease the batch_size and/or max_len.
 The extracted codes, phonemes, and vocab.txt will be stored at `path/to/store_extracted_codes_and_phonemes/${dataset_size}/{encodec_16khz_4codebooks,phonemes,vocab.txt}`.
 
 As for manifest, please download train.txt and validation.txt from [here](https://huggingface.co/datasets/pyp1/VoiceCraft_RealEdit/tree/main), and put them under `path/to/store_extracted_codes_and_phonemes/manifest/`. Please also download vocab.txt from [here](https://huggingface.co/datasets/pyp1/VoiceCraft_RealEdit/tree/main) if you want to use our pretrained VoiceCraft model (so that the phoneme-to-token matching is the same).
@@ -160,4 +160,3 @@ We thank Feiteng for his [VALL-E reproduction](https://github.com/lifeiteng/vall
 
 ## Disclaimer
 Any organization or individual is prohibited from using any technology mentioned in this paper to generate or edit someone's speech without his/her consent, including but not limited to government leaders, political figures, and celebrities. If you do not comply with this item, you could be in violation of copyright laws.
-

From 7ff895f79b68ba7fd03358b37486598cd5cdacc7 Mon Sep 17 00:00:00 2001
From: Chris Carini <6374067+ChrisCarini@users.noreply.github.com>
Date: Sat, 1 Jun 2024 03:18:00 -0700
Subject: [PATCH 2/2] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 6e49afc..76553cc 100644
--- a/README.md
+++ b/README.md
@@ -18,7 +18,7 @@ see [quickstart command line](#quickstart-command-line).
 
 When you are inside the docker image or you have installed all dependencies, Checkout [`inference_tts.ipynb`](./inference_tts.ipynb).
 
-If you want to do model development such as training/finetuning, I recommend following [envrionment setup](#environment-setup) and [training](#training).
+If you want to do model development such as training/finetuning, I recommend following [environment setup](#environment-setup) and [training](#training).
 
 ## News
 :star: 04/22/2024: 330M/830M TTS Enhanced Models are up [here](https://huggingface.co/pyp1), load them through [`gradio_app.py`](./gradio_app.py) or [`inference_tts.ipynb`](./inference_tts.ipynb)! Replicate demo is up, major thanks to [@chenxwh](https://github.com/chenxwh)!