Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to tensorflow 2 & numpy & others #8

Open
YaraAlkaka opened this issue Apr 18, 2024 · 2 comments
Open

Update to tensorflow 2 & numpy & others #8

YaraAlkaka opened this issue Apr 18, 2024 · 2 comments

Comments

@YaraAlkaka
Copy link

to run the colab code successfully:

  1. run the first cell only
  2. go to /content/tacotron_pytorch/hparams.py and change it to this:
import tensorflow as tf
import types

# Default hyperparameters:
hparams_dict = {
    # Comma-separated list of cleaners to run on text prior to training and eval. For non-English
    # text, you may want to use "basic_cleaners" or "transliteration_cleaners" See TRAINING_DATA.md.
    'cleaners': 'english_cleaners',
    'use_cmudict': False,  # Use CMUDict during training to learn pronunciation of ARPAbet phonemes

    # Audio:
    'num_mels': 80,
    'num_freq': 1025,
    'sample_rate': 20000,
    'frame_length_ms': 50,
    'frame_shift_ms': 12.5,
    'preemphasis': 0.97,
    'min_level_db': -100,
    'ref_level_db': 20,

    # Model:
    # TODO: add more configurable hparams
    'outputs_per_step': 5,
    'padding_idx': None,
    'use_memory_mask': False,

    # Data loader
    'pin_memory': True,
    'num_workers': 2,

    # Training:
    'batch_size': 32,
    'adam_beta1': 0.9,
    'adam_beta2': 0.999,
    'initial_learning_rate': 0.002,
    'decay_learning_rate': True,
    'nepochs': 1000,
    'weight_decay': 0.0,
    'clip_thresh': 1.0,

    # Save
    'checkpoint_interval': 5000,

    # Eval:
    'max_iters': 200,
    'griffin_lim_iters': 60,
    'power': 1.5,              # Power to raise magnitudes to prior to Griffin-Lim
}
# Convert the dictionary to a namespace
hparams = types.SimpleNamespace(**hparams_dict)


def hparams_debug_string():
    hp = ['  %s: %s' % (name, hparams[name]) for name in sorted(hparams)]
    return 'Hyperparameters:\n' + '\n'.join(hp)

  1. go to /content/tacotron_pytorch/lib/tacotron/util/audio.py and change np.complex in line 70 to complex
  2. go to /content/pytorch-dc-tts/datasets/emovdb.py line 45 and change np.long to np.int64
  3. go to /content/pytorch-dc-tts/audio.py line 61 and change it to
return librosa.istft(spectrogram, hop_length=hp.hop_length, win_length=hp.win_length, window="hann")

line 47 to

est = librosa.stft(X_t, n_fft=hp.n_fft, hop_length=hp.hop_length, win_length=hp.win_length)
  1. remove %tensorflow_version 1.x from the second cell in the colab

now it works although Amused emotion is not sounding correct but i'll update this when i fix it

@ruobingli1103
Copy link

Just wanted to say a huge thanks for sharing this!

@nguyenlamvu123
Copy link

nguyenlamvu123 commented Sep 24, 2024

Excellent
more, don't forget to pip install docopt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants