🚧 In Work 🚧
A project of Speech Recognition in python using Tensorflow and keras.
I train the model using a network of recurrent neurons predicting a linear output. my dataset consists of audio associated with a sentence. I cut each sentence into phonemes to which I come to associate a sound.
The architecture is the following:
- One convolution of 8 filters (9*9) [Elu activation]
- Max pooling pool_size=[2,2]
- Lstm of 128 filters
- Flatten layer
- Dropout: 0.4
- Fully conected: 256 [Elu]
- Dropout: 0.2
- Fully conected: len(vocab)