•Trained DS-CNN model on Tensorflow to recognize keywords in human speech with 91% accuracy and quantized it to 8-bit fixed-point model for lower computation cost on mobile devices.
•Improved RNN performance by 17% using data augmentation for speech dereverberation by generating new reverberant audio data using 100+ different simulated impulse responses.
•Implemented GMM-UBM-based large-scale speaker recognition system on both Linux micro-computer and Android phone with a team, which achieved 83% accuracy, and eliminated audio reverberation using RNN.
•Created an algorithm in C to compress every 4 weights into 1 byte for ternary neural network and decompress weights during inference, which reduced model size by 75%.