Link Search Menu Expand Document

Speech Model Curricula Design

Designing optimal big data curricula for large speech models


I worked on this research project at Deepgram, a company building advanced speech-to-text APIs.

Deepgram uses large datasets to train even larger models. While this allows for powerful end-to-end high-capacity speech transcription systems, it also is computationally expensive and restricts the efficiency of experimental R&D. I find that intentionally restricting the dataset can significantly improve the efficiency of model training between two to five times, and particularly results in substantive absolute improvements to the diminishing-returns regime of large model training.

View a writeup here.


Table of contents