I've really been into testing the various ASR, TTS, and other audio related models. This space showcases the Nvidia Canary-Qwen 2.5B model. The model is able to transcribe incredibly fast and and combine qwen for queries about the transcript.
I've really been into testing the various ASR, TTS, and other audio related models. This space showcases the Nvidia Canary-Qwen 2.5B model. The model is able to transcribe incredibly fast and and combine qwen for queries about the transcript.