facebook/wav2vec2-base-960h
Automatic Speech Recognition
•
94.4M
•
Updated
•
1.68M
•
384
Generate spatial audio from images (and optionally text)
Paper Whisperer