Testing CELT
May. 14th, 2008 08:32 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
I've been conducting a listening test for a paper on the CELT codec. I've been comparing it to AAC-LD, G.722.1C (aka Siren14) and MP3. Here are the results for the 48 kbit/s MUSHRA test (95% confidence intervals):

And here are the results for the 64 kbit/s MUSHRA test (95% confidence intervals):

Considering that I was just hoping wouldn't be too much worse than these codecs, it's a pleasant surprise. That's because the version of CELT I tested had a latency of 8.7 ms, while the latency of AAC-LD was 34.8 ms (I know it's possible to get down to 20 ms, but the Apple implementation doesn't do it), G.722.1C was 40 ms and MP3 (LAME) was probably way above 100 ms.
In the graphs above, the error bars don't consider the fact that the MUSRA test is paired, so there's more statistically significant results than what is apparent. Basically, CELT and AAC-LD come out ahead of both G.722.1C and MP3 in both tests. CELT comes out ahead of AAC-LD at 48 kbit/s and the two are tied (i.e. no statistically significant difference could be observed) at 64 kbit/s.
Despite those results, I still think CELT can do better. Among the things I'd like to try once I'm done with the paper:
And here are the results for the 64 kbit/s MUSHRA test (95% confidence intervals):
Considering that I was just hoping wouldn't be too much worse than these codecs, it's a pleasant surprise. That's because the version of CELT I tested had a latency of 8.7 ms, while the latency of AAC-LD was 34.8 ms (I know it's possible to get down to 20 ms, but the Apple implementation doesn't do it), G.722.1C was 40 ms and MP3 (LAME) was probably way above 100 ms.
In the graphs above, the error bars don't consider the fact that the MUSRA test is paired, so there's more statistically significant results than what is apparent. Basically, CELT and AAC-LD come out ahead of both G.722.1C and MP3 in both tests. CELT comes out ahead of AAC-LD at 48 kbit/s and the two are tied (i.e. no statistically significant difference could be observed) at 64 kbit/s.
Despite those results, I still think CELT can do better. Among the things I'd like to try once I'm done with the paper:
- Add a psycho-acoustic mode and start changing the bit allocation based on the frequency content
- Do lots of tuning
- Do something to prevent time smearing of impulses (not TNS)
Encoding (or guessing) the spectral tilt in each band- Better stereo support
no subject
Date: 2008-05-26 11:28 am (UTC)Are they useless here?
no subject
Date: 2008-05-26 11:56 am (UTC)no subject
Date: 2008-05-26 12:34 pm (UTC)no subject
Date: 2008-05-26 12:42 pm (UTC)no subject
Date: 2008-05-26 01:14 pm (UTC)Most wavelets by definition are best to apply to non-sinusoidal sources, afair (I touched them a few years ago) there are some wavelets bases, which derived from cosines and may be better suited.
no subject
Date: 2008-05-26 01:35 pm (UTC)no subject
Date: 2008-05-26 01:56 pm (UTC)