jmvalin: (Default)
[personal profile] jmvalin
I've been conducting a listening test for a paper on the CELT codec. I've been comparing it to AAC-LD, G.722.1C (aka Siren14) and MP3. Here are the results for the 48 kbit/s MUSHRA test (95% confidence intervals):

And here are the results for the 64 kbit/s MUSHRA test (95% confidence intervals):

Considering that I was just hoping wouldn't be too much worse than these codecs, it's a pleasant surprise. That's because the version of CELT I tested had a latency of 8.7 ms, while the latency of AAC-LD was 34.8 ms (I know it's possible to get down to 20 ms, but the Apple implementation doesn't do it), G.722.1C was 40 ms and MP3 (LAME) was probably way above 100 ms.

In the graphs above, the error bars don't consider the fact that the MUSRA test is paired, so there's more statistically significant results than what is apparent. Basically, CELT and AAC-LD come out ahead of both G.722.1C and MP3 in both tests. CELT comes out ahead of AAC-LD at 48 kbit/s and the two are tied (i.e. no statistically significant difference could be observed) at 64 kbit/s.

Despite those results, I still think CELT can do better. Among the things I'd like to try once I'm done with the paper:
  • Add a psycho-acoustic mode and start changing the bit allocation based on the frequency content
  • Do lots of tuning
  • Do something to prevent time smearing of impulses (not TNS)
  • Encoding (or guessing) the spectral tilt in each band
  • Better stereo support

Date: 2008-05-26 11:28 am (UTC)
From: [identity profile]
What did you experiments with wavelets revealed?
Are they useless here?

Date: 2008-05-26 11:56 am (UTC)
From: [identity profile]
I've only experimented a bit with wavelets but found it next to impossible to get both decent leakage (or lack thereof) and low latency. Not to mention complexity. In theory, the only thing wavelets could improve is sharp transients (e.g. castanets), and it may be possible to still have low latency, but I've no clue how. In any case, since I plan on releasing CELT 1.0 within the next 5 years, wavelets are out.

Date: 2008-05-26 12:34 pm (UTC)
From: [identity profile]
What do you mean by 'leakage' here?

Date: 2008-05-26 12:42 pm (UTC)
From: [identity profile]
By leakage, I mean when apply some transform to a sinusoid and you have lots of non-zero coefficients around the strongest peak.

Date: 2008-05-26 01:14 pm (UTC)
From: [identity profile]
Got it. Which wavelets bases did you tried?
Most wavelets by definition are best to apply to non-sinusoidal sources, afair (I touched them a few years ago) there are some wavelets bases, which derived from cosines and may be better suited.

Date: 2008-05-26 01:35 pm (UTC)
From: [identity profile]
Tried designing my own, based on the lifting scheme. Basically, the goal was to design an asymmetric (in time) high-pass/low-pass pair with as little delay (non-causal taps) as possible. I did some optimisation, but in the end, to have good enough stop bands, I ended up with too much latency.

Date: 2008-05-26 01:56 pm (UTC)
From: [identity profile]
hum, got it, thanks.


jmvalin: (Default)

June 2017

1819 2021222324

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 24th, 2017 09:10 pm
Powered by Dreamwidth Studios