jmvalin | Testing CELT

I've been conducting a listening test for a paper on the CELT codec. I've been comparing it to AAC-LD, G.722.1C (aka Siren14) and MP3. Here are the results for the 48 kbit/s MUSHRA test (95% confidence intervals):

And here are the results for the 64 kbit/s MUSHRA test (95% confidence intervals):

Considering that I was just hoping wouldn't be too much worse than these codecs, it's a pleasant surprise. That's because the version of CELT I tested had a latency of 8.7 ms, while the latency of AAC-LD was 34.8 ms (I know it's possible to get down to 20 ms, but the Apple implementation doesn't do it), G.722.1C was 40 ms and MP3 (LAME) was probably way above 100 ms.

In the graphs above, the error bars don't consider the fact that the MUSRA test is paired, so there's more statistically significant results than what is apparent. Basically, CELT and AAC-LD come out ahead of both G.722.1C and MP3 in both tests. CELT comes out ahead of AAC-LD at 48 kbit/s and the two are tied (i.e. no statistically significant difference could be observed) at 64 kbit/s.

Despite those results, I still think CELT can do better. Among the things I'd like to try once I'm done with the paper:

Add a psycho-acoustic mode and start changing the bit allocation based on the frequency content
Do lots of tuning
Do something to prevent time smearing of impulses (not TNS)
~~Encoding (or guessing) the spectral tilt in each band~~
Better stereo support

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Most Popular Tags

aac - 2 uses
academic scam - 1 use
amazon - 3 uses
aom - 1 use
bugs - 3 uses
c - 1 use
celt - 23 uses
codec2 - 1 use
codecs - 37 uses
conference - 1 use
daala - 4 uses
deep learning - 4 uses
demo - 11 uses
entropy - 1 use
eusipco - 1 use
fixed-point - 2 uses
ghost - 2 uses
hardware - 1 use
ietf - 9 uses
laptop - 1 use
lca - 1 use
memmove - 1 use
mozilla - 18 uses
noise - 3 uses
open access - 1 use
opus - 22 uses
paper - 2 uses
patents - 1 use
quebec - 1 use
rant - 2 uses
renovations - 6 uses
silk - 2 uses
speech - 5 uses
speex - 7 uses
testing - 1 use
type safety - 1 use
ubuntu - 2 uses
underhanded - 1 use
video - 4 uses
vorbis - 3 uses
vp8 - 1 use
webrtc - 5 uses
xiph - 37 uses

Flat | Top-Level Comments Only

From:

tactus.livejournal.com

What did you experiments with wavelets revealed?
Are they useless here?

jmspeex.livejournal.com

I've only experimented a bit with wavelets but found it next to impossible to get both decent leakage (or lack thereof) and low latency. Not to mention complexity. In theory, the only thing wavelets could improve is sharp transients (e.g. castanets), and it may be possible to still have low latency, but I've no clue how. In any case, since I plan on releasing CELT 1.0 within the next 5 years, wavelets are out.

What do you mean by 'leakage' here?

By leakage, I mean when apply some transform to a sinusoid and you have lots of non-zero coefficients around the strongest peak.

Got it. Which wavelets bases did you tried?
Most wavelets by definition are best to apply to non-sinusoidal sources, afair (I touched them a few years ago) there are some wavelets bases, which derived from cosines and may be better suited.

Tried designing my own, based on the lifting scheme. Basically, the goal was to design an asymmetric (in time) high-pass/low-pass pair with as little delay (non-causal taps) as possible. I did some optimisation, but in the end, to have good enough stop bands, I ended up with too much latency.

hum, got it, thanks.

Jean-Marc Valin

Testing CELT

Testing CELT

no subject

no subject

no subject

no subject

no subject

no subject

no subject

Profile

March 2023

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags