jmvalin | Code-Excited Lapped Transform (CELT), Part 1: Overview (Reply)

Here's a bit more info on the CELT experimental codec I just released. First, the goals. For the past two years, Monty and I have been discussing what the next generation free audio codec would be. Monty's goal is to basically be better than Vorbis in terms of quality vs compression. My main goal, on the other hand is to have a high-quality codec with very low latency, even if it means being less efficient. So, we're trying to combine both into the Ghost codecs. Whether that'll succeed or we have to go with two separate codecs is still an open question. For now, I'm working on CELT, which I hope to be both a low-latency codec, and a noise encoder to be used in a lower bit-rate Ghost codec like Monty wants.

Below is an overview of how CELT works:

It may look a bit hairy, but it's actually a relatively simple idea. The four main ideas are:

We use a lapped transform (here an MDCT) on very short windows (128-256 samples)

The spectrum is divided in bands and the energy in each band is encoded and kept constant

We use a time-domain pitch predictor, with frequency-domain gains

The residual is encoded using a pulse codebook

I'll address each of these (and more) in later posts.

Code-Excited Lapped Transform (CELT), Part 1: Overview

Post a comment in response: