jmvalin: (Default)
[personal profile] jmvalin
Before reading this, I recommend reading part 1 and part 2. As I explained in part 1, CELT achieves really low latency by using very short MDCT windows. In the current setup, we have two 256-sample overlapping (input) MDCT windows per frame. The reason for not using a single 512-sample MDCT instead is latency (the look-ahead of the MDCT is shorter). With that setup, we get 256 output samples per frame to encode (128 per MDCT window). Now, at 44.1 kHz, it means a resolution of 172 Hz, not to mention the leakage. That's far from enough to separate female pitch harmonics, much less male ones. To the MDCT, a periodic voice signal thus looks pretty much like noise, with no clear structure that can be used to our advantage.

To work around the poor MDCT resolution, we introduce a pitch predictor. Instead of trying to extract the structure from a single (small) frame, the pitch predictor looks outside the current frame (in the past of course) for similar patterns. Pitch prediction itself is not new. Most speech codecs (and all CELP codecs, including Speex) use a pitch predictor. It usually works in the excitation domain, where we find a time offset in the past (we use the decoded signal because the original isn't available to the decoder) that looks similar to the current frame. The time offset (pitch period) is encoded, along with a gain (the prediction gain). When the signal is highly periodic (as is often the case with voice), the gain is close to 1 and the error after the prediction is small.

Unlike CELP, CELT doesn't operate in the time domain, so doing pitch prediction is a bit trickier. What we need to do is find the offset in the time domain, and then apply the MDCTs (remember we have two MDCT windows per frame) and do the rest in the frequency domain. Another complication is the fact that periodicity is generally only present at lower frequencies. For speech, the pitch harmonics tend to go down (compared to the noisy part) after about 3 kHz, with very little present past 8 kHz. Most CELP codecs only have a single gain that is applied throughout the entire frame (across all frequencies). While Speex has a 3-tap predictor that allows a small amount of control on the amount of gain as a function of frequency, it's still very basic. Working in the frequency domain on the other hand, allows a great deal of flexibility. What we do is apply the pitch prediction only up to a certain frequency (e.g. 6 kHz) and divide the rest in several (e.g. 5) bands. For the example from part 2 (corresponding to mode1 of the 0.0.1 release), we use the following bands for the pitch (different from the bands on which we normalise energy):

{0, 4, 8, 12, 20, 36}

Another particulatity of the pitch predictor in CELT (unlike any other algorithm I know of) is that the pitch prediction is computed on the normalised bands. That is we apply the energy normalisation on both the current signal (X) and the delayed (pitch prediction from the past) signal (P). Because of that, the pitch gain can never exceed unity, which is a nice property when it comes to making things stable despite transmission losses. Despite a maximum value of one in the normalised domain, the "effective value" (not normalised) can be greater than one when the energy is increasing, which is the desired effect. The pitch gain for band i is computed simply g_i = <X_i, P_i>, where <,> is the inner product and X_i is the sub-vector of X that corresponds to band i (same for P_i).

Here's what the distribution of the gains look like for each band:



It's clear from the figure above that the lower bands (lower frequencies) tend to have a much higher pitch value. Because of that, a single gain for all the bands wouldn't work very well. Once the gains are computed, they need to be encoded efficiently. Again, using naive scalar quantisation and encoding each gain separately (using 3 or 4 bits each) would be a bit wasteful. So far, I've been using a trained (non-algebraic) vector quantiser (VQ) with 32 entries, which means a total of 5 bits for all gains. The advantage of VQ for that kind of data is that it eliminates all redundancy so it tends to be more efficient. The are a few disadvantages as well. Trained VQ codebooks are not as flexible and can end up taking too much space when there are many entries (I don't think 32 entries is enough for 5 gains).

The last point to address about the pitch predictor is calculating the pitch period. We could try all delays, apply the MDCTs and compute the gains for each and at the end decide which is beat. Unfortunately, the computational cost would be huge. Instead, it's easier to do it in "open loop" just like in Speex (and many other CELP codecs). We compute the generalised cross-correlation (GCC) in the frequency domain (cheaper than computing in the time domain). The cross-spectrum (before computing the IFFT) is weighted by an approximation of the psychoacoustic masking curve just so each band contributes to the result (instead of having the lower frequencies dominate everything else).

Now the results: how much benefit does pitch prediction give? Quite a bit actually, hear for yourself. Here's the same speech sample encoded with or without pitch prediction. Even on music, which is not always periodic, pitch prediction can a bit, though not as much. I think there's potential to do better on music. There's a few leads I'd like to investigate (and again, I'm open to ideas):
  • Using two pitch periods
  • Frequency-domain prediction
Feel free to ask questions below in the (likely) case something's not clear.
Page 1 of 10 << [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] >>

Loss recovery

Date: 2007-12-25 11:16 pm (UTC)
From: [identity profile] nullc.livejournal.com
How do you prevent pitch prediction from propagating loss forever?

One though I had for realtime applications would be to support a back-channel from the remote decoder indicating lost frames. In packet based duplex operation this channel could be nearly free, and if it was used to create forbidden zones for the predictor it would help ensure recovery within ~1RTT plus whatever the rest of the predictors needed to recover.

This doesn't wouldn't help with the short pitch period you are currently using, but if you include a second longer predictor it might be more important.

One point you don't go into is how the pitch period is transmitted.. It appears that it is just sent as a straight sample offset with 10bits? (I see MAX_PERIOD is 1024 in the codec) If pitch prediction is only of value until 6khz or so you may be able to only align mod 2 (or 4) and thus reduce the bandwidth needed by one (or two) bit(s) per frame.

Re: Loss recovery

Date: 2007-12-25 11:47 pm (UTC)
From: [identity profile] jmspeex.livejournal.com
One thing I didn't mention explicitly is that I'm actually bounding the gain to something slightly below unity (e.g. 0.9 or 0.95). That way, the errors just "fades out", just like it does for a CELP codec (e.g. Speex). There's other good reasons for limiting the pitch value, which I'll describe in part 4. As for a second pitch period, it would make a "conventional" (not normalised domain) pitch predictor unstable for any decent gain values, but I'm hoping it would actually be stable because of the normalisation.

You are right that I am encoding the pitch using 10 bits (actually slightly less because of the 1024 lags possible, some overlap with the current window). The reason why I don't align it to multiples of 2 or 4 is that it gives me more accuracy on where the harmonics are. If you look at some narrowband CELP codecs (I think G.729 does that), they have what they call "fractional pitch" exactly because an integer number of samples at 8 kHz isn't enough to model pitch properly (Speex instead uses a 3-tap predictor, which can approximate the effect). The fractional pitch is a sort or up-sampling (4x-8x typically) of the excitation that allows more resolution on the pitch period. Back to CELT, using 44.1 kHz (or 48 kHz) means we don't actually need to use fractional pitch because there's already more resolution.

Pictures from venereal networks

Date: 2016-08-16 11:40 pm (UTC)
From: (Anonymous)
Recent available porn area
http://blacklesbians.xblog.in/?natalia
virtual erotic erotic telugu stories erotic postcards sbs erotic tales what is erotic

Pictures from collective networks

Date: 2016-08-18 06:59 am (UTC)
From: (Anonymous)
My novel page
http://date.inc.xblog.in/?gain.rayna
free online chat video single women looking for men naughty adult chat asian girl dating white man singles events perth

Grown up galleries

Date: 2016-08-18 06:36 pm (UTC)
From: (Anonymous)
Blog with daily sexy pics updates
http://asslick.photo.erolove.in/?post-kelsey
hardcore movies freeware ebony pornstars clips no country for old men trailer high miyase akina idol iptd338 girl school female bojdage video

Adult position

Date: 2016-08-18 09:38 pm (UTC)
From: (Anonymous)
Study my modish contract
free apps android download most popular android free games free adult encounters freeware software downloads adult pornography websites
http://sexgames.android.tobuy.in/?diagram.mckenna
htc mobile dual sim download free games full version funny video sex download android mobile india price poker rules

Experimental Poke out

Date: 2016-08-22 07:39 am (UTC)
From: (Anonymous)
New sissy girls blog website
vibrating ring for penis bondage et sex kids easter dresses
http://sissies.purplesphere.in/?read.marcella
what is suffrage movement sissy shoppe childrens tights south africa travel cheese feminized seeds women that drink strapon sissy gallery harley davidson luggage

Sexual pictures

Date: 2016-08-26 12:07 am (UTC)
From: (Anonymous)
Blog with daily sexy pics updates
http://sexypic.erolove.in/?post.tayler
lingerie for crossdress boca linda latin music turki sex move scat porn reviews boobsville black beauties 2 torrent

Unencumbered galleries

Date: 2016-08-26 08:47 pm (UTC)
From: (Anonymous)
Check my altered project
http://analplug.tobuy.in/?diagram.diana
erotic cake erotic navel erotic breastfeeding classic erotic yoga erotic

Adult galleries

Date: 2016-08-29 06:43 pm (UTC)
From: (Anonymous)
Hi fashionable website
http://whipme.yopoint.in/?leaf.courtney
sex pictures erotically erotic hotel erotic dance videos erotic

Adult galleries

Date: 2016-08-31 10:54 am (UTC)
From: (Anonymous)
Pornographic pictures blog
http://bdsmfiles.pornpost.in/?facebook_delaney
homemade crossdresser movies popular boys names trisha south indaian hottest actress nude bath scene 1 original torrent greek porn amatuer

Experimental Job

Date: 2016-09-14 09:49 am (UTC)
From: (Anonymous)
Novel programme
apps android gratis download social game apps for android adult movie online application google store android download video sexy song
http://sexgames.android.tobuy.in/?leaf.meagan
how to download live wallpaper for android android programming example adult porn webcams clauncher theme free download top paid android apps free download

Mature galleries

Date: 2016-09-14 01:18 pm (UTC)
From: (Anonymous)
Study my new devise
hottest games for android best android racing game android latest software descarca aplicatii gratis online dating chat room
http://pornapps.xblog.in/?gain.felicity
smartphone reviews 2015 sheer sexy lingerie link google play store android download now sexy photos girl

Grown up galleries

Date: 2016-09-14 11:45 pm (UTC)
From: (Anonymous)
Chit my new engagement
alcatel android phone how do you get app store top fun games for android apps market download cheap smartphones 2015
http://sexgames.android.tobuy.in/?gain.jamie
free download android softwares tomb raider game free download must have apps android tablet top 10 best apps full sexy movie download

Pictures from community networks

Date: 2016-09-15 10:17 am (UTC)
From: (Anonymous)
Started up to date web project
agenda android app sexy videos download com android games 2015 android cell phone full sexy game download
http://apps.android.telrock.org/?gain.alena
biggest dating app top apk apps download game in mobile cell phone free webcam room

Delivered adult galleries

Date: 2016-09-17 06:08 am (UTC)
From: (Anonymous)
My published project is super!
adult sex movies download hd sexy android wallpaper gallery screenshots download free sexy lingerie sets
http://sexyapps.erolove.in/?gain.kaela
hd pictures free honeycomb android download play store android market free educational apps for adults google play store download for android

My brand-new website

Date: 2016-09-17 12:04 pm (UTC)
From: (Anonymous)
Daily porn blog updates
http://sexypic.erolove.in/?entry-anastasia
asktrivia porn quality inn college park md tv porrno sex reshma sex vedios vrouwen met sexy slipjes gluren

Unencumbered galleries

Date: 2016-09-17 10:40 pm (UTC)
From: (Anonymous)
New kick ass photo blog
http://hotpic.erolove.in/?entry.marlee
pirates xxx for free on internet sex hiep dam girls teen sex exstrim sex comunity jessei combs nude

Loose galleries

Date: 2016-09-20 02:51 am (UTC)
From: (Anonymous)
Free shemale porn
http://shemales.blogporn.in/?profile.amara
transvestite sex shomale sex sex videos shemals shamale shemals free sex

Pictures from community networks

Date: 2016-09-21 02:30 am (UTC)
From: (Anonymous)
Sissy tales blog
girls bras girdles pretty white blouse
http://sissythings.pornpost.in/?blog.kaley
makeup youtube video verginal wash old gay fucking tube cleaning business names delhi call girl videos marihuana online baby clothes wholesale china women seeking women backpage

Adult site

Date: 2016-09-22 09:12 am (UTC)
From: (Anonymous)
Hi reborn website
http://cars.feed.yopoint.in/?entry-mollie
crazy holiday porn hairy black granny porn video best iphone porn apps yaxum porn movies simpson anime ponr
From: (Anonymous)
[color=green][b]Great news everybody![/b][/color]

New updated [b]XRumer 12[/b] recognize and break Google ReCaptcha again,
during automatic registering and posting:

[img]http://s017.radikal.ru/i440/1609/07/3bbec3a7389c.jpg[/img]

Interested? :)

Just Google for the latest subversion of XRumer 12! ;)

[color=gray]P.P.S. XRumer 7.0.12 and another versions are too old - and have less than 10% of functions, that have XRumer 12.[/color]

___
[url=http://predprinimatel.by/forum/viewtopic.php?f=9&t=157191]XRumer 12.0.17: ReCaptcha breaker[/url], [url=http://predprinimatel.by/forum/viewtopic.php?f=3&t=157187]X-Rumer 12.0.17-18: Google Captcha breaker[/url], [url=http://www.metal-rules.com/bb/viewtopic.php?f=1&t=82555]X-Rumer 12.0.16: Google ReCaptcha breaker[/url], [url=http://dragbattle.ru/forum/viewtopic.php?pid=5091#p5091]XRumer 12.0.17: Google ReCaptcha breaking[/url], [url=http://forum.greenberet.ru/subdmn/forum/viewtopic.php?f=12&t=2168]X-Rumer 12.0.16: Google Captcha solution[/url]

Free full-grown galleries

Date: 2016-09-26 06:19 pm (UTC)
From: (Anonymous)
My contemporary page
http://muslim.clit.pornpost.in/?entry.alanis
park vacant lending rockford xian

Free matured galleries

Date: 2016-09-30 06:50 am (UTC)
From: (Anonymous)
My updated website is super!
video call download free non android market apps android desktop wallpapers lveel app for android free text app for android
http://sex.games.android.porndairy.in/?gain.kiersten
game free online all apps for android free download free downloadable porn app apps for messaging korean mmorpg list

Social pictures

Date: 2016-09-30 05:20 pm (UTC)
From: (Anonymous)
Started new spider's web throw
http://premium.dating.pornpost.in/?entry.rebecca
bangalore free dating sites datingseiten vergleich tall women dating meet bikers men dating older woman

Unshackle galleries

Date: 2016-10-02 08:48 am (UTC)
From: (Anonymous)
Recent sovereign porn place
erotic poem erotic readings angelina jolie erotic adult video free sex
http://femdom.erolove.in/?joslyn
sexy erotic slideshow erotic movies list erotic weight gain erotic trance
Page 1 of 10 << [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] >>

Profile

jmvalin: (Default)
jmvalin

June 2017

S M T W T F S
    123
45678910
11121314151617
1819 2021222324
252627282930 

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 24th, 2017 09:10 pm
Powered by Dreamwidth Studios