jmvalin: (Default)
jmvalin ([personal profile] jmvalin) wrote2017-09-26 10:24 pm
Entry tags:

RNNoise: Learning Noise Suppression

banner

This demo presents the RNNoise project, showing how deep learning can be applied to noise suppression. The main idea is to combine classic signal processing with deep learning to create a real-time noise suppression algorithm that's small and fast. No expensive GPUs required — it runs easily on a Raspberry Pi. The result is much simpler (easier to tune) and sounds better than traditional noise suppression systems (been there!).

Read More

Possible alternative uses for this algorithm ?

(Anonymous) 2017-09-29 07:49 am (UTC)(link)
Long time ago I used to make amateur remixes, and one tricky part was to isolate vocals from the remixed track. To do that I was using the noise removal tool: select a part of the track without vocals, run a spectral analysis on it and then substract the result to the whole track. Most of the time the result was terribly mangled, but sometimes I got something usable.

Your demo got me thinking: if I want to remove something very specific from one track instead of learning a generalized filter, can I train this model with a smaller dataset, like a few seconds from that track?

Re: Possible alternative uses for this algorithm ?

(Anonymous) 2017-09-29 02:07 pm (UTC)(link)
Hi Jean Marc,

There was a 1.6 million $ Indigogo project about Snoring noise suppression device that went bust.
https://www.indiegogo.com/projects/silent-partner-quiets-snoring-noise-like-magic-sleep/x/14463062#/

Do you think you could help them?

Fab!

(Anonymous) 2017-09-29 03:34 pm (UTC)(link)
This is great, thanks for your work! Do you think this might be of use with non-audio data?

Re: Fab!

(Anonymous) 2017-10-02 12:27 am (UTC)(link)
Hi this looks really interesting. I am working with a team to try to improve optical character recognition on old and "noisy" text images that humans can read but OCR cannot. We have been using neural net to do this with some success on training data (https://github.com/digiah/oldOCR, my github username is rcrath) using OCRopy but I have always thought treating the text as a data stream and using department to get a noise sample of the garbage text that OCR makes when it fails and subtracting that from the signal might reduce the garbage in an OCR file to make other approaches better focused. I realize that is far from what you are doing, butdo you think it would be feasible for us to try and adapt your code?

One thing that strikes my ear in the samples, most obviously in the street noise one, is that the algorithm is acting more like a gate than noise removal since the horns and traffic are clearly audible still in the speech sections.

I would love to see this adapted to guitar noise suppression!

Thanks for this work.

Re: Fab!

(Anonymous) 2017-10-02 12:30 am (UTC)(link)
Lol, plz excuse autocorrect. I have no idea what "department" was supposed to be!

What about reducing noise in music?

(Anonymous) 2017-10-02 03:56 pm (UTC)(link)
Do you think this approach would be helpful in reducing noise on old 78 rpm records? Because archive.org's Great 78 project has a large number of albums you could experiment with.

Re: What about reducing noise in music?

(Anonymous) 2017-10-06 03:47 pm (UTC)(link)
Hmm, perhaps you can store the RNN's coefficients in a file so one can train the filter to match his needs?

(Anonymous) 2017-10-03 01:36 pm (UTC)(link)
Hi Jean Marc,

Impressive Work! Really nice results too. I've been working on a similar project but to be exclusively used with Ardour so it's an lv2 plugin (https://github.com/lucianodato/noise-repellent). I used ME method (Rangachari and Loizou) to estimate noise with it but this method seems to work miles better. Do you mind if I make an lv2 plugin out of your library?
Also did you evaluate using discrete wavelet transform instead of fft+bark scale? I've read in the past few works that get near zero latency with that architecture.
Thank you very much for this. It already taught me few things wasn't aware of.

ladspa-plugin

(Anonymous) 2017-10-07 11:44 am (UTC)(link)
I'd love to see a ladspa-plugin, so that I could watch Youtube-videos & university-lectures denoised with mpv.
Anyone any ideas how hard it would be to create a ladspa-plugin based on it?

I never done anything like this (but have basic C/C++-knowleged)…

Also did you evaluate using discrete wavelet transform instead of fft+bark scale?

(Anonymous) 2018-01-17 06:49 pm (UTC)(link)
I was wondering myself, if you use a typical FFT\bark.scale Algorhythm on the A.I Enhanced granular wavetable frequency bands, as virtual Bin (size?) calculated predominantly by of course the FFT\barki.scale relying for its final or real.time results, which are limited by or from mainly the CPU/RAM/MEMORY Capacity.
Do you have a Unique (Transform Algorhythm) that uses the "keras" or "theano" CNTK Binary libraries? say something that could be like a hybrid of the FFT & GFT, DTMF? I understand that using a GFT\bark.scale Transfrom Algorhythm alone, could be a massive difference in data\frequency band to using a FFT, as they can potentially use unlimited/Infinite Granular Wavetable band Resolution Virtual Bin data-size/processing power per noise scale or frequency bands!
Understanding there is not large amounts of people, that acknowledge/understand the difference of those two Transform Algorhythms as to begin with.

RESIDUAL NOISE PROBLEM

(Anonymous) 2017-10-09 08:21 am (UTC)(link)
Hi Jean Marc,
According to the audio samples you provided, it seems that the RNNoise has more residual noise than the speex. Do you think it will perform better with more noise samples for training?

(Anonymous) 2017-10-30 03:54 am (UTC)(link)
Can I know how much data do you trained to achieve your demo performance

Possible Use Cases

(Anonymous) 2017-12-03 09:02 pm (UTC)(link)
Great Work you are doing Jean Marc!
I was wondering if this could be applicable to active noise cancellation as well, with a real time algorithm say from a raspberry pi or maybe a hardware implementation on a fpga. Of course, this would have to be trained on a different training set; I was thinking the Aurora II? Could it be paired with another algorithm such as beamforming to also attenuate noise not from a specific relative location?

Sorry for the plethora of questions.
Thanks.

Re: Possible Use Cases

(Anonymous) 2019-01-02 04:43 pm (UTC)(link)
Hi Jean,

For active noise cancellation, can you suggest on the approach? Also, is there any project to your knowledge around that?

Real-time Algorithm

(Anonymous) 2017-12-05 02:48 pm (UTC)(link)
Looking denoised output on spectogram feels so good, great work!

But how can we modify to make it real-time spectogram since you purposely delayed it by a few seconds?

Thank you sir.

Using RNNoise as VAD source / Way to improve VAD

(Anonymous) 2017-12-08 08:28 am (UTC)(link)
Hello.
First of all, thanks for great work.

Recently, I needed VAD on my application and found that RNNoise has VAD output. So I tried use VAD output of RNNoise.
I commented out code not related with VAD, and made prototype of the app. It works quite nicely despite its small calculation size. But on some audio samples, VAD fails.

I'm trying to improve VAD by adding feature size and neural net size.
Can you give me some hint to improve VAD quality?

Amateur Radio

(Anonymous) 2017-12-16 08:08 am (UTC)(link)
Hi JMV,

In Amateur Radio noise suppression is always a big topic. I've added your project into a SDR receiver - if you're not into SDR see www.sdr-radio.com . Anyway it's working well when the audio is above the noise, when the noise and audio are both at the same level I can lose the audio.

This isn't a complaint - just an observation, your code isn't really designed for this situation. Now if only I could get you interested the Amateur Radio your skills would have a big impact on noise reduction.

Thanks for this project, I'll be following it closely.

Re: Amateur Radio

(Anonymous) 2017-12-16 08:09 am (UTC)(link)
BTW - I'm Simon@sdr-radio.com - I should have said that :) .

Simon G4ELI

Cutting out environmental noise

[personal profile] h3yfinn 2018-01-11 08:47 am (UTC)(link)
Hey there, I'm trying to find a way of detecting any voice in a public(new zealand rural) area. This is so I can scrub private conversations from publicly record files and then analyse the remaining files for birdsong.
We have quite poor quality recordings so I thought your algorithm may be quite useful to remove background noise so we can more easily detect voice (which otherwise we cannot at all detect using a webrtcVAD algorithm). I'm just setting it up now. Thought I'd put this here in case you had any idea whether your algorithm would work like I hope.

Thanks a lot,

Re: Cutting out environmental noise

(Anonymous) 2018-06-07 10:37 am (UTC)(link)
For such purposes, you need a source separation framework and not noise cancellation. Ping me at praveshb @ iiitd dot ac dot in if you want know more.

Thanks

The Audio Player is awfully nice!

(Anonymous) 2018-02-16 04:34 pm (UTC)(link)
The player, with the play line, and the babble/db/noise reduction on/off buttons, is nicely done.
slightly off topic but how did you do that?

Training Data

(Anonymous) 2018-03-12 02:14 pm (UTC)(link)
Hi Jean Marc, I am a college student.You great job really impresses me.I want to recurrent your job to learn more.But when I check your open code, your raw data named 'denoise_data9.h5' was lacked .Can you share this data with me? My email address is xiangkai@hust.edu.cn. Thank you very much.

Spectral non-stationarity metric

(Anonymous) 2018-06-20 11:22 am (UTC)(link)
Hi, will you disclose the secret spectral non-stationarity metric mentioned on the xiph page?

Re: Spectral non-stationarity metric

(Anonymous) 2018-08-27 06:39 am (UTC)(link)
Thanks, I found it. Great work on this! Will you help me settle something I've been thinking way too much about: do you pronounce it "RN-Noise," or "RNN-Noise?"

Asking permission to use Javascript version of RNNoise

(Anonymous) 2018-08-02 10:18 am (UTC)(link)
Hi, my name is Herru, I’ve seen your web demo https://people.xiph.org/~jm/demo/rnnoise/ about RNN for Noise Removal, it’s awesome! After testing the live recording demo, I see that the noise removal is using Javascript. I would like to ask you about the permission to use the Javascript version of it on my project (which is non commercial project), since your link to the original code was made using C and under BSD license.

Re: Asking permission to use Javascript version of RNNoise

(Anonymous) 2018-08-03 03:12 am (UTC)(link)
Thank you very much Jean-Marc.

Input and output data dimensions

(Anonymous) 2018-11-08 01:20 pm (UTC)(link)
Dear Jean-Marc,

Many thanks for publishing your exciting work and sharing your code.
I've two points which are not 100% clear to me after reading your documentations and code:

(1) Network training input and output data samples are finite sequences of 42- and 23-element vectors, respectively. But in the operation mode, the trained network is fed sequentially with a single input vector and outputs a single vector?

(2) Is the training data extracted from overlapping spectrogram segments?

Kind regards

Re: Input and output data dimensions

(Anonymous) 2018-11-08 04:33 pm (UTC)(link)
Thanks a lot for your quick answer!
Regarding 2), I think I have to specify my question:

Looking at your training code (rnnoise/training/rnn_train.py), you feed the network with sequences of 2000 42-element vectors/frames (= 1 training sample). Now I wonder if two distinct training samples might share a certain number of frames?

Negative SNR cases

(Anonymous) 2019-01-24 05:44 am (UTC)(link)
Hi,
This application is working very well. But If I give -6dB/-3dB SNR case inputs, some part of speech is corrupted

Processing stereo input seems to work great

(Anonymous) 2019-02-28 05:08 pm (UTC)(link)
Im working with automating the processing of stereo 48000 recording files to clean up noise.

Appears stereo is working fine. However processing just one channel did a slightly better job.

Do you know of any consultants that may be able to assist in our project. want to try to test this inserted realtime on the server and in client of webrtc app as a proof of concept.

Also trying to understand stereo AEC methods.

training

(Anonymous) 2019-03-01 05:53 am (UTC)(link)
Hi Jean Marc,

Thanks for this great project. I'd like to train this suppression myself and have downloaded the github version, but I don't see any scripts or way of creating the noisy mixtures as you've described in your paper (with the random filtering, etc.). Are you willing to share the training scripts with us?

Thanks,
Chris