Many thanks for publishing your exciting work and sharing your code. I've two points which are not 100% clear to me after reading your documentations and code:
(1) Network training input and output data samples are finite sequences of 42- and 23-element vectors, respectively. But in the operation mode, the trained network is fed sequentially with a single input vector and outputs a single vector?
(2) Is the training data extracted from overlapping spectrogram segments?
Input and output data dimensions
Many thanks for publishing your exciting work and sharing your code.
I've two points which are not 100% clear to me after reading your documentations and code:
(1) Network training input and output data samples are finite sequences of 42- and 23-element vectors, respectively. But in the operation mode, the trained network is fed sequentially with a single input vector and outputs a single vector?
(2) Is the training data extracted from overlapping spectrogram segments?
Kind regards