bin2bin Music Signals Inpainting

This page contains some listening samples to evaluate the performance of bin2bin, a cGAN-based spectrogram inpainting approach, described in: A Score-aware Generative Approach for Music Signals Inpainting - C. Aironi, L. Gabrielli, S. Cornell, S. Squartini (2023)

Architecture overview

Image Description

The bin2bin architecture is composed of a U-Net for CQT spectrogram inpainting (G). Multi-objective loss for training the U-Net is obtained by ensembling a PatchGAN discriminator BCEloss, along with a Spectral Convergence loss LSC, and the MSE between the true, input-aligned piano roll, and that obtained from a pitch estimation module.

Small gap (375 ms)

clean (reference)

corrupted

restored


clean (reference)

corrupted

restored


clean (reference)

corrupted

restored


clean (reference)

corrupted

restored

Medium gap (750 ms)

clean (reference)

corrupted

restored


clean (reference)

corrupted

restored


clean (reference)

corrupted

restored