Grupo de Tratamiento de Imágenes

Deep Learning System for Music Detection

On March 26th at 12:00, Room B-221.

Music identification in tv content has become one of the main aims for copyright purposes. Therefore, the detection of the boundaries of musical segments, even when the music is low compared to other sources is a key start point for this objective, that will be followed by source separation and fingerprinting identification steps. Through mel-spectrograms and deep learning techniques, an audio signal segment is turned into a stream of 2D images and input to a ResNet-based Convolutional Neural Network (CNN), in order to classify it according to the kind of sources producing its sound. A post-processing stage filters the label stream to reduce errors and the onset and duration of the music segments are extracted.

Raquel Dueñas received the Bachelor of Engineering in Telecommunication Technologies and Services (intensification in Sound and Image) in 2018 and currently she is studying the Master in Signal Theory and Communications (Track on Signal Processing and Machine Learning for Big Data), both from the Universidad Politécnica de Madrid (UPM), Madrid, Spain. She has been a member of the Grupo de Tratamiento de Imágenes (Image Processing Group) at the UPM since 2018. Her current research is in the area of video analysis and processing, deep learning.

News and Events