• No results found

Mining Large-Scale Music Data Sets

N/A
N/A
Protected

Academic year: 2021

Share "Mining Large-Scale Music Data Sets"

Copied!
15
0
0

Loading.... (view fulltext now)

Full text

(1)

1.

Music Audio Representations

2.

The Million Song Dataset

3.

Finding Similar Items (Cover Songs)

4.

Current Work

Mining Large-Scale

Music Data Sets

Dan Ellis & Thierry Bertin-Mahieux

Lab

oratory for

R

ecognition and

O

rganization of

S

peech and

A

udio

Dept. Electrical Eng., Columbia Univ., NY USA

(2)

The Problem

One Million Songs

Query track

“Similar” tracks

Time Frequency 0 1 2 3 4 5 6 7 8 9 10 0 1000 2000 3000 4000 Time Frequency 0 1 2 3 4 5 6 7 8 9 10 0 1000 2000 3000 4000 Time Frequency 0 1 2 3 4 5 6 7 8 9 10 0 1000 2000 3000 4000 Time Frequency 0 1 2 3 4 5 6 7 8 9 10 0 1000 2000 3000 4000
(3)

1.

Music Audio Representations

We need a description of the audio that

contains “

suitable

” detail

Speech Recognition uses

Mel-Frequency

Cepstral Coefficients

(MFCCs)

freq / Hz Coefficient freq / Hz

Let It Be (LIB-1) - log-freq specgram

329 1396 5915 MFCCs 2 4 6 8 10 12 time / sec

Noise excited MFCC resynthesis (LIB-2)

0 5 10 15 20 25

329 1396 5915

(4)

Chroma Features

We’d like to preserve the notes

at least within one

octave

Let It Be - log-freq specgram (LIB-1)

Chroma features C D E G A B

Shepard tone resynthesis of chroma (LIB-3)

MFCC-filtered shepard tones (LIB-4)

freq / Hz 300 1400 6000 freq / Hz chroma bin 300 1400 6000 freq / Hz 300 1400 6000 time / sec 0 5 10 15 20 25

W

ar

re

n

et

al

. 2003

(5)

Beat-Synchronous Chroma

Perform

Beat Tracking

.. and record one chroma vector per beat

compact representation of harmonies

Let It Be - log-freq specgram (LIB-1)

Onset envelope + beat times

Beat-synchronous chroma

Beat-synchronous chroma + Shepard resynthesis (LIB-6)

freq / Hz 300 1400 6000 freq / Hz 300 1400 6000 C D E G A B chroma bin time / sec 0 5 10 15 20 25

(6)

2.

Million Song Dataset (MSD)

Commercial-scale dataset

available to MIR researchers

1M pop songs

250 GB of features

(6 years of listening)

Many facets:

Features,

Lyrics,

Tags,

Covers,

Listeners ...

http://labrosa.ee.columbia.edu/millionsong

Thierry Bertin-Mahieux

(7)

MSD Audio Features

Use

Echo Nest

“Analyze” features

segment audio into variable-length “events”

represent by

12 chroma +

12 “timbre”

supports

a crude

resynthesis

:

240 761 2416 C D E G A B time / sec freq / Hz freq / Hz 0 2 4 6 8 10 12 14 240 761 2416

TRKUYPW128F92E1FC0 - Tori Amos - Smells Like Teen Spirit

Original

Resynth

EN

Features

(8)

MSD Metadata

artist: 'Tori Amos'

release: 'LIVE AT MONTREUX'

title: 'Smells Like Teen Spirit' id: 'TRKUYPW128F92E1FC0' key: 5 mode: 0 loudness: -16.6780 tempo: 87.2330 time_signature: 4 duration: 216.4502 sample_rate: 22050 audio_md5: '8' 7digitalid: 5764727 familiarity: 0.8500 year: 1992

%5489,4468, Smells Like Teen Spirit TRTUOVJ128E078EE10 Nirvana

TRFZJOZ128F4263BE3 Weird Al Yankovic TRJHCKN12903CDD274 Pleasure Beach TRELTOJ128F42748B7 The Flying Pickets

TRJKBXL128F92F994D Rhythms Del Mundo feat. Shanade TRIHLAW128F429BBF8 The Bad Plus

TRKUYPW128F92E1FC0 Tori Amos

12 hello 11 i 10 a 9 and 7 it 6 are 6 we 6 now 6 here 6 us 6 entertain 4 the 4 feel 4 yeah 3 to 3 my 3 is 3 with 3 oh 3 out 3 an 3 light 3 less 3 danger 100.0 – cover 57.0 – covers 43.0 – female vocalists 42.0 – piano 34.0 – alternative 14.0 – singer-songwriter 11.0 – acoustic 8.0 – tori amos 7.0 – beautiful 6.0 – rock 6.0 – pop 6.0 – Nirvana 6.0 – female vocalist 6.0 – 90s

5.0 – out of genre covers

5.0 – cover songs 4.0 – soft rock 4.0 – nirvana cover 4.0 – Mellow 4.0 – alternative rock 3.0 – chick rock 3.0 – Ballad 3.0 – Awesome Covers 2.0 – melancholic 2.0 – k00l ch1x 2.0 – indie 2.0 – female vocalistist 2.0 – female 2.0 – cover song 2.0 – american

EN Metadata

SHS Covers

MxM Lyric Bag-of-Words

(9)

3.

Finding Similar Items

If it’s really the same, we can use

fingerprinting

(e.g. Shazam):

Query audio 0 500 1000 1500 2000 2500 3000 3500 4000

Match: 05−Full Circle at 0.032 sec

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 500 1000 1500 2000 2500 3000 3500 4000 time / sec freq / Hz

find local “landmarks”

quantize by pairs

inverted index

(10)

Dynamic Programming Alignment

We can match the chroma representations

by

DP alignment

robust to timing changes

quite efficient

best performer

in MIREX Cover Song evaluations

(11)

Large-Scale Cover Matching

How can we find covers in

1M songs

?

@ 1 sec / comparison, one search = 11.5 CPU-days

full N

2

mining = 16,000 CPU-years

Hashing

?

landmarks from chroma patches (like Shazam)

represent item by distribution of contour fragments

one stage in a filtering hierarchy...

(12)

Euclidean Cover Matching

Reduce cover matching to

nearest-neighbor

search

in fixed-size beat-chroma patches

Right “

distance

”?

principal components

learned weightings

Alignment/

segmentation?

music segmentation

multiple probes

framing-insensitive

representation

Ellis & Poliner ’08

Bertin-Mahieux & Ellis ’12

Between the Bars − Elliot Smith − pwr .25

120 130 140 150 160 170 2 4 6 8 10 12

Between the Bars − Glenn Phillips − pwr .25 − −17 beats − trsp 2

120 130 140 150 160 170 2 4 6 8 10 12

pointwise product (sum(12x300) = 19.34)

time / beats chroma bin 120 130 140 150 160 170 2 4 6 8 10 12

(13)

4.

Other Work: Pattern Mining

Cluster

beat-synchronous chroma

patches

#32273 - 13 instances #51917 - 13 instances #65512 - 10 instances #9667 - 9 instances #10929 - 7 instances #61202 - 6 instances #55881 - 5 instances 5 10 15 20 #68445 - 5 instances 5 10 15 time / beats G F D C A G F D C A G F D C A G F D C A chroma bins

a20-top10x5-cp4-4p0

(14)

Other Work: MSD Challenge

Using “Taste Profile” data

User-Song-playcount

Challenge:

Rank 1M songs to

complete half a playlist

using audio, metadata, lyrics, whatever

Kaggle.com based contest

self-service evaluation, leader board

Results at MIREX, AdMIRe

b80344d063b5ccb3212f76538f3d9e43d87dca9e SOAKIMP12A8C130995 1 b80344d063b5ccb3212f76538f3d9e43d87dca9e SOAPDEY12A81C210A9 1 b80344d063b5ccb3212f76538f3d9e43d87dca9e SOBBMDR12A8C13253B 2 b80344d063b5ccb3212f76538f3d9e43d87dca9e SOCNMUH12A6D4F6E6D 1 b80344d063b5ccb3212f76538f3d9e43d87dca9e SODACBL12A8C13C273 1 b80344d063b5ccb3212f76538f3d9e43d87dca9e SODDNQT12A6D4F5F7E 5

(15)

Summary

Finding Musical Similarity at Large Scale

Beat Chroma

Representation

Million Song Dataset

http://labrosa.ee.columbia.edu/ http://labrosa.ee.columbia.edu/millionsong

References

Related documents

GREEK SCHOOL GRADUATION AND DINNER DANCE Our Annual Afternoon School End of the Year Program will take place on Saturday, June 6th at 4:00 pm at our Community Center.. Mark the

Weed lines and quality blanks on auctions, bluegill color dae yang crankbait lure would catch bass is the water crank is really want, i like these.. Smoked the square crankbait

The BIOMIN Mycotoxin Survey also determines the risk levels of the mycotoxins analyzed according to the percentage of samples in the different contamination ranges and based on

We support and fund research to help people living with dementia and their families, as well as research to develop new treatments for the future.. From improvements in

Bosnia Kosovo Macedonia Montenegro Serbia Burundi Rwanda Liberia OPT 0% 25% 50% 75% 100% Youth (18-35y) Women EURED MFS I MFS II UNDP UNDP-MRBD YES Male Owner Female Owner 0% 25%

This is the ability to detect a non-pregnant cow as early as 24 days after service (artificial insemination and naturally using a male animal) as compared to the conventional method

Introduction to crystallography and mineralogy. Crystal systems and classes. Fundamental concepts of mineral chemistry. Classification of minerals by crystal systems and chemical

The results showed that (1) the liquidity management of Islamic banking, there are three options that can be used the Certificates of Bank Indonesia Sharia (SBIS),