+

WO2007036846A2 - Procede et dispositif d'analyse automatique de structures de pistes musicales - Google Patents

Procede et dispositif d'analyse automatique de structures de pistes musicales Download PDF

Info

Publication number
WO2007036846A2
WO2007036846A2 PCT/IB2006/053398 IB2006053398W WO2007036846A2 WO 2007036846 A2 WO2007036846 A2 WO 2007036846A2 IB 2006053398 W IB2006053398 W IB 2006053398W WO 2007036846 A2 WO2007036846 A2 WO 2007036846A2
Authority
WO
WIPO (PCT)
Prior art keywords
signal
music track
channels
channel
beat
Prior art date
Application number
PCT/IB2006/053398
Other languages
English (en)
Other versions
WO2007036846A3 (fr
Inventor
Aweke N. Lemma
Francesco F. M. Zijderveld
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2007036846A2 publication Critical patent/WO2007036846A2/fr
Publication of WO2007036846A3 publication Critical patent/WO2007036846A3/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G11B27/038Cross-faders therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/061Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of musical phrases, isolation of musically relevant segments, e.g. musical thumbnail generation, or for temporal structure analysis of a musical piece, e.g. determination of the movement sequence of a musical work
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/076Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of timing, tempo; Beat detection

Definitions

  • the present invention relates to method and apparatus for automatic structure analysis of a piece of music or music track.
  • it relates to analysis of the structure such as the location of accented beats of the music track and the meter (or bar) of the music track.
  • rhythmical coherent beats or accented beats within the meter.
  • the accent is the music stress applied to a note and contributes to the sensation of the beat. Therefore, when the accented beats are not matched, the transition could be annoying.
  • AutoDJ systems which select and sort music tracks based on some similarity criteria and play them in a smooth rhythmically consistent way.
  • Music tracks stored in database 101 are analysed to extract representative parameters 103. These include, among other things, the end of the intro section, the beginning of the outro section, phrase or bar boundaries, tempo, beat locations, and harmonic signature, etc. These parameters are usually computed offline and stored in a linked database
  • a playlist 107 is generated from the definitive store 101 of music tracks.
  • the playlist is compiled on the basis of the parameters in the database 105 and a set of user preferences.
  • a transition planner 109 compares the extracted parameters corresponding to the music tracks in the playlist and generates a set of mix parameters to be used by the player 111.
  • the player 111 streams the music tracks in the playlist to the output device 113, for example loudspeaker, using the mix parameters that are an optimal compromise between user preferences and computed features to provide a smooth transition.
  • US6542869 discloses yet another method of extracting song structure.
  • the evolution of the ceptral frequency coefficients is used to determine events in songs.
  • it does not provide a means of sorting these detected events. That is the method detects structural events that can represent any of intro, outro, beat onset, phrase boundary, break points etc. But it does not differentiate between these events making it practically inapplicable to an AutoDJ implementation.
  • this is achieved by a method for determining a location of accented beats of a music track, the method comprising the steps of: extracting a signal from the music track; demultiplexing the signal across a number of channels, the number of channels corresponding to a candidate meter of the music track such that each consecutive channel contains a consecutive portion of the signal, each portion of the signal corresponding to a consecutive beat period; and determining the location of the accented beats as a specific one of the channels having signal properties different to those of the other channels.
  • apparatus for determining a location of accented beats of a music track comprising: an extractor for extracting a signal from the music track; a demultiplexer for demultiplexing the signal across a number of channels such that each consecutive channel contains a consecutive portion of the signal, each portion of the signal corresponding to a consecutive beat portion, the number of channels corresponding to a candidate meter of the music track; a comparator for determining the channel having different signal properties to that of the other channels; and a selector for selecting a specific one of the channels having different signal properties to that of the other channels and determining this channel as the location of the accented beats.
  • the location of the accented beats can be determined with some degree of accuracy.
  • the method may further comprise the step of determining the beat onsets of the music track to ensure that each consecutive channel contains a corresponding portion of the signal aligned with corresponding consecutive beat onsets and the temporal duration of each portion of the signal may correspond to the beat period of the music track. Further, the method may further comprise the step of estimating the candidate meter of the music track to increase the accuracy of the locating the accented beats.
  • the difference in the signal properties may be derived from the corresponding correlation coefficients, difference in the signal profile of the music track, number of zero crossings of the signal profile of the music track etc. which enables a simple and effective comparison to be carried out to locate the channel containing the accented beat
  • the demultiplexing step is repeated for different candidate meters. The results can then be compared and the more accurate result used to locate the accented beats.
  • the method further comprises the step of: determining the meter structure of the music track as the candidate meter for which a channel having different signal properties to that of the other channels can be distinguished.
  • the at least two demultiplexers stage may comprise at least two demultiplexers connected in parallel and the at least two correlating stages comprises at least two correlators connected in series to respective demultiplexers.
  • a beat onset estimator for estimating the tempo and beat onset may be incorporated in the apparatus for controlling the width of each channel such that the width of each channel corresponds to a beat period.
  • the present invention may also provide a method for determining the meter of a music track, the method comprising the steps of: (a) extracting a signal from the music track; (b) demultiplexing the music track across a plurality of channels such that each consecutive channel contains a consecutive portion of the signal, each portion of the signal corresponding to a consecutive beat period; (c) computing the cross correlation of the music track across the plurality of channels; (d) carrying out steps (b) and (c) for a plurality of different hypothesis in which at least the number of the plurality of channels are variable; (e) determining the meter of the music track as the number of channels of the hypothesis having the most consistent result.
  • the meter may be determined as the argument corresponding to the minimum of the plurality of hypotheses.
  • Figure 1 is a schematic diagram of the functions of a known automatic DJ system
  • Figure 2 is a schematic of the apparatus according to a first embodiment of the present invention.
  • Figure 3 is a schematic of apparatus according to a second embodiment of the present invention.
  • Figures 4a and 4b illustrate graphical representations of an example of an input and output, respectively, of the demultiplexer of the apparatus of the embodiments of the present invention
  • Figure 5 is a schematic of apparatus according to a third embodiment of the present invention.
  • Figure 6 is a schematic of apparatus according to a fourth embodiment of the present invention.
  • the apparatus comprises an input terminal 201 connected to the input of a framing means 203.
  • the output of the framing means 203 is connected to the input of a Fast Fourier Transform (FFT) processor 205.
  • the output of the FFT processor 205 is connected to a weighted averaging means 207.
  • the output of the weighted averaging means 207 is connected to the input of a buffer 209.
  • the output of the buffer 209 is connected to the input of a demultiplexer 211.
  • the control of the demultiplexer 211 is connected to a genre or meter input terminal 213 of the apparatus 200.
  • the plurality of outputs of the demultiplexer 211 are connected to the respective inputs of a signal comparator 215.
  • the plurality of outputs of the signal comparator 215 are connected to respective inputs of a selector 217.
  • the output of the selector 217 is connected to the output terminal 219 of the apparatus 200.
  • the apparatus 200 identifies the accented beats of an input music track x[n].
  • the input music track x[n] is processed to generate a song feature signal E[k].
  • the input music track (or song) is divided into a plurality frames by the framing means 203.
  • the Fast Fourier Transform of each frame is derived by the processor 205 and weighted by the weighted averaging means 207.
  • the song feature signal output from the weighted averaging means 207 is temporarily stored in the buffer 209 for delivery to the demultiplexer 211. More particularly, if Xk[fJ represents the FFT of the signal x[n] corresponding to the k-th frame, then the output E[k] in the buffer 209 is mathematically given by
  • C[fJ is the weighting factor for the frequency component f and N is the FFT frame size in number of samples.
  • the song meter information or a priori acquired genre data (For instance dance songs usually have a 4-4 meter structure) is input on the terminal 213.
  • the song feature signal E[k] is then de-multiplexed into M channels in demultiplexer 211 such that each consecutive channel contains a corresponding consecutive beat.
  • the M signals are compared against each other in the comparator 215 to determine which channel has differentiating properties. Since accented beats have differentiating properties, the channel identified as containing such a signal will be that which is the accented beat. This channel is then selected and marked as an accented beat by the selector 217 and output on the output terminal 219 of the apparatus 200.
  • the output of the buffer 209 is connected to the input of a tempo and beat onset estimator 301 and the input of a demultiplexer 303.
  • the output of the tempo and beat onset estimator 301 is also connected to the input of the demultiplexer 303.
  • the plurality of outputs of the demultiplexer 303 are connected to respective inputs of a correlator 307.
  • the control of the demultiplexer 303 is connected to a meter input terminal 305.
  • the plurality of outputs of the correlator 307 are connected to a selector 309.
  • the output of the selector 309 is connected to an output terminal 219 of the apparatus 300.
  • the song feature signal E[k] is extracted and temporarily stored in the buffer 209.
  • the demultiplexer 303 Given the feature signal E[k], first the tempo and the beat onset positions are estimated by the estimator 301. Subsequently, the demultiplexer 303 performs block- wise de-multiplexing of the feature signal, where the block-size is equal to the beat-period over M channels, where M is the meter of the song.
  • An example of a typical input signal E[k] and output of the demultiplexer 303 are shown in Figures 4a and 4b, respectively.
  • the beat onset positions are determined using known beat detection algorithms such that the feature signal E[k] is segmented into blocks of beat periods as shown in Figure 4a.
  • wave shapes corresponding to accented beats are multiplexed into one channel, channel 4 as shown in Figure 4b.
  • the cross correlation function ⁇ m defined as
  • the fourth channel has a slightly different property than all the other channels. Thus, it corresponds to the accented beat.
  • the accented beat is thus located by the selector 309. Once the accented beat has been selected, it can be mixed by aligning the accented beats in the two songs to provide a rhythmically coherent transition.
  • the correlation function o m is defined as
  • the demultiplexer is followed by a filtering stage to remove transients similar to that seen in channel 4 of Figure 4b.
  • the apparatus comprises a demultiplexer 303, correlator 307 and selector 309 similar to those of Figure 3.
  • the first stage is the same as that of Figure 3 and is not shown here.
  • a filter 501 is incorporated between the output of the demultiplexer 303 and the correlator 307. The filter 501 removes transients and, therefore, the correlation results are less affected by transient noises and hence gives more reliable results.
  • the wave shapes output by the demultiplexer of the embodiments above in the different channels can be compared in several different ways. For example, correlation coefficients, signal difference, number of zero crossing etc could be used as a measure to differentiate the beat types.
  • the meter of the song may be derived from the genre of the song. However, this may provide an inaccurate result since in some cases two songs of the same genre may have different meters. Therefore, according to a preferred embodiment, the meter may be determined by the apparatus shown in Figure 6.
  • the input music track (or song) x[n] is placed on the input of a beat detector 601 and a feature extractor 603.
  • the output of the beat detector 601 is connected to the feature extractor 603 and the input of a first demultiplexer 605 and a second demultiplexer 607. Only two demultiplexers are shown in Figure 6. However, it can be appreciated that the apparatus may comprise any number of demultiplexers depending on the number of hypotheses.
  • the plurality of outputs of the first and second demultiplexers 605, 607 are connected to respective inputs of a first and second cross correlator 609, 611.
  • the outputs of the first and second cross correlators 609, 611 are connected to first and second selectors 613, 615.
  • the outputs of the first and second selectors 613, 615 are connected to a comparator 617.
  • the output of the comparator 617 is connected to the output terminal of the apparatus.
  • the beat onset positions are determined using known beat detection algorithms in the detector 601 and a feature extracted signal r[k] is output.
  • the feature signal r[k] is segmented into blocks of beat periods as shown in Figure 4a.
  • the segmented signal r[k] is block-wise de-multiplexed into M-channels.
  • the first demultiplexer 605 demultiplexes the signal r[k] into 4 channels.
  • the fourth channel has a slightly different property than all the other channels. Thus, according to the invention, it corresponds to the accented beat.
  • the minimum value x(M,) is computed.
  • the meter M of the song is determined by the comparator 617 as the argument corresponding to the minimum of the n hypothesis.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

Un procédé de détermination de l'emplacement des battement accentués dans des pistes musicales consiste à : extraire un signal de la piste musicale ; démultiplexer le signal dans un certain nombre de canaux, le nombre de canaux correspondant à la mesure de la piste musicale de sorte que chaque canal consécutif contienne une partie consécutive du signal, chaque partie du signal correspondant à une période de battement consécutive ; et enfin, analyser le contenu de chaque canal pour déterminer le canal qui possède des propriétés de signal différentes.
PCT/IB2006/053398 2005-09-30 2006-09-20 Procede et dispositif d'analyse automatique de structures de pistes musicales WO2007036846A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05109083.5 2005-09-30
EP05109083 2005-09-30

Publications (2)

Publication Number Publication Date
WO2007036846A2 true WO2007036846A2 (fr) 2007-04-05
WO2007036846A3 WO2007036846A3 (fr) 2007-11-29

Family

ID=37900145

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/053398 WO2007036846A2 (fr) 2005-09-30 2006-09-20 Procede et dispositif d'analyse automatique de structures de pistes musicales

Country Status (1)

Country Link
WO (1) WO2007036846A2 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8525012B1 (en) 2011-10-25 2013-09-03 Mixwolf LLC System and method for selecting measure groupings for mixing song data
US9111519B1 (en) 2011-10-26 2015-08-18 Mixwolf LLC System and method for generating cuepoints for mixing song data
GB2539875B (en) * 2015-06-22 2017-09-20 Time Machine Capital Ltd Music Context System, Audio Track Structure and method of Real-Time Synchronization of Musical Content

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5734731A (en) * 1994-11-29 1998-03-31 Marx; Elliot S. Real time audio mixer
EP1162621A1 (fr) * 2000-05-11 2001-12-12 Hewlett-Packard Company, A Delaware Corporation Compilation automatique de chansons
JP4555072B2 (ja) * 2002-05-06 2010-09-29 シンクロネイション インコーポレイテッド ローカライズされたオーディオ・ネットワークおよび関連するディジタル・アクセサリ
WO2007072394A2 (fr) * 2005-12-22 2007-06-28 Koninklijke Philips Electronics N.V. Analyse de structure audio

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8525012B1 (en) 2011-10-25 2013-09-03 Mixwolf LLC System and method for selecting measure groupings for mixing song data
US9070352B1 (en) 2011-10-25 2015-06-30 Mixwolf LLC System and method for mixing song data using measure groupings
US9111519B1 (en) 2011-10-26 2015-08-18 Mixwolf LLC System and method for generating cuepoints for mixing song data
GB2539875B (en) * 2015-06-22 2017-09-20 Time Machine Capital Ltd Music Context System, Audio Track Structure and method of Real-Time Synchronization of Musical Content
GB2550090A (en) * 2015-06-22 2017-11-08 Time Machine Capital Ltd Method of splicing together two audio sections and computer program product therefor
GB2550090B (en) * 2015-06-22 2019-10-09 Time Machine Capital Ltd Method of splicing together two audio sections and computer program product therefor

Also Published As

Publication number Publication date
WO2007036846A3 (fr) 2007-11-29

Similar Documents

Publication Publication Date Title
EP2816550B1 (fr) Analyse de signal audio
Schreiber et al. A Single-Step Approach to Musical Tempo Estimation Using a Convolutional Neural Network.
EP2845188B1 (fr) Évaluation de la battue d'un signal audio musical
EP2659481B1 (fr) Détection d'un changement de scène autour d'un ensemble de points de départ dans des données multimédia
CN104978962B (zh) 哼唱检索方法及系统
Krebs et al. Downbeat Tracking Using Beat Synchronous Features with Recurrent Neural Networks.
EP2854128A1 (fr) Appareil d'analyse audio
EP2962299B1 (fr) Analyse de signaux audio
WO2014001849A1 (fr) Analyse de signal audio
WO2009001202A1 (fr) Procédés et systèmes de similitudes musicales comprenant l'utilisation de descripteurs
Eronen et al. Music Tempo Estimation With $ k $-NN Regression
Uhle et al. Estimation of tempo, micro time and time signature from percussive music
WO2010097870A1 (fr) Dispositif d'extraction de musique
Dixon A beat tracking system for audio signals
JP5605575B2 (ja) 多チャンネル音響信号処理方法、そのシステム及びプログラム
Elowsson et al. Modelling perception of speed in music audio
WO2007036846A2 (fr) Procede et dispositif d'analyse automatique de structures de pistes musicales
CN106970950B (zh) 相似音频数据的查找方法及装置
Wu Musical tempo octave error reducing based on the statistics of tempogram
Dittmar et al. Novel mid-level audio features for music similarity
Gainza et al. Tempo detection using a hybrid multiband approach
Vyas et al. Automatic mood detection of indian music using mfccs and k-means algorithm
Theimer et al. Definitions of audio features for music content description
Davies et al. Comparing mid-level representations for audio based beat tracking
Schreiber et al. Exploiting global features for tempo octave correction

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06809357

Country of ref document: EP

Kind code of ref document: A2

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载