WO2012163013A1 - Procédé et appareil de recherche de musique - Google Patents
Procédé et appareil de recherche de musique Download PDFInfo
- Publication number
- WO2012163013A1 WO2012163013A1 PCT/CN2011/080977 CN2011080977W WO2012163013A1 WO 2012163013 A1 WO2012163013 A1 WO 2012163013A1 CN 2011080977 W CN2011080977 W CN 2011080977W WO 2012163013 A1 WO2012163013 A1 WO 2012163013A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- music
- fingerprint
- queried
- segment
- feature
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000009432 framing Methods 0.000 claims abstract description 26
- 238000000605 extraction Methods 0.000 claims description 24
- 238000006243 chemical reaction Methods 0.000 claims description 18
- 238000001228 spectrum Methods 0.000 claims description 14
- 230000009467 reduction Effects 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 11
- 230000003595 spectral effect Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 9
- 238000005070 sampling Methods 0.000 description 7
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000011946 reduction process Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/121—Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
- G10H2240/131—Library retrieval, i.e. searching a database or selecting a specific musical piece, segment, pattern, rule or parameter set
- G10H2240/141—Library retrieval matching, i.e. any of the steps of matching an inputted segment or phrase with musical database contents, e.g. query by humming, singing or playing; the steps may include, e.g. musical analysis of the input, musical feature extraction, query formulation, or details of the retrieval process
Definitions
- the present invention relates to the field of communications technologies, and in particular, to a music query method and apparatus.
- a music fingerprint referred to as an Audio Fingerprint
- An Audio Fingerprint is defined as a sequence of audio clip features that are processed to characterize the "identity" of the music.
- the method of music pattern recognition and retrieval research is significantly different from the traditional music retrieval based on metadata such as song title and singer.
- the music does not contain all the information of a piece of music, but it can be used to identify a unique piece of music, that is, through the music pattern, the desired music can be queried from the massive data.
- the existing music query technology generally has specific requirements on the length and starting point of the song query segment, and the query efficiency is low.
- Embodiments of the present invention provide a music query method and apparatus to improve the query efficiency of music.
- An embodiment of the present invention provides a music query method, including:
- Determining, according to the fingerprint feature of the segmented segment included in the music segment to be queried, a fingerprint feature matching the fingerprint feature of the music segment to be queried in the fingerprint feature stored in the fingerprint database, and according to the fingerprint of the music segment to be queried The degree of similarity between the feature and the queried fingerprint feature returns the query result.
- the embodiment of the invention further provides a music query device, including:
- An intercepting module configured to intercept a music piece to be queried from a music file to be queried
- a framing module configured to framing the music segment to be queried
- An extracting module configured to extract a fingerprint feature of the segmented segment included in the music segment to be queried, to obtain a fingerprint feature of the music segment to be queried;
- a querying module configured to query, according to the fingerprint feature of the segmented segment included in the music segment to be queried extracted by the extraction module, a fingerprint feature matching the fingerprint feature of the music segment to be queried in the fingerprint feature stored in the fingerprint database ;
- a returning module configured to return a query result according to a similarity degree between the fingerprint feature of the music piece to be queried and the fingerprint feature queried by the query module.
- the music segment to be queried is first intercepted from the music file to be queried, and the music segment to be queried is segmented, and then the fingerprint feature of the framing segment included in the music segment to be queried is extracted to obtain a fingerprint of the music segment to be queried. Finally, according to the fingerprint feature of the segmented segment included in the music segment to be queried, querying the fingerprint feature matched with the fingerprint feature of the music segment to be queried in the fingerprint feature stored in the fingerprint database, and according to the fingerprint of the music segment to be queried The degree of similarity between the feature and the queried fingerprint feature returns the query result.
- the embodiment of the present invention does not require the length and the starting point of the music query segment, and can improve the query efficiency of the music.
- 1 is a flow chart of an embodiment of a music query method according to the present invention
- 2 is a flow chart of an embodiment of a fingerprint feature extraction process of the present invention
- FIG. 3 is a schematic diagram of an embodiment of extracting a spectrum envelope and a dimension reduction process according to the present invention
- FIG. 4 is a schematic structural diagram of an embodiment of a music query apparatus according to the present invention
- FIG. 5 is a schematic structural diagram of another embodiment of a music query apparatus according to the present invention.
- FIG. 6 is a schematic structural diagram of an embodiment of a computer device according to the present invention.
- the technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention.
- the embodiments are a part of the embodiments of the invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
- the music query method may include:
- Step 101 Intercept the music piece to be queried from the music file to be queried, and categorize the music piece to be queried.
- Step 102 Extract fingerprint features of the segmented segments included in the music segment to be queried to obtain fingerprint features of the music segment to be queried.
- Step 103 Query, according to the fingerprint feature of the segmented segment included in the music segment to be queried, a fingerprint feature matching the fingerprint feature of the music segment to be queried in the fingerprint feature stored in the fingerprint database, and according to the fingerprint of the music segment to be queried The degree of similarity between the feature and the queried fingerprint feature returns the query result.
- the known music file needs to be performed first. Framing; then extracting the fingerprint features of the segmented segments included in the known music file to obtain the fingerprint features of the known music files, and finally storing the fingerprint features of the known music files into the fingerprint database.
- extracting the fingerprint feature of the segmented segment included in the music segment to be queried may be: performing time-frequency conversion on the segmented segment included in the music segment to be queried, and performing frequency domain data obtained by time-frequency conversion. ; choose according to the auditory characteristics of the human ear in the frequency domain data after modeling Frequency domain data on a predetermined frequency band; extracting a spectrum envelope of frequency domain data on the predetermined frequency band; performing dimensionality reduction processing on the feature matrix obtained after extracting the spectrum envelope, and obtaining fingerprint features of the frame segment included in the music segment to be queried .
- extracting the fingerprint feature of the segmented segment included in the known music file may be: performing time-frequency conversion on the segmented segment included in the known music file, and performing frequency-modulation on the frequency-domain data obtained by time-frequency conversion; Selecting frequency domain data on a predetermined frequency band in the frequency domain data after modeling according to the human auditory characteristic; extracting a spectral envelope of the frequency domain data on the predetermined frequency band; and performing dimensionality reduction on the feature matrix obtained after extracting the spectral envelope Processing, obtaining fingerprint features of the segmented segments included in the known music file.
- step 103 may be: first, according to the fingerprint feature of the segmented segment included in the music segment to be queried, querying the fingerprint feature of the segmented segment included in the to-be-queried music segment in the fingerprint feature stored in the fingerprint database Matching the fingerprint feature; secondly, reading a predetermined number of fingerprint features in the fingerprint database from the location according to the location of the matched fingerprint feature stored in the fingerprint database in the belonging song, the predetermined number and the music segment to be queried include The number of the fragmented segments is the same; finally, the degree of similarity between the predetermined number of fingerprint features and the fingerprint features of all the segmented segments included in the music segment to be queried is compared, and the query result is returned according to the degree of similarity.
- the above embodiment does not require the length and starting point of the music query segment, and can improve the query efficiency of the music; and in the noisy environment, the music fingerprint query can be effectively completed, and the matching result under the song noise can be returned.
- the music query method provided by the embodiment of the present invention is described in detail in the following three aspects: the music fingerprint extraction process, the fingerprint database establishment process, and the music fingerprint query process.
- the music fingerprint extraction process may include: decoding, downsampling, and fingerprint feature extraction. The following are described separately:
- Decoding process Since the music file is generally encoded and compressed, before extracting the fingerprint feature of the music file, the music file is first decoded, and the music file is decoded into a waveform (Wave; hereinafter referred to as WAV)
- WAV waveform
- the file, the decoded music file has the same sampling rate as the original music.
- the sampling rate of common music files is generally 44KHz or 22KHz.
- Downsampling process Since the sampling rate of music files is generally high, most of the high frequency information is included, which makes it difficult to identify music files. Therefore, for the decoded music files, downsampling processing is also required.
- the music file to be decoded is reduced from a higher sampling rate such as 44KHz or 22KHz to a lower sampling rate. In this embodiment, the decoded music file is uniformly reduced to a sampling rate of 5KHz, and the down sampling processing is performed.
- the subsequent music file is converted into a file of Pulse Code Modulation (hereinafter referred
- FIG. 2 is a flowchart of an embodiment of the fingerprint feature extraction process of the present invention, including:
- Step 201 Perform framing on the music file after the downsampling process.
- the process of framing is implemented by windowing.
- Hanning window can be used for framing, and the window size of Hanning window is 2048 points.
- Step 202 Perform time-frequency conversion on the framed segments obtained by the frame.
- the above-mentioned frame segment is time-frequency-converted in a plurality of manners, and the implementation manner of the time-frequency conversion is not limited in this embodiment.
- a Fast Fourier Transform hereinafter referred to as FFT
- the data repetition rate of two adjacent frames is 31/. 32, that is, the next frame is FFTed with approximately 60 new PCM data relative to the previous frame.
- the value obtained in step 202 is a complex number.
- Step 203 Perform modulo calculation on frequency domain data obtained by time-frequency conversion.
- Step 204 Select frequency domain data on a predetermined frequency band in the frequency domain data after the modulo according to the human ear hearing characteristic.
- 33 sub-bands are selected from the frequency domain data after the modulo according to the auditory characteristics of the human ear, and the frequency distribution of the sub-bands ranges from 0 to 2.5 kHz, and the bandwidth of the above 33 sub-bands is The logarithmic domain is distributed in a linear relationship.
- Step 205 Extract a spectrum envelope of frequency domain data on the predetermined frequency band.
- the method for extracting the spectrum envelope is not limited.
- the frequency envelope of the frequency domain data is extracted by using wavelet transform as an example.
- Wavelet transform is a local transformation of space and frequency. It can perform multi-scale refinement analysis of functions or signals through operations such as scaling and translation, so that information can be extracted from signals effectively.
- the standard Haar wavelet is used to analyze the above-mentioned frequency domain data, and only the maximum 300 wavelet coefficients (according to the absolute value of the spectral energy) are retained, and other coefficients not in the maximum 300 wavelet coefficients are quantized as "00". For each of the largest 300 wavelet coefficients, if it is a positive number, the quantization is "10", otherwise the quantization is "01".
- Step 206 Perform dimension reduction processing on the feature matrix obtained after extracting the spectrum envelope, and obtain fingerprint features of the segmented segments included in the music file.
- the wavelet transform is a high-dimensional 0-1 feature matrix
- dimensionality reduction is required.
- the minimum hash (MinHash) algorithm is used for the dimensionality reduction processing, that is, the arbitrary position of each 0-1 feature matrix is arbitrarily exchanged P times, and the position of the first one is recorded each time; generally speaking, at the 255th position After that, the probability of the first occurrence of 1 is very small, so 255 is uniformly taken as 255; thus, the high-dimensional 0-1 characteristic matrix is compressed into the P-dimensional eigenvalue, and each group of P-dimensional 0 ⁇ 255 integers is called music. A sub-grain of the pattern.
- P 100 can be taken, so that the dimension of the 100-dimensional 0 ⁇ 255 can be obtained after the dimension reduction processing.
- each group of 100-dimensional integers of 0 to 255 is called a sub-grain of the music pattern. .
- FIG. 3 is a schematic diagram of an embodiment of extracting a spectral envelope and a dimensionality reduction process according to the present invention.
- PCM data is read in chronological order, and each frame of data is read in 60 more than the previous frame. PCM data, this process continues until it reaches the end of the PCM data.
- the spectrum envelope of each frame of PCM data is extracted according to the method provided in step 205, and then the feature matrix obtained by extracting the spectrum envelope is subjected to dimensionality reduction processing according to the method provided in step 206, to obtain the fragmented segment included in the music file. Fingerprint feature.
- the fingerprint feature of the segmented segment is referred to as a sub-grain, and the fingerprint feature of the music file is referred to as a tune.
- the tune is a sequence of sub-tones, and the sequence of sub-tones in the sequence The sequence reflects the sequential relationship of the sub-frame segments corresponding to the sub-tones in time.
- Each entry in the index table stores a child note and the child note in the fingerprint database An identification, and a specific time position of the sub-grain in the song to which it belongs.
- Each entry in the music sheet stores the music pattern of a song, that is, all the child music patterns contained in the song.
- the fingerprint feature of the music segment to be queried is first extracted according to the method provided in the above music fingerprint extraction process, and then the fingerprint feature of the segmented segment included in the music segment to be queried is first
- the fingerprint feature matching the fingerprint feature of the segmented segment included in the music segment to be queried is searched in the index table, and then the fingerprint matching the above is found according to the identifier of the matched fingerprint feature saved in the index table.
- the entry corresponding to the feature reads a predetermined number of sub-categories from the position corresponding to the fingerprint pattern and the matching fingerprint feature from the position A note pattern, wherein the predetermined number is the same as the number of frame segments included in the music piece to be queried. Finally, the degree of similarity between the read predetermined number of sub-letter patterns and the fingerprint features of all the sub-frame segments included in the music segment to be queried is compared, and the query result list is returned according to the degree of similarity.
- the music query method provided by the embodiment of the present invention has the following advantages:
- the compression ratio of the music pattern can reach more than 100 times of compression, the compression ratio is large, and the representation is strong.
- the wavelet transform can be used to extract the characteristics of the noise details, and the high-energy information part of the spectrogram is hash-compressed, so that one frame of data is compressed from the original 8192 points to 100 bytes; The data is reduced to a few hundredths of the original data; therefore, the feature compression ratio is large and representative.
- music pattern design has a certain degree of noise resistance.
- the feature matrix is processed by the minimum hash algorithm, which is up to 8192.
- the feature of the dimension is reduced in dimension, and the feature can be similarly calculated by a simple comparison.
- the local sensitive hash is introduced in consideration of the variation characteristics of the local features of the music, and the applicability is strong. , greatly reducing the range of candidate music patterns. Since the noise resistance tolerance is taken into account in the music pattern extraction stage, it does not contain a special denoising system, so the final noise-free music pattern has a certain Noise resistance.
- the embodiment of the present invention can return the query result list according to the degree of similarity.
- Embodiments of the present invention may also perform similarity comparison and metrics of overlapping portions on similar pieces of music.
- the music patterns extracted by the embodiment of the present invention are sequential in time, it is convenient to know the source of the two segments and the position in the belonging song, thereby judging the similarity of the two similar music segments and The proportion of overlap.
- the sequence of the fingerprint data in the fingerprint data and the efficiency of the query ensure the realization of such requirements.
- the music query apparatus may include: an intercepting module 41, a framing module 42, an extracting module 43, a querying module 44, and a returning module 45.
- the intercepting module 41 is configured to intercept the music segment to be queried from the music file to be queried;
- the framing module 42 is configured to framing the music segment to be queried;
- the extracting module 43 is configured to extract a fingerprint feature of the segmented segment included in the music segment to be queried to obtain a fingerprint feature of the music segment to be queried;
- the querying module 44 is configured to query, according to the fingerprint feature of the frame segment included in the music segment to be queried extracted by the extraction module 43 , a fingerprint feature matching the fingerprint feature of the music segment to be queried in the fingerprint feature stored in the fingerprint database;
- the returning module 45 is configured to return the query result according to the degree of similarity between the fingerprint feature of the music piece to be queried and the fingerprint feature queried by the query module 44.
- the music inquiry device does not require the length and the starting point of the music query segment, and can improve the query efficiency of the music; and in the noisy environment, the music fingerprint query can be effectively completed, and the matching result under the song noise can be returned.
- Figure 5 is a schematic diagram of another embodiment of the music query device of the present invention, as shown in Figure 5, the music query device may further include: a storage module 46;
- the framing module 42 is further configured to framing the known music file;
- the extracting module 43 is further configured to extract the fingerprint feature of the framing segment included in the known music file to obtain the known music.
- the fingerprint characteristics of the document are further configured to extract the fingerprint feature of the framing segment included in the known music file to obtain the known music.
- the storage module 46 is configured to store the fingerprint feature of the above-mentioned known music file obtained by the extraction module 43 into the fingerprint database.
- the extraction module 43 may include: a conversion submodule 43 1. a module submodule 432, a selection submodule 433, an envelope extraction submodule 434, and a dimension reduction submodule 435.
- the conversion sub-module 43 1 is configured to perform time-frequency conversion on the frame segment included in the music segment to be queried;
- the module submodule 432 is configured to perform frequency domain data obtained by time-frequency conversion, and the selection sub-module 433 is configured to select a frequency domain on a predetermined frequency band in the frequency domain data obtained by the module sub-module 432 according to the human ear hearing characteristic.
- An envelope extraction submodule 434 configured to extract a spectrum envelope of frequency domain data on the predetermined frequency band
- the dimension reduction sub-module 435 is configured to perform a dimensionality reduction process on the feature matrix obtained by extracting the spectrum envelope by the envelope extraction sub-module 434, and obtain the fingerprint feature of the segmented segment included in the music segment to be queried.
- the query module 44 may include: a feature query sub-module 441 and a feature reading sub-module 442;
- the feature query sub-module 441 is configured to query, according to the fingerprint feature of the segmented segment included in the music segment to be queried, a fingerprint feature stored in the fingerprint database to match a fingerprint feature of the segmented segment included in the to-be-queried music segment. Fingerprint feature
- the feature reading sub-module 442 is configured to read, according to the location of the matched fingerprint feature stored in the fingerprint database, a predetermined number of fingerprint features in the fingerprint database, the predetermined number and the music to be queried.
- the fragment contains the same number of fragmented segments.
- the returning module 45 may compare the degree of similarity between the predetermined number of fingerprint features read by the feature reading sub-module 442 and the fingerprint features of all the segmented segments included in the music segment to be queried, and return the query result according to the similarity degree.
- the music inquiry device does not require the length and the starting point of the music query segment, and can improve the query efficiency of the music; and in the noisy environment, the music fingerprint query can be effectively completed, and the matching result under the song noise can be returned.
- FIG. 6 is a schematic structural diagram of an embodiment of a computer device according to the present invention.
- the computer device in this embodiment can implement the function of the music query device in the embodiment shown in FIG. 4 or FIG. 5, as shown in FIG.
- the system may include: a central processing unit (hereinafter referred to as: CPU) 61, a bus control logic 62, a system bus 63, a memory 64, an interface 65, and an input/output (I/O) subsystem 66.
- the I/O subsystem 66 includes an I/O device 661 and a memory 662.
- the CPU 61 is configured to intercept a music segment to be queried from the music file to be queried, perform framing on the music segment to be queried, and extract a fingerprint feature of the framing segment included in the music segment to be queried to obtain the above. Querying the fingerprint feature of the music segment to be queried, according to the fingerprint feature of the segmented segment included in the extracted music segment, querying the fingerprint feature stored in the fingerprint database to match the fingerprint feature of the music segment to be queried, and according to Querying the similarity between the fingerprint feature of the music piece and the queried fingerprint feature returns the query result; the CPU 61 in this embodiment can implement the intercepting module 41, the framing module 42, and the extraction in the embodiment shown in FIG. 4 or FIG. The functions of module 43 and query module 44.
- the fingerprint database is stored in the memory 662.
- the CPU 61 returns the query result: the CPU 61 sends the query result to the bus control logic 62, and the query result is passed by the bus control logic 62 through the system bus 63 and the interface 65.
- the I/O device 661 sends the query result to the I/O device 661.
- the query result may be cached in the memory 64 first. That is, in this embodiment, the CPU 61, the bus control logic 62, the system bus 63, the memory 64, the interface 65, and the I/O device 661 collectively complete the return module 45 of the embodiment shown in FIG. 4 or FIG. 5 of the present invention.
- the CPU 61 may further framing the known music files to extract the fingerprint features of the segmented segments included in the known music files to obtain the fingerprint features of the known music files.
- the memory 662 is configured to save the fingerprint database, and store the fingerprint feature of the known music file obtained by the CPU 61 into the fingerprint database.
- the memory in this embodiment 662 can implement the functions of memory module 46 in the embodiment of FIG. 5 of the present invention.
- the above computer device does not require the length and starting point of the music query segment, and can improve the query efficiency of the music; and in the noisy environment, the music fingerprint query can be effectively completed, and the matching result under the song noise can be returned.
- modules in the apparatus in the embodiments may be distributed in the apparatus of the embodiment according to the embodiment description, or the corresponding changes may be located in one or more apparatuses different from the embodiment.
- the modules of the above embodiments may be combined into one module, or may be further split into a plurality of sub-modules.
Landscapes
- Engineering & Computer Science (AREA)
- Library & Information Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Collating Specific Patterns (AREA)
Abstract
La présente invention concerne un procédé et un appareil de recherche de musique. Le procédé de recherche de musique consiste à : extraire un segment de musique, devant être recherché, d'un fichier de musique devant être recherché, et tramer le segment de musique, devant être recherché ; extraire des caractéristiques d'empreinte digitale d'un segment de trame contenu dans le segment de musique devant être recherché afin d'obtenir les caractéristiques d'empreinte digitale du segment de musique devant être recherché ; selon les caractéristiques d'empreinte digitale du segment de trame contenu dans le segment de musique devant être recherché, chercher, dans les caractéristiques d'empreinte digitale stockées dans une base de données d'empreintes digitales, les caractéristiques d'empreinte digitale appariées avec les caractéristiques d'empreinte digitale du segment de musique devant être recherché, et renvoyer un résultat de la recherche selon le degré de similarité entre les caractéristiques d'empreinte digitale du segment de musique devant être recherché et les caractéristiques d'empreinte digitale recherchées. La présente invention n'a pas d'exigences relatives à la longueur ni au point de commencement du segment de musique devant être recherché, améliorant de cette façon l'efficacité de la recherche.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201180002170.8A CN103180847B (zh) | 2011-10-19 | 2011-10-19 | 音乐查询方法和装置 |
PCT/CN2011/080977 WO2012163013A1 (fr) | 2011-10-19 | 2011-10-19 | Procédé et appareil de recherche de musique |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2011/080977 WO2012163013A1 (fr) | 2011-10-19 | 2011-10-19 | Procédé et appareil de recherche de musique |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012163013A1 true WO2012163013A1 (fr) | 2012-12-06 |
Family
ID=47258328
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2011/080977 WO2012163013A1 (fr) | 2011-10-19 | 2011-10-19 | Procédé et appareil de recherche de musique |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN103180847B (fr) |
WO (1) | WO2012163013A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018027607A1 (fr) * | 2016-08-10 | 2018-02-15 | 董访问 | Procédé de pousser d'informations pour un système de mise en correspondance et de partage de chanson basé sur un enregistrement sonore |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018027606A1 (fr) * | 2016-08-10 | 2018-02-15 | 董访问 | Procédé d'acquisition de données pour une technologie de mise en correspondance et d'analyse de musique, et système de partage |
WO2018027605A1 (fr) * | 2016-08-10 | 2018-02-15 | 董访问 | Procédé de partage de musique basé sur un enregistrement sonore, et système de partage |
CN107633078B (zh) * | 2017-09-25 | 2019-02-22 | 北京达佳互联信息技术有限公司 | 音频指纹提取方法、音视频检测方法、装置及终端 |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1620684A (zh) * | 2001-05-25 | 2005-05-25 | 多尔拜实验特许公司 | 利用基于听觉事件的表征比较音频 |
CN1628303A (zh) * | 2002-02-06 | 2005-06-15 | 皇家飞利浦电子股份有限公司 | 基于杂乱数据的多媒体对象元数据的快速检索 |
CN1661600A (zh) * | 2004-02-24 | 2005-08-31 | 微软公司 | 生成音频缩略图的系统和方法 |
CN1672211A (zh) * | 2002-05-16 | 2005-09-21 | 皇家飞利浦电子股份有限公司 | 信号处理方法和装置 |
CN1708758A (zh) * | 2002-11-01 | 2005-12-14 | 皇家飞利浦电子股份有限公司 | 改进的音频数据指纹搜索 |
CN1820511A (zh) * | 2003-07-11 | 2006-08-16 | 皇家飞利浦电子股份有限公司 | 用于生成并探测多媒体信号中起到触发标记作用的指纹的方法和设备 |
CN101014953A (zh) * | 2003-09-23 | 2007-08-08 | 音乐Ip公司 | 音频指纹识别系统和方法 |
US20090012638A1 (en) * | 2007-07-06 | 2009-01-08 | Xia Lou | Feature extraction for identification and classification of audio signals |
CN101673262A (zh) * | 2008-09-12 | 2010-03-17 | 未序网络科技(上海)有限公司 | 音频内容的搜索方法 |
CN101673267A (zh) * | 2008-09-12 | 2010-03-17 | 未序网络科技(上海)有限公司 | 音频、视频内容的搜索方法 |
CN101673264A (zh) * | 2008-09-12 | 2010-03-17 | 未序网络科技(上海)有限公司 | 音频内容的搜索装置 |
CN101673266A (zh) * | 2008-09-12 | 2010-03-17 | 未序网络科技(上海)有限公司 | 音频、视频内容的搜索方法 |
CN101882439A (zh) * | 2010-06-10 | 2010-11-10 | 复旦大学 | 一种基于Zernike矩的压缩域音频指纹方法 |
CN102096780A (zh) * | 2010-12-17 | 2011-06-15 | 华中科技大学 | 大规模用户环境下数字指纹的快速检测方法 |
US20110173208A1 (en) * | 2010-01-13 | 2011-07-14 | Rovi Technologies Corporation | Rolling audio recognition |
CN102214219A (zh) * | 2011-06-07 | 2011-10-12 | 盛乐信息技术(上海)有限公司 | 音视频内容检索系统及其方法 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101959191B (zh) * | 2010-09-25 | 2012-12-26 | 华中科技大学 | 一种无线网络安全认证方法及其系统 |
-
2011
- 2011-10-19 WO PCT/CN2011/080977 patent/WO2012163013A1/fr active Application Filing
- 2011-10-19 CN CN201180002170.8A patent/CN103180847B/zh not_active Expired - Fee Related
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1620684A (zh) * | 2001-05-25 | 2005-05-25 | 多尔拜实验特许公司 | 利用基于听觉事件的表征比较音频 |
CN1628303A (zh) * | 2002-02-06 | 2005-06-15 | 皇家飞利浦电子股份有限公司 | 基于杂乱数据的多媒体对象元数据的快速检索 |
CN1672211A (zh) * | 2002-05-16 | 2005-09-21 | 皇家飞利浦电子股份有限公司 | 信号处理方法和装置 |
CN1708758A (zh) * | 2002-11-01 | 2005-12-14 | 皇家飞利浦电子股份有限公司 | 改进的音频数据指纹搜索 |
CN1820511A (zh) * | 2003-07-11 | 2006-08-16 | 皇家飞利浦电子股份有限公司 | 用于生成并探测多媒体信号中起到触发标记作用的指纹的方法和设备 |
CN101014953A (zh) * | 2003-09-23 | 2007-08-08 | 音乐Ip公司 | 音频指纹识别系统和方法 |
CN1661600A (zh) * | 2004-02-24 | 2005-08-31 | 微软公司 | 生成音频缩略图的系统和方法 |
US20090012638A1 (en) * | 2007-07-06 | 2009-01-08 | Xia Lou | Feature extraction for identification and classification of audio signals |
CN101673262A (zh) * | 2008-09-12 | 2010-03-17 | 未序网络科技(上海)有限公司 | 音频内容的搜索方法 |
CN101673267A (zh) * | 2008-09-12 | 2010-03-17 | 未序网络科技(上海)有限公司 | 音频、视频内容的搜索方法 |
CN101673264A (zh) * | 2008-09-12 | 2010-03-17 | 未序网络科技(上海)有限公司 | 音频内容的搜索装置 |
CN101673266A (zh) * | 2008-09-12 | 2010-03-17 | 未序网络科技(上海)有限公司 | 音频、视频内容的搜索方法 |
US20110173208A1 (en) * | 2010-01-13 | 2011-07-14 | Rovi Technologies Corporation | Rolling audio recognition |
CN101882439A (zh) * | 2010-06-10 | 2010-11-10 | 复旦大学 | 一种基于Zernike矩的压缩域音频指纹方法 |
CN102096780A (zh) * | 2010-12-17 | 2011-06-15 | 华中科技大学 | 大规模用户环境下数字指纹的快速检测方法 |
CN102214219A (zh) * | 2011-06-07 | 2011-10-12 | 盛乐信息技术(上海)有限公司 | 音视频内容检索系统及其方法 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018027607A1 (fr) * | 2016-08-10 | 2018-02-15 | 董访问 | Procédé de pousser d'informations pour un système de mise en correspondance et de partage de chanson basé sur un enregistrement sonore |
Also Published As
Publication number | Publication date |
---|---|
CN103180847A (zh) | 2013-06-26 |
CN103180847B (zh) | 2016-03-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102314875B (zh) | 一种音频文件的识别方法和装置 | |
CA2899657C (fr) | Procede et dispositif de reconnaissance audio | |
US8411977B1 (en) | Audio identification using wavelet-based signatures | |
WO2015027751A1 (fr) | Système de récupération de musique basé sur une caractéristique d'empreinte acoustique | |
CN103403710A (zh) | 对来自音频信号的特征指纹的提取和匹配 | |
CN108197319A (zh) | 一种基于时频局部能量的特征点的音频检索方法和系统 | |
Patil et al. | Music genre classification using MFCC, K-NN and SVM classifier | |
EP2791935A1 (fr) | Détection de répétition à faible complexité dans des données multimédia | |
JP2013534645A (ja) | オーディオメディア認識のためのシステム及び方法 | |
CN102063904A (zh) | 一种音频文件的旋律提取方法及旋律识别系统 | |
CN110472097A (zh) | 乐曲自动分类方法、装置、计算机设备和存储介质 | |
CN104050259A (zh) | 一种基于som算法的音频指纹提取方法 | |
WO2012163013A1 (fr) | Procédé et appareil de recherche de musique | |
Thiruvengatanadhan | Music Classification using MFCC and SVM | |
CN110059218A (zh) | 一种基于快速傅里叶逆变换的语音检索方法及系统 | |
CN103294696B (zh) | 音视频内容检索方法及系统 | |
CN105741853A (zh) | 一种基于共振峰频率的数字语音感知哈希方法 | |
CN104900239B (zh) | 一种基于沃尔什-哈达码变换的音频实时比对方法 | |
Yao et al. | An efficient cascaded filtering retrieval method for big audio data | |
Wang et al. | Robust audio fingerprint extraction algorithm based on 2-D chroma | |
CN111382303B (zh) | 一种基于指纹权重的音频样例检索方法 | |
CN108268572B (zh) | 一种歌曲同步方法及系统 | |
You et al. | Music Identification System Using MPEG‐7 Audio Signature Descriptors | |
CN112037815B (zh) | 音频指纹提取方法、服务器、存储介质 | |
Ribbrock et al. | A full-text retrieval approach to content-based audio identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11866713 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11866713 Country of ref document: EP Kind code of ref document: A1 |