WO2017068032A1

WO2017068032A1 - Cross-fading between audio files

Info

Publication number: WO2017068032A1
Application number: PCT/EP2016/075207
Authority: WO
Inventors: Robert TOULSON; Justin PATERSON
Original assignee: Anglia Ruskin University; The University Of West London
Priority date: 2015-10-22
Filing date: 2016-10-20
Publication date: 2017-04-27
Also published as: GB201518717D0

Abstract

A method is provided for forming cross-fades between audio files, which includes providing a plurality of audio files, each audio file comprising a plurality of respective audio streams pertaining to a single song, and providing a library of different fades, wherein each fade defines a respective characteristic shape of change in playback level over time to fade-in or fade-out an audio stream. The method further includes associating a respective fade, selected from the library of fades, with each audio stream, such that a cross-fade from a first one of the audio files to a second one of the audio files can be formed by applying the respective fades of the audio streams of the first audio file to fade-out the first audio file, and simultaneously applying the respective fades of the audio streams of the second audio file to fade-in the second audio file. These fades might be placed intelligently through real-time analysis of the plurality of audio streams and matching of musical features such as harmony. The method can enhance the ability of a content producer to generate musical content that can be changed as desired by a listener, by means of the cross-fade, to reflect, e.g. different preferences, moods or environments.

Description

CROSS-FADING BETWEEN AUDIO FILES

Field of the Invention

The present invention relates to cross-fading between audio files. Background The playback quality of musical audio files on a playback device, such as an MP3 player, can typically be changed by a music listener changing the bass, treble or playback level of the output. Graphic equalizers provide further levels of control. However, these controls give the listener little or no ability to influence the overall style and genre of the music.

Other systems, such as the digital music system presented in WO 2005083675, allow a listener to transition between different audio files which are being performed simultaneously using a cross-fade. Similarly, mixing desks allow a listener to produce cross-fades which transition between different audio files.

These cross-fades are changes in playback level which fade-out a first audio file and simultaneously fade-in a second audio file such that the playback levels of the first and second audio files are reduced and increased respectively. Thus, at the beginning of a cross-fade, the first audio file is usually audible and the second audio file is substantially silent, but by the end of the cross-fade, the second audio file is audible and the first audio file is substantially silent.

Many systems rely on applying a generic cross-fade to all audio file transitions or applying a listener defined cross-fade. For example, US 8787594 presents a cross-fade controller with programmable fades.

Summary

It would be desirable to provide a listener with an enhanced ability to manipulate music playback, e.g. of a given work such as a given song, based on personal preference, mood or environment whilst ensuring the quality of the fade maintains the musical integrity of the music.

Accordingly, in a first aspect, the present invention provides a (preferably real-time) method for forming cross-fades between audio files, the method including: providing a plurality of audio files, each audio file comprising a plurality of respective audio streams (all of which typically pertain to a single work);

providing a library of different fades, wherein each fade defines a respective characteristic shape of change in playback level over time to fade-in or fade-out an audio stream; and

associating a respective fade selected from the library of fades with each audio stream, such that a cross-fade from a first one of the audio files to a second one of the audio files can be formed by applying the respective fades of the audio streams of the first audio file to fade-out the first audio file, and simultaneously applying the respective fades of the audio streams of the second audio file to fade-in the second audio file.

Advantageously, the library of different fades enables each audio stream to be associated with a respective fade which maintains the rhythm, harmony and overall musical integrity of the stream as it is faded in or out. Thus, as all the audio streams of each audio file can be associated with respective preselected fades, the musical integrity of the streams can be maintained, and cross-fades can be formed between any one of the audio files and any one of the other audio files, on demand, with a reduced likelihood of producing undesirable musical clashes. The method thus enhances the ability of a content producer to generate musical content that can be changed as desired by a listener, by means of the cross-fade, to reflect, e.g. different preferences, moods or environments. The method of the first aspect may have any one or, to the extent that they are compatible, any combination of the following optional features.

Each audio file may be a different version of a given work (for example, a different component or arrangement of a given work) such that any point in time in any one of the versions is mappable to an equivalent point in time in the, or each, other version. For example, the work may be a musical work, such as a song. The different versions may then be different musical arrangements of that work. A content producer can then supply musical content in the form of different musical arrangements (e.g. styles and/or genres) of a given work, and the listener can select which arrangement is heard by changing arrangements mid-play. The versions may all be run at the same time such that they are synchronised, but only the currently played audio file (or pair of audio files when one audio file is being cross- faded to another) having a non-zero playback level. Another option is to apply sample- accurate analysis to map from any point in time in one of the versions to an equivalent time in another version (i.e. real-time audio analysis across the versions). In this latter case, as soon as a cross-fade is actioned, the data sample point (i.e. point in time) of the cross-fade is identified in the first audio file and the equivalent data sample point of the second audio file is also identified, allowing the audio files to be cross-faded at corresponding data sample points, thereby ensuring that the two audio streams are synchronised. Such sample- accurate analysis is advantageous because it allows the number of parallel audio streams playing at any one time to be reduced, as long as those that are currently involved in a cross-fade are accurately synchronised at the initiation of the fade.

Each audio file may comprise one or more further respective audio streams, which further streams are unaffected by the cross-fade. This allows a partial cross-fade to be formed in which the respective fades of the audio streams of the first audio file are applied to fade-out the first audio file, and simultaneously the respective fades of the audio streams of the second audio file are applied to fade-in the second audio file, while the further audio streams of the first audio file are maintained un-faded. In this way, mixed versions of a given work can be generated, i.e. the un-faded further audio streams of the first audio file can be played with the faded-in audio streams of the second audio file. The method may further include: defining one or more times in each audio file to perform a cross-fade. Thus a content producer can preselect when cross-fades are to take place.

The shapes of change in playback level of the fades may be selected from the group consisting of: linear, logarithmic, inverted logarithmic, exponential, S-curve, polynomial and step. Other non-linear shapes such as, for example, bespoke fade profiles made up of combinations of these standard shapes may be used.

The audio streams may be mono, stereo, or surround sound.

Second, third and fourth aspects of the present invention provide: a computer program comprising code which, when run on a computer, causes the computer to perform the method of the first aspect; a computer-readable medium storing a computer program comprising code which, when run on a computer, causes the computer to perform the method of the first aspect; and a computer system programmed to perform the method of the first aspect.

For example, a computer system (such as a content management system) for forming cross- fades between audio files may include:

a computer-readable medium or media storing: (1 ) a plurality of audio files, each audio file comprising a plurality of respective audio streams; and (2) a library of different fades, wherein each fade defines a respective characteristic shape of change in playback level over time to fade-in or fade-out an audio stream; and

a user interface configured to receive a user input (and preferably dynamic real-time user input) to: associate a respective fade selected from the library of fades with each audio stream, such that a cross-fade from a first one of the audio files to a second one of the audio files can be formed by applying the respective fades of the audio streams of the first audio file to fade-out the first audio file, and simultaneously applying the respective fades of the audio streams of the second audio file to fade-in the second audio file. The user interface may also be configured to receive a user input to: define one or more times in each audio file to perform a cross-fade.

In a fifth aspect, the present invention provides a method of cross-fading between audio files, the method including:

providing a plurality of audio files, each audio file comprising a plurality of respective audio streams, and each audio file being a different version of a given work such that any point in time in any one of the versions is mapped to an equivalent point in time in the, or each, other version;

providing a respective fade for each audio stream, each fade being selected from a library of different fades, wherein each fade defines a respective characteristic shape of change in playback level over time to fade-in or fade-out an audio stream;

playing a first one of the audio files; and

cross-fading from the first audio file to an equivalent point in time of a second one of the audio files by applying the respective fades of the audio streams of the first audio file to fade-out the first audio file, and simultaneously applying the respective fades of the audio streams of the second audio file to fade-in the second audio file.

Advantageously, as each audio file is mappable onto the other audio files, a cross-fade can be formed between a first one of the audio files and a second one of the audio files such that the correct musical timing of the work is maintained throughout the cross-fade. Further, as each audio stream is associated with a respective fade which is selected from a library of different fades, the likelihood of producing unwanted musical clashes during a cross-fade is reduced. Furthermore, by providing a plurality of audio files, each being a different version of a given work, a listener can cross-fade between different versions of the work such that the style and/or genre of the music is changed. Thus, a listener can more easily swap between different versions of a musical work to suit their mood, personal preference, or environment, with an increased likelihood of maintaining the musical integrity of the work.

The method of the fifth aspect may have any one or, to the extent that they are compatible, any combination of the following optional features. The versions may all be run at the same time such that they are synchronised, but only the currently played audio file (or pair of audio files when one audio file is cross-fading to another) having a non-zero playback level. Another option is to apply sample-accurate (and preferably real-time) analysis to map from any point in time in one of the versions to an equivalent time in another version. Each audio file may comprise one or more further respective audio streams, which further streams are unaffected by the cross-fade. This allows the cross-fade to be a partial cross- fade, in which the respective fades of the audio streams of the first audio file are applied to fade-out the first audio file, and simultaneously the respective fades of the audio streams of the second audio file are applied to fade-in the second audio file, but the further audio streams of the first audio file are maintained un-faded.

The method may further include providing a pre-selection unit which pre-selects the second audio file from the plurality of audio files before the cross-fading from the first audio file to the second audio file. The second audio file may be selected randomly or pseudo-randomly. It may be selected from a subset of audio files defined by a musical content creator. The method may further include providing a user interface configured to receive a user input to initiate a cross-fade; wherein the cross-fade from the first audio file to the second audio file is performed, preferably in real time, on receipt of the user input. For example, the user interface may be a Graphical User Interface (GUI). The user interface may be configured to receive a user input which signifies the duration of the cross-fade, and wherein the duration of the cross-fade from the first audio file to the second audio file which is performed on receipt of the user input is in accordance with the signified duration. The cross fade from the first to the second audio file may be controlled dynamically in real-time by user interaction, and hence may allow the user to influence the rate of cross-fade directly through gesture.

The cross-fade from the first audio file to the second audio file may be performed when one or more streams of the first audio file and one or more corresponding streams of the second audio file are below a threshold playback level. For example, the cross-fade may be programmed to occur only during periods of play of the streams in which their simultaneous playback levels are substantially zero (i.e. near silent, or at least below a specified volume threshold). In particular, the performance of the cross-fade may be delayed (e.g. after receipt of a user input to initiate a cross-fade) until a time is reached in the playing of the first audio file at which both of the streams are below the threshold playback level.

Additionally or alternatively, the cross-fade from the first audio file to the second audio file may be performed when one or more streams of the first audio file and one or more corresponding streams of the second audio file are in periods of play in which corresponding streams meet criteria of a predefined musical (e.g. harmonic) relationship. This can be, for example, when one or more streams of the first audio file and one or more corresponding streams of the second audio file are in periods of play in which the respective musical pitches of corresponding streams are within a predefined frequency range of each other. In particular, cross-fades may be programmed to implement only during periods of play in which the musical pitches of corresponding streams are substantially the same. In particular, the performance of the cross-fade may be delayed (e.g. after receipt of a user input to initiate a cross-fade) until a time is reached in the playing of the first audio file at which the respective musical pitches of the first and second audio files are within the predefined frequency range. However, more complex strategies may applied. Thus rather than merely musical pitch, the predefined musical relationship may be a predefined harmony (e.g.

between two notes or between multiple polyphonic clusters) or other musical pattern. The cross-fade may be implemented according to real-time analysis of the audio content across the plurality of files.

The method may further include: comparing the fades of the audio streams of the first audio file with the fades of the audio streams of the second audio file; and adjusting the

characteristic shape of change in playback level over time of any one or more of the fades.

The shapes of change in playback level of the fades may be selected from the group consisting of: linear, logarithmic, inverted logarithmic, exponential, S-curve, polynomial and step. Other non-linear shapes such as, for example, bespoke fade profiles made up of combinations of these standard shapes may be used. The audio streams may be mono, stereo, or surround sound. The method may further include suspending the cross-fade in an incomplete state such that the first and the second audio files play simultaneously and indefinitely according to the state of completion of the cross-fade. Indeed, the cross-fading may be performed simultaneously from the first audio file to more than one second audio file, the plural cross-fades being suspended in incomplete states such that the first audio file and the second audio files play simultaneously and indefinitely according to the states of completion of the cross-fades.

Further aspects of the present invention provide: a computer program comprising code which, when run on a computer, causes the computer to perform the method of the fifth aspect; a computer-readable medium storing a computer program comprising code which, when run on a computer, causes the computer to perform the method of the fifth aspect; and a computer system programmed to perform the method of the fifth aspect.

For example, a computer-based system (such as a playback device) for cross-fading between audio files may include:

a computer-readable medium or media storing: (1 ) a plurality of audio files, each audio file comprising a plurality of respective audio streams, and each audio file being a different version of a given work such that any point in time in any one of the versions is mapped to an equivalent point in time in the, or each, other version; and (2) a respective fade for each audio stream, each fade being selected from a library of different fades, wherein each fade defines a respective characteristic shape of change in playback level over time to fade-in or fade-out an audio stream;

one or more processors configured to: (1 ) play a first one of the audio files; and (2) cross-fade from the first audio file to an equivalent point in time of a second one of the audio files by applying the respective fades of the audio streams of the first audio file to fade-out the first audio file, and simultaneously applying the respective fades of the audio streams of the second audio file to fade-in the second audio file.

The computer-based system may further include a pre-selection unit which pre-selects the second audio file from the plurality of audio files before the cross-fading from the first audio file to the second audio file.

The system may further include: a user interface configured to receive a user input to initiate a cross-fade, wherein the cross-fade from the first audio file to the second audio file is performed on receipt of the user input, and preferably is dynamically performed on receipt of the user input in real-time. The user interface may be configured to receive a user input which signifies the duration of the cross-fade, and wherein the duration of the cross-fade from the first audio file to the second audio file which is performed on receipt of the user input is in accordance with the signified duration. The user interface may be configured to accept a user input which signifies a suspension of the cross-fade in an incomplete state, such that the first and the second audio files can be played simultaneously and indefinitely according to the state of completion of the cross-fade.

The one or more processors may be configured to perform the cross-fade from the first audio file to the second audio file when one or more streams of the first audio file and one or more corresponding streams of the second audio file are below a threshold playback level. The one or more processors may be configured to perform the cross-fade from the first audio file to the second audio file when one or more streams of the first audio file and one or more corresponding streams of the second audio file are in periods of play in which corresponding streams meet criteria of a predefined musical relationship (e.g. the respective musical pitches of corresponding streams are within a predefined frequency range of each other or possess another polyphonic harmonic relationship). The one or more processors may be configured to: compare the fades of the audio streams of the first audio file with the fades of the audio streams of the second audio file; and adjust the characteristic shape of change in playback level over time of any one or more of the fades on the basis of the comparison. The system may further include: a further user interface configured to receive a (preferably real-time) user input to adjust the characteristic shape of change in playback level over time of any one or more of the fades.

Brief Description of the Drawings

Embodiments of the invention will now be described by way of example with reference to the accompanying drawings in which:

Figure 1 shows an example cross-fade between a first audio file and a second audio file;

Figure 2 shows possible fade shapes;

Figure 3 shows another example cross-fade, in this case at a moment of synchronised silence; Figure 4 shows four playback device Graphical User Interfaces (a) to (d); and Figure 5 shows part of a content management system Graphical User Interface. Detailed Description and Further Optional Features

The ensuing description provides preferred exemplary embodiment(s) only, and is not intended to limit the scope, applicability or configuration of the invention. Rather, the ensuing description of the preferred exemplary embodiment(s) will provide those skilled in the art with an enabling description for implementing a preferred exemplary embodiment of the invention, it being understood that various changes may be made in the function and arrangement of elements without departing from the scope of the invention.

Specific details are given in the following description to provide a thorough understanding of the embodiments. However, it will be understood by one of ordinary skill in the art that embodiments may be practiced without these specific details. For example, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.

As disclosed herein, the term "computer readable medium" may represent one or more devices for storing data, including read only memory (ROM), random access memory (RAM), magnetic RAM, core memory, magnetic disk storage mediums, optical storage mediums, flash memory devices and/or other machine readable mediums for storing information. The term "computer-readable medium" includes, but is not limited to portable or fixed storage devices, optical storage devices, wireless channels and various other mediums capable of storing, containing or carrying instruction(s) and/or data.

Furthermore, embodiments may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine readable medium such as storage medium. A processor(s) may perform the necessary tasks. A code segment may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, etc.

Figure 1 shows a first audio file, including audio streams 1 , 2 and 3, being cross-faded into a second audio file, including audio streams 4, 5 and 6. At point A the first audio file has a non- zero playback level and the second audio file has a zero playback level such that only the first audio file is audible. The cross-fade is initiated after point A to reduce the first audio file to a zero playback level and to increase the second audio file to a non-zero playback level such that only the second audio file is audible at point B.

The first and second audio files are different versions of a given musical work (e.g. they can be different components of a musical work that combine to represent the work) and any point in time in each version is mapped to an equivalent point in time in the other version. In Figure 1 , the first audio file can be a pop song and the second audio file a dance remix of the same song. As any point in time in each version is mapped to an equivalent point in time in the other version, it is possible to cross-fade from the pop song to the dance remix whilst maintaining the correct timing of the musical work and reducing or preventing disruption to the flow of the musical work. Such mapping can be achieved by running both audio files simultaneously such that they are synchronised, but only playing the desired version at a non-zero playback level. Another method of mapping is to use (preferably real-time) sample- accurate analysis for synchronising audio files while playback is already in progress. Each audio stream includes samples, each sample of that stream relating to a different point in time of the corresponding version of the musical work. At a given point in time, a sample from an audio stream of the first audio file may therefore be mapped to a sample from a corresponding audio stream of the second audio file corresponding to the same point in time. Thus, when a cross-fade is initiated, the second file to be faded to can be accessed from data memory and actioned to play back from the exact corresponding sample value as the first audio file, thus ensuring that when the fade commences the two audio files are perfectly mapped and synchronised. Sample-accurate analysis can be particularly advantageous in the context of large systems where many audio files are present and there is insufficient computer processing power to play back all audio files simultaneously at all times. To set up the cross-fade, a content management system can be used to reduce the processing overhead associated with real-time analysis. The system can be used to associate each audio stream of the respective audio file with a fade which defines a respective characteristic shape of change in playback level of the audio stream. As shown in Figure 1 , audio streams 1 , 2 and 3 are associated with a step, a linear and an S-curve shaped fade respectively, which fade-out the respective audio streams, i.e. the shape of the fade is such that when the audio file is played on a playback device, the playback level of the respective streams decrease from an initial higher playback level to a final lower playback level. Audio streams 4, 5 and 6 are associated with a step, a linear and an S-curve shaped fade respectively such that these streams are faded-in, i.e. the shape of the fade is such that when the audio file is played on a playback device, the playback level of the respective streams increase from an initial lower playback level to a final higher playback level.

Figure 2 shows a number of different shapes of possible fades to fade-out an audio stream. As well as the linear, logarithmic, inverted logarithmic, S-curve and step (instantaneous) shaped fades shown, exponential, polynomial and other non-linear shaped fades may be used. Although not shown in Figure 2, these general shapes can also be used to fade-in audio files. The step shaped fade shown in Figure 2 takes a finite period of time to complete. As will be appreciated by a person skilled in the art, the gradients, durations, initial and final playback levels of the fade shapes may be varied whilst maintaining the overall shapes of the fades. A combination of such differently shaped fades produces a library of different fades. A library of fades forms part of a content management system.

Thus the content management system can be used to select the fades of Figure 1 from the library of different fades and associate them with the respective audio files such that loss of the musical integrity of the work is reduced or prevented throughout the cross-fade and a smooth transition is heard between the audio files as the playback head moves from A to B when played on a playback device. For example, a linearly shaped fade may be associated with a smooth cello line fading into a sustained synthesizer performance such that a gradual change is heard. The content management system can be manually operated or it can intelligently evaluate audio streams to identify transient and pitch profiles and/or harmonic profiles, and thus automatically select a suitable fade for each audio stream from the library. Such

identification might be implemented or augmented by forms signal analysis known to the skilled person to enhance its accuracy. The fade can then be fine-tuned by a user (e.g. a programmer or listener) either through the content management system or in real time through a playback system. The user may also adjust the volume of each audio file for corrective purposes via the content management system. The time at which a cross-fade is to be performed in each audio file may be defined. For example, cross-fades may be defined to occur based on elements of the musical structure, such as the beginning of a chorus, on the bar line or on a specific beat of the bar. These times may be defined by a user through the content management system. Advantageously, pre-defining when a fade is to occur can further reduce or prevent loss of musical integrity during a cross-fade.

Once each audio stream has been associated with a respective fade from the library and (optionally) timings for planned cross-fades defined, the content management system outputs a data file of control information that can be used by a playback system. The playback system plays the audio files based on the control information provided by the content management system and user inputs provided in real time.

Advantageously, the data file of control information may be commercialised. This can be as an alternative to the real-time analysis. For example, digital music packages may be sold which contain more than one production version of a song and wherein each audio stream is already associated with a pre-defined fade shape such that the likelihood of the musical integrity of the work being maintained throughout a cross-fade is increased. These digital music packages may be accessed through a bespoke, portable playback device or other playback systems, such as computers, tablets, smartphones and smart-televisions. Music content owners can therefore create bespoke interactive audio products simply and efficiently within the confines of the control information. This control information can be shared with other owners of the song as a social activity. Advantageously, this approach combines more than one production version of a song within a single commercial package, and so represents an attractive concept for commercial music packaging and sale which allows the listener to explore alternative versions and representations of the music. With reference again to Figure 1 , the playback system may compare the fade-outs of audio streams 1 , 2 and 3 with the fade-ins of audio streams 4, 5 and 6. Based on this comparison, the characteristic shape of change in playback level over time of any one or more of the fades may be altered to reduce the occurrence of undesirable musical clashes.

The playback system may include a pre-selection unit which pre-selects a second audio file before cross-fading between the first audio file and the second audio file occurs. The preselection may be pseudo-random. For example, the pre-selection unit may pseudo-randomly pre-select song B, made up of five audio streams (vocals, hand percussion, acoustic guitar, piano and cello), to be cross-faded into from song A, also made up of five audio streams (vocals, drums, bass, electric guitar and synthesiser). The pre-selection unit may be intelligent. For example, if five audio streams make up a "rock" version of a song and five other streams make up an "acoustic" version of the same song, then the user can select to hear an "acoustic rock" version of the song, in which case the pre-selection unit could randomly (e.g. pseudo-randomly) select three streams from each file to cross-fade.

However, the intelligence could ensure that if a drum stream from one file is chosen then a bass stream from that same file is also chosen, to ensure a cohesive rhythm. Yet another option is for a user simply to decide which version and thus which audio file they wish to hear next, this selection may be scheduled via the content management system or made in real-time using the playback system. Thus, a user can multiplex between multiple takes of a given musical performance to produce unique composite performances that are machine switched according to a control algorithm, wherein the algorithm may be user influenced or machine controlled. Figure 3 shows a cross-fade between audio stream 7, part of audio file 7, and audio stream 8, part of audio file 8. Audio stream 7 is initially at a non-zero playback level and has a step shaped fade to a zero playback level. Conversely, audio stream 8 is initially at a zero playback level and has a step shaped fade to a non-zero playback level. Both audio streams are percussion tracks and so step change fades are timed to occur when both audio stream 7 and audio stream 8 have playback levels that are substantially silent and hence in- between percussion beats. Other audio streams with pronounced transients, such as certain styles of vocals, may also benefit from cross-fading during periods of silence or when the playback level is below a configured threshold. Thus the playback system may be configured to cross-fade between the audio files when some or all of the streams of audio file 7 and some or all of the corresponding streams of audio file 8 are substantially zero.

Advantageously, fading between audio files when one or more of the corresponding audio streams' playback levels are substantially zero or are below a threshold can reduce or prevent unwanted musical clashes.

Another option is to cross-fade between audio files when one or more audio streams of the first audio file and one or more corresponding audio streams of the second audio file are in periods of play in which the respective musical pitches of corresponding streams are within a predefined frequency range of each other, or meet criteria of some other predefined musical relationship, such as polyphonic harmony. This relationship may be tied to the harmonic structure across a plurality of audio streams. To facilitate such fading, each audio stream may be scanned by the content management system and a map created of which pitches occur at what point in time in the musical work and for how long. This map may then form part of the data file of control information outputted to the playback system by the content management system. The playback system may then use the map to time the cross-fade to occur during periods of play in which the musical pitches of corresponding streams are substantially the same for a period sufficient to complete the cross-fade. More generally, cross-fades may occur when the same pitch families are present in both of the

corresponding audio streams and thus, fading may occur when a specific chord is present in both audio streams. The same pitch may be defined within a tolerance to allow for tuning discrepancies and vibrato.

Figure 4 shows four possible GUIs (a)-(d) for a playback device. GUIs (a), (b) and (c) have a circular, a triangular, and a square selection zone 30 respectively and each has a selector icon 32 which can be moved within the respective selection zone. Different regions of each selection zone correspond to different audio files. When the icon is moved to a different position within the selection zone by a listener, a cross-fade is initiated between the original audio file corresponding to the icon's original location and the new audio file corresponding to the icon's new location. Similarly, selection of different rectangles 34 on GUI (d) initiates a cross-fade between an initial audio file corresponding to the initial rectangle and a new audio file corresponding to the chosen rectangle. A listener can thus initiate cross-fades between audio files in real time. In GUIs (a), (b) and (c), depending on the position of the selector icon 32 on the selection zone 30, a cross-fade may be run to completion or may be suspended such that first and second audio files can be played simultaneously and indefinitely according to a given state of completion of the cross-fade. GUIs (a) and (b) can also allow two or more suspended cross-fades to be selected, resulting in the simultaneous and indefinite playing of three or more different audio files.

The GUIs of Figure 4 may be configured to receive an input which signifies the duration of the cross-fade. For example, the duration may be proportional to the duration of time the listener holds the icon 32 in its new location or holds down a chosen rectangle 34, or it may be dependent on speed of finger movement. The input may take many forms including a swipe or tap. Thus the duration of the cross-fade can be controlled in real-time, and may indeed be halted and left at a half-way point if desired by the listener. In other words, the listener may choose to accept a playback scenario that sits between different audio files meaning that all tracks (i.e. streams) of the files are audible at different playback levels until another cross-fade is actioned. Through a GUI a listener may therefore control which audio files are cross-faded between, when the cross-fade is initiated and the rate at which the cross-fade occurs. Thus, the listener may manipulate music playback dependent on their personal preference, mood and environment.

Figure 5 shows another example GUI which is part of a content management system. This GUI allows a user (e.g. a programmer or listener) to control the type and shape of eight fades, each associated with a respective audio stream. The type category allows the user to choose between a step fade (type "cut") or a gradient fade (type "fade"), and the shape category allows the user to alter the point of change of the step fade or the form of the gradient fade. Additionally, for each audio stream, the "same playback level" feature, "same pitch" feature and/or "same harmonic analysis" feature can be enabled or disabled to alter the cross-fade's initiation point. By enabling "same playback level" for a specific audio stream, the audio file, comprising all eight audio streams, will be cross-faded by a playback system when that audio stream and the corresponding audio stream in the subsequent audio file are within a predefined playback level of each other. Typically this means that both audio streams are below a threshold playback level. Similarly, by enabling "same pitch" for a specific audio stream, the audio file will be cross-faded by a playback device when the audio stream and the corresponding audio stream in the subsequent audio file are both within a predefined frequency range of each other or meet criteria of some other predefined musical relationship. In a more sophisticated version of the GUI, the "same pitch" feature can be substituted by a "same harmonic analysis" feature, in which the audio file will be cross-faded by a playback device when the audio stream and the corresponding audio stream in the subsequent audio file are both harmonically related to other or meet criteria of some other predefined musical relationship. Thus, a cross-fade can be made to occur only when all the initiation point conditions for all eight audio streams are met.

The content management system GUI may be used to configure audio streams to work in an autonomous "unlinked" manner, or in a "linked" manner which selects the cross-fade initiation point during run-time, dependent on the initiation point settings of the audio stream in question and the respective destination audio stream. For example, in a "linked" manner, an audio stream can be configured to use "same playback level" if both that audio stream and a destination audio stream are "same playback level" enabled (such as may occur when fading from one drum pattern to another drum pattern), but to use a long, smooth, linear fade if the transition is not between two audio streams with "same playback level enabled (such as may occur when fading from a drum pattern to a strummed acoustic guitar part).

While the invention has been described in conjunction with the exemplary embodiments described above, many equivalent modifications and variations will be apparent to those skilled in the art when given this disclosure. Accordingly, the exemplary embodiments of the invention set forth above are considered to be illustrative and not limiting. Various changes to the described embodiments may be made without departing from the spirit and scope of the invention.

Claims

1 . A method for forming cross-fades between audio files, the method including:

providing a plurality of audio files, each audio file comprising a plurality of respective audio streams;

2. A method according to claim 1 , wherein each audio file is a different version of a given work such that any point in time in any one of the versions is mappable to an equivalent point in time in the, or each, other version.

3. A method according to claim 1 or 2, further including: defining one or more times in each audio file to perform a cross-fade.

4. A method of cross-fading between audio files, the method including:

playing a first one of the audio files; and

5. A method according to claim 4 further including providing a pre-selection unit which pre-selects the second audio file from the plurality of audio files before the cross-fading from the first audio file to the second audio file.

6. A method according to claims 4 or 5, further including providing a user interface configured to receive a user input to initiate a cross-fade;

wherein the cross-fade from the first audio file to the second audio file is performed on receipt of the user input.

7. A method according to claim 6, wherein the user interface is configured to receive a user input which signifies the duration of the cross-fade, and wherein the duration of the cross-fade from the first audio file to the second audio file which is performed on receipt of the user input is in accordance with the signified duration.

8. A method according to any one of claims 4 to 7, wherein the cross-fade from the first audio file to the second audio file is performed when one or more streams of the first audio file and one or more corresponding streams of the second audio file are in periods of play in which the streams meet criteria of a predefined musical relationship.

9. A method according to any one of claims 4 to 8, further including:

comparing the fades of the audio streams of the first audio file with the fades of the audio streams of the second audio file; and

adjusting the characteristic shape of change in playback level over time of any one or more of the fades.

10. A method according to any one of the previous claims, wherein the shapes of change in playback level of the fades are selected from the group consisting of: linear, logarithmic, inverted logarithmic, exponential, S-curve, polynomial and step.

1 1 . A computer program comprising code which, when run on a computer, causes the computer to perform the method of any of the previous claims.

12. A computer-readable medium storing the computer program of claim 1 1 .

13. A computer system programmed to perform the method of any of the previous claims.