+

WO2016003206A1 - Procédé et dispositif de traitement de signaux audio multicanal - Google Patents

Procédé et dispositif de traitement de signaux audio multicanal Download PDF

Info

Publication number
WO2016003206A1
WO2016003206A1 PCT/KR2015/006788 KR2015006788W WO2016003206A1 WO 2016003206 A1 WO2016003206 A1 WO 2016003206A1 KR 2015006788 W KR2015006788 W KR 2015006788W WO 2016003206 A1 WO2016003206 A1 WO 2016003206A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
channel
input
output
matrix
Prior art date
Application number
PCT/KR2015/006788
Other languages
English (en)
Korean (ko)
Inventor
백승권
서정일
성종모
이태진
장대영
김진웅
Original Assignee
한국전자통신연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국전자통신연구원 filed Critical 한국전자통신연구원
Priority to DE112015003108.1T priority Critical patent/DE112015003108B4/de
Priority to CN201580036477.8A priority patent/CN106471575B/zh
Priority to CN201911108867.8A priority patent/CN110970041B/zh
Priority to US15/323,028 priority patent/US9883308B2/en
Priority to CN201911107604.5A priority patent/CN110895943B/zh
Priority to CN201911107595.XA priority patent/CN110992964B/zh
Priority claimed from KR1020150094195A external-priority patent/KR102144332B1/ko
Publication of WO2016003206A1 publication Critical patent/WO2016003206A1/fr
Priority to US15/870,700 priority patent/US10264381B2/en
Priority to US16/357,180 priority patent/US10645515B2/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

Definitions

  • the present invention relates to a method and apparatus for processing a multichannel audio signal, and more particularly, to a method and apparatus for processing a multichannel audio signal more efficiently for an N-N / 2-N structure.
  • MPEG Surround is an audio codec for coding multi-channel signals such as 5.1 channel and 7.1 channel. It refers to an encoding and decoding technology capable of compressing and transmitting a multi-channel signal with a high compression rate. MPS has the limitation of backward compatibility in encoding and decoding process. Therefore, the bitstream compressed through the MPS and then transmitted to the decoder must satisfy the constraint that the audio stream can be reproduced in a mono or stereo manner even if the previous audio codec is used.
  • the bitstream transmitted to the decoder must include an encoded mono signal or a stereo signal.
  • the decoder may further receive additional information such that a mono signal or a stereo signal transmitted through the bitstream may be upmixed.
  • the decoder may recover the multichannel signal from the mono signal or the stereo signal using the additional information.
  • the present invention provides a method and apparatus for processing a multichannel audio signal via an N-N / 2-N structure.
  • Multi-channel audio signal processing method comprises the steps of identifying the downmix signal and the residual signal of the N / 2 channel generated from the input signal of the N channel; Applying a downmix signal and a residual signal of the N / 2 channel to a first matrix; Outputs a first signal input to the N / 2 decorrelators corresponding to N / 2 OTT boxes and a second signal transmitted to the second matrix without being input to the N / 2 decorrelators through the first matrix Making; Outputting uncorrelated signals from a first signal through the N / 2 decorrelators; Applying the uncorrelated signal and the second signal to a second matrix; And generating an output signal of the N channel through the second matrix.
  • N / 2 decorrelators may correspond to the N / 2 OTT boxes.
  • the index of the decorrelator may be repeatedly reused according to the reference value.
  • the decorrelator may use N / 2, except for the number of LFE channels, and the LFE channel may not use the decorrelator of the OTT box. .
  • the second matrix may be input with a vector including the second signal, the uncorrelated signal derived from the decorrelator, and the residual signal derived from the decorrelator. have.
  • the second matrix is a spread comprising a vector corresponding to a direct signal consisting of the second signal and a residual signal derived from the decorrelator and an uncorrelated signal derived from the decorrelator.
  • a vector corresponding to the signal may be input.
  • the generating of the N-channel output signal includes, when subband domain time processing (STP) is used, applying a scale factor based on a spread signal and a direct signal to a spread signal portion of the output signal to temporal envelope of the output signal.
  • STP subband domain time processing
  • the generating of the N-channel output signal may flatten and reshape the envelope of the direct signal portion for each channel of the N-channel output signal when guided envelope shaping (GES) is used.
  • GES guided envelope shaping
  • the size of the first matrix may be determined according to the number of channels of the downmix signal applying the first matrix and the number of decorrelators, and the elements of the first matrix may be determined by the CLD parameter or the CPC parameter.
  • a method of processing a multichannel audio signal including: identifying a downmix signal of an N / 2 channel and a residual signal of the N / 2 channel; Inputting an N / 2 channel downmix signal and an N / 2 channel residual signal to the N / 2 OTT boxes to generate an N channel output signal, wherein the N / 2 OTT boxes are not connected to each other;
  • the OTT box which is arranged in parallel without any other and outputs the LFE channel among the N / 2 OTT boxes receives (1) only the downmix signal except the residual signal, and (2) the CLD parameter among the CLD parameter and the ICC parameter. (3) Do not output uncorrelated signal through decorator.
  • An apparatus for processing a multichannel audio signal includes a processor for performing a multichannel audio signal processing method, and the multichannel audio signal processing method includes an N / 2 channel generated from an input signal of N channels. Identifying the downmix signal and the residual signal of the; Applying a downmix signal and a residual signal of the N / 2 channel to a first matrix; Outputs a first signal input to the N / 2 decorrelators corresponding to N / 2 OTT boxes and a second signal transmitted to the second matrix without being input to the N / 2 decorrelators through the first matrix Making; Outputting uncorrelated signals from a first signal through the N / 2 decorrelators; Applying the uncorrelated signal and the second signal to a second matrix; And generating an output signal of the N channel through the second matrix.
  • N / 2 decorrelators may correspond to the N / 2 OTT boxes.
  • the index of the decorrelator may be repeatedly reused according to the reference value.
  • the decorrelator may use N / 2, except for the number of LFE channels, and the LFE channel may not use the decorrelator of the OTT box. .
  • the second matrix may be input with a vector including the second signal, the uncorrelated signal derived from the decorrelator, and the residual signal derived from the decorrelator. have.
  • the second matrix is a spread comprising a vector corresponding to a direct signal consisting of the second signal and a residual signal derived from the decorrelator and an uncorrelated signal derived from the decorrelator.
  • a vector corresponding to the signal may be input.
  • the generating of the N-channel output signal includes, when subband domain time processing (STP) is used, applying a scale factor based on a spread signal and a direct signal to a spread signal portion of the output signal to temporal envelope of the output signal.
  • STP subband domain time processing
  • the generating of the N-channel output signal may flatten and reshape the envelope of the direct signal portion for each channel of the N-channel output signal when guided envelope shaping (GES) is used.
  • GES guided envelope shaping
  • the size of the first matrix may be determined according to the number of channels of the downmix signal applying the first matrix and the number of decorrelators, and the elements of the first matrix may be determined by the CLD parameter or the CPC parameter.
  • an apparatus for processing a multichannel audio signal includes a processor for performing a method for processing a multichannel audio signal, and the method for processing a multichannel audio signal includes an N / 2 channel downmix signal and an N / Identifying a residual signal of two channels; Inputting an N / 2 channel downmix signal and an N / 2 channel residual signal to the N / 2 OTT boxes to generate an N channel output signal,
  • the N / 2 OTT boxes are arranged in parallel without being connected to each other, and an OTT box that outputs an LFE channel among the N / 2 OTT boxes receives (1) only a downmix signal except a residual signal, (2) It uses CLD parameter among CLD parameter and ICC parameter. (3) Does not output uncorrelated signal through decorator.
  • FIG. 1 is a diagram illustrating a 3D audio decoder, according to an exemplary embodiment.
  • FIG. 2 is a diagram for a domain processed by a 3D audio decoder, according to an exemplary embodiment.
  • FIG. 3 illustrates a USAC 3D encoder and a USAC 3D decoder, according to an exemplary embodiment.
  • FIG. 4 is a first diagram illustrating a detailed configuration of a first encoding unit of FIG. 3 according to an embodiment.
  • FIG. 5 is a second diagram illustrating a detailed configuration of a first encoding unit of FIG. 3 according to an embodiment.
  • FIG. 6 is a third diagram illustrating a detailed configuration of the first encoding unit of FIG. 3 according to an embodiment.
  • FIG. 7 is a fourth diagram illustrating a detailed configuration of the first encoding unit of FIG. 3 according to an embodiment.
  • FIG. 8 is a first diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.
  • FIG. 9 is a second diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.
  • FIG. 10 is a third diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.
  • FIG. 11 is a diagram illustrating an example of implementing FIG. 3 according to an embodiment.
  • FIG. 12 is a diagram schematically illustrating FIG. 11 according to an embodiment.
  • FIG. 13 is a diagram illustrating a detailed configuration of a second encoding unit and a first decoding unit of FIG. 12 according to an embodiment.
  • FIG. 14 is a diagram illustrating a result of combining the first encoding unit and the second encoding unit of FIG. 11 and combining the first decoding unit and the second decoding unit, according to an exemplary embodiment.
  • FIG. 15 is a diagram schematically illustrating FIG. 14 according to an embodiment.
  • 16 is a diagram illustrating an audio processing scheme for an N-N / 2-N structure according to an embodiment.
  • 17 is a diagram illustrating an N-N / 2-N structure in a tree form according to an embodiment.
  • FIG. 18 illustrates an encoder and a decoder for an FCE structure according to an embodiment.
  • FIG. 19 illustrates an encoder and a decoder for a TCE structure according to an embodiment.
  • FIG. 20 illustrates an encoder and a decoder for an ECE structure according to an embodiment.
  • 21 illustrates an encoder and a decoder for a SiCE structure according to an embodiment.
  • FIG. 22 illustrates a process of processing an audio signal of 24 channels according to an FCE structure according to an embodiment.
  • FIG. 23 is a diagram illustrating a process of processing an audio signal of 24 channels according to an ECE structure according to an embodiment.
  • 24 is a diagram illustrating a process of processing an audio signal of 14 channels according to an FCE structure according to an embodiment.
  • 25 is a diagram illustrating a process of processing an audio signal of 14 channels according to an ECE structure and a SiCE structure according to an embodiment.
  • FIG. 26 illustrates a process of processing an 11.1 channel audio signal according to a TCE structure according to an embodiment.
  • FIG. 27 illustrates a process of processing an 11.1 channel audio signal according to an FCE structure according to an embodiment.
  • FIG. 28 is a diagram illustrating a process of processing an audio signal of 9.0 channels according to a TCE structure according to an embodiment.
  • 29 is a diagram illustrating a process of processing an audio signal of 9.0 channels according to an FCE structure according to an embodiment.
  • FIG. 1 is a diagram illustrating a 3D audio decoder, according to an exemplary embodiment.
  • a multichannel audio signal may be downmixed at an encoder and a downmix signal may be upmixed at a decoder to restore the multichannel audio signal.
  • the contents of the decoder correspond to FIG. 1.
  • 2 to 29 illustrate a process of processing a multi-channel audio signal, it may correspond to any one component of a bitstream, a USAC 3D decoder, a DRC-1, and a format conversion in FIG. 1.
  • FIG. 2 is a diagram for a domain processed by a 3D audio decoder, according to an exemplary embodiment.
  • the USAC decoder described in FIG. 1 is for coding a core band and processes an audio signal in one of a time domain and a frequency domain.
  • the DRC-1 processes the audio signal in the frequency domain when the audio signal is multiband.
  • Format conversion processes audio signals in the frequency domain.
  • FIG. 3 illustrates a USAC 3D encoder and a USAC 3D decoder, according to an exemplary embodiment.
  • the USAC 3D encoder may include both a first encoder 301 and a second encoder 302.
  • the USAC 3D encoder may include a second encoding unit 302.
  • the USAC 3D decoder may include a first decoding unit 303 and a second decoding unit 304.
  • the USAC 3D decoder may include a first decoding unit 303.
  • N may have a value larger than M.
  • M when N is even, M may be N / 2.
  • M when N is odd, M may be (N-1) / 2 + 1. In summary, it may be expressed as Equation 1.
  • the second encoder 302 may generate a bitstream by encoding the downmix signal of the M channel.
  • the second encoder 302 may encode the downmix signal of the M channel, and a general audio coder may be utilized.
  • the second encoder 302 may encode and transmit 24 channel signals.
  • the N-channel input signal is encoded using only the second encoding unit 302
  • the N-channel input signal is encoded using both the first encoding unit 301 and the second encoding unit 302. More bits are required, and sound quality degradation can also occur.
  • the first decoder 303 may output a M-channel downmix signal by decoding the bitstream generated by the second encoder 302. Then, the second decoding unit 304 may generate an N-channel output signal by upmixing the M-channel downmix signal. The N-channel output signal may be restored similarly to the N-channel input signal input to the first encoding unit 301.
  • the second decoding unit 304 may decode the downmix signal of the M channel, and a general audio coder may be utilized.
  • a general audio coder may be utilized.
  • the second decoding unit 304 is a USAC coder that is an extended HE-AAC
  • the second decoding unit 302 may decode a 24 channel downmix signal.
  • FIG. 4 is a first diagram illustrating a detailed configuration of a first encoding unit of FIG. 3 according to an embodiment.
  • the first encoding unit 301 may include a plurality of downmixing units 401.
  • the N-channel input signals input to the first encoding unit 301 may be configured in pairs of two and then input to the downmixing unit 401.
  • the downmixing unit 401 may represent a two-to-two box.
  • the downmixing unit 401 is a spatial cue (CLD), Inter Channel Correlation / Coherence (ICC), Inter Channel Phase Difference (IPD), Channel Prediction Coefficient (CPC) or OPD, which are spatial cues from the input two input signals.
  • One phase (mono) downmix signal may be generated by extracting (Overall Phase Difference) and downmixing an input signal of two channels (stereo).
  • the plurality of downmixing units 401 included in the first encoding unit 301 may represent a parallel structure. For example, when an input signal of N channels is input to the first encoding unit 301 and N is an even number, the downmixing unit 401 implemented as a TTO box included in the first encoding unit 301 is N / N. Two may be required. In the case of FIG. 4, the first encoding unit 301 may downmix an N-channel input signal through N / 2 TTO boxes to generate a downmix signal of M channels (N / 2 channels).
  • FIG. 5 is a second diagram illustrating a detailed configuration of a first encoding unit of FIG. 3 according to an embodiment.
  • FIG. 4 illustrates a detailed configuration of the first encoding unit 301 when an input signal of N channels is input to the first encoding unit 301 and N is an even number.
  • 5 illustrates a detailed configuration of the first encoding unit 301 when an input signal of N channels is input to the first encoding unit 301 and N is an odd number.
  • the first encoding unit 301 may include a plurality of downmixing units 501.
  • the first encoding unit 301 may include (N-1) / 2 downmixing units 501.
  • the first encoder 301 may include a delay unit 502 to process the other one channel signal.
  • the N-channel input signals input to the first encoding unit 301 may be configured in pairs of two channels and then input to the downmixing unit 501.
  • the downmixing unit 501 may represent a TTO box.
  • the downmixing unit 501 extracts the spatial cues CLD, ICC, IPD, CPC, or OPD from the input two-channel input signals, downmixes the two-channel (stereo) input signals, and downlinks one channel (mono). You can generate a mix signal.
  • the downmix signal of the M channel output from the first encoder 301 is determined according to the number of downmixers 501 and the number of delay units 502.
  • the delay value applied to the delay unit 502 may be the same as the delay value applied to the downmixer 501. If the downmix signal of the M channel, which is an output signal of the first encoding unit 301, is a PCM signal, the delay value may be determined according to Equation 2 below.
  • Enc_Delay represents a delay value applied to the downmixing unit 501 and the delay unit 502.
  • Delay1 QMF Analysis
  • Delay2 Hybrid QMF Analysis
  • 64 the reason why 64 is applied is that Hybrid QMF analysis is performed after QMF analysis is performed for 64 bands.
  • the delay value may be determined according to Equation (3).
  • FIG. 6 is a third diagram illustrating a detailed configuration of the first encoding unit of FIG. 3 according to an embodiment.
  • FIG. 7 is a fourth diagram illustrating a detailed configuration of the first encoding unit of FIG. 3 according to an embodiment.
  • an input signal of the N channel is composed of an input signal of the N 'channel and an input signal of the K channel.
  • an input signal of the N ′ channel is input to the first encoding unit 301, and an input signal of the K channel is not input to the first encoding unit 301.
  • M which is the number of channels corresponding to the downmix signal of the M channel input to the second encoder 301, may be determined by Equation 4.
  • FIG. 6 illustrates a structure of the first encoding unit 301 when N 'is an even number
  • FIG. 7 illustrates a structure of the first encoding unit 301 when N' is an odd number.
  • input signals of the N ′ channel may be input to the plurality of downmixing units 601, and input signals of the K channel may be input to the plurality of delay units 602.
  • the input signal of the N 'channel may be input to the downmixing unit 601 representing N' / 2 TTO boxes, and the input signal of the K channel may be input to the K delay units 602.
  • an input signal of an N ′ channel may be input to the plurality of downmixing units 701 and one delay unit 702.
  • the input signal of the K channel may be input to the plurality of delay units 702.
  • the input signal of the N 'channel may be input to the downmixing unit 701 and one delay unit 702 representing N' / 2 TTO boxes.
  • the input signal of the K channel may be input to the K delay units 702.
  • FIG. 8 is a first diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.
  • the second decoding unit 304 may generate an N-channel output signal by upmixing the M-down channel downmix signal transmitted from the first decoding unit 303.
  • the first decoding unit 303 may decode the downmix signal of the M channel included in the bitstream.
  • the second decoding unit 304 may generate the output signal of the N channel by upmixing the downmix signal of the M channel using the spatial cues transmitted from the second encoding unit 301 of FIG. 3.
  • the second decoding unit 304 may include a plurality of decorrelating units 801 and upmixing units 802.
  • the second decoding unit 304 may include a plurality of uncorrelated units 801, an upmixing unit 802, and a delay unit 803. That is, when N is an even number in the output signal of the N channel, the delay unit 803 may be unnecessary, as shown in FIG. 8.
  • the delay value of the delay unit 803 may be different from the delay value applied in the encoder. 8 illustrates a case where N is an odd number in an N-channel output signal derived from the second decoding unit 304.
  • the delay value of the delay unit 803 may be determined according to Equation 5 below.
  • Dec_Delay represents the delay value of the delay unit 803.
  • Delay1 represents a delay value generated according to QMF analysis
  • Delay2 represents a delay value generated from hybrid QMF analysis
  • Delay3 represents a delay value generated from QMF synthesis.
  • Delay 4 represents a delay value generated when the uncorrelated filter is applied in the uncorrelated unit 801.
  • the delay value of the delay unit 803 may be determined according to Equation 6 below.
  • each of the plurality of uncorrelated units 801 may generate an uncorrelated signal of the downmix signal of the M channel input to the second decoder 304.
  • the uncorrelated signal generated in each of the plurality of decorrelators 801 may be input to the upmixing unit 802.
  • the plurality of uncorrelated units 801 may generate an uncorrelated signal using the downmix signal of the M channel. That is, when using an M-channel downmix signal transmitted from an encoder to generate an uncorrelated signal, sound quality degradation may not occur when reproducing a sound field of a multi-channel signal.
  • the M uncorrelated signals generated by using the downmix signal of the M channel are It can be defined as.
  • the output signal of the N channel output through the second decoding unit 304 is It can be defined as.
  • the second decoding unit 304 may generate an output signal of the N channel according to Equation 7 below.
  • M (n) means a matrix for performing upmixing on the downmix signal of M channels at n sample times.
  • M (n) may be defined by the following equation (8).
  • Equation 8 0 is a 2x2 zero matrix. May be defined as Equation 9 as a 2 ⁇ 2 matrix.
  • the spatial cues actually transmitted from the encoder can be determined for each b index, which is a frame unit, and is applied on a sample basis. May be determined by interpolation between frames adjacent to each other.
  • Equation 10 May be determined by Equation 10 according to the MPS method.
  • Equation 10 Can be derived from the CLD. And, Wow Can be derived from CLD and ICC. Equation 10 may be derived according to the processing method of the spatial queue defined in the MPS.
  • Equation 7 Denotes an operator for interlacing each element of the vectors to create a new vector column.
  • equation (7) May be determined according to Equation 11 below.
  • Equation 7 may be represented by Equation 12 below.
  • Equation 12 ⁇ is used to clearly indicate the processing of the input signal and the output signal.
  • the downmix signal of the M channel and the uncorrelated signal may be paired with each other, and may be an input of Equation 12, which is an upmixing matrix. That is, according to Equation 12, by applying an uncorrelated signal to each of the downmix signals of the M channel, the distortion of sound quality during the upmixing process can be minimized, and the sound field effect can be generated as close to the original signal as possible. .
  • Equation 12 described above may also be represented by Equation 13 below.
  • FIG. 9 is a second diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.
  • the second decoding unit 304 may decode an M-channel downmix signal transmitted from the first decoding unit 303 to generate an N-channel output signal.
  • the second decoding unit 304 may also process the result reflected by the encoder.
  • the second decoding unit 304 may control the plurality of delay units 903. It may include.
  • the second decoding unit 304 may have a structure as shown in FIG. 9. If N 'is an even number for the downmix signal of the M channel satisfying Equation 4, one delay unit 903 located below the upmixing unit 902 in the second decoding unit 304 of FIG. May be excluded.
  • FIG. 10 is a third diagram illustrating a detailed configuration of a second decoding unit of FIG. 3 according to an embodiment.
  • the second decoding unit 304 may generate an N-channel output signal by upmixing an M-channel downmix signal transmitted from the first decoding unit 303.
  • the upmixing unit 1002 may include a plurality of signal processing units 1003 representing a one-to-two box.
  • each of the plurality of signal processing units 1003 generates two channels of output signals using the downmix signal of one channel among the downmix signals of the M channel and the uncorrelated signal generated by the uncorrelated unit 1001. can do.
  • the plurality of signal processing units 1003 arranged in parallel in the upmixing unit 1002 may generate output signals of the N-1 channel.
  • the delay unit 1004 may be excluded from the second decoding unit 304. Then, the plurality of signal processing units 1003 arranged in parallel in the upmixing unit 1002 may generate output signals of N channels.
  • the signal processor 1003 may upmix according to Equation 13.
  • the upmixing process performed by all the signal processing units 1003 may be represented by one upmixing matrix as shown in Equation 12.
  • FIG. 11 is a diagram illustrating an example of implementing FIG. 3 according to an embodiment.
  • the first encoding unit 301 may include a plurality of downmixing units 1101 and a plurality of delay units 1102 of the TTO box.
  • the second encoding unit 302 may include a plurality of USAC encoders 1103.
  • the first decoding unit 303 may include a plurality of USAC decoders 1106, and the second decoding unit 304 may include a plurality of upmixing units 304 and a plurality of delay units 1108 of the OTT box. It may include.
  • the first encoding unit 301 may output a downmix signal of M channels by using an input signal of N channels.
  • the downmix signal of the M channel may be input to the second encoding unit 302.
  • pairs of downmix signals of one channel which are passed through the downmixing unit 1101 of the TTO box, among the downmix signals of the M channel, in a stereo form in the USAC encoder 1103 included in the second encoding unit 302. Can be encoded.
  • the downmix signal which has passed through the delay unit 1102 without passing through the downmixing unit 1101 of the TTO box, may be encoded in the mono form or the stereo form by the USAC encoder 1103.
  • the downmix signal of one channel of the downmix signal of the M channel which has passed through the delay unit 1102 may be encoded in the mono form by the USAC encoder 1103.
  • the downmix signals of two channels which have passed through the two delay units 1102 of the downmix signals of the M channel, may be encoded in a stereo form by the USAC encoder 1103.
  • the M channel signals may be encoded by the second encoding unit 302 to generate a plurality of bitstreams.
  • the plurality of bitstreams may be reformatted into one bitstream through the multiplexer 1104.
  • the bitstream generated by the multiplexer 1104 is transferred to the demultiplexer 1104, and the demultiplexer 1105 corresponds to a plurality of bitstreams corresponding to the USAC decoder 303 included in the first decoder 303. It can demultiplex into bitstreams of.
  • the plurality of demultiplexed bitstreams may be input to the USAC decoder 1106 included in the first decoding unit 303, respectively.
  • the USAC decoder 303 may decode according to a method encoded by the USAC encoder 1103 included in the second encoding unit 302. Then, the first decoding unit 303 may output the downmix signal of the M channel from the plurality of bitstreams.
  • the second decoding unit 304 may generate an output signal of the N channel using the downmix signal of the M channel.
  • the second decoding unit 304 may upmix a portion of the downmix signal of the input M channel using the upmixing unit 1107 of the OTT box.
  • the downmix signal of one channel of the downmix signals of the M channel is input to the upmixing unit 1107, and the upmixing unit 1107 uses a signal uncorrelated with the downmix signal of one channel to 2.
  • the output signal of the channel can be generated.
  • the upmixing unit 1107 may generate two channels of output signals using Equation 13.
  • each of the plurality of upmixing units 1107 performs upmixing M times by using an upmixing matrix corresponding to Equation 13, so that the second decoding unit 304 generates an N-channel output signal.
  • M in Equation 12 may be equal to the number of upmixing units 1107 included in the second decoding unit 304. Can be.
  • the first encoder 301 of the N channel input signals includes the K channel audio signal from the M channel downmix signal through the delay unit 1102 instead of the downmixing unit 1101 of the TTO box.
  • the K-channel audio signal may be processed by the delay unit 1108 instead of the upmixing unit 1107 of the OTT box by the second decoding unit 304.
  • the number of channels of the output signal output through the upmixing unit 1107 may be N-K.
  • FIG. 12 is a diagram schematically illustrating FIG. 11 according to an embodiment.
  • N-channel input signals may be input to the downmixing unit 1201 included in the first encoding unit 301 in pairs of two channels.
  • the downmixer 1201 may be configured as a TTO box, and downmix the two input signals to generate one downmix signal.
  • the first encoding unit 301 may generate an M-channel downmix signal from the N-channel input signals by using the plurality of downmixing units 1201 arranged in parallel.
  • N is an integer greater than M
  • M may be N / 2.
  • the stereotype USAC encoder 1202 included in the second encoder 302 may generate a bitstream by encoding two downmix signals output from the two downmixers 1201. .
  • the USAC decoder 1203 of the stereo type included in the first decoder 303 may restore two downmix signals of one channel from the downmix signal of M channels from the bitstream.
  • Two one-channel downmix signals may be input to two upmixing units 1204 respectively representing OTT boxes included in the second decoding unit 304. Then, the upmixing unit 1204 may generate two channel output signals constituting the N channel output signals using signals uncorrelated with one channel downmix signal.
  • FIG. 13 is a diagram illustrating a detailed configuration of a second encoding unit and a first decoding unit of FIG. 12 according to an embodiment.
  • the USAC encoder 1302 included in the second encoding unit 302 may include a downmixing unit 1303, a spectral band replication (SBR) unit 1304, and a core encoding unit 1305 of the TTO box. have.
  • SBR spectral band replication
  • the downmixing unit 1301 of the TTO box included in the first encoding unit 301 downmixes two input signals of the N channel input signals to form one downmix signal of the M channel. You can generate a signal.
  • the number of channels of the M channel may be determined according to the number of the downmixing units 1301.
  • the downmixer 1303 may generate a downmix signal of one channel by downmixing a pair of downmix signals of one channel output from the two downmixers 1301.
  • the SBR unit 1304 may extract only the low frequency band excluding the high frequency band from the mono signal. Then, the core encoding unit 1305 may generate a bitstream by encoding the mono signal of the low frequency band corresponding to the core band.
  • a TTO type downmixing process may be continuously performed to generate a bitstream including an M channel downmix signal from an N channel input signal.
  • the downmixing unit 1301 of the TTO box may downmix two channel input signals having a stereo form among the N channel input signals.
  • the result output from each of the two downmixing units 1301 may be input to the downmixing unit 1303 of the TTO box as a part of the downmix signal of the M channel. That is, four of the N-channel input signals may be continuously output as one-channel downmix signals through TTO-type downmixing.
  • the bitstream generated by the second encoder 302 may be input to the USAC decoder 1306 of the first decoder 302.
  • the USAC decoder 1306 included in the second encoding unit 302 may include a core decoding unit 1307, an SBR unit 1308, and an upmixing unit 1309 of an OTT box.
  • the core decoding unit 1307 may output a mono signal of the core band corresponding to the low frequency band using the bitstream. Then, the SBR unit 1308 may restore the high frequency band by copying the low frequency band of the mono signal.
  • the upmixing unit 1309 may generate a stereo signal constituting the downmix signal of the M channel by upmixing the mono signal output from the SBR unit 1308.
  • the upmixing unit 1310 of the OTT box included in the second decoding unit 304 may generate a stereo signal by upmixing the mono signal included in the stereo signal generated by the first decoding unit 302. .
  • an OTT-type upmixing process may be performed in parallel to recover an N-channel output signal from a bitstream.
  • the upmixing unit 1309 of the OTT box may generate a stereo signal by upmixing a mono signal (one channel).
  • the two mono signals constituting the stereo signal as the output signal of the upmixing unit 1309 may be input to the upmixing unit 1310 of the OTT box.
  • the upmixing unit 1301 of the OTT box may output a stereo signal by upmixing the input mono signal. That is, four channels of the output signal can be generated by continuously mixing the mono signal in the OTT form.
  • FIG. 14 is a diagram illustrating a result of combining the first encoding unit and the second encoding unit of FIG. 11 and combining the first decoding unit and the second decoding unit, according to an exemplary embodiment.
  • the first encoding unit and the second encoding unit of FIG. 11 may be combined to be implemented as one encoding unit 1401 as illustrated in FIG. 14.
  • the first decoding unit and the second decoding unit of FIG. 11 are combined to show a result implemented by one decoding unit 1402 as shown in FIG. 14.
  • the encoding unit 1401 of FIG. 14 further includes a downmixing unit 1404 of the TTO box in a USAC encoder including a downmixing unit 1405, an SBR unit 1406, and a core encoding unit 1407 of the TTO box.
  • An encoding unit 1403 may be included.
  • the encoding unit 1401 may include a plurality of encoding units 1403 arranged in a parallel structure.
  • the encoding unit 1403 may correspond to a USAC encoder including the downmixing unit 1404 of the TTO box.
  • the encoding unit 1403 may generate a mono signal of one channel by continuously applying a TTO-type downmixing to four input signals of N channels.
  • the decoding unit 1402 of FIG. 14 includes an upmixing unit 1404 of an OTT box to a USAC decoder that includes a core decoding unit 1411, an SBR unit 1412, and an upmixing unit 1413 of an OTT box. It may include a decoding unit 1410 further comprising. In this case, the decoding unit 1402 may include a plurality of decoding units 1410 arranged in a parallel structure. Alternatively, the decoding unit 1410 may correspond to a USAC decoder including the upmixing unit 1404 of the OTT box.
  • the decoding unit 1410 may generate an output signal of four channels of the output signals of the N channel by continuously applying the OTT-type upmixing to the mono signal.
  • FIG. 15 is a diagram schematically illustrating FIG. 14 according to an embodiment.
  • the encoding unit 1501 may correspond to the encoding unit 1403 of FIG. 14.
  • the encoding unit 1501 may correspond to the modified USAC encoder. That is, the modified USAC encoder additionally includes the downmixing unit 1503 of the TTO box in the original USAC encoder including the downmixing unit 1504 of the TTO box, the SBR unit 1505 and the core encoding unit 1506. Can be implemented.
  • the decoding unit 1502 may correspond to the decoding unit 1410 of FIG. 14.
  • the decoding unit 1502 may correspond to the modified USAC decoder. That is, the modified USAC decoder further includes the upmixing unit 1510 of the OTT box in the original USAC decoder including the core decoding unit 1507, the SBR unit 1508, and the upmixing unit 1509 of the OTT box. Can be implemented.
  • 16 is a diagram illustrating an audio processing scheme for an N-N / 2-N structure according to an embodiment.
  • an N-N / 2-N structure in which a structure defined in MPEG SURROUND is changed is illustrated.
  • spatial synthesis may be performed in a decoder as shown in Table 1.
  • Spatial synthesis can transform the input signals from the time domain into a non-uniform subband domain through a hybrid Quadrature Mirror Filter (QMF) analysis bank.
  • QMF Quadrature Mirror Filter
  • irregular corresponds to a hybrid.
  • the decoder then operates in the hybrid subband.
  • the decoder may generate an output signal from the input signals by performing spatial synthesis based on the spatial parameters passed by the encoder.
  • the decoder can then use the hybrid QMF synthesis bank to inverse the output signals from the hybrid subband to the time domain.
  • FIG. 16 illustrates a process of processing a multi-channel audio signal through a mixed matrix of spatial synthesis performed by a decoder.
  • MPEG SURROUND defines a 5-1-5 structure, a 5-2-5 structure, a 7-2-7 structure, and a 7-5-7 structure, but the present invention proposes an N-N / 2-N structure.
  • the decoder may generate the N-channel output signal by upmixing the N / 2 channel downmix signal.
  • the number of N channels in the N-N / 2-N structure of the present invention is not limited. That is, the N-N / 2-N structure may support not only a channel structure supported by the MPS but also a channel structure of a multichannel audio signal not supported by the MPS.
  • NumInCh refers to the number of channels of the downmix signal
  • NumOutCh refers to the number of channels of the output signal.
  • NumInCh is N / 2
  • NumOutCh is N.
  • NumInCh is N / 2
  • X0 to X NumInCh ⁇ 1 represent downmix signals of N / 2 channels.
  • N the number of one-to-two (OTT) boxes is N / 2
  • N the number of channels of the output signal, must be even to process the downmix signal of the N / 2 channel.
  • the input vector X to be multiplied by means a vector including the downmix signal of the N / 2 channel.
  • N / 2 decorrelators may be used to the maximum. However, if N, the channel number of the output signal, exceeds 20, the filters of the decorrelator can be reused.
  • N which is the number of channels of the output signal in the N-N / 2-N structure, needs to be less than twice the limited specific number (ex. N ⁇ 20). If the LFE channel is included in the output signal, the N channel needs to be configured with a smaller number of channels (eg, N ⁇ 24) than more than twice the specific number in consideration of the number of LFE channels.
  • the output result of the decorrelators may be replaced with the residual signal for a specific frequency region depending on the bitstream. If the LFE channel is one of the outputs of the OTT box, no decorrelator is used for the OTT box based on the upmix.
  • the decorrelators labeled M (ex. NumInCh-NumLfe) from 1, the output result (uncorrelated signal) of the decorrelator, and residual signals correspond to different OTT boxes.
  • d 1 ⁇ d M means uncorrelated signal which is the output result of the decorrelator (D 1 ⁇ D M )
  • res 1 ⁇ res M means the residual signal which is the output result of the decorrelator (D 1 ⁇ D M ) do.
  • the decorrelators D1 to DM correspond to different OTT boxes, respectively.
  • vectors and matrices used in the NN / 2-N structure are defined.
  • Input signals to decorators in N-2 / NN structures are vectors Is defined as
  • the vector in equation (14) Of elements in To May be input directly to the matrix M2 without being input to the N / 2 decorrelators corresponding to the N / 2 OTT boxes. so, To May be defined as a direct signal. And vector Of elements in To Signals other than To ) May be input to the N / 2 decorrelators corresponding to the N / 2 OTT boxes.
  • vector Is composed of a direct signal, d 1 to d M which are decorrelated signals output from decorrelators, and res 1 to res M which are residual signals output from decorrelators. vector May be determined by Equation 15 below.
  • Is Means a set of all k satisfying And, Signal Fall decorator When input to, it means the uncorrelated signal output from the decorator.
  • Is the OTT box is OTTx and the residual signal is In the case of means the signal output from the decorator.
  • the subbands of the output signal can be defined dependently for all time slots n and all hybrid subbands k.
  • Output signal Can be determined by Equation 16 through the vector w and the matrix M2 .
  • Equation 17 Denotes a matrix M2 composed of NumOutCh rows and NumInCh-NumLfe columns. Is Can be defined by Equation 17 below.
  • the hybrid synthesis filter bank is a combination of the QMF synthesis bank through the Nyquist synthesis banks, Can be transformed from the hybrid subband domain to the time domain through a hybrid synthesis filterbank.
  • vectors Is the same as described above, but the vector May be divided into two vectors as shown in Equation 19 and Equation 20 below.
  • Is Means a set of all k satisfying Also, decorator Input signal to Is entered, Decorator Means the uncorrelated signal output from.
  • Equation 20 Wow The final output signal is Wow It can be divided into. Includes a direct signal, Includes a diffuse signal. In other words, Is the result derived from the direct signal input directly to the matrix M2 without passing through the decorrelator, Is the result derived from the spread signal output from the decorrelator and input to the matrix M2.
  • a spreading signal is generated through the decorrelator for spatial synthesis.
  • the generated spread signal may be mixed with the direct signal.
  • the temporal envelope of the spread signal does not match the envelope of the direct signal.
  • subband domain time processing is used to shape the envelope of each spreading signal portion of the output signal to match the temporal shape of the downmix signal transmitted from the encoder.
  • processing may be implemented with envelope estimation, such as envelope ratio calculation for direct and spread signals or shaping of the upper spectral portion of the spread signal.
  • the temporal energy envelope of the portion corresponding to the direct signal and the portion corresponding to the spread signal in the output signal generated through upmixing can be estimated.
  • the shaping factor may be calculated as the ratio between the temporal energy envelope for the portion corresponding to the direct signal and the portion corresponding to the spread signal.
  • STP May be signaled as. if, If, the spread signal portion of the output signal generated through upmixing can be processed via STP.
  • the downmix of the spatial upmix is approximated with the transmitted original downmix signal ( approximation).
  • the direct downmix signal for (NumInCh-NumLfe) can be defined by Equation 21 below.
  • the envelopes of the downmix broadband envelopes and the spread signal portion of each upmix channel can be estimated according to Equation 22 using normalized direct energy.
  • Means a bandpass factor Denotes a spectral flattering factor.
  • the scale factor for the NN / 2-N structure Can be defined.
  • the scale factor is then applied to the spread signal portion of the output signal, thereby mapping the temporal envelope of the output signal to substantially the temporal envelope of the downmix signal.
  • the spread signal portion processed by the scale factor in each channel of the output signals of the N channels may be mixed with the direct signal portion.
  • it may be signaled whether the extension signal portion has been processed in the scale factor for each channel of the output signal. ( ) Indicates that the extension signal portion was processed with the scale factor.)
  • GES can recover the broadband envelope of the synthesized output signal.
  • GES includes a modified upmixing process after flattening and reshaping the envelope for the direct signal portion for each channel of the output signal.
  • additional information of a parametric broadband envelope included in the bitstream may be used.
  • the additional information includes the envelope ratio of the envelope of the original input signal and the envelope of the downmix signal.
  • the envelope ratio at the decoder may be applied to the direct signal portion of each time slot included in the frame for each channel of the output signal.
  • the GES does not alter the spread signal portion for each channel of the output signal.
  • the extension signal and the direct signal of the output signal may be respectively synthesized using the post mixing matrix M2 modified in the hybrid subband domain according to Equation (24).
  • Equation 24 the direct signal portion for the output signal y provides the direct signal and the residual signal, and the extension signal portion for the output signal y provides the extension signal. In total, only the direct signal can be processed by the GES.
  • the result of processing the GES may be determined according to Equation 25 below.
  • the GES can extract an envelope for a particular channel of the upmixed output signal from the downmix signal by the downmix signal and decoder that performs spatial synthesis except the LFE channel depending on the tree structure.
  • Output signal in NN / 2-N structure May be defined as shown in Table 3 below.
  • the input signal in the NN / 2-N structure May be defined as shown in Table 4 below.
  • downmix signals in NN / 2-N structures May be defined as shown in Table 5 below.
  • the matrix M1 (defined for all time slots n and all hybrid subbands k) ) And the matrix M2 ( ) Will be described. These matrices are defined for a given parameter time slot and given processing band m based on the parameter time slot and the CLD, ICC and CPC parameters valid for the processing band. And Interpolated version of.
  • Matrix M1 may be expressed as a free matrix.
  • the size of the matrix M1 depends on the number of channels of the downmix signal input to the matrix M1 and the number of decorrelators used in the decoder.
  • the elements of the matrix M1 may be derived from the CLD and / or CPC parameters.
  • M1 may be defined by Equation 26 below.
  • Matrix for Matrix M1 May be defined as follows.
  • OTT box matrix May be defined differently according to the channel structure.
  • all channels of an input signal may be input in pairs by 2 channels to the OTT box. So, for the NN / 2-N structure, the number of OTT boxes is N / 2.
  • the matrix I is a vector containing the input signal It depends on the number of OTT boxes equal to its column size.
  • Lfe upmixes based on OTT boxes are not considered in the NN / 2-N architecture since no decorrelator is needed.
  • matrix All elements of may be either 1 or 0.
  • Equation 28 In the NN / 2-N structure May be defined by Equation 28 below.
  • OTT boxes in the NN / 2-N architecture represent a parallel processing satge, not a cascade. Therefore, all OTT boxes in the NN / 2-N structure are not connected to any other OTT boxes. So, matrix is unit matrix And unit matrix It can be configured as. In this case, the unit matrix May be a unit matrix of size N * N.
  • Calibration factor matrix It can be applied to the downmix signal or an externally supplied downmix signal.
  • Matrix in NN / 2-N structure May be defined by Equation 29 below.
  • Means a unit matrix indicating NumInch * NumInCh size Denotes a zero matrix representing NumInch * NumInCh size.
  • the number of channels of the downmix signal may be more than five.
  • the inverse matrix H is a vector of input signals for all parameter sets and processing bands. It may be a unit matrix having the same size as the number of columns of.
  • matrix M2 Defines how to combine the direct and uncorrelated signals to regenerate the multi-channel output signal. May be defined by Equation 32 below.
  • the element of can be calculated from the equivalent model of the OTT box.
  • the OTT box includes a decorrelator and a mixing section.
  • the mono input signal input to the OTT box is transmitted to the decorrelator and the mixing unit, respectively.
  • the mixing unit may generate a stereo output signal using a mono input signal, an uncorrelated signal output through the decorrelator, and the CLD and ICC parameters.
  • the CLD controls localization in the stereo field
  • the ICC controls the stereo wideness of the output signal.
  • Equation 34 the result output from any OTT box can be defined by Equation 34 below.
  • OTT box Labeling as ( ), Time slot for OTT box And parameter bands Denotes an element of an arbitrary matrix.
  • the post gain matrix may be defined as in Equation 35 below.
  • CLD and ICC may be defined by Equation 37 below.
  • decorrelators may be performed by a reverberation filter in the QMF subband domain.
  • Reverberation filters exhibit different filter characteristics based on which hybrid subband currently corresponds to all hybrid subbands.
  • the reverberation filter is an IIR grating filter.
  • the IIR grating filters have different filter coefficients for different decorrelators to produce mutually uncorrelated orthogonal signals.
  • the uncorrelated process carried out by the decorator is carried out in several processes.
  • the output of matrix M1 Is entered into the set of all-pass uncorrelated filters.
  • the filtered signals can then be energy shaped.
  • energy shaping is shaping the spectral or temporal envelope to match uncorrelated signals more closely to the input signal.
  • the uncorrelated filter consists of a plurality of all-pass (IIR) regions preceded by a fixed frequency-dependent delay.
  • the frequency axis may be divided into different regions so as to correspond to the QMF division frequency.
  • the length of the delay and the length of the filter coefficient vectors are the same.
  • the filter coefficients of the decorrelator with fractional delay due to additional phase rotation depend on the hybrid subband index.
  • the filters of the decorrelators have different filter coefficients to ensure orthogonality between the uncorrelated signals output from the decorrelators.
  • N / 2 decorrelators are required.
  • the number of decorrelators may be limited to ten.
  • the decorators are more than 10 OTT boxes according to 10 basis modulo operations. It can be reused corresponding to the number of.
  • Table 6 shows the index of the uncorrelator in the decoder of the NN / 2-N structure.
  • the N / 2 decorrelators are indexed by 10 units. That is, the 0th decorator and the 10th decorator Have the same index.
  • N-N / 2-N structure For the N-N / 2-N structure, it may be implemented by the syntax of Table 7.
  • bsTreeConfig may be implemented by Table 8.
  • bsNumInCh which is the number of channels of the downmix signal in the N-N / 2-N structure, may be implemented as shown in Table 9 below.
  • the number of LFE channels among the output signals is May be implemented as shown in Table 10 below.
  • the channel order of the output signal may be implemented as shown in Table 11 according to the number of channels of the output signal and the number of LFE channels.
  • the audioChannelLayout shows the layout of the loudspeakers for actual playback.
  • the loudspeaker includes an LFE channel
  • the LFE channels should be processed using one OTT box together with the non-LFE channel and may be located last in the channel list.
  • the LFE channel is located last in the channel lists L, Lv, R, Rv, Ls, Lss, Rs, Rss, C, LFE, Cvr, and LFE2.
  • 17 is a diagram illustrating an N-N / 2-N structure in a tree form according to an embodiment.
  • the N-N / 2-N structure illustrated in FIG. 16 may be represented in a tree form as shown in FIG. 17.
  • all OTT boxes can regenerate two channels of output signals based on CLD, ICC, residual signal and input signal.
  • OTT boxes and their corresponding CLD, ICC, residual and input signals may be numbered in the order in which they appear in the bitstream.
  • the decoder which is a multichannel audio signal processing apparatus, may generate N-channel output signals from N / 2-channel downmix signals using N / 2 OTT boxes.
  • N / 2 OTT boxes are not implemented through a plurality of layers. That is, the OTT boxes may perform upmixing in parallel for each channel of the downmix signal of the N / 2 channel. In other words, one OTT box is not connected to another OTT box.
  • the left figure shows a case where the LFE channel is not included in the N-channel output signal
  • the right figure shows a case where the LFE channel is included in the N-channel output signal.
  • the N / 2 OTT boxes may generate the output signal of the N channel using the residual signal res and the downmix signal M.
  • the OTT box in which the LFE channel is output among the N / 2 OTT boxes may use only the downmix signal except the residual signal.
  • the OTT box in which the LFE channel is not output among the N / 2 OTT boxes upmixes the downmix signal using CLD and ICC, but the LFE channel is The output OTT box can upmix the downmix signal using only the CLD.
  • the OTT box in which the LFE channel is not output among the N / 2 OTT boxes generates an uncorrelated signal through the decorrelator, but the OTT in which the LFE channel is output.
  • the box does not perform uncorrelated processes and therefore does not generate uncorrelated signals.
  • FIG. 18 illustrates an encoder and a decoder for an FCE structure according to an embodiment.
  • a Four Channel Element downmixes an input signal of four channels to generate an output signal of one channel, or upmixes an input signal of one channel to generate an output signal of four channels. Corresponds to the device to create.
  • the FCE encoder 1801 may generate an output signal of one channel from four input signals using two TTO boxes 1803 and 1804 and the USAC encoder 1805.
  • the TTO boxes 1803 and 1804 may each downmix two input signals to generate one down channel signal from four input signals.
  • the USC encoder 1805 may perform encoding in the core band of the downmix signal.
  • the FCE decoder 1802 performs the inverse of the operation performed by the FCE encoder 1801.
  • the FCE decoder 1802 may generate four channels of output signals from one channel of input signals using the USAC decoder 1806 and two OTT boxes 1807 and 1808.
  • OTT boxes 1807 and 1808 may upmix the input signals of one channel, respectively, decoded by USAC decoder 1806 to produce four channels of output signals.
  • USC decoder 1806 may perform encoding in the core band of the FCE downmix signal.
  • the FCE decoder 1802 may perform coding at a low bitrate in order to operate in a parametric mode using spatial cues such as CLD, IPD, and ICC.
  • the parametric type may be changed based on at least one of the operation bit rate and the total number of channels of the input signal, the resolution of the parameter, and the quantization level.
  • the FCE encoder 1801 and the FCE decoder 1802 can be widely used from 128 kbps to 48 kbps.
  • the number of channels (four) of the output signal of the FCE decoder 1802 is the same as the number of channels (four) of the input signal input to the FCE encoder 1801.
  • FIG. 19 illustrates an encoder and a decoder for a TCE structure according to an embodiment.
  • a three channel element corresponds to an apparatus for generating an output signal of one channel from three input signals or generating an output signal of three channels from an input signal of one channel.
  • the TCE encoder 1901 may include one TTO box 1903 and one QMF converter 1904 and one USAC encoder 1905.
  • the QMF converter may include a hybrid analyzer / synthesizer.
  • input signals of two channels may be input to the TTO box 1903, and input signals of one channel may be input to the QMF converter 1904.
  • the TTO box 1903 may downmix the input signals of the two channels to generate the downmix signal of one channel.
  • the QMF converter 1904 may convert an input signal of one channel into a QMF domain.
  • the output result of the TTO box 1903 and the output result of the QMF converter 1904 may be input to the USAC encoder 1905.
  • the USAC encoder 1905 may encode the core bands of the two channel signals input as the output result of the TTO box 1903 and the output result of the QMF converter 1904.
  • the TCE encoder 1901 may be mainly applied when the number of channels of the input signal is 11.1 or 9.0.
  • the TCE decoder 1902 may include one USAC decoder 1906, one OTT box 1907 and one QMF inverse converter 1904. At this time, the input signal of one channel input from the TCE encoder 1901 is decoded through the USAC decoder 1906. In this case, the USAC decoder 1906 may decode the core band from the input signal of one channel.
  • Input signals of two channels output through the USAC decoder 1906 may be input to the OTT box 1907 and the QMF inverse converter 1908 for each channel.
  • QMF inverse transformer 1908 may include a hybrid analyzer / synthesizer.
  • the OTT box 1907 may generate an output signal of two channels by upmixing an input signal of one channel.
  • the QMF inverse converter 1908 may inversely convert the input signal of one of the two channels of the input signal output through the USAC decoder 1906 from the QMF domain to the time domain or frequency domain.
  • the number of channels of three output signals of the TCE decoder 1902 is equal to the number of channels of three input signals input to the TCE encoder 1901.
  • FIG. 20 illustrates an encoder and a decoder for an ECE structure according to an embodiment.
  • an ECE (Eight Channel Element) downmixes an input signal of eight channels to generate an output signal of one channel, or upmixes an input signal of one channel to generate an output signal of eight channels. Corresponds to the device to create.
  • the ECE encoder 2001 may generate an output signal of one channel from eight input signals using six TTO boxes 2003 to 2008 and USAC encoder 2009. First, input signals of eight channels are input as input signals of two channels, respectively, by four TTO boxes 2003 to 2006. Then, each of the four TTO boxes 2003 to 2006 may generate an output signal of one channel by downmixing input signals of two channels. The output results of the four TTO boxes 2003 to 2006 are input to two TTO boxes 2007 and 2008 connected to the four TTO boxes 2003 to 2006.
  • the two TTO boxes 2007 and 2008 may downmix the output signals of two channels among the output signals of the four TTO boxes 2003 to 2006 to generate the output signal of one channel. Then, the output results of the two TTO boxes 2007 and 2008 are input to the USAC encoder 2009 connected to the two TTO boxes 2007 and 2008. The USAC encoder 2009 may encode the input signal of two channels to generate the output signal of one channel.
  • the ECE encoder 2001 may generate an output signal of one channel from an input signal of eight channels using TTO boxes connected in a two-stage tree form.
  • the four TTO boxes 2003 to 2006 and the two TTO boxes 2007 and 2008 may be connected to each other in a cascade to form a tree of two layers.
  • the ECE encoder 2001 may be used in 48kbps mode or 64kbps mode for the case where the channel structure of the input signal is 22.2 or 14.0.
  • the ECE decoder 2002 may generate eight channels of output signals from one channel of input signals using six OTT boxes 2011 to 2016 and USAC decoders 2010.
  • an input signal of one channel generated by the ECE encoder 2001 may be input to the USAC decoder 2010 included in the ECE decoder 2002.
  • the USAC decoder 2010 may then decode the core band of the input signal of one channel to generate an output signal of two channels.
  • the output signals of the two channels output from the USAC decoder 2010 may be input to the OTT box 2011 and the OTT box 2012 for each channel.
  • the OTT box 2011 may generate an output signal of two channels by upmixing an input signal of one channel.
  • the OTT box 2012 may upmix the input signal of one channel to generate an output signal of two channels.
  • output results of the OTT boxes 2011 and 2012 may be input to the OTT boxes 2013 to 2016 connected to the OTT boxes 2011 and 2012, respectively.
  • Each of the OTT boxes 2013 to 2016 may receive upmixed output signals of one channel among the output signals of two channels that are output results of the OTT boxes 2011 and 2012. That is, each of the OTT boxes 2013 to 2016 may generate an output signal of two channels by upmixing an input signal of one channel. Then, the number of channels of the output signal generated from each of the four OTT boxes 2013 to 2016 is nine.
  • the ECE decoder 2002 may generate eight channels of output signals from one channel of input signals using OTT boxes connected in a two-stage tree form.
  • the four OTT boxes 2013 to 2016 and the two OTT boxes 2011 and 2012 may be connected to each other in a cascade to form a tree of two layers.
  • the number of channels of eight output signals of the ECE decoder 2002 is equal to the number of channels of eight input signals input to the ECE encoder 2001.
  • 21 illustrates an encoder and a decoder for a SiCE structure according to an embodiment.
  • a six channel element corresponds to an apparatus for generating one channel output signal from six channel input signals or six channel output signals from one channel input signal. .
  • the SICE encoder 2101 may include four TTO boxes 2103-2106 and one USAC encoder 2107. At this time, input signals of six channels may be input to three TTO boxes 2103 to 2106. Then, each of the three TTO boxes 2103 to 2106 may generate an output signal of one channel by downmixing an input signal of two channels among the input signals of six channels. Two TTO boxes of the three TTO boxes 2103 to 2106 may be connected to the other TTO box. In the case of FIG. 21, the TTO boxes 2103 and 2104 may be connected to the TTO boxes 2106.
  • the output results of the TTO boxes 2103 and 2104 may be input to the TTO box 2106. As shown in FIG. 21, the TTO box 2106 may downmix two input signals to generate one channel of output signal. On the other hand, the output result of the TTO box 2105 is not input to the TTO box 2106. That is, the output result of the TTO box 2105 is input to the USAC encoder 2107 by bypassing the TTO box 2106.
  • the USAC encoder 2107 may generate the output signal of one channel by encoding the core bands of the two channel input signals that are the output results of the TTO box 2105 and the TTO box 2106.
  • the SiCE encoder 2101 can process an input signal having a 14.0 channel structure at 48 kbps and 64 kbps.
  • the SiCE decoder 2102 may include one USAC decoder 2108 and four OTT boxes 2109-2112.
  • the output signal of one channel generated by the SiCE encoder 2101 may be input to the SiCE decoder 2102.
  • the USAC decoder 2108 of the SiCE decoder 2102 may then decode the core band of the input signal of one channel to generate two output signals. Then, the output signal of one of the two channel output signals generated from the USAC decoder 2108 is input to the OTT box 2109, and the output signal of the other one channel bypasses the OTT box 2109. Directly into the OTT box 2112.
  • the OTT box 2109 may then upmix the input signal of one channel delivered from the USAC decoder 2108 to generate two channels of output signal. Then, the output signal of one channel of the two channel output signals generated from the OTT box 2109 is input to the OTT box 2110, and the output signal of the other one channel is input to the OTT box 2111. Can be. Thereafter, the OTT boxes 2110 to 2112 may upmix the input signals of one channel to generate output signals of two channels.
  • the encoder of the FCE structure, the TCE structure, the ECE structure, and the SiCE structure described above with reference to FIGS. 18 to 21 may generate an output signal of one channel from an N-channel input signal using a plurality of TTO boxes.
  • one TTO box may exist inside the USAC encoder included in the FCE structure, the TCE structure, the ECE structure, and the SiCE encoder.
  • the encoder of the ECE structure and the SiCE structure may be configured of two layers of TTO boxes.
  • the TTO box may be bypassed.
  • the decoder of the FCE structure, the TCE structure, the ECE structure, and the SiCE structure may generate an N-channel output signal from an input signal of one channel using a plurality of OTT boxes.
  • one OTT box may exist inside the USAC decoder included in the decoder of the FCE structure, the TCE structure, the ECE structure, and the SiCE structure.
  • the decoder of the ECE structure and the SiCE structure may be configured of two layers of OTT boxes.
  • the number of channels of the input signal is odd, such as the TCE structure and the SiCE structure, there is a case of bypassing the OTT box.
  • FIG. 22 illustrates a process of processing an audio signal of 24 channels according to an FCE structure according to an embodiment.
  • FIG. 22 may operate at 128kbps and 96kbps as a 22.2 channel structure.
  • four channels of 24 input signals may be input to six FCE encoders 2201.
  • the FCE encoder 2201 may generate one channel output signal from four channel input signals.
  • an output signal of one channel output from each of the six FCE encoders 2201 illustrated in FIG. 22 may be output in the form of a bitstream through the bitstream formatter. That is, the bitstream may include six output signals.
  • the bitstream deformatter can then derive six output signals from the bitstream. Six output signals may be input to each of six FCE decoders 2202. Then, as described with reference to FIG. 18, the FCE decoder 2202 may generate four channel output signals from one channel input signal. A total of 24 channels of output signals may be generated through six FCE decoders 2202.
  • FIG. 23 is a diagram illustrating a process of processing an audio signal of 24 channels according to an ECE structure according to an embodiment.
  • FIG. 23 assumes a case where an input signal of 24 channels is input as in the 22.2 channel structure described with reference to FIG. 22. However, it is assumed that the operation mode of FIG. 23 operates at 48 kbps and 64 kbps, which are lower bit rates than FIG. 22.
  • eight channels of input signals of 24 channels may be input to three ECE encoders 2301, respectively. Then, as described with reference to FIG. 20, the ECE encoder 2301 may generate an output signal of one channel from input signals of eight channels. Then, an output signal of one channel output from each of the three ECE encoders 2301 illustrated in FIG. 23 may be output in the form of a bitstream through the bitstream formatter. That is, the bitstream may include three output signals.
  • the bitstream deformatter can then derive three output signals from the bitstream.
  • Three output signals may be input to three ECE decoders 2302, respectively.
  • the ECE decoder 2302 may generate an output signal of eight channels from an input signal of one channel.
  • a total of 24 channels of output signals may be generated through three FCE decoders 2302.
  • 24 is a diagram illustrating a process of processing an audio signal of 14 channels according to an FCE structure according to an embodiment.
  • FIG. 24 illustrates a process of generating four channels of output signals through three FCE encoders 2401 and one CPE encoder 2402 with input signals of fourteen channels. At this time, FIG. 24 shows a case in which operation is performed at a relatively high bit rate such as 128 kbps or 96 kbps.
  • Three FCE encoders 2401 may generate one channel of output signals from four channels of input signals, respectively.
  • one CPE encoder 2402 may generate an output signal of one channel by downmixing an input signal of two channels. Then, the bitstream formatter may generate a bitstream including four output signals from the output results of three FCE encoders 2401 and the output results of one CPE encoder 2402.
  • the bitstream formatter extracts four output signals from the bitstream, and then the three output signals can be delivered to three FCE decoders 2403 and the other one output signal to one CPE decoder 2404. have. Then, each of the three FCE decoders 2403 may generate four channels of output signals from one channel of input signals. In addition, one CPE decoder 2404 may generate two channels of output signals from one channel of input signals. That is, a total of 14 output signals may be generated through three FCE decoders 2403 and one CPE decoder 2404.
  • 25 is a diagram illustrating a process of processing an audio signal of 14 channels according to an ECE structure and a SiCE structure according to an embodiment.
  • the ECE encoder 2501 and the SiCE encoder 2502 process 14 input signals. Unlike FIG. 24, FIG. 25 is applied to a relatively low bit rate (eg 48 kbps, 96 kbps).
  • a relatively low bit rate eg 48 kbps, 96 kbps.
  • the ECE encoder 2501 may generate an output signal of one channel from input signals of eight channels among the input signals of 14 channels.
  • the SiCE encoder 2502 may generate an output signal of one channel from input signals of six channels among the input signals of 14 channels.
  • the bitstream formatter may generate a bitstream using two output signals as an output result of the ECE encoder 2501 and the SiCE encoder 2502.
  • the bitstream deformatter may extract two output signals from the bitstream. Then, two output signals may be input to the ECE decoder 2503 and the SiCE decoder 2504, respectively.
  • the ECE decoder 2503 can generate eight channels of output signals using one channel of input signals
  • the SiCE decoder 2504 can generate six channels of output signals using one channel of input signals. have. That is, a total of 14 output signals may be generated through the ECE decoder 2503 and the SiCE decoder 2504, respectively.
  • FIG. 26 illustrates a process of processing an 11.1 channel audio signal according to a TCE structure according to an embodiment.
  • four CPE encoders 2601 and one TCE encoder 2602 may generate five channels of output signals from 11.1 channels of input signals.
  • an audio signal may be processed at a relatively high bit rate such as 128 kbps and 96 kbps.
  • Each of the four CPE encoders 2601 may generate one channel of output signals from two channels of input signals. Meanwhile, one TCE encoder 2602 may generate one channel output signal from three channel input signals.
  • the output results of the four CPE encoders 2601 and one TCE encoder 2602 may be input to a bitstream formatter and output as a bitstream. That is, the bitstream may include output signals of five channels.
  • bitstream deformatter may extract five channels of output signals from the bitstream.
  • Five output signals may then be input to four CPE decoders 2603 and one TCE decoder 2604.
  • the four CPE decoders 2603 may then generate two channels of output signals from one channel of input signals, respectively.
  • the TCE decoder 2604 may generate three channels of output signals from one channel of input signals.
  • 11 channels of output signals may be output through four CPE decoders 2603 and one TCE decoder 2604.
  • FIG. 27 illustrates a process of processing an 11.1 channel audio signal according to an FCE structure according to an embodiment.
  • FIG. 27 may operate at a relatively low bit rate (eg, 64kbps, 48kbps).
  • three FCE encoders 2701 may generate three channels of output signals from twelve channels of input signals. Specifically, each of the three FCE encoders 2701 may generate an output signal of one channel from input signals of four channels among the input signals of twelve channels. Then, the bitstream formatter may generate a bitstream using three channel output signals output from three FCE encoders 2701.
  • bitstream deformatter may output three channels of output signals from the bitstream. Then, output signals of three channels may be input to three FCE decoders 2702, respectively. Thereafter, the FCE decoder 2702 may generate an output signal of three channels by using an input signal of one channel. Then, output signals of 12 channels may be generated through three FCE decoders 2702.
  • FIG. 28 is a diagram illustrating a process of processing an audio signal of 9.0 channels according to a TCE structure according to an embodiment.
  • FIG. 28 a process of processing input signals of nine channels is illustrated.
  • 28 can process input signals of nine channels at relatively high bitrates (eg, 128 kbps, 96 kbps).
  • nine channels of input signals may be processed based on three CPE encoders 2801 and one TCE encoder 2802.
  • Each of the three CPE encoders 2801 may generate one channel of output signals from two channels of input signals.
  • one TCE encoder 2802 may generate one channel output signal from three channel input signals. Then, a total of four channels of output signals can be input to the bitstream formatter and output as a bitstream.
  • the bitstream deformatter may extract output signals of four channels included in the bitstream. Then, four channels of output signals may be input to three CPE decoders 2803 and one TCE decoder 2804. Each of the three CPE decoders 2803 may generate two channels of output signals from one channel of input signals. Meanwhile, one TCE decoder 2804 may generate three channel output signals from one channel input signal. A total of nine channels of output signals can then be generated.
  • 29 is a diagram illustrating a process of processing an audio signal of 9.0 channels according to an FCE structure according to an embodiment.
  • 29 can process nine channels of input signals at relatively low bitrates (64 kbps, 48 kbps).
  • nine channels of input signals may be processed based on two FCE encoders 2901 and one SCE encoder 2902.
  • Each of the two FCE encoders 2901 may generate one channel of output signal from four channels of input signal.
  • one SCE encoder 2902 may generate an output signal of one channel from an input signal of one channel. Then, a total of three channels of output signals may be input to the bitstream formatter and output in the bitstream.
  • the bitstream deformatter may extract output signals of three channels included in the bitstream. Then, output signals of three channels may be input to two FCE decoders 2903 and one SCE decoder 2904. Each of the two FCE decoders 2903 may generate four channels of output signals from one channel of input signals. Meanwhile, one SCE decoder 2904 may generate one channel output signal from one channel input signal. A total of nine channels of output signals can then be generated.
  • Table 12 shows a configuration of a parameter set according to the number of channels of an input signal when spatial coding is performed.
  • bsFreqRes means the number of analysis bands equal to the number of USAC encoders.
  • the USAC encoder can encode the core band of the input signal.
  • the USAC encoder can control the plurality of encoders according to the number of input signals by using channel-to-object mapping information based on metadata representing relationship information between channel elements (CPEs and SCEs) and objects and rendered channel signals.
  • CPEs and SCEs channel elements
  • Table 13 shows the bit rate and sampling rate used in the USAC encoder. According to the sampling rate of Table 13, encoding parameters of spectral band replication (SBR) may be appropriately adjusted.
  • SBR spectral band replication
  • Methods according to an embodiment of the present invention can be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium.
  • the computer readable medium may include program instructions, data files, data structures, etc. alone or in combination.
  • Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)

Abstract

L'invention concerne un procédé de traitement de signaux audio multicanal et un dispositif de traitement de signaux audio multicanal. Le procédé de traitement de signaux audio multicanal permet de produire des signaux de sortie de N canaux à partir de signaux somme de N/2 canaux suivant une structure N-N/2-N.
PCT/KR2015/006788 2014-07-01 2015-07-01 Procédé et dispositif de traitement de signaux audio multicanal WO2016003206A1 (fr)

Priority Applications (8)

Application Number Priority Date Filing Date Title
DE112015003108.1T DE112015003108B4 (de) 2014-07-01 2015-07-01 Verfahren und Vorrichtung zur Verarbeitung eines Mehrkanal-Audiosignals
CN201580036477.8A CN106471575B (zh) 2014-07-01 2015-07-01 多信道音频信号处理方法及装置
CN201911108867.8A CN110970041B (zh) 2014-07-01 2015-07-01 处理多信道音频信号的方法和装置
US15/323,028 US9883308B2 (en) 2014-07-01 2015-07-01 Multichannel audio signal processing method and device
CN201911107604.5A CN110895943B (zh) 2014-07-01 2015-07-01 处理多信道音频信号的方法和装置
CN201911107595.XA CN110992964B (zh) 2014-07-01 2015-07-01 处理多信道音频信号的方法和装置
US15/870,700 US10264381B2 (en) 2014-07-01 2018-01-12 Multichannel audio signal processing method and device
US16/357,180 US10645515B2 (en) 2014-07-01 2019-03-18 Multichannel audio signal processing method and device

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR20140082030 2014-07-01
KR10-2014-0082030 2014-07-01
KR1020150094195A KR102144332B1 (ko) 2014-07-01 2015-07-01 다채널 오디오 신호 처리 방법 및 장치
KR10-2015-0094195 2015-07-01

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US15/323,028 A-371-Of-International US9883308B2 (en) 2014-07-01 2015-07-01 Multichannel audio signal processing method and device
US15/870,700 Continuation US10264381B2 (en) 2014-07-01 2018-01-12 Multichannel audio signal processing method and device

Publications (1)

Publication Number Publication Date
WO2016003206A1 true WO2016003206A1 (fr) 2016-01-07

Family

ID=55019650

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2015/006788 WO2016003206A1 (fr) 2014-07-01 2015-07-01 Procédé et dispositif de traitement de signaux audio multicanal

Country Status (1)

Country Link
WO (1) WO2016003206A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10645515B2 (en) 2014-07-01 2020-05-05 Electronics And Telecommunications Research Institute Multichannel audio signal processing method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050195981A1 (en) * 2004-03-04 2005-09-08 Christof Faller Frequency-based coding of channels in parametric multi-channel coding systems
WO2007078254A2 (fr) * 2006-01-05 2007-07-12 Telefonaktiebolaget Lm Ericsson (Publ) Decodage personnalise de son d'ambiance multicanal
WO2007111568A2 (fr) * 2006-03-28 2007-10-04 Telefonaktiebolaget L M Ericsson (Publ) Procede et agencement pour un decodeur pour son d'ambiance multicanaux
WO2010050740A2 (fr) * 2008-10-30 2010-05-06 삼성전자주식회사 Appareil et procédé de codage/décodage d’un signal multicanal
KR20120099191A (ko) * 2006-01-11 2012-09-07 삼성전자주식회사 다운믹스된 신호로부터 멀티채널 신호 생성방법 및 그 기록매체

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050195981A1 (en) * 2004-03-04 2005-09-08 Christof Faller Frequency-based coding of channels in parametric multi-channel coding systems
WO2007078254A2 (fr) * 2006-01-05 2007-07-12 Telefonaktiebolaget Lm Ericsson (Publ) Decodage personnalise de son d'ambiance multicanal
KR20120099191A (ko) * 2006-01-11 2012-09-07 삼성전자주식회사 다운믹스된 신호로부터 멀티채널 신호 생성방법 및 그 기록매체
WO2007111568A2 (fr) * 2006-03-28 2007-10-04 Telefonaktiebolaget L M Ericsson (Publ) Procede et agencement pour un decodeur pour son d'ambiance multicanaux
WO2010050740A2 (fr) * 2008-10-30 2010-05-06 삼성전자주식회사 Appareil et procédé de codage/décodage d’un signal multicanal

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10645515B2 (en) 2014-07-01 2020-05-05 Electronics And Telecommunications Research Institute Multichannel audio signal processing method and device

Similar Documents

Publication Publication Date Title
WO2010107269A2 (fr) Appareil et méthode de codage/décodage d'un signal multicanaux
WO2016024847A1 (fr) Procédé et dispositif de génération et de lecture de signal audio
WO2014137159A1 (fr) Procédé et appareil pour appliquer des transformées secondaires sur des résidus de couche d'amélioration
WO2012091464A4 (fr) Appareil et procédé pour coder/décoder une extension de largeur de bande haute fréquence
WO2013183977A1 (fr) Procédé et appareil de masquage d'erreurs de trames et procédé et appareil de décodage audio
WO2020242260A1 (fr) Procédé et dispositif de compression d'image basée sur l'apprentissage machine utilisant un contexte global
WO2017222140A1 (fr) Procédés et dispositifs de codage et de décodage comprenant un filtre en boucle à base de cnn
WO2012144878A2 (fr) Procédé de quantification de coefficients de codage prédictif linéaire, procédé de codage de son, procédé de déquantification de coefficients de codage prédictif linéaire, procédé de décodage de son et support d'enregistrement
WO2009157715A2 (fr) Procédé de conception de livre de codes pour système à multiples entrées et multiples sorties et procédé d'utilisation du livre de codes
WO2012144877A2 (fr) Appareil de quantification de coefficients de codage prédictif linéaire, appareil de codage de son, appareil de déquantification de coefficients de codage prédictif linéaire, appareil de décodage de son et dispositif électronique s'y rapportant
WO2010062123A2 (fr) Codec vocal/audio unifié (usac) pour le traitement d’une séquence de fenêtres sur la base d’une commutation de mode
WO2010008229A1 (fr) Appareil de codage et de décodage audio multi-objet prenant en charge un signal post-sous-mixage
WO2009131376A2 (fr) Système de communication à antennes multiples comprenant la mise à jour et le changement adaptatifs de livres de codes
WO2020032632A1 (fr) Procédé de codage/décodage d'images et dispositif associé
AU2012246799A1 (en) Method of quantizing linear predictive coding coefficients, sound encoding method, method of de-quantizing linear predictive coding coefficients, sound decoding method, and recording medium
WO2016204581A1 (fr) Procédé et dispositif de traitement de canaux internes pour une conversion de format de faible complexité
AU2012246798A1 (en) Apparatus for quantizing linear predictive coding coefficients, sound encoding apparatus, apparatus for de-quantizing linear predictive coding coefficients, sound decoding apparatus, and electronic device therefor
EP2443750A2 (fr) Appareil et procédé de codage arithmétique à base de contexte et appareil et procédé de décodage arithmétique à base de contexte
WO2016195455A1 (fr) Procédé et dispositif de traitement de signal vidéo au moyen d'une transformée basée graphique
WO2018139884A1 (fr) Procédé de traitement audio vr et équipement correspondant
WO2009116815A2 (fr) Appareil et procédé permettant d’effectuer un codage et décodage au moyen d’une extension de bande passante dans un terminal portable
WO2022158943A1 (fr) Appareil et procédé de traitement d'un signal audio multicanal
WO2015093742A1 (fr) Procédé et appareil destinés à l'encodage/au décodage d'un signal audio
WO2016003206A1 (fr) Procédé et dispositif de traitement de signaux audio multicanal
WO2016204579A1 (fr) Procédé et dispositif de traitement de canaux internes pour une conversion de format de faible complexité

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15815538

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15323028

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 112015003108

Country of ref document: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15815538

Country of ref document: EP

Kind code of ref document: A1

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载