WO2006009275A1

WO2006009275A1 - Method and system for editing audiovisual files

Info

Publication number: WO2006009275A1
Application number: PCT/JP2005/013540
Authority: WO
Inventors: Pi-Chung Hsu; Hon-Wen Pon
Original assignee: Matsushita Electric Industrial Co., Ltd.
Priority date: 2004-07-19
Filing date: 2005-07-19
Publication date: 2006-01-26
Also published as: CN1725360A

Abstract

A method for editing audiovisual files is used to edit an input bit stream according to at least one copy task. The input bit stream has a plurality of video frames and a plurality of audio frames. The method is characterized in that, during execution of the copy task, a synchronization error table is used to first correct the positional relationship between the video frames and the audio frames before proceeding with copy selection so as to achieve the effect of effectively avoiding accumulation and propagation of stream errors during repeated editing.

Description

DESCRIPTION

METHOD AND SYSTEM FOR EDITING AUDIOVISUAL FILES

Technical Field

The invention relates to a method and system for editing audiovisual files, more particularly to a method and system for editing audiovisual files which can effectively prevent accumulation and propagation of stream errors during repeated editing.

Background Art

Motion Pictures Experts Group (MPEG) is a system set up by the International Standards Organization (ISO) specifically for digital video and audio compression. When bit streams are encoded, the system requires a minimum number of rules to be followed during encoding so that the receiver can unambiguously decode the encoded bit streams received thereby. A bit stream generally includes a video component, an audio component, and a system component. The system component defines information of how the video and audio components in a single bit stream can be combined and synchronized. Therefore, the MPEG standard defines a system for compressing encoded video and audio bit streams.

However, when a copied audiovisual segment is joined to another copied audiovisual segment, the audio components can hardly be synchronized. According to the type of MPEG audio levels, the audio frames may possibly change in size. Therefore, the synchronization problem is partly due to the fact that there is rarely a one-to-one correlation between the audio frames and the video frames. Thus, when an audiovisual segment is identified from a file for copying, the number of audio frames corresponding to the identified video frames will not be a predetermined value. Accordingly, when a video segment is copied from a file and is joined to another copied segment, the audio components in the copied segment might no longer be synchronized with the corresponding video frames. Once the video and audio frames are no longer synchronized, an error representing a synchronization discrepancy between the video and audio frames will be introduced into the resultant bit stream. This error can be expressed in terms of the number or percentage of the audio frames. For example, a synchronization error that is introduced by the joining of two bit stream segments may be as small as a fraction of an audio frame or as large as several audio frames. Although the error associated with the joining of only two bit stream segments may be merely a few audio frames under certain conditions, when a number of bit stream segments are joined in a more complicated editing task, the errors of the connected segments are summed. Therefore, the errors thus generated may be very large, and the resultant audio frames may be seriously un-synchronized that they cannot be played back. Furthermore, the sounds produced at the positions where the non-synchronized audio and video bit streams are joined are generally discontinuous. For instance, popping sounds are produced. Therefore, if discontinuities are introduced into the joined bit stream segments, annoying popping sounds will be introduced thereinto as well, which not only results in non-synchronization of the resultant audio bit streams, the playback effect is also intolerable. In view of the foregoing, U.S. Patent No. 6,262,777 discloses a conventional method for synchronizing edited audiovisual files. This method can prevent bit stream errors from exceeding half an audio frame regardless of the number of joined segments after a continuous copying operation so that the video frames are substantially synchronized with the audio frames.

The flow of tab processing in the conventional method to avoid generation of errors exceeding half an audio frame when a plurality of audio and video segments are joined is illustrated with reference to Figure 1 (i.e., Figure 13 of U.S. Patent No. 6,262,777). Generally, the start audio frame 706, 710, 714, 718 of each audiovisual segment is referred to as a tab-in audio frame, whereas the end audio frame 708, 712, 716, 720 is referred to as a tab-out audio frame (see Figure 2). Tab processing is performed for each of the tab-in audio frame and the tab- out audio frame. When specific conditions are met, the tab-in and tab- out audio frames may be dropped or retained. In the following text, the tab-in audio frames and the tab-out audio frames are referred to as tabs. To facilitate description, the tab processing flow in Figure 1 will be illustrated by way of an example with reference to Figures 2 and 3 (i.e., Figures 14 and 15 of the aforesaid U.S. Patent No. 6,262,777). In the example of Figure 2, four segments SEGMENT A, SEGMENT B, SEGMENT C, and SEGMENT D are to be connected together.

Initially, in step 602, the existing stream errors present prior to processing of the tab 706 are determined. As shown in Figure 2, SEGMENT A is the first segment, and there is no previous tab that introduces an existing stream error. Therefore, the existing stream error is 0. Once it is determined in step 602 that the existing stream error is 0, step 604 is executed to determine the tab error of the tab 706. As shown in Figure 3, the tab error of the tab 706 is assumed to be 0.2 (i.e., 20% of an audio frame).

Further, in step 606, it is determined whether summation (herein referred to as cumulative error) of the existing stream error and the tab error is greater than half a frame (i.e., 0.5). If it is determined in step 606 that the summation is greater than half a frame, step 610 is executed to drop the tab processed in the aforesaid step 604. Conversely, if it is determined in step 606 that the summation is not greater than half a frame, step 608 is executed to retain the tab processed in the aforesaid step 604. Step 612 is executed after steps 608 and 610 are ended. In step 606, since the cumulative error of the tab 706 is 0.2, which is smaller than 0.5, step 608 is executed to retain the tab 706. Thereafter, in step 612, it is determined whether there are other tabs. If it is determined in step 612 that there are other tabs, the flow skips back to step 602. Otherwise, the tab processing flow is ended. In this example, since the aforementioned steps are concerned with the processing of the tab 706 only, it is determined in step 612 that there are other tabs, and the flow skips back to step 602.

It is determined in step 602 that the existing stream error of the tab 708 is 0.2, whereas it is determined in step 604 that the tab error of the tab 708 is 0.5. In the subsequent step 606, the flow will skip to step 610 to drop the tab 708 because the cumulative error of the tab 708 is 0.7. After the tab 708 is dropped, the new stream error becomes -0.3(0.7-1 =- 0.3). Then, steps 602-612 are repeated to continue processing of the tabs 710, 712, 714, 716, 718, and 720 until all of the tabs have been processed. After tab processing, all the segments SEGMENT A, SEGMENT B, SEGMENT C, and SEGMENT D can be connected together.

The conventional method utilizes the tab processing operation as described above to ensure that the stream error does not exceed 0.5 audio frame so that the video frames are substantially synchronized with the audio frames regardless of the number of segments that are joined during a copying operation. However, while the conventional method can effectively prevent accumulation of errors during a single editing task of the audiovisual segments, growth and propagation of cumulative stream errors are hardly avoided when these edited audiovisual segments are subjected to repeated editing tasks.

For example, the original audiovisual segment (s) shown in Figure 4 is connected to another audiovisual segment (X). According to the conventional method, to avoid growth of cumulative stream errors, the audio frame (a) will be dropped, and it is assumed that the current existing stream error is -0.5 audio frame. Therefore, in Figure 5, the tab- processed audiovisual segment (s) is joined to the audiovisual segment (X), and all the audio frames will be shifted to the left by 0.5 audio frame beginning from (b) so as to form a new audiovisual segment (s¹).

Further, the second editing is commenced to cut an audiovisual segment beginning from a video frame 2 from the audiovisual segment (s') for joining to another audiovisual segment (Y). During tab processing of the audiovisual segment cut from the audiovisual segment (s¹), the audio frame (d) will be dropped, and the current existing stream error is set to -0.5 audio frame. As shown in Figure 6, when the cut audiovisual segment is connected to the audiovisual segment (Y), all the audio frames will be shifted to the left by 0.5 audio frame beginning from the audio frame (e) so as to form a new audiovisual segment (s"). Therefore, the error of the audiovisual segment that has been edited for a second time becomes one audio frame, i.e., |-0.5-0.5|=1. Thereafter, editing is carried out once again to cut an audiovisual segment beginning from the video frame 3 from the audiovisual segment (s") for joining to yet another audiovisual segment (Z). During tab processing of the audiovisual segment cut from the audiovisual segment (S"), the audio frame (g) will be dropped, and the existing stream error is assumed to be -0.5. As shown in Figure 7, when the tab-processed audiovisual segment is connected to the audiovisual segment (Z), all the audio frames will be shifted to the left by 0.5 audio frame beginning from the audio frame (h) so as to form a new audiovisual segment (s'"). Therefore, the stream error thus accumulated after the three editing operations is 1.5, i.e., |-0.5-0.5-0.5|=1.5.

If editing is executed for the fourth time, an audiovisual segment beginning from the video frame 4 is cut from the audiovisual segment (s"'), and the audio frame (j) is dropped and the existing stream error is set to -0.5 during tab processing. As shown in Figure 8, when the processed audiovisual segment is connected to a further audiovisual segment (V), the audio frames as a whole will be shifted to the left by 0.5 audio frame beginning from the audio frame 5 so as to form a new audiovisual segment (s"^π). In this way, the stream error thus accumulated after four editing operations is increased to 2, i.e., |-0.5-0.5- 0.5-0.5|=2. For instance, the audio frames which lie below the video frame 5 in the unedited audiovisual segment (s) are three audio frames 5. However, two of the three audio frames 5 of the audiovisual segment (s"") have shifted to the left to below the video frame 4 after the fourth editing operation.

According to the foregoing, the conventional method is still unable to inhibit the growth and propagation of stream errors during repeated editing, so that the audio components in the bit stream that was edited repeatedly might no longer be synchronized with the proper video components.

Disclosure of Invention In view of the inability of the prior art to avoid accumulation of stream errors during repeated editing of a bit stream, the inventors of this application contemplated the logging or recording of the amount of shifting of audio frames during joining of audiovisual segments so that, during repeated editing, the audio frames are first restored to their positions prior to the joining of the audiovisual segments based on the recorded amount of the shifting of the audio frames before further tab processing and segment joining are performed, thereby avoiding accumulation of stream errors during repeated editing.

Therefore, the object of the present invention is to provide a method and system for editing audiovisual files, which can effectively avoid accumulation and propagation of stream errors during repeated editing.

Accordingly, a method for editing audiovisual files of this invention is adapted to process an input bit stream according to a copy task. The input bit stream has a plurality of video frames and a plurality of audio frames. The method comprises the following steps:

A) identifying a mark-in video frame and a mark-out video frame from the video frames according to the copy task;

B) when presence of a synchronization error table corresponding to the input bit stream is determined, correcting positional relationship between the video frames and the audio frames according to the synchronization error table, the synchronization error table recording a stream error of the audio frames;

C) identifying a tab-in audio frame associated with the mark-in video frame and a tab-out audio frame associated with the mark-out video frame from the audio frames; and D) copying an audiovisual segment from the input bit stream, the audiovisual segment including a plurality of video frames from the mark-in video frame to the mark-out video frame, and a plurality of audio frames from the tab-in audio frame to the tab-out audio frame.

This invention utilizes the synchronization error table to correct the positional relationship between the video and audio frames before performing tab processing and segment joining so as to achieve the effects of effectively avoiding accumulation and propagation of stream errors during repeated editing.

Brief Description of Drawings

Other features and advantages of the present invention will become apparent in the following detailed description of the preferred embodiment with reference to the accompanying drawings, of which:

Figure 1 is a flowchart of tab processing according to U.S. Patent No. 6,262,777;

Figure 2 is a schematic diagram of an example of a plurality of audiovisual segments to be joined together according to the method of U.S. Patent No. 6,262,777;

Figure 3 is a table showing calculation results during processing of a plurality of the tabs shown in Figure 2;

Figures 4 to 8 are schematic diagrams of the audiovisual segments when the audiovisual segments are subjected to repeated editing according to the method of U.S. Patent No. 6,262,777;

Figure 9 is a schematic diagram of the preferred embodiment of a system for editing audiovisual files according to this invention;

Figure 10 is a schematic diagram of a bit stream in this invention; Figure 11 is a flowchart of the preferred embodiment of a method for editing audiovisual files according to this invention;

Figure 12 is a flowchart of a copying operation in Figure 1 1 ;

Figure 13 is a schematic diagram showing a bit stream cut from that of Figure 10 according to the flowchart of Figure 12; Figure 14 is a schematic diagram illustrating copying of an audiovisual segment according to the flowchart of Figure 12;

Figure 15 is a flowchart of tab processing in the flowchart of Figure 11 ;

Figure 16 is a schematic view showing how a plurality of audiovisual segments are joined together during tab processing in this preferred embodiment; and

Figures 17 to 22 are schematic diagrams of audiovisual segments that are subjected to repeated editing according to this invention.

Best Mode for Carrying Out the Invention

Referring to Figure 9, the preferred embodiment of a system for editing audiovisual files according to this invention is shown to include an editing engine 11 for executing audiovisual file editing, and a storage medium 12 such as a hard disk. The editing engine 11 is used to edit one or more input bit streams. To facilitate description, the editing engine 11 is exemplified herein as one that is used for editing two input bit streams, i.e., A.MPEG and B. MPEG. Of course, those skilled in the art are aware that the editing engine 11 can also be used to edit one or more input bit streams, and is not limited to that which is disclosed in this embodiment. As shown in Figure 10, an audiovisual bit stream 13 generally has a video bit stream 131 and an audio bit stream 132. The video bit stream 131 has a plurality of video frames, each of which has an exclusive serial number. The audio bit stream 132 has a plurality of audio frames. The storage medium 12 stores an edit list 121. The edit list 121 records a plurality of copy tasks inputted by an operator. Each copy task contains a specific input bit stream, a start serial number, and an end serial number, e.g., COPY 10...25 A.MPEG. The storage medium 12 is also provided for access of data by the editing engine 11.

Further, the operation of the editing engine 11 in the preferred embodiment will be first described in succinct terms with reference to Figure 11. It is first noted that the system for editing audiovisual files in this embodiment is an open system, which receives the aforesaid input bit streams from the outside and which outputs edited output bit streams to the outside. Therefore, in step 21 , the editing engine 11 receives the edit list 121 inputted by the operator, and stores the edit list 121 temporarily in the storage medium 12. Next, in step 22, the editing engine 11 generates a brand new synchronization error table 122 for use in subsequent processing. Thereafter, in step 23, the editing engine 11 executes each of the copy tasks in the edit list 121 in sequence so as to generate a corresponding copying operation 111 , such as an object, to decide the number of audio frames to be copied, i.e., the number of audio frames having a tab-in and a tab-out, so as to generate an audiovisual segment, thereby forming an output bit stream. When the audiovisual segment is outputted, the first audio frame (i.e., the tab-in audio frame) in the audiovisual segment has to be aligned with the first video frame (i.e., a mark-in video frame) so that it is necessary to perform tab processing and update the synchronization error table 122. Finally, step 24 is performed to insert the synchronization error table 122 into the output bit stream. In this embodiment, the synchronization error table 122 is used to record stream errors of the audio frames that have undergone tab processing.

Therefore, the job executed by the copying operation 111 as generated by the editing engine 11 generally includes copying and tab processing. Hereinafter, the task flowchart of the copying operation in step 23 will be first described in detail with reference to Figure 12.

Initially, in step 31 , the copying operation 111 selects a mark-in position and a mark-out position from the video bit stream 131 based on the start serial number and end serial number specified in the copy task so as to identify a mark-in video frame and a mark-out video frame to thereby decide the number of video frames to be copied (i.e., the video frames from the mark-in video position to the mark-out video position). For example, as shown in Figure 10, if the start serial number and the end serial number are specified as 10 and 25, respectively, the mark-in position is video frame 10, and the mark-out position is video frame 25. Hence, the video frame 10 is the mark-in video frame, and the video frame 25 is the mark-out video frame. The segment to be copied includes the frames from the video frame 10 to the video frame 25. Certainly, the audiovisual segment to be copied will also include the audio frames associated with the above-identified video frames. These audio frames lie below their associated video frames. To avoid accumulation of stream errors after editing, in this embodiment, after the video and audio frames are edited, the synchronization error table 122 will be used to record the stream errors, and the synchronization error table 122 will be inserted into the bit stream at a suitable position (as in step 24), e.g., the position of a system component in the bit stream or a header or the like in an audiovisual segment, such as PES_private_date in PES_header.

Therefore, in step 32, it is first determined whether a synchronization error table is present in the input bit stream. If it is determined in step 32 that there is a synchronization error table, this means that the input bit stream has been previously edited, such as an input bit stream A. MPEG, and there are stream errors between the audio bit stream 132 and the video bit stream 131. Therefore, step 33 is first executed. If it is determined in step 32 that there is not a synchronization error table, this means that the input bit stream has yet to be edited, such as an input bit stream B. MPEG, and step 36 is executed right away.

In step 33, the copying operation 111 first loads the synchronization error table present in the input bit stream. This synchronization error table is different from the synchronization error table 122 generated in step 22. The synchronization error table in this example records the numbers of the audio frames that were shifted in the previous editing, and the stream errors after the shifting thereof. In this example, in order to reduce the amount of audio frames recorded in the synchronization error table, only those audio frames that are tab audio frames are recorded therein. This is because, for an audio segment that is selected for editing, shifting of the entire audio segment starts from a tab audio frame thereof. Since the amount of shifting of each of the audio frames in the entire audio segment will be consistent, it will be sufficient to record only the tab audio frames.

Furthermore, if a synchronization error table is present, this means that the audiovisual bit stream 13 has been edited, and may be composed of a plurality of interconnected audiovisual segments such that the synchronization error table records several entries of audio frames and the corresponding stream errors thereof. To ensure accurate correction of the positional relationship between the video and audio frames in the input bit stream, in step 34, the video bit stream 131 is divided into a plurality of sub-audiovisual segments based on the serial numbers of the audio frames recorded in the synchronization error table. This means that the audiovisual bit stream 13 is split at the positions of the audio frames recorded in the synchronization error table. During splitting, the video frame located above the recorded audio frame is separated from an immediately preceding video frame so as to form a sub-audiovisual segment. If there are two video frames above the audio frame, these two video frames are separated. Besides, during splitting, the copying operation 111 selects an associated correction tab audio frame for and based on each of the two adjacent video frames that are to be split.

The split two adjacent video frames refer to the video frame that is located above the audio frame recorded in the synchronization error table and its immediately preceding video frame, or the two video frames that are located above the recorded audio frame. Similarly, since the length of the audio segment preferably exceeds or is equal to the length of the video segment, if the video frame is located at an earlier time, the end time of the selected audio frame is required to be the same as or later than the end time of the video frame that is at the earlier time. If the video frame is located at a later time, the start time of the selected audio frame must be the same as or earlier than the start time of the video frame that is at the later time. Certainly, in order to achieve this requirement, the audio frame located below the splitting position may be duplicated so that the two video frames are associated with the aforesaid audio frame.

Then, in step 35, the positional relationship between the video and audio frames in each sub-audiovisual segment is corrected based on the corresponding stream error of each recorded audio frame. Step 36 is executed after step 35 is ended.

For example, if the synchronization error table loaded into the input bit stream in step 33 is such as that shown in Table 1 below, in step 34, as shown in Figure 13, since video frame 9 is above the audio frame 134 and since video frames 19, 20 are above audio frame 135, the copying operation 111 will execute splitting of the audiovisual bit stream 13 between video frames 8, 9 and between video frames 19, 20 to thereby form three sub-audiovisual segments 136, 137, 138. In order that the lengths of the audio segments in the sub-audiovisual segments are not smaller than those of the video segments, the audio frame at the start position of the sub-audiovisual segment 136, which serves as a correction tab audio frame, is the audio frame 139 that precedes the audio frame 134; the audio frame at the end position of the sub- audiovisual segment 136 is a duplicated audio frame 135; and the audio frame at the start position of the sub-audiovisual segment 138 is also the audio frame 135. Thus, in this example, the audio frame 135 is duplicated.

Further, in step 35, the positional relationship between the video and audio frames in each sub-audiovisual segment 137, 138 is corrected according to the corresponding stream error of each audio frame 134, 135. Specifically, since the first sub-audiovisual segment 136 does not have any corresponding error information, the relationship between the video and audio frames therein is maintained. Since the stream error of the first audio frame 134 in the audio bit stream in the second sub- audiovisual segment 137 is 0.3 audio frame, the entire audio bit stream in the sub-audiovisual segment 137 is shifted to the left by 0.3 audio frame to eliminate the stream error caused to the audiovisual segment 137 during the previous editing. As for the audio frame 135 in the third sub-audiovisual segment 138, since the stream error is -0.2 audio frame, the entire audio bit stream in the sub-audiovisual segment 138 is shifted to the right by 0.2 audio frame to correct the positional relationship between the video and audio frames. Thus, the correction of the positional relationship between the video frames and the audio frames using the stream error information in the synchronization error table in the input bit stream in steps 33-35 is to restore the audio frames to their positions prior to their being edited. Then, in step 36, a tab-in audio frame and a tab-out audio frame are selected based on the mark-in video frame and the mark-out video frame so as to determine the size of the audio segment to be copied. Since the length of the audio segment to be copied preferably exceeds or is equal to the length of the copied video segment, as shown in Figure 14, the start time of the tab-in audio frame, which serves as the initial audio frame 143, is preferably equal to or earlier than the start time 141 of the mark-in video frame 10, and the start time of the tab-out audio frame, which serves as the end audio frame 144, is preferably equal to or earlier than the end time 142 of the mark-out video frame 25. In short, if an audio frame cannot be aligned with the start time 141 of the mark-in video frame 10 or the end time 142 of the mark-out video frame 25, the tab-in audio frame 143 has an earlier start time compared to the start time 141 of the mark-in video frame 10, and the tab-out audio frame 144 has an earlier start time compared to the end time 142 of the mark-out video frame 25.

Finally, in step 37, the selected audiovisual segment (i.e., the segment containing the selected video frames 10-25 and audio frames 143-144) is outputted after tab processing (to be described in detail hereinafter). After the copying operation 111 has executed step 37, the copying operation 111 will be terminated. Thus, by utilizing the synchronization error table present in the input bit stream to first correct the positional relationship between the video and audio frames, the stream error generated in the previous editing operation can be eliminated beforehand so as to achieve the effect of effectively avoiding accumulation of stream errors due to repeated editing.

It is noted that, although the determination of the presence of the synchronization error table and the corresponding position correcting process (as in steps 32-35) in this embodiment are performed after selecting the video frame segment to be copied (as in step 31 ), it should be apparent to those skilled in the art that the determination of the presence of the synchronization error table and the corresponding position correcting process can be executed before selecting the video frame segment to be copied. In addition, the operation of selecting the correction tab audio frame in step 34 may be executed together with the selection of audio frames in step 36, and should not be limited to that which is disclosed in this embodiment. Thereafter, the editing engine 11 may generate a new copying operation 1 11 according to a next copy task in the edit list 121 so as to execute the next copy task.

Furthermore, when the copying operation 111 executes the copy task, it will be terminated after outputting the copied audiovisual segment so that the editing engine 11 continues to generate copying operations 111 to execute the next copy task. If the audiovisual segment in the subsequent copy task needs to be joined to the audiovisual segment of the previous copy task, the stream error of the previous audiovisual segment will be considered during tab processing of the audiovisual segment generated by the subsequent copying operation. Therefore, in the following description, the tab processing mentioned in step 37 of the aforesaid copying operation will be described by way of an example with reference to Figures 15 and 16. It is supposed in the example of Figure 16 that there are four sub-audiovisual segments 15, 16, 17, and 18, of which the audiovisual segments 15, 16 belong to the copied audiovisual segments of the first copying operation, and the sub-audiovisual segments 17, 18 belong to the copied audiovisual segments of the second copying operation. Each of the sub-audiovisual segments 15, 17 has a tab-in audio frame (hereinafter referred to as tab) 151 , 171 , and a correction tab audio frame (hereinafter referred to as tab) 152, 172 at the start and the end of the audio segment, respectively. The audio segment in each sub-audiovisual segment 16, 18 also has a correction tab audio frame 161 , 181 and a tab-out audio frame 162, 182 (hereinafter referred to as tabs) at the start and the end thereof, respectively.

Initially, in step 41 , the existing stream error of the current tab 151 is determined. As shown in Figure 15, the current tab 151 belongs to the first segment 15, and there is no previous tab to introduce an existing stream error. Therefore, the existing stream error is 0.

Then, in step 42, the tab error of the tab 151 is determined. In this example, the tab error of the tab 151 is assumed to be 0.3.

Next, in step 43, it is determined whether sum of the existing stream error and the tab error (herein referred to as cumulative error) is greater than half a frame (i.e., 0.5). If so determined in step 43, step 45 is executed to drop the tab 151 processed in the previous step 42. Otherwise, step 44 is executed to retain the tab 151 processed in the previous step 42. According to this example, in step 43, since the cumulative error of the tab 151 is 0.3, which is smaller than 0.5, step 44 is executed to retain the tab 151. In addition, the cumulative error of the tab 151 is also referred to as a new stream error. After executing step 44 or 45, step 46 is executed.

In step 46, the new stream error of the tab 151 is recorded in the synchronization error table 122 generated in the aforesaid step 22 so as to update the synchronization error table 122. In this example, as shown in Table 2, the new stream error of the tab 151 is recorded in the synchronization error table 122. In addition, as the shifting (i.e., stream error) of the two tabs which are respectively at the start and end positions of the same audiovisual segment will be the same during joining, it is only necessary to record at least one of the two tabs in order to permit restoration of the positional relationship between the audio segment and the video segment. In this embodiment, the stream error of the tab 151 at the start position is selected for recording in the synchronization error table 122. Each entry of the recorded stream error is about 4-6 bytes.

Afterwards, it is determined in step 47 whether there are other tabs. If it is determined that there are other tabs in step 47, the flow returns to step 41. Otherwise, the tab processing flow is ended. In this example, since only the tab 151 was processed using the aforesaid steps, it will be determined in step 47 that there are other tabs, and the flow will return to step 41.

It is determined in step 41 that the existing stream error of the tab

152 is 0.3, and it is determined in step 42 that the tab error of the tab 152 is 0.5. Then, in step 43, since the cumulative error of the tab is 0.8, the flow will skip to step 45 to drop the tab 152. After the tab 152 is dropped, the new stream error will become -0.2 (i.e., 0.8-1 =-0.2).

Subsequently, in step 47, since the tab 152 is one at the end position of the segment, the synchronization error table 122 is not updated, and execution of step 47 is continued to determine the next tab 161. Thereafter, steps 41 -47 are repeated until all the tabs have been processed. In addition, when the tab 161 of the second segment 16 is being processed, the existing stream error of the tab 161 in step 41 is - 0.2, and the tab error of the tab 161 is determined to be 0.3 in step 42. Since the cumulative error of the tab 161 is 0.1 , the tab 161 will be retained, and the cumulative error 0.1 (i.e., the new stream error) of the tab 161 will be recorded in the synchronization error table 122 (as shown in Table 2) in step 46. Similarly, the tabs 171 , 181 , which are at the start positions of the third and fourth segments 17, 18, respectively, will be recorded in the synchronization error table 122.

Finally, after tab processing is completed, all the sub-audiovisual segments 15, 16, 17, and 18 can be joined together. Thus, during tab processing, the synchronization error table 122 can be concurrently updated to record the amount of shifting (i.e., stream error) of the audio frames in each of the sub-audiovisual segments 15, 16, 17, and 18 in the synchronization error table 122. When the editing engine 11 subsequently outputs the processed bit stream (e.g., the joined sub- audiovisual segment 15, 16, 17, and 18), the editing engine 11 will insert the synchronization error table 122 into the output bit stream (as in step 24 of Figure 11 ) so that when the bit stream is re-edited, the stream error information in the synchronization error table 122 can be used to first correct the relationship between the audio and video frames before proceeding with the editing job so as to effectively avoid accumulation and propagation of stream errors.

It is noted that the copying operation 111 will output the output bit stream after accomplishing processing of the audiovisual segment which was generated in response to the copy task. Take the aforesaid example of Figure 16 as an instance, the sub-audiovisual segments 15, 16 will be outputted to the output bit stream after tab processing, and the sub-audiovisual segments 17, 18 will be outputted to the output bit stream after tab processing in the next copying operation. How this embodiment can avoid stream error accumulation will be described by way of an example hereinbelow. First, the original audiovisual segment 19 of Figure 17 is joined to another audiovisual segment (X). To avoid growth of cumulative stream errors, the audio frame (a) is dropped, and the currently existing stream error is assumed to be -0.5 audio frame. Therefore, as in Figure 18, the audiovisual segment 19, which has been tab processed, is joined to the audiovisual segment (X), and all the audio frames are shifted to the left by 0.5 audio frame beginning from the audio frame (b) so as to form a new audiovisual segment 19'. At the same time, the amount of shifting (i.e., stream error) of the audio frame (b), i.e., -0.5, will be recorded in the synchronization error table 122, such as (b), i.e., -0.5. Moreover, before outputting the bit stream including the audiovisual segment 19', the editing engine 11 will insert the synchronization error table 122 into the bit stream.

Next, a second editing operation is commenced to cut an audiovisual segment beginning from the video frame 2 from the audiovisual segment (s¹) for joining to another audiovisual segment (Y). When determining the segment of the audiovisual segment (s^J) to be copied, as in Figure 19, the synchronization error table present in the bit stream will be loaded before the selection of the audiovisual segment to be cut so that the stream error information can be used to correct the positional relationship between the audio and video frames. Thereafter, when the cut audiovisual segment undergoes tab processing, the audio frame (c) will be dropped, and the currently existing stream error is set to 0.4 audio frame. As in Figure 20, when the cut audiovisual segment is joined to the audiovisual segment (Y), all the audio frames will be shifted to the right by 0.4 audio frame beginning from the audio frame (d) so as to form a new audiovisual segment 19", and the stream error of the audio frame (d) will be recorded in the synchronization error table 122. Moreover, the synchronization error table 122 will be inserted into the bit stream including the audiovisual segment 19" for output together with the audiovisual segment 19". Herein, as opposed to the accumulation of stream errors in the prior art, due to the retention of the stream error information in the bit stream and the position correcting operation in this embodiment, the stream error after two editing operations is merely the stream error of the second editing operation and is not the cumulative stream error of the two editing operations.

Similarly, if editing is performed for the third time, an audiovisual segment beginning from the video frame 3 is cut from the audiovisual segment 19" for joining to another audiovisual segment (Z). Likewise, when an audiovisual segment is cut from the audiovisual segment 19", as in Figure 21 , the positional relationship between the audio and video frames will be first corrected using the synchronization error table. Then, in the subsequent tab processing operation, the audio frame (f) will be retained, and the stream error will be assumed to be 0.4. As in Figure 22, when the audiovisual segment which has undergone tab processing is joined to the audiovisual segment (Z), all the audio frames will be shifted to the right by 0.4 audio frame beginning from the audio frame (f) so as to form a new audiovisual segment 19'", and the stream error of the audio frame (f) is recorded in the synchronization error table 122. When the new audiovisual segment 19'" is outputted, the synchronization error table 122 is outputted therewith. Therefore, the stream error of the audiovisual segment 19'", which has been edited three times, is merely 0.4, and will not accumulate.

It is noted that although the system for editing audiovisual files in the aforesaid embodiment is an open system so that the synchronization error table is located in the bit stream to facilitate communication with the outside, it should be apparent to those skilled in the art that the synchronization error table 122 can be stored in the storage medium 12 if the system for editing audiovisual files is a closed system, and that it is only necessary to load the synchronization error table 122 into the storage medium 12 in order to correct the positional relationship between the video and audio frames. The synchronization error table 122 in the storage medium 12 can be updated directly during tab processing, which differs from the aforesaid embodiment. In a closed system for editing audiovisual files, it is not necessary to insert the synchronization error table 122 into the bit stream.

In sum, the present invention utilizes a synchronization error table 122 to record stream error information so that, during repeated editing, the stream error information can be used to correct the positional relationship between the audio and video frames before proceeding with editing. Moreover, the new stream error information will be recorded in the newly generated synchronization error table after editing, and the synchronization error table recording the new stream error information is inserted into the edited output bit stream or stored in the storage medium 12, thereby achieving the effect of effectively avoiding accumulation and propagation of stream errors due to repeated editing.

While the present invention has been described in connection with what is considered the most practical and preferred embodiment, it is understood that this invention is not limited to the disclosed embodiment but is intended to cover various arrangements included within the spirit and scope of the broadest interpretation so as to encompass all such modifications and equivalent arrangements.

Claims

1. A method for editing audiovisual files, which is adapted to process an input bit stream according to a copy task, the input bit stream having a plurality of video frames and a plurality of audio frames, said method comprising the following steps:

C) identifying a tab-in audio frame associated with the mark-in video frame and a tab-out audio frame associated with the mark-out video frame from the audio frames; and

D) copying an audiovisual segment from the input bit stream, the audiovisual segment including a plurality of video frames from the mark-in video frame to the mark-out video frame, and a plurality of audio frames from the tab-in audio frame to the tab-out audio frame.

2. The method as claimed in Claim 1 , wherein, in step B), the synchronization error table contains the stream error of at least one of the audio frames.

3. The method as claimed in Claim 1 or 2, wherein, in step B), the audio frames are shifted based on the stream error in the synchronization error table so as to correct the positional relationship between the video frames and the audio frames.

4. The method as claimed in Claim 2, wherein, in step B), if the synchronization error table is present, the input bit stream is split into a plurality of sub-audiovisual segments at a position corresponding to that of said at least one of the audio frames recorded in the synchronization error table, and relationship between the video and audio frames in the corresponding sub-audiovisual segment is corrected according to the stream error of said at least one of the audio frames.

5. The method as claimed in Claim 4, wherein, in step B), the input bit stream is split into the sub-audiovisual segments such that the audio segment in each of the sub-audiovisual segments has a length not smaller than that of the video segment.

6. The method as claimed in Claim 1 , further comprising a step E) of performing tab processing and recording the stream error of the audiovisual segment in the synchronization error table.

7. The method as claimed in Claim 1 , wherein, in step B), if the synchronization error table is present, the synchronization error table is located in the input bit stream.

8. The method as claimed in Claim 6, wherein, in step E), the synchronization error table is further inserted into an output bit stream including the audiovisual segment that has undergone tab processing before outputting the output bit stream.

9. A system for editing audiovisual files, which is adapted to process an input bit stream according to a copy task, the input bit stream having a plurality of video frames and a plurality of audio frames, said system comprising: an editing engine for editing the input bit stream according to the copy task; and a storage medium provided for access of data by said editing engine; wherein, according to the copy task, said editing engine corrects positional relationship between the video frames and the audio frames based on a synchronization error table upon determining presence of the synchronization error table that corresponds to the input bit stream, identifies a mark-in video frame and a mark-out video frame from the video frames, and identifies a tab-in audio frame associated with the mark-in video frame and a tab-out audio frame associated with the mark-out video frame from the audio frames before copying an audiovisual segment, the audiovisual segment including a plurality of video frames from the mark-in video frame to the mark-out video frame and a plurality of audio frames from the tab-in audio frame to the tab-out audio frame.

10. The system as claimed in Claim 9, wherein the synchronization error table records a stream error of at least one of the audio frames.

11. The system as claimed in Claim 9 or 10, wherein said editing engine shifts the audio frames according to the stream error in the synchronization error table so as to correct the positional relationship between the video frames and the audio frames.

12. The system as claimed in Claim 10, wherein, if the synchronization error table is present, said editing engine splits the input bit stream into a plurality of sub-audiovisual segments at a position corresponding to that of said at least one of the audio frames recorded in the synchronization error table, said editing engine further correcting the relationship between the video and audio frames in a corresponding sub-audiovisual segment according to the stream error of said at least one of the audio frames.

13. The system as claimed in Claim 12, wherein, when the input bit stream is being split into the sub-audiovisual segments, said editing engine ensures that the length of the audio segment of each of the sub- audiovisual segments is not smaller than that of the video segment.

14. The system as claimed in Claim 9, wherein said editing engine further performs tab processing and records a stream error of the audiovisual segment in the synchronization error table.

15. The system as claimed in Claim 14, wherein, if said system is an open system, the synchronization error table is located in the input bit stream.

16. The system as claimed in Claim 9, wherein, if said system is a closed system, the synchronization error table is located in said storage medium.

17. The system as claimed in Claim 15, wherein said editing engine further inserts the synchronization error table into an output bit stream which includes the audiovisual segment that has undergone tab processing before outputting the output bit stream.

18. The system as claimed in Claim 9, wherein said editing engine generates a copying operation corresponding to the copy task to execute copying of the audiovisual segment.

19. A computer program product for causing an electronic device to execute steps of editing audiovisual files, the electronic device being used to process an input bit stream according to a copy task, the input bit stream having a plurality of video frames and a plurality of audio frames, the steps of editing the audiovisual files comprising: