CN110290425A

CN110290425A - A kind of method for processing video frequency, device and storage medium

Info

Publication number: CN110290425A
Application number: CN201910691577.4A
Authority: CN
Inventors: 段聪; 吴江红
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-29
Filing date: 2019-07-29
Publication date: 2019-09-27
Anticipated expiration: 2039-07-29
Also published as: CN110290425B

Abstract

The present invention provides a kind of method for processing video frequency, device and storage mediums；Method includes: acquisition target video；In response to the cutting operation for target object in the target video, obtain from the target video using the target object as the foreground video of prospect；The foreground video includes at least one foreground video frame；Obtain background video；The background video includes at least one background video frame；In response to being directed to the synthetic operation of the foreground video and the background video, the foreground video frame in the foreground video is overlapped with the background video frame in the background video, and the video frame that superposition obtains is encapsulated as synthetic video.By means of the invention it is possible to carry out the synthesis of dynamic video.

Description

A kind of method for processing video frequency, device and storage medium

Technical field

This application involves multimedia technology more particularly to a kind of method for processing video frequency, device and storage medium.

Background technique

With the continuous development of communication and mobile Internet, the epoch based on text and picture have become past, net Network live streaming and short video traffic start to be skyrocketed through, the appearance of various video class application programs, significantly reduce people's production The threshold of video, more and more users begin participating in video creation.

But video production scheme in the related technology, object that only can be static are synthesized in template video, can not carry out The synthesis of dynamic video.

Summary of the invention

The embodiment of the present invention provides a kind of method for processing video frequency, device and storage medium, is able to carry out the conjunction of dynamic video At.

The technical solution of the embodiment of the present invention is achieved in that

The embodiment of the present invention provides a kind of method for processing video frequency, comprising:

Obtain target video；

In response to the cutting operation for target object in the target video, obtain from the target video with described Target object is the foreground video of prospect；The foreground video includes at least one foreground video frame；

Obtain background video；The background video includes at least one background video frame；

In response to being directed to the synthetic operation of the foreground video and the background video, by the prospect in the foreground video Video frame is overlapped with the background video frame in the background video, and

The video frame that superposition obtains is encapsulated as synthetic video.

The embodiment of the present invention provides a kind of video process apparatus, comprising:

First acquisition unit, for obtaining target video；

Cutting unit, for being regarded from the target in response to the cutting operation for target object in the target video It obtains in frequency using the target object as the foreground video of prospect；The foreground video includes at least one foreground video frame；

Second acquisition unit, for obtaining background video；The background video includes at least one background video frame；

Synthesis unit, for the synthetic operation in response to being directed to the foreground video and the background video, before described Foreground video frame in scape video and the background video frame in the background video into being superimposed, and

The video frame that superposition obtains is encapsulated as synthetic video.

In the above scheme, the cutting unit, is also used to:

The batch splitting received at least two target videos operates；

It operates in response to the batch splitting, obtains from each target video using the target object as prospect Video clip, and it is determined as corresponding foreground video.

In the above scheme, the synthesis unit, is also used to:

Receive the batch synthetic operation for being directed at least two foreground videos and the background video；

In response to the batch synthetic operation, the foreground video frame at least two foreground video is added to respectively In background video frame in the background video.

In the above scheme, the second acquisition unit, is also used to:

Loaded and displayed has the video selection window of alternative background video；

The video selection received for the video selection window operates；

Obtain the selected background video of the video selection operation.

In the above scheme, described device further include: preview unit is used for:

In response to being directed to the preview operation of the foreground video and the background video, the foreground video frame and institute is presented State the Overlay of background video frame.

In the above scheme, the cutting unit, is also used to:

Target area where identifying the target object in the video frame of the target video, and by the video frame Described in areas transparentization processing except target area；

Video frame after transparency process is encapsulated as the foreground video.

In the above scheme, the cutting unit, is also used to:

Identify the target area where target object described in the video frame of the target video, and according to the target area Domain obtains the corresponding image array of video frame of the target video；Element in described image matrix characterizes corresponding view respectively The pixel of frequency frame belongs to the probability of the target area；

Described image matrix is subjected to mask process with corresponding video frame, the target area will be removed in the video frame Areas transparent except domain.

In the above scheme, the synthesis unit, is also used to:

Obtain the timestamp alignment relation of the foreground video frame Yu background video frame；

The timestamp alignment relation pair will be met in foreground video frame and the background video in the foreground video The background video frame answered is overlapped.

In the above scheme, the synthesis unit, is also used to:

In response to the edit operation for the foreground video and background video setting synthetic parameters, by the prospect Video frame covers the background video frame, and overlay area of the foreground video frame in the background video frame meets setting Synthetic parameters.

In the above scheme, the synthesis unit, is also used to:

Construct initial matrix identical with the foreground video frame sign；

The element in the initial matrix is adjusted according to the edit operation, obtains the synthetic parameters of characterization setting Variable quantity objective matrix.

In the above scheme, the synthesis unit, is also used to:

The objective matrix is multiplied with the foreground video frame in the foreground video, the foreground video after being adjusted Frame；

The foreground video frame adjusted is covered into the background video frame.

Memory, for storing executable instruction；

Processor when for executing the executable instruction stored in the memory, is realized provided in an embodiment of the present invention Method for processing video frequency.

The embodiment of the present invention provides a kind of storage medium, is stored with executable instruction, real when for causing processor to execute Existing method for processing video frequency provided in an embodiment of the present invention.

The prospect that will be split as foreground video using target object in target video, and will be divided from target video The video frame that the foreground video frame of video is synthesized with the background video frame of background video is encapsulated as synthetic video, to be based on video Content, new view is synthesized as background using the target object in target video as prospect and using the video frame of background video Frequently, the dynamic video of image content coordination is obtained.

Detailed description of the invention

Fig. 1 is an optional structural schematic diagram of processing system for video framework provided in an embodiment of the present invention；

Fig. 2 is an optional structural schematic diagram of video process apparatus provided in an embodiment of the present invention；

Fig. 3 is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention；

Fig. 4 is that the embodiment of the present invention provides an optional display interface schematic diagram；

Fig. 5 A is that the embodiment of the present invention provides an optional Overlay schematic diagram；

Fig. 5 B is that the embodiment of the present invention provides an optional Overlay schematic diagram；

Fig. 6 is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention；

Fig. 7 is that the embodiment of the present invention provides an optional training sample schematic diagram；

Fig. 8 is that the embodiment of the present invention provides an optional editing interface schematic diagram；

Fig. 9 is that the embodiment of the present invention provides an optional editing interface schematic diagram；

Figure 10 is an optional encoding and decoding configuration diagram of video encoder provided in an embodiment of the present invention；

Figure 11 is an optional flow diagram of method for processing video frequency in the related technology；

Figure 12 is the synthetic effect schematic diagram of method for processing video frequency in the related technology；

Figure 13 is an optional flow diagram of method for processing video frequency in the related technology；

Figure 14 is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention；

Figure 15 is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention；

Figure 16 is one provided in an embodiment of the present invention optional display interface schematic diagram.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention make into It is described in detail to one step, described embodiment is not construed as limitation of the present invention, and those of ordinary skill in the art are not having All other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.

In the following description, it is related to " some embodiments ", which depict the subsets of all possible embodiments, but can To understand, " some embodiments " can be the same subsets or different subsets of all possible embodiments, and can not conflict In the case where be combined with each other.

Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention The normally understood meaning of technical staff is identical.Term used herein is intended merely to the purpose of the description embodiment of the present invention, It is not intended to limit the present invention.

Before the embodiment of the present invention is further elaborated, to noun involved in the embodiment of the present invention and term It is illustrated, noun involved in the embodiment of the present invention and term are suitable for following explanation.

1) background, the scenery of main body behind, can show space-time environment locating for personage or event, example in the picture of video The building, wall at such as personage rear, ground.

2) prospect is the main body that video shows compared with background closer to the content of camera lens in video pictures, such as stands and building People before building object.

3) target video, carry out Video Composition when for extracting the video of prospect.

4) background video, carry out Video Composition when for extracting the video of background.

5) it is superimposed, using the partial region in one (or multiple images) as prospect, and using another image as background It is synthesized, obtains new image.Such as: by image A some region and image B synthesize, obtain image C.This In, image can be the video frame in video.

6) exposure mask (mask) is the image for being shielded to (some or all of pixel) in image to be processed Matrix, so that the part in specific image highlights.Exposure mask can be two-dimensional matrix array, also use multivalue matrix function sometimes According to.

7) mask process shields the processing of (such as transparence) based on exposure mask to some regions in image.In image Binary number (also referred to as mask) progress and operation of same position in each pixel and exposure mask, such as 1&1=1；1&0=0.

8) it encapsulates, multiple video frames is converted to by video file based on certain frame per second and video format.Wherein, frame per second table Show frame number per second, such as: 25 frames/per second (fps), 60fps etc..Video format can include: Matroska multimedia container (Matroska Multimedia Container, MLV), Audio Video Interleaved format (Audio Video Interleaved, AVI), the video file formats such as dynamic image expert group (Moving Picture Experts Group, MPEG) -4.

Analytic explanation is carried out to the technical solution for being directed to Video Composition in the related technology first.

Technical solution 1) still image and dynamic video synthesis

To still image carry out AI segmentation, be partitioned into the corresponding region of target object, using the region divided as with work It is merged for the background video of background, the video after being synthesized.Here, the object of fusion is static image and dynamic Video, the target object in video after synthesis are static, that is to say, that in video in post synthesis, in each video frame Target object be static.

Technical solution 2) video pictures whole splicing

The video frame or so of two videos is stitched together, synthesize a bigger video, for the background to video into Row processing, the picture of the video of synthesis are not merged in terms of content.

Above-mentioned several technical solutions there are aiming at the problem that, the embodiment of the present invention provides a kind of method for processing video frequency, by mesh It is split from target video in mark video using target object as foreground video, and the prospect for the foreground video divided is regarded The video frame that frequency frame is synthesized with the background video frame of background video is encapsulated as synthetic video, thus based on the content of video to dynamic Video synthesized, obtain image content coordination dynamic video.

Illustrate the exemplary application for realizing the video process apparatus of the embodiment of the present invention below, it is provided in an embodiment of the present invention Video process apparatus is desirably integrated into various forms of electronic equipments, and the electronic equipment that the present invention implements to provide may be embodied as Various terminals, such as mobile phone (mobile phone), tablet computer, laptop etc. have the mobile terminal of wireless communication ability, In another example desktop computer, desktop computer etc..In addition, electronic equipment also may be embodied as a server or by multiple servers The server cluster of composition, is not limited herein.

It is an optional configuration diagram of processing system for video 100 provided in an embodiment of the present invention referring to Fig. 1, Fig. 1, Both terminal 400 connects server 200 by network 300, and network 300 can be wide area network or local area network, or be Combination realizes that data are transmitted using Radio Link.Operation has video processing applications program in terminal 400, and video processing is answered It is provided with interface 410 with program, to receive the relevant operation of the synthetic video of user.

For being provided with video process apparatus provided in an embodiment of the present invention in server 200, exemplary answered at one In, when terminal 400 needs synthetic video, target video and background video can be the video recorded using terminal, at this point, Target video and background video can be sent to server by terminal 400, and request server 200 carries out Video Composition.At this point, clothes Device 200 be engaged in after receiving target video and background video, using method for processing video frequency provided in an embodiment of the present invention, by target Target object in video is split, and is prospect and using background video as background using the foreground video split, by prospect Foreground video frame in video and the background video frame in background video are overlapped, and superimposed video frame is sealed Dress, obtains synthetic video, the synthetic video after encapsulation is finally sent to terminal 400 again.

Such as: as shown in Figure 1, target video is video 101, background video is video 102, and terminal 400 is by 101 He of video Video 102 is sent to server 200, and server 200 extracts the foreground video 104 with portrait 103 for prospect from video 101, And by the foreground video frame 1041 of foreground video 104 (the background video frame including 1041-1 to 1041-n) and background video 102 1021 (including 1021-1 to 1021-n) is overlapped respectively, and obtaining the video frame 1051 of synthetic video 105, (including 1051-1 is extremely 1051-n), wherein n is the integer greater than 1.

To be provided with another exemplary application of video process apparatus provided in an embodiment of the present invention in server 200 In, when terminal 400 needs synthetic video, the identification information of target video and background video can be sent to server 200.Clothes Business device 200 is based on the received identification information of institute and determines corresponding target video and background video, is mentioned using the embodiment of the present invention The method for processing video frequency of confession splits the target object in target video, using the foreground video that splits as prospect simultaneously Using background video as background, the foreground video frame in foreground video and the background video frame in background video are overlapped, and Superimposed video frame is packaged, synthetic video is obtained, the video after encapsulation is finally sent to terminal 400 again.Terminal 400 can go out the video distribution after synthesis.

In an example using terminal 400 as electronic equipment, target video and background video be can be in terminal 400 The video file of encapsulation, by terminal 400 itself using the method provided in an embodiment of the present invention in video processing, by foreground video In foreground video frame and background video in background video frame be overlapped, and superimposed video frame is packaged, is obtained Video file after to synthesis.

It is carried out for being provided with video process apparatus provided in an embodiment of the present invention in server and terminal respectively above Explanation, it is possible to understand that ground, video process apparatus provided in an embodiment of the present invention, which can be distributed, to be arranged in terminal and server, from And method for processing video frequency provided in an embodiment of the present invention is completed by terminal and server collaboration.

It should be noted that in embodiments of the present invention, the type of target video and background video can be identical, it can also not Together.Such as: target video and background video are encapsulated video file.For another example: target video is video flowing, background Video is encapsulated video file.

Video process apparatus provided in an embodiment of the present invention may be embodied as the side of hardware, software or software and hardware combining Formula.

As the example of software implementation, video process apparatus may include one or more software module, for independent Or cooperative achievement method for processing video frequency provided in an embodiment of the present invention, software module can be using the various of various front ends or rear end Programming language.

As the example that hardware is implemented, video process apparatus may include one or more hardware modules, and hardware module can With using (ASIC, Application Specific Integrated Circuit), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic Device), field programmable gate array (F PGA, Field-Programmable Gate Array) etc. hardware decoders, be programmed to realize video processing side provided in an embodiment of the present invention alone or synergistically Method.

Below again by taking software and hardware combining as an example, illustrate the exemplary reality of video process apparatus provided in an embodiment of the present invention It applies.

Referring to fig. 2, Fig. 2 is 20 1 optional structural schematic diagrams of video process apparatus provided in an embodiment of the present invention, root According to Fig. 2 shows video process apparatus 20 structure, it is anticipated that other exemplary structures of video process apparatus 20, therefore Structure as described herein is not construed as limiting, such as can be omitted members described below, alternatively, adding hereafter institute The component that do not record is to adapt to the specific demands of certain applications.

Video process apparatus 20 shown in Fig. 2 includes: at least one processor 210, memory 240, at least one network Interface 220 and user interface 230.Various components in video process apparatus 20 are coupled by bus system 250.It can manage Solution, bus system 250 is for realizing the connection communication between these components.Bus system 250 is in addition to including data/address bus, also Including power bus, control bus and status signal bus in addition.But for the sake of clear explanation, in Fig. 2 all by various buses It is designated as bus system 250.

Memory 240 can be volatile memory or nonvolatile memory, may also comprise volatile and non-volatile Both memories.Wherein, nonvolatile memory can be read-only memory (ROM, Read Only Memory), and volatibility is deposited Reservoir can be random access memory (RAM, Random Access Memory).The memory of description of the embodiment of the present invention 240 are intended to include the memory of any suitable type.

Memory 240 in the embodiment of the present invention can storing data to support the operation of server 200.These data Example includes: any computer program for operating on video process apparatus 20, such as operating system and application program.Its In, operating system includes various system programs, such as ccf layer, core library layer, driving layer etc., for realizing various basic businesses And the hardware based task of processing.Application program may include various application programs.

As the example that method for processing video frequency provided in an embodiment of the present invention uses software and hardware combining to implement, the present invention is implemented Method provided by example can be embodied directly in be combined by the software module that processor 210 executes, and software module, which can be located at, deposits In storage media, storage medium is located at memory 240, and processor 210 reads the executable finger that software module includes in memory 240 It enables, it is real to complete the present invention in conjunction with necessary hardware (e.g., including processor 210 and the other assemblies for being connected to bus 250) The method for processing video frequency of example offer is provided.

The exemplary application and implementation of the video process apparatus above-mentioned for realizing the embodiment of the present invention will be combined, illustrates to realize The method for processing video frequency of the embodiment of the present invention.It is to be appreciated that method for processing video frequency shown in Fig. 3 can be held by various electronic equipments Row, such as executed by terminal or server, or, it is performed in unison with by terminal and server.

It is an optional flow diagram of method for processing video frequency provided in an embodiment of the present invention referring to Fig. 3, Fig. 3, it will The step of showing in conjunction with Fig. 3 is illustrated.

Step S301 obtains target video.

Target video can be encapsulated video file.Target video can also be video flowing, such as: the stream of live video Media data.

The quantity of target video can be one or more.

Operation has video processing applications program in terminal, and target video selection window is provided in video processing applications program Mouthful, the identification information of alternative target video is provided in target video selection window, such as: video thumbnails, video name Claim etc..Terminal receives the selection operation of user, and the corresponding video of identification information that selection operation is selected is as target video.Its In, when electronic equipment is server, the video processing applications program run in terminal is the client of server.

For the target video of user's selection, target video can be presented at the terminal, so that user is to selected mesh It marks video and carries out preview, whether be required target video with the selected target video of determination, if not needed for user When the target video wanted, window can be selected to carry out reselecting for target video based on target video.

Illustratively, target video selection window can icon 401, icon as shown in window 401 in Fig. 4, in window 401 402, icon 403 is respectively the icon of alternative target video, when selection operation selectes icon 402, then the corresponding view of icon 402 Frequency is target video.In window 401 further include: the more multi-option 404 of the more alternative target videos of triggering, when more multi-option 404 when receiving the touch control operation of user, and the identification information of more alternative target videos can be presented.When acquisition target video Afterwards, window 405 can be regard as the preview window, the picture of target video is presented in window 405.

Step S302, in response to the cutting operation for target object in the target video, from the target video It obtains using the target object as the foreground video of prospect.

The segmentation entrance for receiving the cutting operation for being directed to target object can be loaded in terminal.Terminal can be based on receiving Cutting operation generate instruction from target video obtain using the target object as the split order of the foreground video of prospect.

In one example, electronic equipment be terminal when, terminal be based on split order local from target video obtain before Scape video.

In another example, when electronic equipment is server, split order is sent to server by terminal, and server is based on The split order received obtains foreground video from target video.

Electronic equipment is based on split order, can call the interface of video encoder, be regarded target based on the interface called Target video frame is decomposed into video frame by video encoder by frequency input video encoder.Electronic equipment to each video frame into Row image recognition identifies target object from each video frame, obtains constituting foreground video based on the region where target object Foreground video frame.Target object can be the object in the prospect in the video frames of target videos such as people, animal.Here, by structure It is known as foreground video frame at the video frame of foreground video, the foreground video includes at least one foreground video frame.

It identifies from target video and is regarded by the video of prospect of target object, can be known at least one of in the following manner Target object in the video frame of other target video:

Identification method one, calibration mode

Receive the proving operation for the video frame that user is directed in target video, pair that the proving operation of user is calibrated As being determined as target object.Wherein, the object that the proving operation of user is calibrated can be specific object, for example, multiple A people in people, is also possible to a class object, for example, male, women.

Identification method two, image recognition model automatic identification

Target object is used as by the prospect (such as people, animal) of image recognition model automatic identification video frame.

Step S303 obtains background video.

Background video can be encapsulated video file.Background video can also be video flowing.

The video processing applications program run in terminal can provide video selection window to receive user and select background view The video selection of frequency operates, and the identification information of determining background video is operated based on the video selection of user.

Here, the video frame in background video is known as background video frame, the background video includes at least one background Video frame.

It should be noted that in embodiments of the present invention, step S301, the execution sequence of step S302 and step S303 is not Successively, step S301, step S302 can be first carried out, step S303 can also be first carried out.

Step S304 regards the prospect in response to being directed to the synthetic operation of the foreground video and the background video Foreground video frame in frequency is overlapped with the background video frame in the background video, and the video frame that superposition is obtained encapsulates For synthetic video.

The video processing applications program run in terminal can provide the interactive entrance of triggering Video Composition, to receive instruction The synthetic operation that foreground video and background video are synthesized, and synthetic instruction is generated based on synthetic operation.

When electronic equipment is server, synthetic instruction is sent to server by terminal, server be based on synthetic instruction into The superposition of foreground video frame in row foreground video and the background video frame in background video, to realize foreground video and background view The synthesis of frequency.

As shown in Figure 5A, such as: when prospect video bag includes: video A', when background video is video D, by the view of video A' Frequency frame is overlapped with the background video frame in background video, and Overlay can be as shown in Figure 5A, wherein background area 501 is The picture of video D, object 502 are the corresponding region object a in the prospect in video A.

Electronic equipment carries out the background video frame of the foreground video frame of foreground video and background video according to synthetic parameters Superposition.Here, can using relative position of the target object in target video and/or opposite imaging size as synthetic parameters, The edit operation that user can be received based on edit page, so that user is adjusted synthetic parameters.

It should be noted that the behaviour of the user's operations such as selection operation, cutting operation, synthetic operation in the embodiment of the present invention The mode of work can are as follows: touch-control, voice, gesture etc., the embodiment of the present invention is to the mode of operation of user without any restriction.

It will be the foreground video of prospect from target using target object in method for processing video frequency provided in an embodiment of the present invention The view for being split in video, and the foreground video frame for the foreground video divided being superimposed with the background video frame of background video Frequency frame is encapsulated as synthetic video, to synthesize based on the content of video to dynamic video, obtains image content coordination Dynamic video.Here, it is based on image Segmentation Technology, target object is extracted from a video in real time, with another Video is synthesized, and realizes two automatic syncretizing effects of video, the efficiency of energy significant increase user video production, and excite use More interesting videos are created at family, allow ordinary user that can also produce the video of similar film special efficacy.

In some embodiments, when the quantity of target video is at least two, step S302 can be executed are as follows: reception is directed to The batch splitting of at least described two target videos operates；It is operated in response to the batch splitting, from each target video Middle acquisition is determined as corresponding foreground video using the target object as the video clip of prospect.

When the quantity of target video is multiple, cutting operation can operate for batch splitting.For multiple target videos, often A target video can be identical as the target object of prospect, can not also be identical.Wherein, the corresponding target pair of different target videos As that can be same class object.Here, the foreground video being partitioned into is the video clip that the target object in target video is constituted.

At this point, step S303 can be executed are as follows: receive at least two foreground videos and the background video Batch synthetic operation；In response to the batch synthetic operation, the foreground video frame at least two foreground video is distinguished It is overlapped with the background video frame in the background video.

Such as: when prospect video bag includes: video A', video B' and video C', background video be video D when, by video A', The video frame of video B' and video C' are overlapped with the background video frame in background video jointly, and Overlay can be such as Fig. 5 B institute Show, wherein background area 501 be video D picture, object 502, object 503 and object 504 be respectively video A, video B and Object a, object b and the corresponding region object c in prospect in video C.

In some embodiments, when step S303 may be so performed that loaded and displayed has the video of alternative background video Select window；The video selection received for the video selection window operates；Obtain the selected back of the video selection operation Scape video.

It is provided with video selection window in the video processing applications program run in terminal, is shown in video selection window The identification information of alternative background video.In video selection window the identification information of alternative background video can from local acquisition, It can also be obtained from network side.Terminal is operated based on video selection window reception video selection, so that user is grasped based on video selection Make the background video that the selection from alternative background video carries out Video Composition.

For the background video of user's selection, background video can be presented at the terminal, so that user is to selected back Whether scape video carries out preview, be required background video with the selected background video of determination, if not needed for user When the background video wanted, reselecting for background video can be carried out based on video selection window, to selected background video It is replaced.Illustratively, video selection window can be as shown in window 401 in Fig. 4, here not to the selection course of background video It is repeated again.

In some embodiments, in response to the preview operation for the foreground video and the background video, institute is presented State the Overlay of foreground video frame Yu the background video frame.

The video processing applications program run in terminal can provide the interactive entrance for receiving preview operation, to receive instruction The preview operation of the Overlay of preview foreground video and background video.

In some embodiments, as shown in fig. 6, after receiving cutting operation in step S302, following steps can be passed through Foreground video is partitioned into from target video frame:

Step S3021, the target area where identifying the target object in the video frame of the target video, and will Areas transparentization processing except target area described in the video frame；

The target of target object is identified from the video frame of target video by way of image recognition model or calibration Region after identifying target area, keeps the pixel value for the pixel for belonging to target area constant, will belong to other than target area The pixel value of pixel in region be set as 0, so that the areas transparentization except target area is handled, be partitioned into target video Video frame in target object.

Video frame after transparency process is encapsulated as the foreground video by step S3022.

The foreground video frame of transparency process is encapsulated as foreground video based on Video Codec.

In some embodiments, step S3021 this can be implemented so that

Identify the target area where target object described in the video frame of the target video, and according to the target area Domain obtains the corresponding image array of video frame of the target video；Element in described image matrix characterizes corresponding view respectively The pixel of frequency frame belongs to the probability of the target area；Described image matrix is subjected to mask process with corresponding video frame, with By the areas transparent in the video frame in addition to the target area.

Here, the target area where the target object in target video frame, image can be identified by image recognition model Image array of the identification model based on the target area output binaryzation identified.Target can also be identified by the calibration of user The target area where target object in video frame, and the image array of binaryzation is obtained according to determining target area.? In image array, the corresponding element of pixel other than target area is 0, characterizes the pixel and is not belonging to target area, target area The corresponding element of the pixel in domain is 1, characterizes the pixel and belongs to target area.By the video frame of image array and target video into Line mask processing, the pixel value of the pixel of target area is constant, and the pixel value of the pixel in the region other than target area is 0, thus by the areas transparent in video frame in addition to the target area.

Here, image recognition model can be trained by carrying out the sample set of target object mark.Work as target object When for portrait, the training sample in sample set can be as shown in fig. 7, be labeled portrait 702 in portrait picture 701.

In some embodiments, the background in the foreground video frame by the foreground video and the background video Video frame is overlapped, comprising:

Obtain the timestamp alignment relation of the foreground video frame Yu background video frame；By the prospect in the foreground video Video frame is overlapped with the background video frame for meeting the timestamp alignment relation in the background video.

Here, before by foreground video frame and the superposition of background video frame, each foreground video frame in foreground video is obtained Timestamp, and the timestamp of each background video frame in background video is obtained, and according to the timestamp of acquisition, determine foreground video frame With the timestamp alignment relation of background video frame, that is to say, that the pass between the period of foreground video and the period of background video System, and by with timestamp alignment relation foreground video frame and background video frame be overlapped.Wherein, timestamp alignment relation It can be and automatically determined according to the position of position and each background video frame on a timeline of each foreground video frame on a timeline, The editting function that can also be provided based on video processing applications program is determined.Wherein, the editor that video processing applications program provides Function can be existed based on the position of the timestamp of user adjustment operation adjustment foreground video frame on a timeline or foreground video frame Position on time shaft.

Such as: background video when it is 2 minutes a length of, period on time shaft is 0 to 2 minute, foreground video when it is a length of 30 seconds, and its timestamp is aligned with the 1st point of 16 seconds to the 1st point 45 seconds this period of background video, then in foreground video frame First frame and the 1st point of 16 seconds first frame of background video there is timestamp alignment relation, and correspond to frame by frame, by foreground video frame In in each foreground video frame and background video the 1st point of each background video frame in 45 seconds 16 seconds to the 1st point be overlapped.Here, preceding Scape video frame and the frame per second of background video frame can be identical.

For another example: ibid example is adjusted such as Fig. 8 institute the timestamp alignment relation of foreground video frame and background video frame Show, before adjustment, the initial time of foreground video is aligned with the T1 of background video, wherein T1 be 1 point 16 seconds, foreground video It is aligned with the 1st point of 16 seconds to the 1st point 45 seconds this period of background video.User is based on slidably control prolongs arrow shown in dotted line Head shown in direction be adjusted, the initial time of foreground video is adjusted to the T2 of background video, wherein T2 be 1 point 06 second, Then by time adjustment operation by initial position of the foreground video frame in the time shaft of background video by the 1st point adjust within 16 seconds to 1st point 06 second, the 1st point of the 1st point of 06- of the period of foreground video frame and background video frame is aligned for 35 seconds at this time, then is regarded prospect The 1st point of each background video frame in 35 seconds 06 second to the 1st point is overlapped in each foreground video frame and background video in frequency frame.

In embodiments of the present invention, terminal can provide the slidably times such as control adjustment interface in user interface, so that User passes through the starting that slidably times such as control adjustment interface selection foreground video is synthesized with background video on a user interface Time, synthesis end time.It should be noted that the initial time of synthesis or the end time of synthesis are between background video Between initial time and end time.When foreground video frame and background video frame are superimposed by electronic equipment, based on selected It is background video frame that the initial time of synthesis, which starts decoding background video respectively, and based on the background video frame decomposed and prospect Video frame is overlapped frame by frame, until the end time of synthesis.If between the initial time of synthesis and the end time of synthesis Every the duration than foreground video, then it is subject to end time of foreground video.If the initial time of synthesis and the end of synthesis The interval of time is shorter than the foreground video time, then be subject to selection synthesis end time, i.e., do not arrive also foreground video ending Just terminate to synthesize.

In response to the edit operation for the foreground video and background video setting synthetic parameters；By the prospect Video frame covers the background video frame, and overlay area of the foreground video frame in the background video frame meets setting Synthetic parameters.Synthetic parameters include at least one following parameter: position, size etc., are regarded with characterizing foreground video frame in background The superposed positions such as relative position, relative size in frequency frame.

The video processing applications program run in terminal can provide edit page, can display foreground view in edit page Background video frame in the foreground video frame and background video of frequency can show the prospect view with timestamp alignment relation here Frequency frame and background video frame.

It is loaded with editor's interactive interface on editing interface, receives the edit operation of setting synthetic parameters, to set synthesis Parameter.Wherein, edit operation can be the operations such as translation, rotation, scaling.

In practical applications, the editing interface for carrying out edit operation can be as shown in figure 9, provide one in editing interface 901 Rectangle frame 902 identical with the size of the target object in foreground video frame receives user to foreground video based on the rectangle frame Edit operation.

When determine user complete edit operation after, can automatic trigger synthetic operation, may be based on user in the display interface The interactive entrance of operation receives synthetic operation.Prospect is regarded based on the synthetic parameters of edit operation setting in response to synthetic operation Frequency frame covers the background video frame, so that covering of the foreground video in background video frame meets the synthetic parameters of setting.

In some embodiments, synthetic parameters of the determination foreground video frame in the background video frame, packet It includes: construction initial matrix identical with the foreground video frame sign；According to the edit operation in the initial matrix Element is adjusted, and obtains the objective matrix for characterizing the variable quantity of the synthetic parameters.

Here, the matrix identical with the high width of target object in foreground video frame of construction, referred to as initial matrix.According to editor Operation is adjusted initial matrix, obtains the objective matrix of the variable quantity of characterization synthetic parameters.When edit operation is translation, The value of the corresponding element of pixel where the position of translation is then revised as displacement.When edit operation is scaling, then will The value of the corresponding element of pixel where the position of scaling is revised as scaling.It, then will rotation when edit operation is rotation Position where the corresponding element of pixel value be revised as scaling angle function.

Illustratively, when the high width of prospect video frame is 3, initial matrix can be the matrix of 3*3When translation Objective matrix can beObjective matrix when translation can beObjective matrix when rotation can beWherein, t_x、t_yRespectively indicate the displacement along the translation of the direction x, y, s_x、s_yRespectively indicate edge X, the ratio of the direction y scaling, the q in sin (q)/cos (q) indicate the angle of rotation.Wherein, the ratio along the direction x, y scaling is Indicate s_x、s_yIndicate that 2-d spatial coordinate (x, y) scales s centered on (0,0) in the horizontal direction_xTimes, it contracts in vertical direction Put s_yTimes, that is to say, that the horizontal distance of coordinate position distance (0,0) becomes level of the former coordinate from place-centric point after transformation The s of distance_xTimes, vertical range becomes the s of vertical range of the former coordinate from place-centric point_yTimes.Wherein, 1,0 is without practical meaning Justice is will to calculate the default parameters for being expressed as obtaining when math matrix.

In some embodiments, described that the foreground video frame is covered into the background video frame, and the foreground video Overlay area of the frame in the background video frame meets the synthetic parameters of setting, comprising:

The objective matrix is multiplied with the foreground video frame in the foreground video, the foreground video after being adjusted Frame；The foreground video frame adjusted is covered into the background video frame.

Here, objective matrix can be multiplied with the bitmap of foreground video frame, the bitmap of the foreground video frame after being adjusted. Bitmap (Bitmap) is stored in a manner of the two-dimensional array of RGBA pixel.Coordinate position is p0 (x0, y0) in foreground video frame Pixel when being converted, the parameters such as the displacement of transformation, scaling size are inputted into R-matrix, obtain corresponding objective matrix M (x0, y0), then the coordinate position of the pixel is p1 (x1, y1) in the foreground video frame after adjustment, and the calculating of p1 (x1, y1) is public Formula are as follows:

P1 (x1, y1)=p0 (x0, y0) * M (x0, y0)；

Wherein, p0 (x0, y0) is with the transposition [x y] of matrix [x y]^TIt is calculated.

Such as: when a space coordinate p0 (x0, y0) first prolongs the direction x translation t_x, then prolong the direction y translation t_yThe seat finally obtained Mark p1 (x1, y1)=(x0+t_x, y0+t_y), it, can when being indicated with the form of matrix are as follows:

Each pixel can obtain a new coordinate position in foreground video frame, to obtain a new pixel two Dimension group, the bitmap that can be reduced to by this two-dimensional array after new Bitmap, that is, adjustment.

Method for processing video frequency provided in an embodiment of the present invention is capable of providing edit page, and is received and used based on edit page Edit operation of the family to foreground video frame, when adjustment foreground video frame is synthesized with background video frame, relative to background video frame Relative position and imaging size.

Illustratively, by taking electronic equipment is using Android platform as an example, to video encoder involved in the embodiment of the present invention It is illustrated, the encoding and decoding framework of video encoder is as shown in Figure 10:

Codec, which can handle, enters data to generate output data, and codec uses one group of input buffer and defeated Buffer carrys out asynchronous process data out.An empty input block can be created by loader, to send after filling data It is handled to codec.The input data that codec provides client is converted, be then output to one it is empty Output buffer.Last client gets the data of output buffer, consumes the data of the inside, and the output of occupancy is buffered Area is released back into codec.If subsequent, there are also data needs to continue with, and codec will repeat these operations.

The data type that codec can be handled can include: compressed data and original video data.Buffer area can be passed through (ByteBuffers) these data are handled, screen buffer (Surface) is needed to open up original video data at this time Show, can also improve the performance of encoding and decoding in this way.Surface can be used local screen buffer, this buffer area do not map or Copy ByteBuffers to.Such mechanism allows the more efficient of codec.It, can not usually when using Surface Original video data is accessed, but cis (ImageReader) can be used and access decoded original video frame.

In the following, electronic equipment is that terminal is actual application scenarios using target object as portrait, it will illustrate implementation of the present invention Exemplary application of the example in actual application scenarios.

The relevant technologies, Video Composition scheme can be as shown in figure 11, comprising: selection background template 1101.Background template 1101 For a background video.The portrait picture 1102 that picture material includes portrait is chosen, selects portrait figure in such a way that user smears The portrait area of piece 1102, based on user selection portrait area, divided by AI, by portrait picture 1102 be divided into portrait and Background two parts show that portrait 1103 forms editor's image 1104, and right to pluck out portrait 1103 on background template 1101 Show that portrait 1103 and the position of background template 1101 are edited in editor's image 1104, after completing the editing, and based on volume The synthetic parameters collected are merged to by portrait 1103 with background template 1101, obtain the synthetic video that picture material includes portrait 1105。

The synthetic effect of Video Composition scheme shown in Figure 11 is as shown in figure 12, by the people in static portrait picture 1102 As 1103 are synthesized in background template 1101, the display page 1105 of synthetic video 1104 is obtained.Video Composition shown in Figure 11 Scheme is to carry out portrait background segment to static picture, is then synthesized in video, limitation is larger, first is that must have The good template background video of particular production, second is that can only be split for static images, the portrait plucked out is static, forfeiture Many interests.In addition the portrait segmentation of picture is needed to carry out region manually to smear selection, there is multiframe for processing The video efficiency of image is too low.

In the related technology, user is when watching some short-sighted frequency, can initiate video and be in step with function, two videos are carried out Synthesis, is combined into a video with frame.The technic relization scheme that video is in step with is that two videos are directly carried out left and right splicing, Two videos can seem more stiff because scene is different.The effect that video is in step with is as shown in figure 13, wherein picture 1301 is The picture for one of video that video is in step with, picture 1302 are the pictures of another video.

Therefore, the be in step with scheme of function of video is that simply two videos or so are stitched together, and synthesis one is bigger Video, the background of two videos is not handled, there are two scenes for the video of synthesis, it appears loftier.

In order to which the limitation for solving the above-mentioned Video Composition scheme for being only capable of being synthesized to static picture in video is fixed big, or will The lofty technological deficiency of the Video Composition scheme scene that two videos are spliced, the embodiment of the present invention provide a kind of video processing Method, comprising: video selection, video decoding, portrait segmentation, picture editting, Video Composition, as shown in figure 14, comprising:

Video selection is carried out from local video, obtains background video 1401 and portrait video 1402 i.e. target video. Video decoding is carried out to background video 1401, obtains video frame 1403 i.e. background video frame.Video is carried out to portrait video 1402 Decoding, obtains video frame 1404.Video frame 1404 is inputted into neural network model 1405 and carries out portrait segmentation, exports portrait Mask Figure 140 6 obtains portrait image 1407 i.e. prospect by mask Figure 140 6 of portrait and the mask process of video frame 1404 Video frame.

When receiving beginning edit operation, by background video frame 1402 and the display of portrait image 1407 in editing interface On.Portrait image 1407 on editing interface receives the edit operation of user, based on the edit operation of user to portrait image The relative position of 1407 relative video frames 1402 and relative size are adjusted, and obtain relativeness, and be based on relativeness people As image is edited.Wherein, edit operation portrait image 1407 carried out can include: the processing such as translation, scaling and rotation. When receiving preview operation, edited portrait image 1408 and video frame 1402 are rendered, output is used as portrait figure As 1407 and the Overlay 1409 of video frame 1402.When receiving synthetic operation, again by rendering to edited people As image 1408 and video frame 1402 are rendered, synthetic frame 1410 is obtained, again by Multimedia Encoder by synthetic frame 1410 are packaged into synthetic video 1411.Wherein, it after exporting Overlay 1409, can also continue to receive edit operation, to portrait figure As the relative position of 1407 relative video frames 1402 and relative size are adjusted.

Terminal device can show video selection option by system photograph album or the customized photograph album page, based on shown choosing Item selection background video 1401 and portrait video 1402.Terminal device is regarded background video 1401 and portrait by MediaCodec Frequently 1402 it is decoded into multiple single-frame images respectively.Each frame image decoded to portrait video 1402 carries out portrait segmentation, obtains To portrait image 1407.Terminal device passes through each frame portrait image 1407 of the target Matrix to segmentation of characterization synthetic parameters Bitmap carry out matrixing, the Bitmap of the portrait image after being edited, then Bitmap after editor is passed through OpenGL ES API uploads to the texture cell of graphics processing unit (Graphic Process Unit, GPU), terminal device GPU by tinter by the corresponding texture of background video frame 702 and editor after portrait image texture carry out image mix Closing operation obtains final synthetic frame, and synthetic frame is encoded into synthetic video by MediaCodec.

In the following, to being described with the next stage in method for processing video frequency provided in an embodiment of the present invention: portrait segmentation, figure Piece editor, rendering, the decoding of video and synthesis.

1, portrait is divided

In the server, using include multiple portrait class pictures manually marked set as training set, to neural network Model is trained, and is transplanted to terminal by the neural network model after training and preservation, and by the neural network model after training In equipment.

Server collects portrait class picture, and is labeled by portrait class picture of the artificial mode to collection, will The corresponding region of portrait is as prospect in portrait class picture, and using the region other than portrait as background, by foreground and background Each pixel distinguishes.The portrait class picture manually marked can be as shown in fig. 7, carry out portrait 702 in portrait picture 701 Mark.

For dynamic target video, target video is decoded into real time by static state by Video Decoder (MediaCodec) Frame, then the neural network model by static frames input training is handled, and is returned to the picture mask (binary map) of segmentation, is passed through Original image in mask and target video frame carries out transparency blending, in the portrait i.e. foreground video after can cutting out segmentation Foreground video frame.

2, picture editor

User it is divisible go out portrait picture translate, scale, rotate etc. and edit.User can be according to self-demand to volume The position and size collected are controlled.

The portrait picture being partitioned into is stored in a manner of Bitmap in memory, can be by Matrix matrix to storage Bitmap is converted.By constructing a rectangle frame equal with the wide height of portrait picture, then it is supplied on graphical interfaces The interactive entrance of user dragging and rotation, available user edit the Matrix i.e. objective matrix of rectangle frame generation, will be original Portrait picture pixels be multiplied with Matrix, the Bitmap after transformation such as available translation, scaling and rotation.

3, it renders

Portrait can carry out live preview after splitting in target video.In addition in editor's size and location information It later, can also live preview.In live preview, OpenGL ES has been adopted in the terminal and has been rendered, by the image of each frame RGB data uploads to the texture cell of GPU, is then rendered by the rendering pipeline of GPU, and final GPU can be by the figure of output As data render to screen Frame Buffer in, to be shown on screen.

Have efficient parallel processing and rendering framework based on GPU, be very suitable to the processing and rendering of image, therefore, leads to The API of OpenGL ES is crossed to utilize GPU to be rendered, may be implemented to achieve the purpose that real-time rendering special efficacy.

4, video decoding and synthesis

By the video frame being decoded to obtain to video in video, to handled frame by frame video.It is merging most When the video at end, need using Video Composition technology, that is, Video coding.

By taking Android platform as an example, coding and decoding video can be carried out based on the MediaCodec module on Android.

Method for processing video frequency provided in an embodiment of the present invention can be in the video processing applications program of mobile video and live streaming class In, the multiple videos of rapid synthesis are used for, entertaining video editing efficiency is promoted.As shown in figure 15, user when in use, can be successively Select multiple videos, such as: video 151-1, video 151-2 ... video 151-m, wherein video 151-2 be template video (i.e. Background video).User clicks the interactive interface that portrait is scratched for a key that video processing applications program provides, image application program To other than place video 151-2 video 151-1 ... each video carries out portrait segmentation in video 151-m, corresponding to generate multiple portraits Video: portrait video 152-1 ... portrait video 152-m.Wherein, in portrait video, the background area of non-portrait is transparent.User It can be sequentially adjusted in size and relative position of the portrait video relative to template video (may include a background), and live preview melts Close effect.When image, which is applied to figure, receives user's click synthesis button, Video Composition is carried out, by portrait video 152-1 ... Portrait video 152-m is blended into jointly in video 151-2, obtains synthetic video 153, and can protect the video 153 that synthesis is completed It is stored to local.

The video matting interface that video processing applications program provides can be currently in work such as 1601 in Figure 16, window 1602 That makees area needs to be scratched portrait list of videos, in addition there is the button 1603 of one " stingy portrait " and the button of " starting to edit " 1604.Portrait in the video currently selected can be extracted, and shown in real time by " stingy portrait " button 1603 when the user clicks Show in preview area, clicks " starting to edit " button 1604, editing interface 1605 of entering.Editing interface 1605 is for editing portrait In addition there are " replacement background video " button 1608 and one in the relative position and size of video 1606 and background video 1607 " starting to synthesize " button 1609, " replacement background video " button 1608 for replacing the background video currently selected, " start to close At " button 1609 is for starting final Video Composition.

Method for processing video frequency provided in an embodiment of the present invention by carrying out automatic portrait background segment to dynamic video, this The portrait that sample plucks out is movement and fresh and alive image, and user is allowed to select arbitrary background video, and portrait is synthesized to this In video, to realize two sections even fusion of multistage video.For example, the video danced indoors using two sections of users, Two portraits wherein respectively are plucked out, are synthesized in the scene video of another stage, to realize the cooperation of two people strange lands The effect of performance.Due to having carried out portrait background segment, the video scene finally synthesized is unified, therefore this programme introduces more Writing space, the imagination and creativity of user can be sufficiently excited, to promote the playability and interest of software entirety.

Illustrate the exemplary structure of software module below, in some embodiments, as shown in Fig. 2, in video process apparatus Software module may include:

First acquisition unit 2401, for obtaining target video；

Cutting unit 2402, for the cutting operation in response to being directed to target object in the target video, from the mesh It marks in video and obtains using the target object as the foreground video of prospect；The foreground video includes at least one foreground video Frame；

Second acquisition unit 2403, for obtaining background video；The background video includes at least one background video frame；

Synthesis unit 2404, for the synthetic operation in response to being directed to the foreground video and the background video, by institute The foreground video frame stated in foreground video is overlapped with the background video frame in the background video, and the view that superposition is obtained Frequency frame is encapsulated as synthetic video.

In some embodiments, cutting unit 2402 are also used to:

The batch splitting received at least described two target videos operates；It is operated in response to the batch splitting, from It obtains in each target video using the target object as the video clip of prospect, and is determined as corresponding foreground video.

In some embodiments, synthesis unit 2404 are also used to:

Receive the batch synthetic operation for being directed at least two foreground videos and the background video；In response to described batch Measure synthetic operation, the background that the foreground video frame at least two foreground video is added to respectively in the background video In video frame.

In some embodiments, second acquisition unit 2403 are also used to:

Loaded and displayed has the video selection window of alternative background video；Receive the video for being directed to the video selection window Selection operation；Obtain the selected background video of the video selection operation.

In some embodiments, described device further include: preview unit is used for:

In some embodiments, cutting unit 2402 are also used to:

Target area where identifying the target object in the video frame of the target video, and by the video frame Described in areas transparentization processing except target area；Video frame after transparency process is encapsulated as the foreground video.

In some embodiments, cutting unit 2402 are also used to:

In some embodiments, synthesis unit 2403 are also used to:

In some embodiments, the synthesis unit 2403, is also used to:

In some embodiments, synthesis unit 2403 are also used to:

Construct initial matrix identical with the foreground video frame sign；According to the edit operation to the initial matrix In element be adjusted, obtain the objective matrix for characterizing the variable quantity of the synthetic parameters.

In some embodiments, synthesis unit 2403 are also used to:

The example of hardware implementation, method provided by the embodiment of the present invention are used as method provided in an embodiment of the present invention The processor 410 of hardware decoding processor form can be directly used to execute completion, for example, dedicated by one or more application Integrated circuit (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic Device), field programmable gate array (FPGA, Field-Programmable Gate Array) or other electronics Element, which executes, realizes method provided in an embodiment of the present invention.

The embodiment of the present invention provides a kind of storage medium for being stored with executable instruction, wherein it is stored with executable instruction, When executable instruction is executed by processor, processor will be caused to execute method provided in an embodiment of the present invention, for example, such as Fig. 3 The method shown.

In some embodiments, executable instruction can use program, software, software module, the form of script or code, By any form of programming language (including compiling or interpretative code, or declaratively or process programming language) write, and its It can be disposed by arbitrary form, including be deployed as independent program or be deployed as module, component, subroutine or be suitble to Calculate other units used in environment.

As an example, executable instruction can with but not necessarily correspond to the file in file system, can be stored in A part of the file of other programs or data is saved, for example, being stored in hypertext markup language (HTML, Hyper Text Markup Language) in one or more scripts in document, it is stored in the single file for being exclusively used in discussed program In, alternatively, being stored in multiple coordinated files (for example, the file for storing one or more modules, subprogram or code section).

As an example, executable instruction can be deployed as executing in a calculating equipment, or it is being located at one place Multiple calculating equipment on execute, or, be distributed in multiple places and by multiple calculating equipment of interconnection of telecommunication network Upper execution.

It will using target object be foreground video from target video in target video in conclusion through the embodiment of the present invention In split, and the foreground video frame of the foreground video to be divided is as prospect and using the background video frame of background video as back Scape carries out the synthesis of video frame, and the video frame of synthesis is encapsulated as synthetic video, thus based on the content of video to dynamic view Frequency is synthesized, and the dynamic video of image content coordination is obtained.A key operation based on display interface, to multiple target videos into The dividing processing of row mass target object.Also, editing interface is provided a user, the edit operation based on user, to prospect Video is edited relative to the position of background video and imaging size.

The above, only the embodiment of the present invention, are not intended to limit the scope of the present invention.It is all in this hair Made any modifications, equivalent replacements, and improvements etc. within bright spirit and scope, be all contained in protection scope of the present invention it It is interior.

Claims

1. a kind of method for processing video frequency characterized by comprising

Obtain target video；

In response to the cutting operation for target object in the target video, obtain from the target video with the target Object is the foreground video of prospect；The foreground video includes at least one foreground video frame；

In response to being directed to the synthetic operation of the foreground video and the background video, by the foreground video in the foreground video Frame is overlapped with the background video frame in the background video, and

The video frame that superposition obtains is encapsulated as synthetic video.

2. the method according to claim 1, wherein described in response to for target object in the target video Cutting operation, from the target video obtain using the target object as the foreground video of prospect, comprising:

The batch splitting received at least two target videos operates；

It operates in response to the batch splitting, obtains from each target video using the target object as the video of prospect Segment, and it is determined as corresponding foreground video.

3. the method according to claim 1, wherein described in response to being directed to the foreground video and the background The synthetic operation of video folds the foreground video frame in the foreground video with the background video frame in the background video Add, comprising:

In response to the batch synthetic operation, the foreground video frame at least two foreground video is added to respectively described In background video frame in background video.

4. the method according to claim 1, wherein the acquisition background video, comprising:

The video selection received for the video selection window operates；

Obtain the selected background video of the video selection operation.

5. the method according to claim 1, wherein the method also includes:

In response to being directed to the preview operation of the foreground video and the background video, the foreground video frame and the back is presented The Overlay of scape video frame.

6. the method according to claim 1, wherein described obtain from the target video with the target pair As the foreground video for prospect, comprising:

Target area where identifying the target object in the video frame of the target video, and by institute in the video frame State the areas transparentization processing except target area；

Video frame after transparency process is encapsulated as the foreground video.

7. according to the method described in claim 6, it is characterized in that, it is described from the video frame of the target video identification described in Target area where target object, and the areas transparentization except target area described in the video frame is handled, comprising:

It identifies the target area where target object described in the video frame of the target video, and is obtained according to the target area To the corresponding image array of video frame of the target video；Element in described image matrix characterizes corresponding video frame respectively Pixel belong to the probability of the target area；

Described image matrix and corresponding video frame are subjected to mask process, will in the video frame except the target area it Outer areas transparent.

8. method according to any one of claims 1 to 7, which is characterized in that the prospect by the foreground video Video frame is overlapped with the background video frame in the background video, comprising:

The background of the timestamp alignment relation will be met in foreground video frame and the background video in the foreground video Video frame is overlapped.

9. method according to any one of claims 1 to 7, which is characterized in that the prospect by the foreground video Video frame is overlapped with the background video frame in the background video, comprising:

In response to the edit operation for the foreground video and background video setting synthetic parameters, by the foreground video Frame covers the background video frame, and overlay area of the foreground video frame in the background video frame meets the conjunction of setting At parameter.

10. according to the method described in claim 9, it is characterized in that, the method also includes:

Construct initial matrix identical with the foreground video frame sign；

The element in the initial matrix is adjusted according to the edit operation, obtains the variation for characterizing the synthetic parameters The objective matrix of amount.

11. according to the method described in claim 10, it is characterized in that, described cover the background view for the foreground video frame Frequency frame, and overlay area of the foreground video frame in the background video frame meets the synthetic parameters of setting, comprising:

The objective matrix is multiplied with the foreground video frame in the foreground video, the foreground video frame after being adjusted；

The foreground video frame adjusted is covered into the background video frame.

12. a kind of video process apparatus characterized by comprising

First acquisition unit, for obtaining target video；

Cutting unit, for the cutting operation in response to being directed to target object in the target video, from the target video It obtains using the target object as the foreground video of prospect；The foreground video includes at least one foreground video frame；

Synthesis unit regards the prospect for the synthetic operation in response to being directed to the foreground video and the background video Foreground video frame in frequency is overlapped with the background video frame in the background video, and the video frame that superposition is obtained encapsulates For synthetic video.

13. a kind of video process apparatus characterized by comprising

Memory, for storing executable instruction；

Processor when for executing the executable instruction stored in the memory, is realized described in any one of claim 1 to 11 Method for processing video frequency.

14. a kind of storage medium, which is characterized in that being stored with executable instruction, when for causing processor to execute, realizing right It is required that 1 to 11 described in any item method for processing video frequency.