CN109640112A - Method for processing video frequency, device, equipment and storage medium - Google Patents
Method for processing video frequency, device, equipment and storage medium Download PDFInfo
- Publication number
- CN109640112A CN109640112A CN201910037302.9A CN201910037302A CN109640112A CN 109640112 A CN109640112 A CN 109640112A CN 201910037302 A CN201910037302 A CN 201910037302A CN 109640112 A CN109640112 A CN 109640112A
- Authority
- CN
- China
- Prior art keywords
- video
- processed
- audio feature
- feature information
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000012545 processing Methods 0.000 title claims abstract description 30
- 238000003860 storage Methods 0.000 title claims abstract description 13
- 230000015654 memory Effects 0.000 claims description 17
- 238000013473 artificial intelligence Methods 0.000 claims description 8
- 230000001755 vocal effect Effects 0.000 claims description 8
- 238000012549 training Methods 0.000 claims description 5
- 230000002860 competitive effect Effects 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000005236 sound signal Effects 0.000 description 3
- 238000010304 firing Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 238000007363 ring formation reaction Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/233—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/235—Processing of additional data, e.g. scrambling of additional data or processing content descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/435—Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- User Interface Of Digital Computer (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The embodiment of the invention discloses a kind of method for processing video frequency, device, equipment and storage mediums.The described method includes: obtaining the audio feature information in video to be processed, the audio feature information includes: at least one in channel information, voiceprint and system voice prompt information;The corresponding video features parameter of the video to be processed is determined according to the audio feature information;The corresponding video tab of the video features parameter is added to the video to be processed.According to the technical solution of the present invention, it can be improved the abundant degree of video tab, promote the quality that viewer understands video content.
Description
Technical field
The present embodiments relate to video processing technique more particularly to a kind of method for processing video frequency, device, equipment and storages
Medium.
Background technique
With gradually developing for network video and gradually enriching for video content, requirement of the user for video viewing experience
Also higher and higher.
In the prior art, the mode for carrying out tag extraction processing to game video is mainly, by carrying out to video pictures
It identifies to provide video tab content, thus the video tab provided is only limitted to the content that picture can be shown, so that video mark
Label content is excessively single, reduces the quality that viewer understands game video when watching game video.
Summary of the invention
The embodiment of the present invention provides a kind of method for processing video frequency, device, equipment and storage medium, to improve video tab
Abundant degree promotes the quality that viewer understands video content.
In a first aspect, the embodiment of the invention provides a kind of method for processing video frequency, comprising:
The audio feature information in video to be processed is obtained, the audio feature information includes: channel information, voiceprint
At least one of and in system voice prompt information;
The corresponding video features parameter of the video to be processed is determined according to the audio feature information;
The corresponding video tab of the video features parameter is added to the video to be processed.
Second aspect, the embodiment of the invention also provides a kind of video process apparatus, which includes:
Data obtaining module, for obtaining the audio feature information in video to be processed, the audio feature information includes:
At least one of in channel information, voiceprint and system voice prompt information;
Parameter determination module, for determining the corresponding video features of the video to be processed according to the audio feature information
Parameter;
Label adding module, for the corresponding video tab of the video features parameter to be added to the view to be processed
Frequently.
The third aspect, the embodiment of the invention also provides a kind of computer equipment, which includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes the method for processing video frequency as described in any in the embodiment of the present invention.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer
Program realizes the method for processing video frequency as described in any in the embodiment of the present invention when program is executed by processor.
For the embodiment of the present invention by obtaining the audio feature information in video to be processed, which includes sound channel
At least one of in information, voiceprint and system voice prompt information, and determined according to the audio feature information to be processed
The corresponding video tab of video features parameter is added to video to be processed, is utilized by the corresponding video features parameter of video
Audio feature information in video obtains richer video tab, solves and only provides view by video pictures in the prior art
Frequency label substance, caused by video tab content is excessively single, reduces video the problem of understanding quality, improve video tab
Abundant degree, improve the quality that viewer understands video content.
Detailed description of the invention
Fig. 1 a is a kind of flow diagram for method for processing video frequency that the embodiment of the present invention one provides;
Fig. 1 b is a kind of schematic diagram of the applicable video tab display mode of the embodiment of the present invention one;
Fig. 2 is a kind of flow diagram of method for processing video frequency provided by Embodiment 2 of the present invention;
Fig. 3 is a kind of structural schematic diagram for video process apparatus that the embodiment of the present invention three provides;
Fig. 4 is a kind of structural schematic diagram for computer equipment that the embodiment of the present invention four provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just
Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 a is a kind of flow diagram for method for processing video frequency that the embodiment of the present invention one provides.This method is applicable to
The case where carrying out labeling processing to video content, this method can be executed by video process apparatus, which can be by hardware
And/or software composition, and can generally be integrated in server and all computer equipments comprising video processing function.Specifically
Include the following:
Audio feature information in S110, acquisition video to be processed, audio feature information include: channel information, vocal print letter
At least one of in breath and system voice prompt information.
The present embodiment is believed mainly for the simple label that can not be recognized by image at present using the audio frequency characteristics of video
Breath is to be accurately identified, to increase tagged abundant degree from multiple dimensions during video is processed and is broadcast live.Wherein,
Video tab can be the key word information for being labeled to video highlight content.
In the present embodiment, video to be processed for example can be game class video, can be the video clip of recording, can also
To be the live video stream being broadcast live, it is not limited thereto.Audio feature information in video to be processed can be video
Voice data, such as the video audio of game video comprising in channel information, voiceprint and system voice prompt information
At least one of.Wherein, channel information can be the multidimensional acoustic information with stereo channel, and voiceprint can be sound sound
The information such as size, the acoustic characteristic of amount, system voice prompt information for example can be the pass issued when triggering system core event
Key events prompt tone etc..
S120, the corresponding video features parameter of video to be processed is determined according to audio feature information.
In the present embodiment, video features parameter can be the characteristic parameter for characterizing critical event content in video, example
The data identified such as the excellent operation moment of player in game video.Since certain operation datas will not be directly displayed at trip
It plays on video pictures, therefore, can be analyzed by the audio feature information to video, obtain this partial video characteristic parameter.
For example, enemy's shooting distance value, can not due to being not explicitly shown on video pictures when enemy player shoots in game video
The distance value is directly judged by video pictures, but enemy's shooting distance value can be determined by the size of shot.
Illustratively, preset algorithm can be used to identify the audio feature information extracted in video to be processed, according to
Recognition result obtains the corresponding video features parameter of video to be processed.For example, the audio frequency characteristics that can will be extracted in video to be processed
Information carries out textual identification, and the text or data information to match with predetermined keyword is filtered out in Text region result,
Video features parameter as video to be processed.
Determine that the beneficial effect of the corresponding video features parameter of video to be processed is using audio feature information, it can multidimensional
Degree more fully takes out more helpful labels, and the label dimension reached can not be touched by creating image recognition, to be substantially improved
The quality that video viewers understand video content.
S130, the corresponding video tab of video features parameter is added to video to be processed.
In the present embodiment, different video features parameters can correspond to different video tabs, to video features parameter
Labeling processing is carried out, for example, the corresponding video features parameter of the video to be processed identified is enemy's shooting distance value 100
Rice, then corresponding video tab can be " 100 ".Wherein, a video features parameter can correspond to a video tab, can also
Multiple video features parametric synthesis correspond to a video tab, are not limited thereto.
In a kind of optional embodiment, it can be shown in using video tab as the keyword of the video to be processed wait locate
Reason video corresponds to the lower section of display interface, in order to which viewer selects oneself interested video to see according to video tab
It sees.
In another optional embodiment, the corresponding video tab of video features parameter is added to video to be processed,
It include: to obtain video time section corresponding with video features parameter in video to be processed;It is aobvious in the corresponding video of video time section
Show in picture, the corresponding video tab of display video features parameter.
Illustratively, it can be based on by the corresponding video playback time of record video features parameter acquisition time to obtain
Preset play time section after the video playback time, and join using the period as in video to be processed with the video features
The corresponding video time section of number.It is shown in picture in the corresponding video of the video time section, adds and show corresponding video mark
Label, to help viewer to better understand video content.
A concrete instance is lifted, such as in Fig. 1 b, when video playback time is 3 points and 05 second, identifies that video features are joined
Number is 100 meters of enemy's shooting distance value, then divides 05 second -3 points of corresponding video in 35 seconds to show in picture 1 from the video playing 3,
Show video tab 11.
On the basis of the above embodiments, optionally, it is added to by the corresponding video tab of video features parameter wait locate
After reason video, further includes: scored according to video tab video to be processed;According to the height of scoring to video to be processed
Recommendation is carried out to show.
Specific marking mode can be, and each video tab can be corresponding with corresponding fractional value, according in video to be processed
The video tab of addition carries out the cumulative of fractional value and calculates, and using calculated result as the scoring of the video to be processed, will score
High video carries out preferential recommendation and shows.Certainly, video tab can be also divided into different types, different types of label has
Different weight, when calculating scoring, by the corresponding weighted value of the affiliated type of video tab multiplied by the corresponding score of video tab
Value, then all video tabs added in the video to be processed are added up and calculated.Above two mode may be applicable to
The present embodiment is not limited thereto.
The technical solution of the present embodiment, by obtaining the audio feature information in video to be processed, the audio feature information
Including in channel information, voiceprint and system voice prompt information at least one of, and it is true according to the audio feature information
Determine the corresponding video features parameter of video to be processed, which is added to view to be processed
Frequently, the audio feature information being utilized in video obtains richer video tab, solves and only passes through video in the prior art
Picture provides video tab content, caused by video tab content is excessively single, reduces video the problem of understanding quality, improve
The abundant degree of video tab, improves the quality that viewer understands video content.
Embodiment two
Fig. 2 is a kind of flow diagram of method for processing video frequency provided by Embodiment 2 of the present invention.The present embodiment is with above-mentioned
Optimized based on embodiment, provide preferred method for processing video frequency, specifically, will according to audio feature information determine to
The corresponding video features parameter of processing video advanced optimize for, comprising: audio feature information is input to sound trained in advance
In sound identification model, the corresponding video features parameter of video to be processed is obtained.
Method for processing video frequency provided in this embodiment specifically comprises the following steps:
Audio feature information in S210, acquisition video to be processed, audio feature information include: channel information, vocal print letter
At least one of in breath and system voice prompt information.
S220, audio feature information is input in voice recognition model trained in advance, it is corresponding obtains video to be processed
Video features parameter.
In the present embodiment, vectorization processing, then the feature vector that will be obtained after processing first can be carried out to audio feature information
It is input to voice recognition model trained in advance.Wherein, voice recognition model can be used for carrying out the audio feature information of input
Identification, to export corresponding video features parameter.It is instructed specifically, voice recognition model can be according to default machine learning algorithm
Practise the model come.
The working principle of voice recognition model can be, and when input audio characteristic information, voice recognition model is to input
Audio feature information carry out voice recognition, the characteristic information identified is analyzed, judge input audio feature information
In whether include corresponding characteristic parameter, if so, then this feature parameter is exported, the video features as the video to be processed
Parameter, if not having, no output.For example, will include that the game video of enemy's shot is input in voice recognition model, sound
After identification model carries out voice recognition and signature analysis to the audio feature information of the video, exportable corresponding enemy's shooting away from
From value.
It is in the present embodiment using voice recognition model to carry out the beneficial effect of voice recognition, voice recognition can be improved
Accuracy and real-time, and then video tab addition process can be improved video tab addition accuracy.
Optionally, in audio feature information to be input to voice recognition model trained in advance, video to be processed is obtained
Before corresponding video features parameter, further includes: obtain the audio feature information sample with target video characteristic parameter label;
Setting artificial intelligence model is trained using audio feature information sample, obtains voice recognition model.
Wherein, audio feature information sample can be extracted from each live video in network direct broadcasting platform, can also be led to
It crosses particular search engine to download from internet, be not limited thereto.To be mentioned from each live video in network direct broadcasting platform
For taking audio feature information sample, multiple direct broadcasting rooms of game class are searched for from target network live streaming platform, then from multiple
The audio signal that multistage has typical sound feature is extracted in direct broadcasting room respectively, the multistage audio signal extracted is marked corresponding
Video features parameter tags, to obtain audio feature information sample.Specifically, to the audio feature information sample of acquisition into
The mode of rower note specifically can be manual evaluation notation methods, namely will be obtained from each direct broadcasting room by artificial mode
Audio signal with typical sound feature marks upper corresponding video features parameter tags, using as different video characteristic parameter
Under audio feature information sample.
Artificial intelligence model is set in the present embodiment can be the training pattern established based on machine learning algorithm, such as follow
Ring neural network (Recurrent neural Network, RNN), RNN are a kind of artificial neurons of node orientation connection cyclization
The internal state of network, this network can show dynamic time sequence behavior.Different from feedforward neural network, RNN can benefit
The list entries of arbitrary sequence is handled with its internal memory, it can be easier to handle the hand-written knowledge if not being segmented for this
Not, speech recognition etc..Specifically, can be the process for adjusting each neural network parameter to the process of artificial intelligence model training,
By constantly training, optimal neural network parameter is obtained, the setting artificial intelligence model with optimal neural network parameter
The model as finally to be obtained.Illustratively, multiple audio frequency characteristics letters with target video characteristic parameter label are being obtained
After ceasing sample, setting artificial intelligence model is trained using multiple audio feature information sample, constantly adjustment setting people
Neural network parameter in work model of mind is identified from the audio feature information of input so that setting artificial intelligence model has
The ability of target video characteristic parameter out, to obtain voice recognition model.
Optionally, video to be processed includes shooting game class video;Correspondingly, audio feature information is input to preparatory instruction
In experienced voice recognition model, the corresponding video features parameter of video to be processed is obtained, comprising: by the sound in audio feature information
Road information input obtains direction locating for the corresponding enemy of shooting game class video into voice recognition model trained in advance;Or,
By in audio feature information channel information and vocal print information input into voice recognition model trained in advance, obtain shooting trip
Direction and enemy's shooting distance locating for the corresponding enemy of play class video;Or, by the channel information harmony in audio feature information
Line information input obtains direction locating for the corresponding enemy of shooting game class video, enemy into voice recognition model trained in advance
Square shooting distance and firearms type.
Illustratively, in shooting game class video, since the audio sound that small arms firing issues can pass through sound channel, vocal print
It identifies, it therefore, can be by the way that channel information and/or voiceprint be input in voice recognition model trained in advance, obtaining
Take specific firing data information.Specifically, the channel information for shooting audio with enemy is input in voice recognition model,
It can export to obtain direction locating for enemy from voice recognition model;The voiceprint for shooting audio with enemy is input to sound to know
It is exportable to obtain firearms type used in enemy's shooting distance and/or enemy in other model;Audio will be shot with us
Voiceprint is input in voice recognition model, exportable to obtain firearms type used in us.
Optionally, video to be processed includes the online tactics competitive game class video of more people;Correspondingly, by audio feature information
It is input in voice recognition model trained in advance, obtains the corresponding video features parameter of video to be processed, comprising: by audio spy
System voice prompt information in reference breath is input in voice recognition model trained in advance, obtains the online tactics sports of more people
The corresponding game events keyword of game class video.
Illustratively, in MOBA (Multiplayer Online Battle Arena, more online tactics sports of people) game
In class video, when can issue specific sound or player's triggering particular game event by the used role of player, system meeting
Voice prompting is issued, it therefore, can be by the way that system voice prompt information be input in voice recognition model trained in advance, to obtain
Take specific player exercises data.Specifically, pre- by being input to the system voice prompt information for continuously killing voice prompting
First in trained voice recognition model, it can export to obtain the keyword that game continuously kills event from voice recognition model, such as
Player continuously kills number.
S230, the corresponding video tab of video features parameter is added to video to be processed.
The technical solution of the present embodiment, by after getting the audio feature information in video to be processed, by the audio
Characteristic information is input in voice recognition model trained in advance, obtains the corresponding video features parameter of video to be processed, and will
The corresponding video tab of video features parameter is added to video to be processed, and video audio is identified using voice recognition model,
Video tab abundant is obtained from more various dimensions, degree is enriched in raising video tab and viewer understands matter to video content
While amount, improve voice recognition accuracy and real-time and video tab addition accuracy.
Embodiment three
Fig. 3 is a kind of structural schematic diagram for video process apparatus that the embodiment of the present invention three provides.With reference to Fig. 3, at video
Reason device includes: data obtaining module 310, parameter determination module 320 and label adding module 330, below to each module into
Row illustrates.
Data obtaining module 310, for obtaining the audio feature information in video to be processed, the audio feature information packet
At least one of it includes: in channel information, voiceprint and system voice prompt information;
Parameter determination module 320, for determining the corresponding video of the video to be processed according to the audio feature information
Characteristic parameter;
Label adding module 330, it is described to be processed for the corresponding video tab of the video features parameter to be added to
Video.
Video process apparatus provided in this embodiment, by obtaining the audio feature information in video to be processed, the audio
Characteristic information includes at least one in channel information, voiceprint and system voice prompt information, and according to audio spy
Reference breath determines the corresponding video features parameter of video to be processed, by the corresponding video tab of video features parameter be added to
Video is handled, the audio feature information being utilized in video obtains richer video tab, solves and only lead in the prior art
Cross video pictures provide video tab content, caused by video tab content it is excessively single, reduce video understand asking for quality
Topic, improves the abundant degree of video tab, improves the quality that viewer understands video content.
Optionally, parameter determination module 320 may include:
Information input submodule, for the audio feature information to be input in voice recognition model trained in advance,
Obtain the corresponding video features parameter of the video to be processed.
Optionally, parameter determination module 320 can also include:
Sample acquisition submodule, for the audio feature information to be input to voice recognition model trained in advance
In, before obtaining the corresponding video features parameter of the video to be processed, obtain the sound with target video characteristic parameter label
Frequency characteristic information sample;
Model training submodule, for being instructed using the audio feature information sample to setting artificial intelligence model
Practice, obtains the voice recognition model.
Optionally, the video to be processed includes shooting game class video;
Correspondingly, information input submodule specifically can be used for:
Channel information in the audio feature information is input in voice recognition model trained in advance, is obtained described
Direction locating for the corresponding enemy of shooting game class video;Or,
By in the audio feature information channel information and vocal print information input to voice recognition model trained in advance
In, obtain direction and enemy's shooting distance locating for the corresponding enemy of the shooting game class video;Or, by the audio frequency characteristics
Channel information and vocal print information input in information obtains the shooting game class view into voice recognition model trained in advance
Frequently direction, enemy's shooting distance locating for corresponding enemy and firearms type.
Optionally, the video to be processed includes the online tactics competitive game class video of more people;
Correspondingly, information input submodule specifically can be used for:
System voice prompt information in the audio feature information is input in voice recognition model trained in advance,
Obtain the corresponding game events keyword of the online tactics competitive game class video of more people.
Optionally, label adding module 330 specifically can be used for:
Obtain video time section corresponding with the video features parameter in the video to be processed;
It is shown in picture in the corresponding video of the video time section, shows the corresponding video mark of the video features parameter
Label.
Optionally, video process apparatus can also include:
Video grading module, for the corresponding video tab of the video features parameter to be added to the view to be processed
After frequency, scored according to the video tab the video to be processed;
Video recommendations module carries out recommendation to the video to be processed for the height according to the scoring and shows.
Method provided by any embodiment of the invention can be performed in the said goods, has the corresponding functional module of execution method
And beneficial effect.
Example IV
Fig. 4 is a kind of structural schematic diagram for computer equipment that the embodiment of the present invention four provides, as shown in figure 4, this implementation
A kind of computer equipment that example provides, comprising: processor 41 and memory 42.Processor in the computer equipment can be one
A or multiple, in Fig. 4 by taking a processor 41 as an example, processor 41 and memory 42 in the computer equipment can pass through
Bus or other modes connect, in Fig. 4 for being connected by bus.
Video process apparatus provided by the above embodiment is integrated in the processor 41 of computer equipment in the present embodiment.This
Outside, the memory 42 in the computer equipment is used as a kind of computer readable storage medium, can be used for storing one or more journeys
Sequence, described program can be software program, computer executable program and module, such as video processing side in the embodiment of the present invention
Corresponding program instruction/the module of method is (for example, the module in attached video process apparatus shown in Fig. 3, comprising: data obtaining module
310, parameter determination module 320 and label adding module 330).Processor 41 is stored in soft in memory 42 by operation
Part program, instruction and module, thereby executing the various function application and data processing of equipment, i.e. the realization above method is implemented
Method for processing video frequency in example.
Memory 42 may include storing program area and storage data area, wherein storing program area can storage program area, extremely
Application program needed for a few function;Storage data area, which can be stored, uses created data etc. according to equipment.In addition, depositing
Reservoir 42 may include high-speed random access memory, can also include nonvolatile memory, and a for example, at least disk is deposited
Memory device, flush memory device or other non-volatile solid state memory parts.In some instances, memory 42 can further comprise
The memory remotely located relative to processor 41, these remote memories can pass through network connection to equipment.Above-mentioned network
Example include but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Also, when one or more included program of above-mentioned computer equipment is by one or more of processors 41
When execution, program is proceeded as follows:
Obtain the audio feature information in video to be processed, audio feature information include: channel information, voiceprint and
At least one of in system voice prompt information;The corresponding video features ginseng of video to be processed is determined according to audio feature information
Number;The corresponding video tab of video features parameter is added to video to be processed.
Embodiment five
The embodiment of the present invention five additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should
The method for processing video frequency provided such as the embodiment of the present invention one is realized when program is executed by video process apparatus, this method comprises: obtaining
The audio feature information in video to be processed is taken, audio feature information includes: that channel information, voiceprint and system voice mention
Show at least one in information;The corresponding video features parameter of video to be processed is determined according to audio feature information;By video spy
The corresponding video tab of sign parameter is added to video to be processed.
Certainly, a kind of computer readable storage medium provided by the embodiment of the present invention, the computer program stored thereon
It is performed and is not limited to realize method operation as described above, can also realize at video provided by any embodiment of the invention
Relevant operation in reason method.
By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention
It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more
Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art
Part can be embodied in the form of software products, which can store in computer readable storage medium
In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer
Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
It is worth noting that, included each unit and module are only pressed in the embodiment of above-mentioned video process apparatus
It is divided, but is not limited to the above division according to function logic, as long as corresponding functions can be realized;In addition,
The specific name of each functional unit is also only for convenience of distinguishing each other, the protection scope being not intended to restrict the invention.
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that
The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,
It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention
It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also
It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.
Claims (10)
1. a kind of method for processing video frequency characterized by comprising
Obtain the audio feature information in video to be processed, the audio feature information include: channel information, voiceprint and
At least one of in system voice prompt information;
The corresponding video features parameter of the video to be processed is determined according to the audio feature information;
The corresponding video tab of the video features parameter is added to the video to be processed.
2. the method according to claim 1, wherein determining the view to be processed according to the audio feature information
Frequently corresponding video features parameter, comprising:
The audio feature information is input in voice recognition model trained in advance, it is corresponding to obtain the video to be processed
Video features parameter.
3. according to the method described in claim 2, it is characterized in that, the audio feature information is input to training in advance
In voice recognition model, before obtaining the corresponding video features parameter of the video to be processed, further includes:
Obtain the audio feature information sample with target video characteristic parameter label;
Setting artificial intelligence model is trained using the audio feature information sample, obtains the voice recognition model.
4. according to the method described in claim 2, it is characterized in that, the video to be processed includes shooting game class video;
Correspondingly, the audio feature information is input in voice recognition model trained in advance, the view to be processed is obtained
Frequently corresponding video features parameter, comprising:
Channel information in the audio feature information is input in voice recognition model trained in advance, obtains the shooting
Direction locating for the corresponding enemy of game class video;Or,
By in the audio feature information channel information and vocal print information input into voice recognition model trained in advance, obtain
To direction and enemy's shooting distance locating for the corresponding enemy of the shooting game class video;Or,
By in the audio feature information channel information and vocal print information input into voice recognition model trained in advance, obtain
To direction, enemy's shooting distance and firearms type locating for the corresponding enemy of the shooting game class video.
5. according to the method described in claim 2, it is characterized in that, the video to be processed includes the online tactics sports trip of more people
Play class video;
Correspondingly, the audio feature information is input in voice recognition model trained in advance, the view to be processed is obtained
Frequently corresponding video features parameter, comprising:
System voice prompt information in the audio feature information is input in voice recognition model trained in advance, is obtained
The corresponding game events keyword of the online tactics competitive game class video of more people.
6. the method according to claim 1, wherein the corresponding video tab of the video features parameter is added
To the video to be processed, comprising:
Obtain video time section corresponding with the video features parameter in the video to be processed;
It is shown in picture in the corresponding video of the video time section, shows the corresponding video tab of the video features parameter.
7. the method according to claim 1, wherein adding by the corresponding video tab of the video features parameter
It adds to after the video to be processed, further includes:
It is scored according to the video tab the video to be processed;
Recommendation is carried out to the video to be processed according to the height of the scoring to show.
8. a kind of video process apparatus characterized by comprising
Data obtaining module, for obtaining the audio feature information in video to be processed, the audio feature information includes: sound channel
At least one of in information, voiceprint and system voice prompt information;
Parameter determination module, for determining that the corresponding video features of the video to be processed are joined according to the audio feature information
Number;
Label adding module, for the corresponding video tab of the video features parameter to be added to the video to be processed.
9. a kind of computer equipment, which is characterized in that the equipment includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method for processing video frequency as described in any in claim 1-7.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The method for processing video frequency as described in any in claim 1-7 is realized when execution.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910037302.9A CN109640112B (en) | 2019-01-15 | 2019-01-15 | Video processing method, device, equipment and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910037302.9A CN109640112B (en) | 2019-01-15 | 2019-01-15 | Video processing method, device, equipment and storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109640112A true CN109640112A (en) | 2019-04-16 |
| CN109640112B CN109640112B (en) | 2021-11-23 |
Family
ID=66061982
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910037302.9A Active CN109640112B (en) | 2019-01-15 | 2019-01-15 | Video processing method, device, equipment and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN109640112B (en) |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110677722A (en) * | 2019-09-29 | 2020-01-10 | 上海依图网络科技有限公司 | Video processing method, and apparatus, medium, and system thereof |
| CN111031392A (en) * | 2019-12-23 | 2020-04-17 | 广州视源电子科技股份有限公司 | Media file playing method, system, device, storage medium and processor |
| CN111447489A (en) * | 2020-04-02 | 2020-07-24 | 北京字节跳动网络技术有限公司 | Video processing method and device, readable medium and electronic equipment |
| CN111885414A (en) * | 2020-07-24 | 2020-11-03 | 腾讯科技(深圳)有限公司 | Data processing method, device and equipment and readable storage medium |
| CN111901668A (en) * | 2020-09-07 | 2020-11-06 | 三星电子(中国)研发中心 | Video playback method and device |
| CN113038175A (en) * | 2021-02-26 | 2021-06-25 | 北京百度网讯科技有限公司 | Video processing method and device, electronic equipment and computer readable storage medium |
| CN114095738A (en) * | 2020-07-30 | 2022-02-25 | 京东方科技集团股份有限公司 | Video and live broadcast processing method, live broadcast system, electronic device, terminal and medium |
Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6697564B1 (en) * | 2000-03-03 | 2004-02-24 | Siemens Corporate Research, Inc. | Method and system for video browsing and editing by employing audio |
| US20120245932A1 (en) * | 2009-11-06 | 2012-09-27 | Kazushige Ouchi | Voice recognition apparatus |
| CN104978963A (en) * | 2014-04-08 | 2015-10-14 | 富士通株式会社 | Speech recognition apparatus, method and electronic equipment |
| CN107357875A (en) * | 2017-07-04 | 2017-11-17 | 北京奇艺世纪科技有限公司 | A kind of voice search method, device and electronic equipment |
| CN107483879A (en) * | 2016-06-08 | 2017-12-15 | 中兴通讯股份有限公司 | Video marker method, apparatus and video frequency monitoring method and system |
| CN107507625A (en) * | 2016-06-14 | 2017-12-22 | 讯飞智元信息科技有限公司 | Sound source distance determines method and device |
| CN107527617A (en) * | 2017-09-30 | 2017-12-29 | 上海应用技术大学 | Monitoring method, apparatus and system based on voice recognition |
| CN107770614A (en) * | 2016-08-18 | 2018-03-06 | 中国电信股份有限公司 | The label producing method and device of content of multimedia |
| CN108563670A (en) * | 2018-01-12 | 2018-09-21 | 武汉斗鱼网络科技有限公司 | Video recommendation method, device, server and computer readable storage medium |
| CN108806668A (en) * | 2018-06-08 | 2018-11-13 | 国家计算机网络与信息安全管理中心 | A kind of audio and video various dimensions mark and model optimization method |
| CN108962216A (en) * | 2018-06-12 | 2018-12-07 | 北京市商汤科技开发有限公司 | A kind of processing method and processing device, equipment and the storage medium of video of speaking |
| CN109126132A (en) * | 2018-08-02 | 2019-01-04 | Oppo广东移动通信有限公司 | Game role position prompting method and device, storage medium and electronic equipment |
| CN109166586A (en) * | 2018-08-02 | 2019-01-08 | 平安科技(深圳)有限公司 | A kind of method and terminal identifying speaker |
-
2019
- 2019-01-15 CN CN201910037302.9A patent/CN109640112B/en active Active
Patent Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6697564B1 (en) * | 2000-03-03 | 2004-02-24 | Siemens Corporate Research, Inc. | Method and system for video browsing and editing by employing audio |
| US20120245932A1 (en) * | 2009-11-06 | 2012-09-27 | Kazushige Ouchi | Voice recognition apparatus |
| CN104978963A (en) * | 2014-04-08 | 2015-10-14 | 富士通株式会社 | Speech recognition apparatus, method and electronic equipment |
| CN107483879A (en) * | 2016-06-08 | 2017-12-15 | 中兴通讯股份有限公司 | Video marker method, apparatus and video frequency monitoring method and system |
| CN107507625A (en) * | 2016-06-14 | 2017-12-22 | 讯飞智元信息科技有限公司 | Sound source distance determines method and device |
| CN107770614A (en) * | 2016-08-18 | 2018-03-06 | 中国电信股份有限公司 | The label producing method and device of content of multimedia |
| CN107357875A (en) * | 2017-07-04 | 2017-11-17 | 北京奇艺世纪科技有限公司 | A kind of voice search method, device and electronic equipment |
| CN107527617A (en) * | 2017-09-30 | 2017-12-29 | 上海应用技术大学 | Monitoring method, apparatus and system based on voice recognition |
| CN108563670A (en) * | 2018-01-12 | 2018-09-21 | 武汉斗鱼网络科技有限公司 | Video recommendation method, device, server and computer readable storage medium |
| CN108806668A (en) * | 2018-06-08 | 2018-11-13 | 国家计算机网络与信息安全管理中心 | A kind of audio and video various dimensions mark and model optimization method |
| CN108962216A (en) * | 2018-06-12 | 2018-12-07 | 北京市商汤科技开发有限公司 | A kind of processing method and processing device, equipment and the storage medium of video of speaking |
| CN109126132A (en) * | 2018-08-02 | 2019-01-04 | Oppo广东移动通信有限公司 | Game role position prompting method and device, storage medium and electronic equipment |
| CN109166586A (en) * | 2018-08-02 | 2019-01-08 | 平安科技(深圳)有限公司 | A kind of method and terminal identifying speaker |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110677722A (en) * | 2019-09-29 | 2020-01-10 | 上海依图网络科技有限公司 | Video processing method, and apparatus, medium, and system thereof |
| CN111031392A (en) * | 2019-12-23 | 2020-04-17 | 广州视源电子科技股份有限公司 | Media file playing method, system, device, storage medium and processor |
| CN111447489A (en) * | 2020-04-02 | 2020-07-24 | 北京字节跳动网络技术有限公司 | Video processing method and device, readable medium and electronic equipment |
| CN111885414A (en) * | 2020-07-24 | 2020-11-03 | 腾讯科技(深圳)有限公司 | Data processing method, device and equipment and readable storage medium |
| WO2022017083A1 (en) * | 2020-07-24 | 2022-01-27 | 腾讯科技(深圳)有限公司 | Data processing method and apparatus, device, and readable storage medium |
| CN114095738A (en) * | 2020-07-30 | 2022-02-25 | 京东方科技集团股份有限公司 | Video and live broadcast processing method, live broadcast system, electronic device, terminal and medium |
| US11956510B2 (en) | 2020-07-30 | 2024-04-09 | Boe Technology Group Co., Ltd. | Video processing method, live streaming processing method, live streaming system, electronic device, terminal, and medium |
| CN111901668A (en) * | 2020-09-07 | 2020-11-06 | 三星电子(中国)研发中心 | Video playback method and device |
| CN113038175A (en) * | 2021-02-26 | 2021-06-25 | 北京百度网讯科技有限公司 | Video processing method and device, electronic equipment and computer readable storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109640112B (en) | 2021-11-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109640112A (en) | Method for processing video frequency, device, equipment and storage medium | |
| CN108769823B (en) | Direct broadcasting room display methods, device, equipment | |
| CN108769772B (en) | Direct broadcasting room display methods, device, equipment and storage medium | |
| US12370429B2 (en) | Computer vision and artificial intelligence applications for performance evaluation and/or skills development | |
| CN108292314B (en) | Information processing apparatus, information processing method, and program | |
| US10566009B1 (en) | Audio classifier | |
| US9861895B2 (en) | Apparatus and methods for multimedia games | |
| EP2933737A1 (en) | Search recommendation method and device | |
| CN110769312B (en) | Method and device for recommending information in live broadcast application | |
| CN108460122B (en) | Video searching method, storage medium, device and system based on deep learning | |
| CN110267116A (en) | Video generation method, device, electronic equipment and computer-readable medium | |
| CN106250400A (en) | A kind of audio data processing method, device and system | |
| US20240330355A1 (en) | Methods and systems generating curated playlists | |
| CN114095742A (en) | Video recommendation method and device, computer equipment and storage medium | |
| CN114339285A (en) | Knowledge point processing method, video processing method and device and electronic equipment | |
| CN112639759B (en) | Contextual digital media processing system and method | |
| CN110427499A (en) | Processing method, device and the storage medium and electronic device of multimedia resource | |
| CN113438492B (en) | Method, system, computer device and storage medium for generating title in live broadcast | |
| US20230199194A1 (en) | Video processing device, video processing method, and recording medium | |
| CN113992972A (en) | A subtitle display method, apparatus, electronic device and readable storage medium | |
| CN111031232A (en) | A method and electronic device for real-time detection of dictation | |
| US20240062544A1 (en) | Information processing device, information processing method, and recording medium | |
| CN117014678A (en) | Video recall position determining method and device, storage medium and electronic equipment | |
| CN109684503B (en) | Formula questioning method based on learning video and learning equipment | |
| JP7431569B2 (en) | Display device and control method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |