+

CN109640112A - Method for processing video frequency, device, equipment and storage medium - Google Patents

Method for processing video frequency, device, equipment and storage medium Download PDF

Info

Publication number
CN109640112A
CN109640112A CN201910037302.9A CN201910037302A CN109640112A CN 109640112 A CN109640112 A CN 109640112A CN 201910037302 A CN201910037302 A CN 201910037302A CN 109640112 A CN109640112 A CN 109640112A
Authority
CN
China
Prior art keywords
video
processed
audio feature
feature information
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910037302.9A
Other languages
Chinese (zh)
Other versions
CN109640112B (en
Inventor
乔文彤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huya Information Technology Co Ltd
Original Assignee
Guangzhou Huya Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huya Information Technology Co Ltd filed Critical Guangzhou Huya Information Technology Co Ltd
Priority to CN201910037302.9A priority Critical patent/CN109640112B/en
Publication of CN109640112A publication Critical patent/CN109640112A/en
Application granted granted Critical
Publication of CN109640112B publication Critical patent/CN109640112B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The embodiment of the invention discloses a kind of method for processing video frequency, device, equipment and storage mediums.The described method includes: obtaining the audio feature information in video to be processed, the audio feature information includes: at least one in channel information, voiceprint and system voice prompt information;The corresponding video features parameter of the video to be processed is determined according to the audio feature information;The corresponding video tab of the video features parameter is added to the video to be processed.According to the technical solution of the present invention, it can be improved the abundant degree of video tab, promote the quality that viewer understands video content.

Description

Method for processing video frequency, device, equipment and storage medium
Technical field
The present embodiments relate to video processing technique more particularly to a kind of method for processing video frequency, device, equipment and storages Medium.
Background technique
With gradually developing for network video and gradually enriching for video content, requirement of the user for video viewing experience Also higher and higher.
In the prior art, the mode for carrying out tag extraction processing to game video is mainly, by carrying out to video pictures It identifies to provide video tab content, thus the video tab provided is only limitted to the content that picture can be shown, so that video mark Label content is excessively single, reduces the quality that viewer understands game video when watching game video.
Summary of the invention
The embodiment of the present invention provides a kind of method for processing video frequency, device, equipment and storage medium, to improve video tab Abundant degree promotes the quality that viewer understands video content.
In a first aspect, the embodiment of the invention provides a kind of method for processing video frequency, comprising:
The audio feature information in video to be processed is obtained, the audio feature information includes: channel information, voiceprint At least one of and in system voice prompt information;
The corresponding video features parameter of the video to be processed is determined according to the audio feature information;
The corresponding video tab of the video features parameter is added to the video to be processed.
Second aspect, the embodiment of the invention also provides a kind of video process apparatus, which includes:
Data obtaining module, for obtaining the audio feature information in video to be processed, the audio feature information includes: At least one of in channel information, voiceprint and system voice prompt information;
Parameter determination module, for determining the corresponding video features of the video to be processed according to the audio feature information Parameter;
Label adding module, for the corresponding video tab of the video features parameter to be added to the view to be processed Frequently.
The third aspect, the embodiment of the invention also provides a kind of computer equipment, which includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing Device realizes the method for processing video frequency as described in any in the embodiment of the present invention.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer Program realizes the method for processing video frequency as described in any in the embodiment of the present invention when program is executed by processor.
For the embodiment of the present invention by obtaining the audio feature information in video to be processed, which includes sound channel At least one of in information, voiceprint and system voice prompt information, and determined according to the audio feature information to be processed The corresponding video tab of video features parameter is added to video to be processed, is utilized by the corresponding video features parameter of video Audio feature information in video obtains richer video tab, solves and only provides view by video pictures in the prior art Frequency label substance, caused by video tab content is excessively single, reduces video the problem of understanding quality, improve video tab Abundant degree, improve the quality that viewer understands video content.
Detailed description of the invention
Fig. 1 a is a kind of flow diagram for method for processing video frequency that the embodiment of the present invention one provides;
Fig. 1 b is a kind of schematic diagram of the applicable video tab display mode of the embodiment of the present invention one;
Fig. 2 is a kind of flow diagram of method for processing video frequency provided by Embodiment 2 of the present invention;
Fig. 3 is a kind of structural schematic diagram for video process apparatus that the embodiment of the present invention three provides;
Fig. 4 is a kind of structural schematic diagram for computer equipment that the embodiment of the present invention four provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 a is a kind of flow diagram for method for processing video frequency that the embodiment of the present invention one provides.This method is applicable to The case where carrying out labeling processing to video content, this method can be executed by video process apparatus, which can be by hardware And/or software composition, and can generally be integrated in server and all computer equipments comprising video processing function.Specifically Include the following:
Audio feature information in S110, acquisition video to be processed, audio feature information include: channel information, vocal print letter At least one of in breath and system voice prompt information.
The present embodiment is believed mainly for the simple label that can not be recognized by image at present using the audio frequency characteristics of video Breath is to be accurately identified, to increase tagged abundant degree from multiple dimensions during video is processed and is broadcast live.Wherein, Video tab can be the key word information for being labeled to video highlight content.
In the present embodiment, video to be processed for example can be game class video, can be the video clip of recording, can also To be the live video stream being broadcast live, it is not limited thereto.Audio feature information in video to be processed can be video Voice data, such as the video audio of game video comprising in channel information, voiceprint and system voice prompt information At least one of.Wherein, channel information can be the multidimensional acoustic information with stereo channel, and voiceprint can be sound sound The information such as size, the acoustic characteristic of amount, system voice prompt information for example can be the pass issued when triggering system core event Key events prompt tone etc..
S120, the corresponding video features parameter of video to be processed is determined according to audio feature information.
In the present embodiment, video features parameter can be the characteristic parameter for characterizing critical event content in video, example The data identified such as the excellent operation moment of player in game video.Since certain operation datas will not be directly displayed at trip It plays on video pictures, therefore, can be analyzed by the audio feature information to video, obtain this partial video characteristic parameter. For example, enemy's shooting distance value, can not due to being not explicitly shown on video pictures when enemy player shoots in game video The distance value is directly judged by video pictures, but enemy's shooting distance value can be determined by the size of shot.
Illustratively, preset algorithm can be used to identify the audio feature information extracted in video to be processed, according to Recognition result obtains the corresponding video features parameter of video to be processed.For example, the audio frequency characteristics that can will be extracted in video to be processed Information carries out textual identification, and the text or data information to match with predetermined keyword is filtered out in Text region result, Video features parameter as video to be processed.
Determine that the beneficial effect of the corresponding video features parameter of video to be processed is using audio feature information, it can multidimensional Degree more fully takes out more helpful labels, and the label dimension reached can not be touched by creating image recognition, to be substantially improved The quality that video viewers understand video content.
S130, the corresponding video tab of video features parameter is added to video to be processed.
In the present embodiment, different video features parameters can correspond to different video tabs, to video features parameter Labeling processing is carried out, for example, the corresponding video features parameter of the video to be processed identified is enemy's shooting distance value 100 Rice, then corresponding video tab can be " 100 ".Wherein, a video features parameter can correspond to a video tab, can also Multiple video features parametric synthesis correspond to a video tab, are not limited thereto.
In a kind of optional embodiment, it can be shown in using video tab as the keyword of the video to be processed wait locate Reason video corresponds to the lower section of display interface, in order to which viewer selects oneself interested video to see according to video tab It sees.
In another optional embodiment, the corresponding video tab of video features parameter is added to video to be processed, It include: to obtain video time section corresponding with video features parameter in video to be processed;It is aobvious in the corresponding video of video time section Show in picture, the corresponding video tab of display video features parameter.
Illustratively, it can be based on by the corresponding video playback time of record video features parameter acquisition time to obtain Preset play time section after the video playback time, and join using the period as in video to be processed with the video features The corresponding video time section of number.It is shown in picture in the corresponding video of the video time section, adds and show corresponding video mark Label, to help viewer to better understand video content.
A concrete instance is lifted, such as in Fig. 1 b, when video playback time is 3 points and 05 second, identifies that video features are joined Number is 100 meters of enemy's shooting distance value, then divides 05 second -3 points of corresponding video in 35 seconds to show in picture 1 from the video playing 3, Show video tab 11.
On the basis of the above embodiments, optionally, it is added to by the corresponding video tab of video features parameter wait locate After reason video, further includes: scored according to video tab video to be processed;According to the height of scoring to video to be processed Recommendation is carried out to show.
Specific marking mode can be, and each video tab can be corresponding with corresponding fractional value, according in video to be processed The video tab of addition carries out the cumulative of fractional value and calculates, and using calculated result as the scoring of the video to be processed, will score High video carries out preferential recommendation and shows.Certainly, video tab can be also divided into different types, different types of label has Different weight, when calculating scoring, by the corresponding weighted value of the affiliated type of video tab multiplied by the corresponding score of video tab Value, then all video tabs added in the video to be processed are added up and calculated.Above two mode may be applicable to The present embodiment is not limited thereto.
The technical solution of the present embodiment, by obtaining the audio feature information in video to be processed, the audio feature information Including in channel information, voiceprint and system voice prompt information at least one of, and it is true according to the audio feature information Determine the corresponding video features parameter of video to be processed, which is added to view to be processed Frequently, the audio feature information being utilized in video obtains richer video tab, solves and only passes through video in the prior art Picture provides video tab content, caused by video tab content is excessively single, reduces video the problem of understanding quality, improve The abundant degree of video tab, improves the quality that viewer understands video content.
Embodiment two
Fig. 2 is a kind of flow diagram of method for processing video frequency provided by Embodiment 2 of the present invention.The present embodiment is with above-mentioned Optimized based on embodiment, provide preferred method for processing video frequency, specifically, will according to audio feature information determine to The corresponding video features parameter of processing video advanced optimize for, comprising: audio feature information is input to sound trained in advance In sound identification model, the corresponding video features parameter of video to be processed is obtained.
Method for processing video frequency provided in this embodiment specifically comprises the following steps:
Audio feature information in S210, acquisition video to be processed, audio feature information include: channel information, vocal print letter At least one of in breath and system voice prompt information.
S220, audio feature information is input in voice recognition model trained in advance, it is corresponding obtains video to be processed Video features parameter.
In the present embodiment, vectorization processing, then the feature vector that will be obtained after processing first can be carried out to audio feature information It is input to voice recognition model trained in advance.Wherein, voice recognition model can be used for carrying out the audio feature information of input Identification, to export corresponding video features parameter.It is instructed specifically, voice recognition model can be according to default machine learning algorithm Practise the model come.
The working principle of voice recognition model can be, and when input audio characteristic information, voice recognition model is to input Audio feature information carry out voice recognition, the characteristic information identified is analyzed, judge input audio feature information In whether include corresponding characteristic parameter, if so, then this feature parameter is exported, the video features as the video to be processed Parameter, if not having, no output.For example, will include that the game video of enemy's shot is input in voice recognition model, sound After identification model carries out voice recognition and signature analysis to the audio feature information of the video, exportable corresponding enemy's shooting away from From value.
It is in the present embodiment using voice recognition model to carry out the beneficial effect of voice recognition, voice recognition can be improved Accuracy and real-time, and then video tab addition process can be improved video tab addition accuracy.
Optionally, in audio feature information to be input to voice recognition model trained in advance, video to be processed is obtained Before corresponding video features parameter, further includes: obtain the audio feature information sample with target video characteristic parameter label; Setting artificial intelligence model is trained using audio feature information sample, obtains voice recognition model.
Wherein, audio feature information sample can be extracted from each live video in network direct broadcasting platform, can also be led to It crosses particular search engine to download from internet, be not limited thereto.To be mentioned from each live video in network direct broadcasting platform For taking audio feature information sample, multiple direct broadcasting rooms of game class are searched for from target network live streaming platform, then from multiple The audio signal that multistage has typical sound feature is extracted in direct broadcasting room respectively, the multistage audio signal extracted is marked corresponding Video features parameter tags, to obtain audio feature information sample.Specifically, to the audio feature information sample of acquisition into The mode of rower note specifically can be manual evaluation notation methods, namely will be obtained from each direct broadcasting room by artificial mode Audio signal with typical sound feature marks upper corresponding video features parameter tags, using as different video characteristic parameter Under audio feature information sample.
Artificial intelligence model is set in the present embodiment can be the training pattern established based on machine learning algorithm, such as follow Ring neural network (Recurrent neural Network, RNN), RNN are a kind of artificial neurons of node orientation connection cyclization The internal state of network, this network can show dynamic time sequence behavior.Different from feedforward neural network, RNN can benefit The list entries of arbitrary sequence is handled with its internal memory, it can be easier to handle the hand-written knowledge if not being segmented for this Not, speech recognition etc..Specifically, can be the process for adjusting each neural network parameter to the process of artificial intelligence model training, By constantly training, optimal neural network parameter is obtained, the setting artificial intelligence model with optimal neural network parameter The model as finally to be obtained.Illustratively, multiple audio frequency characteristics letters with target video characteristic parameter label are being obtained After ceasing sample, setting artificial intelligence model is trained using multiple audio feature information sample, constantly adjustment setting people Neural network parameter in work model of mind is identified from the audio feature information of input so that setting artificial intelligence model has The ability of target video characteristic parameter out, to obtain voice recognition model.
Optionally, video to be processed includes shooting game class video;Correspondingly, audio feature information is input to preparatory instruction In experienced voice recognition model, the corresponding video features parameter of video to be processed is obtained, comprising: by the sound in audio feature information Road information input obtains direction locating for the corresponding enemy of shooting game class video into voice recognition model trained in advance;Or, By in audio feature information channel information and vocal print information input into voice recognition model trained in advance, obtain shooting trip Direction and enemy's shooting distance locating for the corresponding enemy of play class video;Or, by the channel information harmony in audio feature information Line information input obtains direction locating for the corresponding enemy of shooting game class video, enemy into voice recognition model trained in advance Square shooting distance and firearms type.
Illustratively, in shooting game class video, since the audio sound that small arms firing issues can pass through sound channel, vocal print It identifies, it therefore, can be by the way that channel information and/or voiceprint be input in voice recognition model trained in advance, obtaining Take specific firing data information.Specifically, the channel information for shooting audio with enemy is input in voice recognition model, It can export to obtain direction locating for enemy from voice recognition model;The voiceprint for shooting audio with enemy is input to sound to know It is exportable to obtain firearms type used in enemy's shooting distance and/or enemy in other model;Audio will be shot with us Voiceprint is input in voice recognition model, exportable to obtain firearms type used in us.
Optionally, video to be processed includes the online tactics competitive game class video of more people;Correspondingly, by audio feature information It is input in voice recognition model trained in advance, obtains the corresponding video features parameter of video to be processed, comprising: by audio spy System voice prompt information in reference breath is input in voice recognition model trained in advance, obtains the online tactics sports of more people The corresponding game events keyword of game class video.
Illustratively, in MOBA (Multiplayer Online Battle Arena, more online tactics sports of people) game In class video, when can issue specific sound or player's triggering particular game event by the used role of player, system meeting Voice prompting is issued, it therefore, can be by the way that system voice prompt information be input in voice recognition model trained in advance, to obtain Take specific player exercises data.Specifically, pre- by being input to the system voice prompt information for continuously killing voice prompting First in trained voice recognition model, it can export to obtain the keyword that game continuously kills event from voice recognition model, such as Player continuously kills number.
S230, the corresponding video tab of video features parameter is added to video to be processed.
The technical solution of the present embodiment, by after getting the audio feature information in video to be processed, by the audio Characteristic information is input in voice recognition model trained in advance, obtains the corresponding video features parameter of video to be processed, and will The corresponding video tab of video features parameter is added to video to be processed, and video audio is identified using voice recognition model, Video tab abundant is obtained from more various dimensions, degree is enriched in raising video tab and viewer understands matter to video content While amount, improve voice recognition accuracy and real-time and video tab addition accuracy.
Embodiment three
Fig. 3 is a kind of structural schematic diagram for video process apparatus that the embodiment of the present invention three provides.With reference to Fig. 3, at video Reason device includes: data obtaining module 310, parameter determination module 320 and label adding module 330, below to each module into Row illustrates.
Data obtaining module 310, for obtaining the audio feature information in video to be processed, the audio feature information packet At least one of it includes: in channel information, voiceprint and system voice prompt information;
Parameter determination module 320, for determining the corresponding video of the video to be processed according to the audio feature information Characteristic parameter;
Label adding module 330, it is described to be processed for the corresponding video tab of the video features parameter to be added to Video.
Video process apparatus provided in this embodiment, by obtaining the audio feature information in video to be processed, the audio Characteristic information includes at least one in channel information, voiceprint and system voice prompt information, and according to audio spy Reference breath determines the corresponding video features parameter of video to be processed, by the corresponding video tab of video features parameter be added to Video is handled, the audio feature information being utilized in video obtains richer video tab, solves and only lead in the prior art Cross video pictures provide video tab content, caused by video tab content it is excessively single, reduce video understand asking for quality Topic, improves the abundant degree of video tab, improves the quality that viewer understands video content.
Optionally, parameter determination module 320 may include:
Information input submodule, for the audio feature information to be input in voice recognition model trained in advance, Obtain the corresponding video features parameter of the video to be processed.
Optionally, parameter determination module 320 can also include:
Sample acquisition submodule, for the audio feature information to be input to voice recognition model trained in advance In, before obtaining the corresponding video features parameter of the video to be processed, obtain the sound with target video characteristic parameter label Frequency characteristic information sample;
Model training submodule, for being instructed using the audio feature information sample to setting artificial intelligence model Practice, obtains the voice recognition model.
Optionally, the video to be processed includes shooting game class video;
Correspondingly, information input submodule specifically can be used for:
Channel information in the audio feature information is input in voice recognition model trained in advance, is obtained described Direction locating for the corresponding enemy of shooting game class video;Or,
By in the audio feature information channel information and vocal print information input to voice recognition model trained in advance In, obtain direction and enemy's shooting distance locating for the corresponding enemy of the shooting game class video;Or, by the audio frequency characteristics Channel information and vocal print information input in information obtains the shooting game class view into voice recognition model trained in advance Frequently direction, enemy's shooting distance locating for corresponding enemy and firearms type.
Optionally, the video to be processed includes the online tactics competitive game class video of more people;
Correspondingly, information input submodule specifically can be used for:
System voice prompt information in the audio feature information is input in voice recognition model trained in advance, Obtain the corresponding game events keyword of the online tactics competitive game class video of more people.
Optionally, label adding module 330 specifically can be used for:
Obtain video time section corresponding with the video features parameter in the video to be processed;
It is shown in picture in the corresponding video of the video time section, shows the corresponding video mark of the video features parameter Label.
Optionally, video process apparatus can also include:
Video grading module, for the corresponding video tab of the video features parameter to be added to the view to be processed After frequency, scored according to the video tab the video to be processed;
Video recommendations module carries out recommendation to the video to be processed for the height according to the scoring and shows.
Method provided by any embodiment of the invention can be performed in the said goods, has the corresponding functional module of execution method And beneficial effect.
Example IV
Fig. 4 is a kind of structural schematic diagram for computer equipment that the embodiment of the present invention four provides, as shown in figure 4, this implementation A kind of computer equipment that example provides, comprising: processor 41 and memory 42.Processor in the computer equipment can be one A or multiple, in Fig. 4 by taking a processor 41 as an example, processor 41 and memory 42 in the computer equipment can pass through Bus or other modes connect, in Fig. 4 for being connected by bus.
Video process apparatus provided by the above embodiment is integrated in the processor 41 of computer equipment in the present embodiment.This Outside, the memory 42 in the computer equipment is used as a kind of computer readable storage medium, can be used for storing one or more journeys Sequence, described program can be software program, computer executable program and module, such as video processing side in the embodiment of the present invention Corresponding program instruction/the module of method is (for example, the module in attached video process apparatus shown in Fig. 3, comprising: data obtaining module 310, parameter determination module 320 and label adding module 330).Processor 41 is stored in soft in memory 42 by operation Part program, instruction and module, thereby executing the various function application and data processing of equipment, i.e. the realization above method is implemented Method for processing video frequency in example.
Memory 42 may include storing program area and storage data area, wherein storing program area can storage program area, extremely Application program needed for a few function;Storage data area, which can be stored, uses created data etc. according to equipment.In addition, depositing Reservoir 42 may include high-speed random access memory, can also include nonvolatile memory, and a for example, at least disk is deposited Memory device, flush memory device or other non-volatile solid state memory parts.In some instances, memory 42 can further comprise The memory remotely located relative to processor 41, these remote memories can pass through network connection to equipment.Above-mentioned network Example include but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Also, when one or more included program of above-mentioned computer equipment is by one or more of processors 41 When execution, program is proceeded as follows:
Obtain the audio feature information in video to be processed, audio feature information include: channel information, voiceprint and At least one of in system voice prompt information;The corresponding video features ginseng of video to be processed is determined according to audio feature information Number;The corresponding video tab of video features parameter is added to video to be processed.
Embodiment five
The embodiment of the present invention five additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should The method for processing video frequency provided such as the embodiment of the present invention one is realized when program is executed by video process apparatus, this method comprises: obtaining The audio feature information in video to be processed is taken, audio feature information includes: that channel information, voiceprint and system voice mention Show at least one in information;The corresponding video features parameter of video to be processed is determined according to audio feature information;By video spy The corresponding video tab of sign parameter is added to video to be processed.
Certainly, a kind of computer readable storage medium provided by the embodiment of the present invention, the computer program stored thereon It is performed and is not limited to realize method operation as described above, can also realize at video provided by any embodiment of the invention Relevant operation in reason method.
By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art Part can be embodied in the form of software products, which can store in computer readable storage medium In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
It is worth noting that, included each unit and module are only pressed in the embodiment of above-mentioned video process apparatus It is divided, but is not limited to the above division according to function logic, as long as corresponding functions can be realized;In addition, The specific name of each functional unit is also only for convenience of distinguishing each other, the protection scope being not intended to restrict the invention.
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation, It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.

Claims (10)

1. a kind of method for processing video frequency characterized by comprising
Obtain the audio feature information in video to be processed, the audio feature information include: channel information, voiceprint and At least one of in system voice prompt information;
The corresponding video features parameter of the video to be processed is determined according to the audio feature information;
The corresponding video tab of the video features parameter is added to the video to be processed.
2. the method according to claim 1, wherein determining the view to be processed according to the audio feature information Frequently corresponding video features parameter, comprising:
The audio feature information is input in voice recognition model trained in advance, it is corresponding to obtain the video to be processed Video features parameter.
3. according to the method described in claim 2, it is characterized in that, the audio feature information is input to training in advance In voice recognition model, before obtaining the corresponding video features parameter of the video to be processed, further includes:
Obtain the audio feature information sample with target video characteristic parameter label;
Setting artificial intelligence model is trained using the audio feature information sample, obtains the voice recognition model.
4. according to the method described in claim 2, it is characterized in that, the video to be processed includes shooting game class video;
Correspondingly, the audio feature information is input in voice recognition model trained in advance, the view to be processed is obtained Frequently corresponding video features parameter, comprising:
Channel information in the audio feature information is input in voice recognition model trained in advance, obtains the shooting Direction locating for the corresponding enemy of game class video;Or,
By in the audio feature information channel information and vocal print information input into voice recognition model trained in advance, obtain To direction and enemy's shooting distance locating for the corresponding enemy of the shooting game class video;Or,
By in the audio feature information channel information and vocal print information input into voice recognition model trained in advance, obtain To direction, enemy's shooting distance and firearms type locating for the corresponding enemy of the shooting game class video.
5. according to the method described in claim 2, it is characterized in that, the video to be processed includes the online tactics sports trip of more people Play class video;
Correspondingly, the audio feature information is input in voice recognition model trained in advance, the view to be processed is obtained Frequently corresponding video features parameter, comprising:
System voice prompt information in the audio feature information is input in voice recognition model trained in advance, is obtained The corresponding game events keyword of the online tactics competitive game class video of more people.
6. the method according to claim 1, wherein the corresponding video tab of the video features parameter is added To the video to be processed, comprising:
Obtain video time section corresponding with the video features parameter in the video to be processed;
It is shown in picture in the corresponding video of the video time section, shows the corresponding video tab of the video features parameter.
7. the method according to claim 1, wherein adding by the corresponding video tab of the video features parameter It adds to after the video to be processed, further includes:
It is scored according to the video tab the video to be processed;
Recommendation is carried out to the video to be processed according to the height of the scoring to show.
8. a kind of video process apparatus characterized by comprising
Data obtaining module, for obtaining the audio feature information in video to be processed, the audio feature information includes: sound channel At least one of in information, voiceprint and system voice prompt information;
Parameter determination module, for determining that the corresponding video features of the video to be processed are joined according to the audio feature information Number;
Label adding module, for the corresponding video tab of the video features parameter to be added to the video to be processed.
9. a kind of computer equipment, which is characterized in that the equipment includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method for processing video frequency as described in any in claim 1-7.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The method for processing video frequency as described in any in claim 1-7 is realized when execution.
CN201910037302.9A 2019-01-15 2019-01-15 Video processing method, device, equipment and storage medium Active CN109640112B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910037302.9A CN109640112B (en) 2019-01-15 2019-01-15 Video processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910037302.9A CN109640112B (en) 2019-01-15 2019-01-15 Video processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109640112A true CN109640112A (en) 2019-04-16
CN109640112B CN109640112B (en) 2021-11-23

Family

ID=66061982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910037302.9A Active CN109640112B (en) 2019-01-15 2019-01-15 Video processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109640112B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110677722A (en) * 2019-09-29 2020-01-10 上海依图网络科技有限公司 Video processing method, and apparatus, medium, and system thereof
CN111031392A (en) * 2019-12-23 2020-04-17 广州视源电子科技股份有限公司 Media file playing method, system, device, storage medium and processor
CN111447489A (en) * 2020-04-02 2020-07-24 北京字节跳动网络技术有限公司 Video processing method and device, readable medium and electronic equipment
CN111885414A (en) * 2020-07-24 2020-11-03 腾讯科技(深圳)有限公司 Data processing method, device and equipment and readable storage medium
CN111901668A (en) * 2020-09-07 2020-11-06 三星电子(中国)研发中心 Video playback method and device
CN113038175A (en) * 2021-02-26 2021-06-25 北京百度网讯科技有限公司 Video processing method and device, electronic equipment and computer readable storage medium
CN114095738A (en) * 2020-07-30 2022-02-25 京东方科技集团股份有限公司 Video and live broadcast processing method, live broadcast system, electronic device, terminal and medium

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6697564B1 (en) * 2000-03-03 2004-02-24 Siemens Corporate Research, Inc. Method and system for video browsing and editing by employing audio
US20120245932A1 (en) * 2009-11-06 2012-09-27 Kazushige Ouchi Voice recognition apparatus
CN104978963A (en) * 2014-04-08 2015-10-14 富士通株式会社 Speech recognition apparatus, method and electronic equipment
CN107357875A (en) * 2017-07-04 2017-11-17 北京奇艺世纪科技有限公司 A kind of voice search method, device and electronic equipment
CN107483879A (en) * 2016-06-08 2017-12-15 中兴通讯股份有限公司 Video marker method, apparatus and video frequency monitoring method and system
CN107507625A (en) * 2016-06-14 2017-12-22 讯飞智元信息科技有限公司 Sound source distance determines method and device
CN107527617A (en) * 2017-09-30 2017-12-29 上海应用技术大学 Monitoring method, apparatus and system based on voice recognition
CN107770614A (en) * 2016-08-18 2018-03-06 中国电信股份有限公司 The label producing method and device of content of multimedia
CN108563670A (en) * 2018-01-12 2018-09-21 武汉斗鱼网络科技有限公司 Video recommendation method, device, server and computer readable storage medium
CN108806668A (en) * 2018-06-08 2018-11-13 国家计算机网络与信息安全管理中心 A kind of audio and video various dimensions mark and model optimization method
CN108962216A (en) * 2018-06-12 2018-12-07 北京市商汤科技开发有限公司 A kind of processing method and processing device, equipment and the storage medium of video of speaking
CN109126132A (en) * 2018-08-02 2019-01-04 Oppo广东移动通信有限公司 Game role position prompting method and device, storage medium and electronic equipment
CN109166586A (en) * 2018-08-02 2019-01-08 平安科技(深圳)有限公司 A kind of method and terminal identifying speaker

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6697564B1 (en) * 2000-03-03 2004-02-24 Siemens Corporate Research, Inc. Method and system for video browsing and editing by employing audio
US20120245932A1 (en) * 2009-11-06 2012-09-27 Kazushige Ouchi Voice recognition apparatus
CN104978963A (en) * 2014-04-08 2015-10-14 富士通株式会社 Speech recognition apparatus, method and electronic equipment
CN107483879A (en) * 2016-06-08 2017-12-15 中兴通讯股份有限公司 Video marker method, apparatus and video frequency monitoring method and system
CN107507625A (en) * 2016-06-14 2017-12-22 讯飞智元信息科技有限公司 Sound source distance determines method and device
CN107770614A (en) * 2016-08-18 2018-03-06 中国电信股份有限公司 The label producing method and device of content of multimedia
CN107357875A (en) * 2017-07-04 2017-11-17 北京奇艺世纪科技有限公司 A kind of voice search method, device and electronic equipment
CN107527617A (en) * 2017-09-30 2017-12-29 上海应用技术大学 Monitoring method, apparatus and system based on voice recognition
CN108563670A (en) * 2018-01-12 2018-09-21 武汉斗鱼网络科技有限公司 Video recommendation method, device, server and computer readable storage medium
CN108806668A (en) * 2018-06-08 2018-11-13 国家计算机网络与信息安全管理中心 A kind of audio and video various dimensions mark and model optimization method
CN108962216A (en) * 2018-06-12 2018-12-07 北京市商汤科技开发有限公司 A kind of processing method and processing device, equipment and the storage medium of video of speaking
CN109126132A (en) * 2018-08-02 2019-01-04 Oppo广东移动通信有限公司 Game role position prompting method and device, storage medium and electronic equipment
CN109166586A (en) * 2018-08-02 2019-01-08 平安科技(深圳)有限公司 A kind of method and terminal identifying speaker

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110677722A (en) * 2019-09-29 2020-01-10 上海依图网络科技有限公司 Video processing method, and apparatus, medium, and system thereof
CN111031392A (en) * 2019-12-23 2020-04-17 广州视源电子科技股份有限公司 Media file playing method, system, device, storage medium and processor
CN111447489A (en) * 2020-04-02 2020-07-24 北京字节跳动网络技术有限公司 Video processing method and device, readable medium and electronic equipment
CN111885414A (en) * 2020-07-24 2020-11-03 腾讯科技(深圳)有限公司 Data processing method, device and equipment and readable storage medium
WO2022017083A1 (en) * 2020-07-24 2022-01-27 腾讯科技(深圳)有限公司 Data processing method and apparatus, device, and readable storage medium
CN114095738A (en) * 2020-07-30 2022-02-25 京东方科技集团股份有限公司 Video and live broadcast processing method, live broadcast system, electronic device, terminal and medium
US11956510B2 (en) 2020-07-30 2024-04-09 Boe Technology Group Co., Ltd. Video processing method, live streaming processing method, live streaming system, electronic device, terminal, and medium
CN111901668A (en) * 2020-09-07 2020-11-06 三星电子(中国)研发中心 Video playback method and device
CN113038175A (en) * 2021-02-26 2021-06-25 北京百度网讯科技有限公司 Video processing method and device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN109640112B (en) 2021-11-23

Similar Documents

Publication Publication Date Title
CN109640112A (en) Method for processing video frequency, device, equipment and storage medium
CN108769823B (en) Direct broadcasting room display methods, device, equipment
CN108769772B (en) Direct broadcasting room display methods, device, equipment and storage medium
US12370429B2 (en) Computer vision and artificial intelligence applications for performance evaluation and/or skills development
CN108292314B (en) Information processing apparatus, information processing method, and program
US10566009B1 (en) Audio classifier
US9861895B2 (en) Apparatus and methods for multimedia games
EP2933737A1 (en) Search recommendation method and device
CN110769312B (en) Method and device for recommending information in live broadcast application
CN108460122B (en) Video searching method, storage medium, device and system based on deep learning
CN110267116A (en) Video generation method, device, electronic equipment and computer-readable medium
CN106250400A (en) A kind of audio data processing method, device and system
US20240330355A1 (en) Methods and systems generating curated playlists
CN114095742A (en) Video recommendation method and device, computer equipment and storage medium
CN114339285A (en) Knowledge point processing method, video processing method and device and electronic equipment
CN112639759B (en) Contextual digital media processing system and method
CN110427499A (en) Processing method, device and the storage medium and electronic device of multimedia resource
CN113438492B (en) Method, system, computer device and storage medium for generating title in live broadcast
US20230199194A1 (en) Video processing device, video processing method, and recording medium
CN113992972A (en) A subtitle display method, apparatus, electronic device and readable storage medium
CN111031232A (en) A method and electronic device for real-time detection of dictation
US20240062544A1 (en) Information processing device, information processing method, and recording medium
CN117014678A (en) Video recall position determining method and device, storage medium and electronic equipment
CN109684503B (en) Formula questioning method based on learning video and learning equipment
JP7431569B2 (en) Display device and control method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载