+

CN111833883B - Voice control method, device, electronic device and storage medium - Google Patents

Voice control method, device, electronic device and storage medium Download PDF

Info

Publication number
CN111833883B
CN111833883B CN202010871490.8A CN202010871490A CN111833883B CN 111833883 B CN111833883 B CN 111833883B CN 202010871490 A CN202010871490 A CN 202010871490A CN 111833883 B CN111833883 B CN 111833883B
Authority
CN
China
Prior art keywords
voice information
attribute
voice
tone color
tone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010871490.8A
Other languages
Chinese (zh)
Other versions
CN111833883A (en
Inventor
胡灵超
陈伟雄
王玉年
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Skyworth RGB Electronics Co Ltd
Original Assignee
Shenzhen Skyworth RGB Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Skyworth RGB Electronics Co Ltd filed Critical Shenzhen Skyworth RGB Electronics Co Ltd
Priority to CN202010871490.8A priority Critical patent/CN111833883B/en
Publication of CN111833883A publication Critical patent/CN111833883A/en
Application granted granted Critical
Publication of CN111833883B publication Critical patent/CN111833883B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本发明实施例公开了一种语音控制方法、装置、电子设备及存储介质,该方法包括:接收语音信息;基于所述语音信息进行音色识别,以确定所述语音信息对应的目标音色属性;根据所述目标音色属性确定是否对所述语音信息进行响应。本发明实施例的技术方案,提高了语音控制精度,避免了环境噪声对语音控制功能的干扰。

The embodiment of the present invention discloses a voice control method, device, electronic device and storage medium, the method comprising: receiving voice information; performing timbre recognition based on the voice information to determine a target timbre attribute corresponding to the voice information; and determining whether to respond to the voice information according to the target timbre attribute. The technical solution of the embodiment of the present invention improves the voice control accuracy and avoids the interference of environmental noise on the voice control function.

Description

Voice control method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of voice control, in particular to a voice control method, a voice control device, electronic equipment and a storage medium.
Background
With the popularity of smart products and the continuous progress of technology, voice control is applied to a variety of smart products, such as mobile phones, televisions, refrigerators, air conditioners, etc.
The voice control brings convenience to people and also brings some trouble, such as frequent interference by environmental sounds, chatting sounds, telephone sounds or shouting sounds transmitted from street sides, and the situation of false recognition occurs.
Disclosure of Invention
The embodiment of the invention provides a voice control method, a voice control device, electronic equipment and a storage medium, which improve voice control precision.
In a first aspect, an embodiment of the present invention provides a voice control method, including:
Receiving voice information;
performing tone recognition based on the voice information to determine a target tone attribute corresponding to the voice information;
And determining whether to respond to the voice information according to the target tone color attribute.
Further, the performing tone color recognition based on the voice information to determine a target tone color attribute corresponding to the voice information includes:
extracting the frequency and amplitude of the audio according to the voice information;
The target timbre attribute is identified based on the frequency and amplitude.
Further, the determining whether to respond to the voice information according to the target tone color attribute includes:
Determining the similarity between the target tone color attribute and a preset tone color attribute;
And if the similarity reaches a set threshold, determining to respond to the voice information.
Further, the responding to the voice information includes:
carrying out semantic recognition on the voice information;
And executing corresponding control operation according to the semantic recognition result.
Further, the executing the corresponding control operation according to the semantic recognition result includes at least one of the following:
Sending the voice information to a target client;
Displaying target information corresponding to the semantic result;
and executing the search task according to the semantic recognition result.
Further, the method further comprises the step of switching the preset tone color attribute.
Further, the switching the preset tone color attribute includes:
determining the preset tone color attribute from at least two pre-stored candidate tone color attributes;
Different kinds of candidate timbre attributes correspond to voices of different users.
In a second aspect, an embodiment of the present invention further provides a voice control apparatus, where the apparatus includes:
the receiving module is used for receiving the voice information;
the tone color determining module is used for carrying out tone color recognition based on the voice information so as to determine a target tone color attribute corresponding to the voice information;
And the control module is used for determining whether to respond to the voice information according to the target tone attribute.
Further, the tone color determining module includes:
An extracting unit for extracting the frequency and amplitude of the audio according to the voice information;
and the identification unit is used for identifying the target tone color attribute based on the frequency and the amplitude.
Further, the control module includes:
a determining unit, configured to determine a similarity between the target tone attribute and a preset tone attribute;
and the response unit is used for determining to respond to the voice information if the similarity reaches a set threshold value.
Further, the response unit includes:
The semantic recognition subunit is used for carrying out semantic recognition on the voice information;
And the control subunit is used for executing corresponding control operation according to the semantic recognition result.
Further, the control operation includes at least one of:
Sending the voice information to a target client;
Displaying target information corresponding to the semantic result;
and executing the search task according to the semantic recognition result.
Further, the device also comprises a switching module for switching the preset tone attribute.
Further, the switching module includes:
A determining unit, configured to determine the preset tone attribute from at least two pre-stored candidate tone attributes;
Different kinds of candidate timbre attributes correspond to voices of different users.
In a third aspect, an embodiment of the present invention further provides an electronic device, including:
One or more processors;
Storage means for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the speech control method according to any of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a speech control method according to any of the embodiments of the present invention.
According to the technical scheme, when voice information is received, tone recognition is performed based on the voice information, corresponding target tone attributes are determined, the target tone attributes are matched with preset tone attributes, whether the voice information is responded is determined according to the matching result, the purpose of responding only to voices of specific users is achieved, voice control accuracy is improved, and user experience is improved.
Drawings
The above and other features, advantages and aspects of embodiments of the present invention will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
Fig. 1 is a schematic flow chart of a voice control method according to an embodiment of the invention;
Fig. 2 is a flow chart of a voice control method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a voice control device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While the invention is susceptible of embodiment in the drawings, it is to be understood that the invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided to provide a more thorough and complete understanding of the invention. It should be understood that the drawings and embodiments of the invention are for illustration purposes only and are not intended to limit the scope of the present invention.
It should be understood that the various steps recited in the method embodiments of the present invention may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the invention is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment," another embodiment "means" at least one additional embodiment, "and" some embodiments "means" at least some embodiments. Related definitions of other terms will be given in the description below.
It should be noted that the terms "first," "second," and the like herein are merely used for distinguishing between different devices, modules, or units and not for limiting the order or interdependence of the functions performed by such devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those skilled in the art will appreciate that "one or more" is intended to be construed as "one or more" unless the context clearly indicates otherwise.
Example 1
Fig. 1 is a flowchart of a voice control method according to an embodiment of the present invention. The method can be applied to application scenes controlled based on voice, such as voice control of intelligent household appliances, video chat, voice control of game scenes (such as 'fighting land owners') and the like. The voice control method can be executed by a voice control device, and the device can be realized in a form of software and/or hardware and integrated into a smart terminal, such as a smart phone, a smart television and the like.
As shown in fig. 1, the voice control method provided in this embodiment includes the following steps:
Step 110, receiving voice information.
The voice information specifically refers to a voice acquired by a voice acquisition device of the intelligent terminal, and the voice can be a voice of any content sent by any object, such as shouting from a street, conversation sound between irrelevant people, telephone sound, a voice control command which is spoken by a relevant user for the intelligent terminal, and the like. The speech acquisition device may be, for example, a microphone in particular.
Because the voice acquired by the voice acquisition equipment of the intelligent terminal is complex, interference noise in many environments can be contained, and the interference noise specifically refers to sound which is not expected by a user, the problems of misidentification, misoperation, low voice control precision and poor user experience of the intelligent terminal are caused. For example, when a user plays a game on a smart phone to play a game "fighting land", the playing action is realized based on a voice control mode, when teammates play the game for the first time, the player plays 4, watches the game aside, and can directly shout the game "Wang Zha", the "Wang Zha" shout by the player is identified and the corresponding playing action is executed by the mobile phone, which is not expected by the user, not only results in three-input games by the user, but also endangers teammates. For another example, the user chats with three girl friends in a video mode, the third mother sees blurt out talk endlessly a sentence of "make-up is too rich", and the third sentence of the third mother blurt out talk endlessly is not hoped to be heard by girl friends, but is acquired by the voice acquisition device of the mobile phone, so the user sends the voice to the third girl friends, and the third girl friends hear the voice.
The voice control method provided by the embodiment can effectively avoid the occurrence of the problems, particularly, when voice information is received, the corresponding tone attribute is preferentially identified, and whether the follow-up control operation, such as control operations of semantic identification, information transmission and the like, is continuously executed is determined according to the identified tone attribute.
And 120, performing tone recognition based on the voice information to determine a target tone attribute corresponding to the voice information.
Wherein, the sound of a person is emitted through the vocal cords, the kind of the sound is physically determined by the amplitude and the frequency, the person subjectively feels the size (commonly called the volume) of the sound, the larger the amplitude is, the larger the loudness is, the smaller the distance between the person and the sound source is, and the loudness is larger. The pitch refers to the high or low level of sound (high or low), and is determined by the "frequency", with higher frequencies being higher.
The popular language is cloud to "smell it to sound its person," i.e., timbre (Timbre), also known as sound. The wave generated by one vibration of an object is a complex wave, and the wave can be regarded as a superposition of sine waves with different frequencies and amplitudes. This complex wave determines the timbre of the sound. Sound has different properties due to the properties of different object materials, tone itself is an abstract object, but the complex wave is a visual representation of this abstract object. Different waveforms and different timbres. Different timbres can be distinguished by waveforms.
Illustratively, the performing tone color recognition based on the voice information to determine the target tone color attribute corresponding to the voice information includes:
extracting the frequency and amplitude of the audio according to the voice information;
The target timbre attribute is identified based on the frequency and amplitude.
And 130, determining whether to respond to the voice information according to the target tone color attribute.
Specifically, the determining whether to respond to the voice information according to the target tone attribute includes:
Determining the similarity between the target tone color attribute and a preset tone color attribute;
And if the similarity reaches a set threshold, determining to respond to the voice information.
The preset tone color attribute is a pre-stored tone color attribute, and may be set by a user, for example, the pre-stored tone color attribute includes three types, which respectively correspond to the sound of Zhang three, the sound of Zhang three mama and the sound of Zhang three dad. The user can set the sound corresponding to the preset tone color attribute, and the system only responds to the voice information of the user and does not respond to other voices at the moment; when the user is chatting with a girlfriend, the user cannot hear the user except himself or herself to say for other people.
The user can set that the preset tone color attribute corresponds to the sound of the user, the sound of the mother and the sound of the dad respectively, and the system can respond to the sound of the third user, the sound of the third mother and the sound of the dad respectively at the moment, for example, in the voice control application of the intelligent household appliance, the system can respond to the voice information of any person in Zhang Sanyi families.
Further, in order to improve the recognition accuracy of the tone color attribute, when the tone color attribute template is pre-stored, the user can be prompted to input the appointed voice for multiple times, for example, the user is prompted to input the voice of a sentence, the voice of a word, the voice of a loud voice, the voice of a fast voice, the voice of a slow voice and the like, by prompting the user to input the appointed voice for multiple times, the tone color feature of the user can be more comprehensively extracted, and the extracted tone color feature is used as the preset tone color attribute. When the voice information is received in real time, the tone corresponding to the voice information is compared with the preset tone attribute, and if the similarity of the tone attribute and the preset tone attribute reaches a set threshold value, the voice is considered to be the voice sent by the same person, and the voice can be responded.
The set threshold may also be referred to as sensitivity, and if the set threshold is set higher, the higher the sensitivity is, the higher the voice control accuracy of the system is, and the less likely it is to be disturbed by noise.
Illustratively, said responding to said voice information comprises:
carrying out semantic recognition on the voice information;
And executing corresponding control operation according to the semantic recognition result. The control operation includes at least one of:
Sending the voice information to a target client;
Displaying target information corresponding to the semantic result;
and executing the search task according to the semantic recognition result.
For example, in the process of video call, only the voice information matched with the preset tone attribute is sent, in the intelligent home control application, only the voice information matched with the preset tone attribute is responded, for example, the voice information of 'please turn on' is spoken by Zhang Sanhe, the television can be responded through the identification and the matching determination of the tone attribute, the control operation of 'please turn on' is executed, if the voice of the son of Zhang Sanhe is spoken by 'please turn on', the control operation of 'please turn on' is not executed, namely, the television cannot be controlled to turn on by Zhang Sanson.
According to the technical scheme, when voice information is received, tone recognition is performed based on the voice information, corresponding target tone attributes are determined, the target tone attributes are matched with preset tone attributes, whether the voice information is responded is determined according to a matching result, the purpose of responding only to voices of specific users is achieved, voice control precision is improved, and user experience is improved.
Example two
Fig. 2 is a flow chart of a voice control method according to a second embodiment of the present invention. On the basis of the above embodiment, the present embodiment further optimizes the scheme, specifically, adds the step 140 of "switching the preset tone attribute", so that the advantage of this optimization is that the user can flexibly set the user object that can use the voice control function according to different application scenarios.
As shown in fig. 2, the voice control method includes the steps of:
Step 210, receiving voice information.
Step 220, performing tone color recognition based on the voice information to determine a target tone color attribute corresponding to the voice information.
Step 230, determining whether to respond to the voice information according to the target tone color attribute.
Step 240, switching the preset tone attribute.
In particular, step 240 may occur at any time, and the present embodiment is not limited to occurring after step 230, but may also occur before step 210. For example, if Zhang Sanwang to make a video call with a girl friend, hope that the girl friend can only hear his own voice, but not hear the voice of other people, zhang Sanwang may switch the preset tone attribute to his own tone attribute before making the video call. After switching, the system only responds to the Zhang three voices, but does not respond to other voices, so that the voice control precision in a specific scene is improved, and the interference of noise on control operation is avoided.
Illustratively, said switching the preset tone color attribute includes:
determining the preset tone color attribute from at least two pre-stored candidate tone color attributes;
Different kinds of candidate timbre attributes correspond to voices of different users.
The candidate tone color attributes are tone color attributes which are recorded in the system in advance.
According to the technical scheme, the user can switch and set the preset tone attribute according to the specific voice control application scene, so that the system only recognizes and responds to the voice of the specific person, the purpose of shielding the voice of other irrelevant persons is achieved, and the voice control precision and the user experience are improved.
Example III
Fig. 3 is a schematic structural diagram of a voice control device according to a third embodiment of the present invention, where the voice control device is configured to execute the voice control method according to any one of the above embodiments, and is implemented in a form of software and/or hardware, and is integrated in an intelligent terminal, for example, a smart phone, a smart television, etc., and is suitable for an application scenario for controlling based on voice, for example, voice control of work of an intelligent home appliance, video chat, voice control game scenario (for example, "fighting land") and so on.
As shown in fig. 3, the voice control apparatus includes a receiving module 310, a tone color determining module 320, and a control module 330.
The voice recognition system comprises a receiving module 310 for receiving voice information, a tone determining module 320 for performing tone recognition based on the voice information to determine a target tone attribute corresponding to the voice information, and a control module 330 for determining whether to respond to the voice information according to the target tone attribute.
Based on the above technical solution, the tone color determining module 320 includes:
An extracting unit for extracting the frequency and amplitude of the audio according to the voice information;
and the identification unit is used for identifying the target tone color attribute based on the frequency and the amplitude.
Based on the above technical solution, the control module 330 includes:
a determining unit, configured to determine a similarity between the target tone attribute and a preset tone attribute;
and the response unit is used for determining to respond to the voice information if the similarity reaches a set threshold value.
On the basis of the above technical solution, the response unit includes:
The semantic recognition subunit is used for carrying out semantic recognition on the voice information;
And the control subunit is used for executing corresponding control operation according to the semantic recognition result.
On the basis of the technical scheme, the control operation comprises at least one of the following steps:
Sending the voice information to a target client;
Displaying target information corresponding to the semantic result;
and executing the search task according to the semantic recognition result.
On the basis of the technical scheme, the device further comprises a switching module for switching the preset tone color attribute.
On the basis of the technical scheme, the switching module comprises:
A determining unit, configured to determine the preset tone attribute from at least two pre-stored candidate tone attributes;
Different kinds of candidate timbre attributes correspond to voices of different users.
According to the technical scheme, when voice information is received, tone recognition is performed based on the voice information, corresponding target tone attributes are determined, the target tone attributes are matched with preset tone attributes, whether the voice information is responded is determined according to the matching result, the purpose of responding only to voices of specific users is achieved, voice control accuracy is improved, and user experience is improved.
The voice control device provided by the embodiment of the invention can execute the voice control method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
It should be noted that the above-mentioned units and modules included in the apparatus are only divided according to the functional logic, but not limited to the above-mentioned division, so long as the corresponding functions can be implemented, and the specific names of the functional units are only used for distinguishing from each other, and are not used for limiting the protection scope of the embodiments of the present invention.
Example IV
Referring now to fig. 4, a schematic diagram of an electronic device (e.g., a terminal device or server in fig. 4) 400 suitable for use in implementing embodiments of the present invention is shown. The terminal device in the embodiment of the present invention may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), car terminals (e.g., car navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 4 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the invention.
As shown in fig. 4, the electronic device 400 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 401, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage means 406 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic device 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
In general, devices may be connected to I/O interface 405 including input devices 406 such as a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc., output devices 407 including a Liquid Crystal Display (LCD), speaker, vibrator, etc., storage devices 406 including magnetic tape, hard disk, etc., and communications devices 409. The communication means 409 may allow the electronic device 400 to communicate with other devices wirelessly or by wire to exchange data. While fig. 4 shows an electronic device 400 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present invention, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present invention include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via communications device 409, or from storage 406, or from ROM 402. The above-described functions defined in the method of the embodiment of the present invention are performed when the computer program is executed by the processing means 401.
The terminal provided by the embodiment of the present invention and the voice control method provided by the above embodiment belong to the same inventive concept, technical details which are not described in detail in the embodiment of the present invention can be seen in the above embodiment, and the embodiment of the present invention has the same beneficial effects as the above embodiment.
Example five
An embodiment of the present invention provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the voice control method provided by the above embodiment.
The computer readable medium of the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to electrical wiring, fiber optic cable, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be included in the electronic device or may exist alone without being incorporated into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:
Receiving voice information;
performing tone recognition based on the voice information to determine a target tone attribute corresponding to the voice information;
And determining whether to respond to the voice information according to the target tone color attribute.
Computer program code for carrying out operations of the present invention may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present invention may be implemented in software or in hardware. Wherein the name of the unit does not constitute a limitation of the unit itself in some cases, for example, the editable content display unit may also be described as an "editing unit".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic that may be used include Field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems-on-a-chip (SOCs), complex Programmable Logic Devices (CPLDs), and the like.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The above description is only illustrative of the preferred embodiments of the present invention and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in the present invention is not limited to the specific combinations of technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the spirit of the disclosure. Such as the above-mentioned features and the technical features disclosed in the present invention (but not limited to) having similar functions are replaced with each other.
Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the invention. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims (9)

1. A voice control method, comprising:
Receiving voice information;
performing tone recognition based on the voice information to determine a target tone attribute corresponding to the voice information;
Determining whether to respond to the voice information according to the target tone attribute;
the determining whether to respond to the voice information according to the target tone color attribute comprises the following steps:
Matching the target tone color attribute with a preset tone color attribute, and responding only to voice information corresponding to the target tone color attribute matched with the preset tone color attribute;
Pre-storing the tone color attribute before responding to the voice information, the method comprises the following steps:
aiming at each user, prompting the user to record appointed voice for a plurality of times, extracting tone characteristics when the user records voice, and taking the extracted tone characteristics as preset tone attributes of the user;
the determining whether to respond to the voice information according to the target tone color attribute comprises the following steps:
Determining the similarity between the target tone color attribute and a preset tone color attribute;
if the similarity reaches a set threshold, determining to respond to the voice information;
The user can set the preset tone attribute as the tone attribute of any user so as to respond to the sound information of the set user only.
2. The method of claim 1, wherein performing tone color recognition based on the voice information to determine a target tone color attribute corresponding to the voice information comprises:
extracting the frequency and amplitude of the audio according to the voice information;
The target timbre attribute is identified based on the frequency and amplitude.
3. The method of claim 1, wherein said responding to said voice information comprises:
carrying out semantic recognition on the voice information;
And executing corresponding control operation according to the semantic recognition result.
4. A method according to claim 3, wherein the performing a corresponding control operation according to the semantic recognition result comprises at least one of:
Sending the voice information to a target client;
displaying target information corresponding to the semantic recognition result;
and executing the search task according to the semantic recognition result.
5. A method according to any one of claims 1-3, further comprising:
And switching the preset tone color attribute.
6. The method of claim 5, wherein said switching the preset timbre attribute comprises:
determining the preset tone color attribute from at least two pre-stored candidate tone color attributes;
Different kinds of candidate timbre attributes correspond to voices of different users.
7. A voice control apparatus, comprising:
the receiving module is used for receiving the voice information;
the tone color determining module is used for carrying out tone color recognition based on the voice information so as to determine a target tone color attribute corresponding to the voice information;
the control module is used for determining whether to respond to the voice information according to the target tone attribute;
the control module is further used for matching the target tone color attribute with a preset tone color attribute and responding only to the voice information corresponding to the target tone color attribute matched with the preset tone color attribute;
The voice attribute acquisition module is used for prompting the user to input appointed voices for a plurality of times aiming at each user, extracting voice characteristics when the user inputs voices, and taking the extracted voice characteristics as preset voice attributes of the user;
The control module comprises:
a determining unit, configured to determine a similarity between the target tone attribute and a preset tone attribute;
The response unit is used for determining to respond to the voice information if the similarity reaches a set threshold value;
The user can set the preset tone attribute as the tone attribute of any user so as to respond to the sound information of the set user only.
8. An electronic device, the electronic device comprising:
One or more processors;
Storage means for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the speech control method of any of claims 1-6.
9. A storage medium containing computer executable instructions for performing the speech control method according to any of claims 1-6 when executed by a computer processor.
CN202010871490.8A 2020-08-26 2020-08-26 Voice control method, device, electronic device and storage medium Active CN111833883B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010871490.8A CN111833883B (en) 2020-08-26 2020-08-26 Voice control method, device, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010871490.8A CN111833883B (en) 2020-08-26 2020-08-26 Voice control method, device, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN111833883A CN111833883A (en) 2020-10-27
CN111833883B true CN111833883B (en) 2025-01-28

Family

ID=72918261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010871490.8A Active CN111833883B (en) 2020-08-26 2020-08-26 Voice control method, device, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN111833883B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283803A (en) * 2021-12-22 2022-04-05 珠海格力电器股份有限公司 Control method, device, storage medium and device for audio processing equipment
CN114155858A (en) * 2021-12-24 2022-03-08 珠海格力电器股份有限公司 Speech processing method, apparatus, electronic device, and computer-readable storage medium
CN118068736A (en) * 2024-02-26 2024-05-24 贝塔智能科技(北京)有限公司 A spray intelligent sensing control method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111312232A (en) * 2018-12-10 2020-06-19 珠海格力电器股份有限公司 Voice control method and device, storage medium and air conditioner

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120064209A (en) * 2010-12-09 2012-06-19 유비벨록스(주) Method and terminal device for adjusting secret number
CN107591150A (en) * 2017-08-16 2018-01-16 珠海市魅族科技有限公司 Audio recognition method and device, computer installation and computer-readable recording medium
CN108712681A (en) * 2018-05-30 2018-10-26 深圳市零度智控科技有限公司 Smart television sound control method, smart television and readable storage medium storing program for executing
CN109101801B (en) * 2018-07-12 2021-04-27 北京百度网讯科技有限公司 Method, apparatus, device and computer readable storage medium for identity authentication

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111312232A (en) * 2018-12-10 2020-06-19 珠海格力电器股份有限公司 Voice control method and device, storage medium and air conditioner

Also Published As

Publication number Publication date
CN111833883A (en) 2020-10-27

Similar Documents

Publication Publication Date Title
US11240050B2 (en) Online document sharing method and apparatus, electronic device, and storage medium
CN110519539B (en) Methods, systems, and media for rewinding media content based on detected audio events
JP7353497B2 (en) Server-side processing method and server for actively proposing the start of a dialogue, and voice interaction system capable of actively proposing the start of a dialogue
RU2667717C2 (en) Environmentally aware dialog policies and response generation
CN111833883B (en) Voice control method, device, electronic device and storage medium
CN110267113B (en) Video file processing method, system, medium, and electronic device
CN111696553B (en) Voice processing method, device and readable medium
CN107801096A (en) Video playback control method, device, terminal equipment and storage medium
CN105551498A (en) Voice recognition method and device
CN111343410A (en) Mute prompt method and device, electronic equipment and storage medium
CN113779208A (en) Method and apparatus for human-machine dialogue
CN112380362A (en) Music playing method, device and equipment based on user interaction and storage medium
CN111460211A (en) Audio information playing method and device and electronic equipment
CN111161734A (en) Voice interaction method and device based on specified scene
CN105959482A (en) Scene sound effect control method and electronic device
CN110413834B (en) Voice comment modification method, system, medium and electronic device
WO2017215615A1 (en) Sound effect processing method and mobile terminal
CN109951504B (en) Information pushing method and device, terminal and storage medium
WO2019228140A1 (en) Instruction execution method and apparatus, storage medium, and electronic device
CN112259076B (en) Voice interaction method, voice interaction device, electronic equipment and computer readable storage medium
CN113793625A (en) Audio playing method and device
CN105162839A (en) Data processing method, device and system
CN113312928A (en) Text translation method and device, electronic equipment and storage medium
CN112002313B (en) Interaction method and device, sound box, electronic equipment and storage medium
CN113495712A (en) Automatic volume adjustment method, apparatus, medium, and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载