Disclosure of Invention
The invention aims to realize the error correction of English vowel sounding of a tested object based on the comparison between the tested object and a standard English vowel sounding model when English is a target language.
The invention provides an English vowel sounding error correction method on one hand, which comprises the following steps: step 1, pre-storing a standard English vowel sounding acoustic model; step 2, inputting English voice of the tested object; step 3, identifying vowels in English voices of the tested object; step 4, inputting the voice of the recognized vowel read by the tested object; step 5, performing English vowel sounding acoustic analysis on the voice of the identified vowel read by the tested object; step 6, comparing the English vowel sounding acoustic analysis data of the tested object with the standard English vowel sounding acoustic model to obtain a first deviation degree; and 7, correcting the English vowel sounding of the tested object according to the first deviation degree.
The step 1 comprises the following steps: inputting English voices of a plurality of standard English sample objects; identifying vowels in the English speech of the plurality of standard English sample objects; performing English vowel sounding acoustic analysis on vowels of each sample object respectively; and generating the standard English vowel sounding acoustic model according to the English vowel sounding acoustic analysis result.
The step 2 comprises the following steps: and providing the voice material according to the nationality of the tested object, and recording English voice of the tested object for reading the voice material.
The step 3 comprises the following steps: and identifying the vowel in the English voice of the tested object according to the resonance peak value of the vowel.
The step 3 comprises the following steps: and identifying the vowel in the English voice of the tested object according to the resonance peak value of the vowel and the duration of the vowel.
The resonance peaks include a first resonance peak and a second resonance peak.
The step 7 comprises the following steps: and adjusting the sound production of the English vowel of the tested object by a visual image according to the data of the English vowel sound production acoustic analysis and the standard English vowel sound production acoustic model.
After the step 7, the method further comprises the following steps: recording the voice of the vowel read again by the tested object; performing acoustic analysis on the pronunciation of the measured object read again on the English vowel; comparing the English vowel sounding acoustic analysis data of the voice read again by the tested object with the standard English vowel sounding acoustic model to obtain a second deviation degree; and outputting an English vowel sounding evaluation text of the tested object according to the first deviation degree and the second deviation degree.
The acoustic analysis of the English vowel sounding comprises the following steps: measuring the resonance peak value of the input English vowel sounding; measuring the sounding duration of the input English vowel; and generating the acoustic analysis data of the input English vowel sounding according to the resonance peak value of the input English vowel sounding and the time length.
The resonance peaks include a first resonance peak and a second resonance peak.
The present invention also provides a memory device having stored therein a plurality of instructions adapted to be loaded and executed by a processor to: step 1, pre-storing a standard English vowel sounding acoustic model; step 2, inputting English voice of the tested object; step 3, identifying vowels in English voices of the tested object; step 4, inputting the voice of the recognized vowel read by the tested object; step 5, performing English vowel sounding acoustic analysis on the voice of the identified vowel read by the tested object; step 6, comparing the English voice sounding acoustic analysis data of the tested object with the standard English vowel sounding acoustic model to obtain a first deviation degree; and 7, correcting the English vowel sounding of the tested object according to the first deviation degree.
The step 1 comprises the following steps: inputting English voices of a plurality of standard English sample objects; identifying vowels in the English speech of the plurality of standard English sample objects; performing English vowel sounding acoustic analysis on vowels of each sample object respectively; and generating the standard English vowel sounding acoustic model according to the English vowel sounding acoustic analysis result.
The step 2 comprises the following steps: and providing a voice material according to the nationality of the tested object, and recording English voice of the tested object for reading the voice material.
The step 3 comprises the following steps: and identifying the vowel in the English voice of the tested object according to the resonance peak value of the vowel.
The step 3 further comprises: and identifying the vowel in the English voice of the tested object according to the resonance peak value of the vowel and the duration of the vowel.
The resonance peaks include a first resonance peak and a second resonance peak.
The step 7 comprises the following steps: and adjusting the sound production of the English vowel of the tested object by a visual image according to the data of the English vowel sound production acoustic analysis and the standard English vowel sound production acoustic model.
After the step 7, the method further comprises the following steps: recording the voice of the vowel read again by the tested object; performing acoustic analysis on the pronunciation of the measured object read again on the English vowel; comparing the English vowel sounding acoustic analysis data of the voice read again by the tested object with the standard English vowel sounding acoustic model to obtain a second deviation degree; and outputting an English vowel sounding evaluation text of the tested object according to the first deviation degree and the second deviation degree.
The acoustic analysis of the English vowel sounding comprises the following steps: measuring the resonance peak value of the input English vowel sounding; measuring the sounding duration of the input vowel; and generating the acoustic analysis data of the input English vowel sounding according to the resonance peak value of the input English vowel sounding and the time length.
The resonance peaks include a first resonance peak and a second resonance peak.
The invention also provides an English vowel sounding error correction device, which comprises: a processor adapted to implement instructions; and a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor to: step 1, pre-storing a standard English vowel sounding acoustic model; step 2, inputting English voice of the tested object; step 3, identifying vowels in English voices of the tested object; step 4, inputting the voice of the recognized vowel read by the tested object; step 5, performing English vowel sounding acoustic analysis on the voice of the identified vowel read by the tested object; step 6, comparing the acoustic analysis of the English vowel sounding of the tested object with the acoustic model of the standard English vowel sounding to obtain a first deviation degree; and 7, correcting the English vowel sounding of the tested object according to the first deviation degree.
The step 1 comprises the following steps: inputting English voices of a plurality of standard English sample objects; identifying vowels in the English speech of the plurality of standard English sample objects; performing English vowel sounding acoustic analysis on vowels of each sample object respectively; and generating the standard English vowel sounding acoustic model according to the English vowel sounding acoustic analysis result.
The step 2 comprises the following steps: and providing voice materials according to nationality of the sample object, and inputting English voices of the sample object for reading the voice materials.
The step 3 comprises the following steps: and identifying the vowel in the English voice of the tested object according to the resonance peak value of the vowel.
The step 3 comprises the following steps: and identifying the vowel in the English voice of the tested object according to the resonance peak value of the vowel and the duration of the vowel.
The resonance peaks include a first resonance peak and a second resonance peak.
The step 7 comprises the following steps: and adjusting the sound production of the English vowel of the tested object by a visual image according to the data of the English vowel sound production acoustic analysis and the standard English vowel sound production acoustic model.
After the step 7, the method further comprises the following steps: recording the voice of the vowel read again by the tested object; performing acoustic analysis on the pronunciation of the measured object read again on the English vowel; comparing the English vowel sounding acoustic analysis data read again by the tested object with the standard English vowel sounding acoustic model to obtain a second deviation degree; and outputting an English vowel sounding evaluation text of the tested object according to the first deviation degree and the second deviation degree.
The acoustic analysis of the English vowel sounding comprises the following steps: measuring the resonance peak value of the input English vowel sounding; measuring the sounding duration of the input vowel; and generating the acoustic analysis data of the input English vowel sounding according to the resonance peak value of the input English vowel sounding and the time length.
The resonance peaks include a first resonance peak and a second resonance peak.
The method has the advantages that the English vowel sounding of the tested object is analyzed and compared with the pre-stored standard English vowel sounding acoustic model, so that the English vowel sounding of the tested object is corrected, and the English vowel sounding of the tested object is more accurate.
Detailed Description
The preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
As shown in fig. 1, a flowchart of an english vowel utterance correction method 100 according to an embodiment of the present invention is shown.
Step 101, a standard english vowel sounding model is pre-stored. For example, a vowel sounds model of a sample object against a background in native english language may be saved as a standard english vowel sounds model. The sample object in the present invention refers to a speaker selected when forming a standard english vowel sound generation model.
And 103, inputting English voices of the tested object. In a specific implementation, a voice material may be provided for the tested object to read, and the tested object may also read any other english word or sentence. When speaking according to the phonetic material is selected, the same phonetic material as when the standard english vowel sound production model is established may be provided, and a phonetic material different from when the standard english vowel sound production model is established may be provided.
And 105, identifying the vowel in the English voice of the recorded tested object.
And step 107, recording the voice of the vowel read by the tested object.
And step 109, performing English vowel vocal acoustic analysis on the recorded voice of the identified vowel read by the tested object, and obtaining the vowel vocal acoustic analysis data.
And step 111, comparing the obtained acoustic analysis data of the English vowel sounds with a pre-stored standard English vowel sound production model to obtain a first deviation degree.
And 113, correcting the English vowel sounding of the tested object according to the first deviation degree.
In a specific embodiment, after inputting english voice read aloud by a certain tested object, firstly, identifying vowel in the english voice, inputting the voice of the recognized vowel read aloud by the tested object, reading the voice of the recognized vowel aloud by the tested object, performing english vowel vocal analysis on the voice of the recognized vowel, comparing obtained english vowel vocal analysis data with a pre-stored standard english vowel vocal model, obtaining the difference between english vowel pronunciation of the tested object and standard english vowel pronunciation, and correcting the english vowel pronunciation of the tested object according to the difference.
In one embodiment, a standard english sample object is selected to establish a standard english vowel acoustic model, which can be accomplished by using the method of fig. 2.
Step 201, selecting a phonetic material, the present invention uses vowels as main objects for English as a target language, and the corpus relates to presentation of all vowels in the word structure and sentence structure of English. The sentence structure includes 5 simple sentence patterns in english. Semantics refers to predictable and unpredictable statements. The predictable sentences comprise sentences with high predictability and sentences with low predictability. All words are high frequency words but include all english vowels. Phonetic materials can be designed by world famous phonetics. Predictable and unpredictable sentences may have an impact on the perception of vowels, such as: the complete sentence is "read sound button on the broken", even if the sentence is "read sound button on the", although "broken" is not said, or "button" is not recognized, it is generally known as "button/broken", that is, the sentence can be predicted, so the sentence can be recognized.
Step 203, selecting standard English sample objects to read the voice materials and establishing a voice library. In practical applications, native english adults in california, usa can be selected as standard english sample objects, and after selecting these people, the selected phonetic materials are read and recorded separately to form a phonetic library.
In step 205, selection of the standard english sample object representation can select listeners with the same language background as the sample object in the united states for intermediate sensing, and select male and female speakers with middle perception, which are the most representative sample objects.
And step 207, performing overall English vowel vocal acoustic analysis on the most representative sample object which is perceived in the middle stage to form a standard English vowel vocal acoustic model.
In one embodiment, after the most representative population of standard english utterances is selected by the method shown in fig. 2, a standard english vowel utterance acoustic model can be generated by the method shown in fig. 3.
Step 301, inputting english voices of a plurality of standard english sample objects, forming a voice material after the voices are input, and forming basic metadata by more than one vowel in a natural voice stream.
Step 303, identifying vowels in the english speech of the plurality of standard english sample objects, in an embodiment, the vowels in the speech may be identified according to a resonance peak of the vowels, or the vowels in the speech may be identified by combining the resonance peak with a duration of the vowels in order to further improve the identification accuracy. In a specific embodiment, the resonance peak may be a first resonance peak and a second resonance peak of a vowel. The first resonance peak F1 represents the lip dimension, i.e., the top-bottom dimension of the utterance, and the second resonance peak F2 represents the tongue dimension, i.e., the front-back dimension of the utterance.
Step 305, performing acoustic analysis on the vowels in the speech of each standard English sample object.
Step 307, generating a standard english vowel utterance acoustic model from the english vowel utterance acoustic analysis data of the plurality of standard english sample objects obtained in step 305. In a specific embodiment, the single vowel feature of each sample object may be different from other speakers, but one person's front, back, high and low vowels may be within a certain range, and the speaker's vowels are measured as a whole according to the specificity, and then the basic feature of the speaker is determined according to the vowel range.
In one embodiment, the voice material can be provided according to the nationality of the tested object, and the English voice of the tested object reading the voice material is recorded. For example, for Chinese, the higher error rate vowels are: e: >
/、/ε~
E/and/u: >
If the voice material is provided, more voice materials containing the vowels can be provided, so that the vowels which are frequently wrong in the tested object can be corrected more, and the correction of the pronunciation of the English vowel is more targeted.
In one embodiment, correcting the english vowel sounds of the subject according to the first degree of deviation may include: and adjusting the sounding of the English vowel of the tested object according to the data of the English vowel sounding acoustic analysis and the standard English vowel sounding acoustic model by using a visual image. As shown in fig. 4, the visual english vowel utterance correction chart of the english vowel utterance correction method according to the embodiment of the present invention is shown, for example, when an english vowel utterance of a chinese is corrected, a provided speech material is "a good speaking mean", and the english vowel utterance correction method according to the present invention is used to correct an english vowel utterance of a chinese

The pronunciation is corrected by obtaining the coordinate positions of the first formant and the second formant of the vowel pronunciation of the Chinese shown by the dots in FIG. 4, and the triangle in FIG. 4
The coordinate positions of the first formant and the second formant of the standard vocalization, through the visual comparison, the speaker can know how to adjust the pronunciation of the English vowel more intuitively.
Fig. 5 is a flowchart of an english vowel utterance correction method 500 according to another embodiment of the present invention, where steps 501 to 513 are the same as steps 101 to 113 in fig. 1, and step 515 is to re-enter the voice of the vowel spoken by the subject; 517, performing acoustic analysis of English vowel pronunciation on the pronunciation read again by the tested object; step 519, comparing the data of the acoustic analysis of the English vowel sounds obtained in the step 517 with a standard acoustic model of the English vowel sounds to obtain a second degree of deviation; and step 521, outputting an English vowel sounding evaluation text of the tested object according to the first deviation degree and the second deviation.
In a specific implementation mode, the English vowel sounding evaluation text can include information such as an original English vowel sounding acoustic analysis diagram of the tested object and a corrected English vowel sounding acoustic analysis diagram, so that the tested object can know the English sounding condition of the tested object and the sounding problem needing to be corrected, the tested object can carry out purposeful English sounding practice according to characteristics of the tested object, and the English sounding of the tested object is improved.
When generating a standard acoustic model for english vowel vocalization or performing acoustic analysis of english vowel vocalization on a test object, the acoustic analysis method for english vowel vocalization shown in fig. 6 may be used.
Step 601 first enters english language speech. When a standard English vowel sounding acoustic model is generated, the voice of a sample object with standard English pronunciation is input; when the acoustic analysis of English vowel vocalization is performed on the tested object, the voice of the tested object is recorded.
Step 603 identifies vowels in the english language. In one embodiment, vowels in speech may be identified based on their formants, or may be identified based on the combination of the formants and the durations of the vowels in order to further improve the accuracy of the identification. In a specific embodiment, the resonance peak may be a first resonance peak and a second resonance peak of a vowel.
Step 605 measures the first resonance peak F1 and the second resonance peak F2 of the vowel, and F1 and F2 of the resonance peaks of the vowel are not linear, and in an embodiment, the hertz value of the resonance peaks can be converted into Bark (Bark) value by the following formula:
Bark=[(26.81 x F)/(1960+F)]–0.53。
step 607 measures the duration of the vowel.
Step 609 generates acoustic analysis data of the english vowel utterance from the data measured in steps 605 and 607.
When the method is used for generating the standard English vowel sounding acoustic model, English vowel sounding acoustic analysis is carried out on a plurality of standard English sample objects, and finally the standard English vowel sounding acoustic model is generated according to English vowel sounding acoustic analysis data of the plurality of sample objects.
Fig. 7 is an english vowel vocal analysis chart of the male and female subjects of different nationalities generated by the english vowel vocal analysis method 600 shown in fig. 6, which is a subject acoustic feature chart obtained by the vowel vocal analysis performed on a speaker, the left side is a male, the right side is a female, the abscissa is the F2 value of the second formant of the vowel, the ordinate is the F1 value of the first formant of the vowel, and the formants F1 and F2 are converted from hertz values to Bark (Bark) values.
The top layer in the figure is the acoustic characteristics of the tested object with Chinese as the mother language, the vowel sounding of the tested object is not distinguished by obvious loose vowel, the obvious interference of Chinese vowel is reflected, and the Chinese accent characteristics are obvious. The middle layer in the figure is the acoustic characteristics of a tested object with Dutch as a mother language, the vowel sounding of the tested object has the difference of tightness, and the individual tone obviously has the Dutch voice negative migration effect. The bottom layer in the figure is the acoustic characteristics of the American English native language measured object, the vowel sounding of the measured object has the distinction of obvious elasticity vowels, the acoustic characteristics of the English native language are embodied, and the acoustic characteristics can be used as a standard English vowel sounding acoustic model.
In one embodiment, as shown in fig. 8, a standard english vowel vocal model diagram is generated by using the english vowel vocal acoustic analysis method 600 shown in fig. 6 and selecting americans as sample objects in standard english according to an embodiment of the present invention.
It should be understood that the present invention does not limit the execution sequence of each step in the english vowel sound generation correction method, and the execution sequence of each step can be adjusted according to actual requirements, so that the technical solution of the present invention can be implemented.
As will be appreciated by one skilled in the art, each of the steps of the English vowel voicing correction methods of the present invention may be embodied as a system, method, or computer program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining hardware and software aspects.
It will be understood that each block of the flowchart illustrations, and combinations of blocks in the flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart block or blocks. Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or C.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer readable storage medium may be a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. A computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
It should be understood that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same, and those skilled in the art can modify the technical solutions described in the above embodiments, or make equivalent substitutions for some technical features; and all such modifications and alterations are intended to fall within the scope of the appended claims.