+

Akhtar et al., 2025 - Google Patents

UrduSER: A comprehensive dataset for speech emotion recognition in Urdu language

Akhtar et al., 2025

View HTML
Document ID
18042793215681789615
Author
Akhtar M
Jahangir R
Ain Q
Nauman M
Uddin M
Ullah S
Publication year
Publication venue
Data in Brief

External Links

Snippet

Abstract Speech Emotion Recognition (SER) is a rapidly evolving field of research that aims to identify and categorize emotional states through speech signal analysis. As SER holds considerable socio¬ cultural and business significance, researchers are increasingly …
Continue reading at www.sciencedirect.com (HTML) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30017Multimedia data retrieval; Retrieval of more than one type of audiovisual media
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems

Similar Documents

Publication Publication Date Title
Bigi SPPAS-multi-lingual approaches to the automatic annotation of speech
Durand et al. The Oxford handbook of corpus phonology
CN107464555B (en) Method, computing device and medium for enhancing audio data including speech
Cole et al. New methods for prosodic transcription: Capturing variability as a source of information
Douglas-Cowie et al. Emotional speech: Towards a new generation of databases
Schmidt EXMARaLDA and the FOLK tools–two toolsets for transcribing and annotating spoken language
US11282508B2 (en) System and a method for speech analysis
Priego-Valverde et al. Is smiling during humor so obvious? a cross-cultural comparison of smiling behavior in humorous sequences in american english and french interactions
Das et al. BanglaSER: A speech emotion recognition dataset for the Bangla language
Szekrényes Annotation and interpretation of prosodic data in the hucomtech corpus for multimodal user interfaces
Kleinberger et al. Voice at NIME: a Taxonomy of New Interfaces for Vocal Musical Expression.
Ibrahim et al. Development of Hausa dataset a baseline for speech recognition
Liesenfeld et al. Building and curating conversational corpora for diversity-aware language science and technology
Bai et al. Audiosetcaps: An enriched audio-caption dataset using automated generation pipeline with large audio and language models
Bartesaghi Theories and practices of transcription from discourse analysis
Akhtar et al. UrduSER: A comprehensive dataset for speech emotion recognition in Urdu language
Taj et al. Urdu speech emotion recognition: A systematic literature review
Tits et al. Emotional speech datasets for english speech synthesis purpose: A review
Arawjo et al. Typetalker: A speech synthesis-based multi-modal commenting system
Gnevsheva Studying sociophonetics in second languages
Rodrigues et al. Emotion detection throughout the speech
Hazel et al. Enhancing the natural conversation experience through conversation analysis–a design method
Zhao et al. Chipola: A Chinese Podcast Lexical Database for capturing spoken language nuances and predicting behavioral data
Zhao et al. Recognition of Weird Tone in Chinese Communication and Improvement of Language Understanding for AI
Raso et al. Introduction: Spoken corpora and linguistic studies: Problems and perspectives
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载