+

CN119047486A - Financial expert language semantic emotion analysis system and method - Google Patents

Financial expert language semantic emotion analysis system and method Download PDF

Info

Publication number
CN119047486A
CN119047486A CN202411523783.1A CN202411523783A CN119047486A CN 119047486 A CN119047486 A CN 119047486A CN 202411523783 A CN202411523783 A CN 202411523783A CN 119047486 A CN119047486 A CN 119047486A
Authority
CN
China
Prior art keywords
financial
market
sentiment
data
speech data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202411523783.1A
Other languages
Chinese (zh)
Other versions
CN119047486B (en
Inventor
陈守红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Gelonghui Information Technology Co ltd
Original Assignee
Shenzhen Gelonghui Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Gelonghui Information Technology Co ltd filed Critical Shenzhen Gelonghui Information Technology Co ltd
Priority to CN202411523783.1A priority Critical patent/CN119047486B/en
Publication of CN119047486A publication Critical patent/CN119047486A/en
Application granted granted Critical
Publication of CN119047486B publication Critical patent/CN119047486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Business, Economics & Management (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Technology Law (AREA)
  • Machine Translation (AREA)

Abstract

本发明涉及金融数据处理技术领域,具体涉及一种金融专家语言语义感情分析系统及方法;本发明通过对金融专家言论数据进行分词、词性标注、实体识别和专业术语提取,形成词嵌入矩阵,对词嵌入矩阵中每个文本向量进行情感极性分析,并针对金融专家言论数据进行类别分类,结合情感极性和类别分类得到词嵌入矩阵的情感强度,基于历史金融专家言论数据和其对应的历史市场反馈数据构建市场情绪评估模型,采用市场情绪评估模型评估词嵌入矩阵的市场情绪,并将市场情绪与市场关键指标进行相关性分析,得到金融专家言论数据对市场关键指标的影响;通过本发明实现对金融专家言论数据的情感分析,实现金融专家言论数据对市场关键指标的影响分析。

The present invention relates to the technical field of financial data processing, and in particular to a financial expert language semantic sentiment analysis system and method; the present invention forms a word embedding matrix by performing word segmentation, part-of-speech tagging, entity recognition and professional terminology extraction on financial expert speech data, performs sentiment polarity analysis on each text vector in the word embedding matrix, and performs category classification on the financial expert speech data, obtains the sentiment intensity of the word embedding matrix by combining the sentiment polarity and category classification, constructs a market sentiment evaluation model based on historical financial expert speech data and its corresponding historical market feedback data, uses the market sentiment evaluation model to evaluate the market sentiment of the word embedding matrix, and performs correlation analysis between the market sentiment and market key indicators to obtain the influence of the financial expert speech data on the market key indicators; the present invention realizes the sentiment analysis of the financial expert speech data and the influence analysis of the financial expert speech data on the market key indicators.

Description

Financial expert language semantic emotion analysis system and method
Technical Field
The invention relates to the technical field of financial data processing, in particular to a financial expert language semantic emotion analysis system and a financial expert language semantic emotion analysis method.
Background
In the current financial field, analysis of financial expert language mainly depends on manual interpretation, and manual interpretation also depends on financial professionals, so that interpretation efficiency is low and interpretation conclusion is easily influenced by subjective factors. Existing automated analysis tools often fail to accurately understand financial terms and complex contexts, resulting in inaccurate emotion analysis. In addition, the lack of predictive and risk pre-warning functions for market emotional trends makes it difficult for investors and decision makers to grasp market dynamics and potential risks.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present invention aims to provide a system and a method for analyzing the semantic emotion of a financial expert language, so as to solve the problem that it is difficult to obtain effective and reliable financial information from the financial expert language.
In order to solve the problems, the invention adopts the following technical scheme:
In one aspect, the invention provides a financial expert language semantic emotion analysis system, which comprises a natural language processing module, an emotion analysis module and a market emotion assessment module;
The natural language processing module is used for carrying out word segmentation, part-of-speech tagging, entity identification and professional term extraction on financial expert language data to form a word embedding matrix;
The emotion analysis module is used for carrying out emotion polarity analysis on each text vector in the word embedding matrix, wherein the emotion polarities comprise positive, negative and neutral, and carrying out optimistic, cautious or pessimistic category classification on financial expert speaking data, and combining the emotion polarities and the category classification to obtain the emotion intensity of the word embedding matrix;
The market emotion assessment module is used for constructing a market emotion assessment model based on historical financial expert speaking data and corresponding historical market feedback data, assessing the market emotion of the word embedded matrix by adopting the market emotion assessment model, and carrying out correlation analysis on the market emotion and the market key index to obtain the influence of the financial expert speaking data on the market key index.
As an implementation manner, the word segmentation, part-of-speech tagging, entity identification and term extraction are performed on financial expert speech data to form a word embedding matrix, which includes:
word segmentation is carried out on financial expert speaking data by adopting a word segmentation model trained by a corpus in the financial field;
Labeling the parts of speech of the segmented financial expert speaking data;
Identifying financial entities of the financial expert speaking data after part of speech tagging by adopting a long-short-term memory network or a transducer model;
extracting the technical terms by combining a deep learning model of a dictionary in the financial field to form a financial technical term library, and extracting the technical terms of the financial entity by calculating cosine similarity of the identified financial entity and the technical terms in the financial technical term library;
A word embedding matrix of financial expert speech data is constructed based on the extracted technical terms of the financial entity.
As an implementation manner, the emotion polarity analysis on each text vector in the word embedding matrix includes:
analyzing each text vector in the word embedding matrix by using the trained deep neural network model, and calculating the emotion polarity score of each word;
The optimistic, cautious or pessimistic class classification of financial expert speech data includes:
Converting the context information of the financial expert language data into a feature vector by using TF-IDF or word2vec, and integrating the feature vector with the score of the emotion polarity to obtain a comprehensive feature vector;
classifying the financial expert speaking data by a trained softmax regression model;
The emotion intensity of the word embedding matrix obtained by combining emotion polarity and category classification comprises the following steps:
and calculating the emotion intensity of the word embedding matrix according to the weight and polarity scores of different emotion categories.
As an implementation manner, the building the market emotion assessment model based on the historical financial expert language data and the corresponding historical market feedback data includes:
collecting historical language data and related market feedback data of financial experts, and preprocessing the data;
Extracting key features from the preprocessed data to construct key feature vectors, wherein the key features comprise scores of emotion polarities of historical language data of financial experts, category classification and market key indexes of related market feedback data;
And constructing realization sequence data based on the constructed key feature vector, and training by adopting a long-short-term memory network to obtain a market emotion assessment model.
As an implementation manner, the estimating the market emotion of the word embedding matrix by using the market emotion estimation model, and performing correlation analysis on the market emotion and the market key index to obtain the influence of the financial expert speaking data on the market key index includes:
And obtaining the influence of financial expert speaking data on the market key indexes according to the obtained market emotion of the word embedding matrix, the score of emotion polarity and the change curve of the market key indexes of the market feedback data related to category classification matching.
In another aspect, the present invention provides a financial expert language semantic emotion analysis method, including:
word segmentation, part-of-speech tagging, entity identification and professional term extraction are carried out on financial expert speaking data to form a word embedding matrix;
carrying out emotion polarity analysis on each text vector in the word embedding matrix, wherein the emotion polarities comprise positive, negative and neutral, and carrying out optimistic, cautious or pessimistic category classification on financial expert speaking data, and combining the emotion polarities and the category classification to obtain the emotion intensity of the word embedding matrix;
And constructing a market emotion assessment model based on the historical financial expert speaking data and the corresponding historical market feedback data, assessing the market emotion of the word embedded matrix by adopting the market emotion assessment model, and carrying out correlation analysis on the market emotion and the market key index to obtain the influence of the financial expert speaking data on the market key index.
As an implementation manner, the word segmentation, part-of-speech tagging, entity identification and term extraction are performed on financial expert speech data to form a word embedding matrix, which includes:
word segmentation is carried out on financial expert speaking data by adopting a word segmentation model trained by a corpus in the financial field;
Labeling the parts of speech of the segmented financial expert speaking data;
Identifying financial entities of the financial expert speaking data after part of speech tagging by adopting a long-short-term memory network or a transducer model;
extracting the technical terms by combining a deep learning model of a dictionary in the financial field to form a financial technical term library, and extracting the technical terms of the financial entity by calculating cosine similarity of the identified financial entity and the technical terms in the financial technical term library;
A word embedding matrix of financial expert speech data is constructed based on the extracted technical terms of the financial entity.
As an implementation manner, the emotion polarity analysis on each text vector in the word embedding matrix includes:
analyzing each text vector in the word embedding matrix by using the trained deep neural network model, and calculating the emotion polarity score of each word;
The optimistic, cautious or pessimistic class classification of financial expert speech data includes:
Converting the context information of the financial expert language data into a feature vector by using TF-IDF or word2vec, and integrating the feature vector with the score of the emotion polarity to obtain a comprehensive feature vector;
classifying the financial expert speaking data by a trained softmax regression model;
The emotion intensity of the word embedding matrix obtained by combining emotion polarity and category classification comprises the following steps:
and calculating the emotion intensity of the word embedding matrix according to the weight and polarity scores of different emotion categories.
As an implementation manner, the building the market emotion assessment model based on the historical financial expert language data and the corresponding historical market feedback data includes:
collecting historical language data and related market feedback data of financial experts, and preprocessing the data;
Extracting key features from the preprocessed data to construct key feature vectors, wherein the key features comprise scores of emotion polarities of historical language data of financial experts, category classification and market key indexes of related market feedback data;
And constructing realization sequence data based on the constructed key feature vector, and training by adopting a long-short-term memory network to obtain a market emotion assessment model.
As an implementation manner, the estimating the market emotion of the word embedding matrix by using the market emotion estimation model, and performing correlation analysis on the market emotion and the market key index to obtain the influence of the financial expert speaking data on the market key index includes:
And obtaining the influence of financial expert speaking data on the market key indexes according to the obtained market emotion of the word embedding matrix, the score of emotion polarity and the change curve of the market key indexes of the market feedback data related to category classification matching.
The financial expert language semantic emotion analysis system and the financial expert language semantic emotion analysis method have the beneficial effects that emotion polarity analysis and category classification are carried out on financial expert language data, emotion analysis on the financial expert language data is achieved, correlation is carried out on the basis of market feedback data, the relation between the financial expert language data and market key indexes is obtained, and therefore influence analysis of the financial expert language data on the market key indexes is achieved.
Drawings
FIG. 1 is a schematic diagram of a financial expert language semantic emotion analysis system according to the present invention.
FIG. 2 is a flow chart of a method for analyzing the semantic emotion of a financial expert language.
Detailed Description
The present invention will be described in further detail with reference to specific examples.
It should be noted that these examples are only for illustrating the present invention, and not for limiting the present invention, and simple modifications of the method under the premise of the inventive concept are all within the scope of the claimed invention.
Referring to fig. 1, a financial expert language semantic emotion analysis system includes a natural language processing module 100, an emotion analysis module 200, and a market emotion assessment module 300.
The natural language processing module 100 is used for word segmentation, part-of-speech tagging, entity identification and term extraction on financial expert speech data to form a word embedding matrix.
The word segmentation of the financial expert speaking data comprises word segmentation of the financial expert speaking data by adopting a word segmentation model trained by a corpus in the financial field. The word segmentation model can be combined with a rule method and a statistical method, so that common words can be recognized, and financial professional terms can be accurately segmented.
The part-of-speech tagging comprises the steps of tagging part-of-speech of financial expert after word segmentation, and assigning part-of-speech tags such as nouns, verbs, adjectives and the like to each vocabulary so as to provide more accurate semantic information for subsequent emotion analysis.
The entity identification comprises the step of identifying financial entities of the word-part tagged financial expert speaking data by adopting a long-short-term memory network or a transducer model. Financial entities in the text, such as person names, institution names, stock codes, etc., are identified to facilitate extraction of key information.
The term of art extraction includes that a deep learning model combined with a dictionary in the financial field is adopted to extract terms of art to form a financial term library, and the cosine similarity of the identified financial entity and the terms in the financial term library is calculated to extract the terms of the financial entity.
Forming the word embedding matrix includes constructing the word embedding matrix of financial expert speech data based on the extracted terminology of the financial entity.
Wherein, The word is represented as being embedded in a matrix,The sequence of words to be input is represented,Is a word embedding function that maps each word into a dense vector of fixed dimensions that captures the semantic information of the word.
For example, consider a section of financial expert comments that "recent market fluctuations are large, but the investment value of high-quality stocks is still significant". In the implementation process, sentences are firstly divided into word units such as 'recent', 'market', 'fluctuation', 'larger' and the like through word segmentation processing. Subsequently, part-of-speech tagging is performed, such as "recent" being noted as a temporal noun and "larger" being noted as an adjective. The entity identification and term extraction step will then identify financial entities and terms such as "market" and "premium stocks". Finally, through word embedding technology, each vocabulary is converted into a vector capable of expressing the deep semantic meaning of the vocabulary, and the vectors are then input into a deep learning model for emotion analysis to judge the emotion tendency expressed by the comment.
The emotion analysis module 200 is configured to analyze emotion polarity of each text vector in the word embedding matrix, where the emotion polarity includes positive, negative and neutral, and performs optimistic, cautious or pessimistic category classification on financial expert speech data, and combines the emotion polarity and the category classification to obtain emotion intensity of the word embedding matrix.
The process specifically comprises the following steps:
And analyzing each text vector in the word embedding matrix by using the trained deep neural network model, and calculating the emotion polarity score of each word. The deep neural network may be a Convolutional Neural Network (CNN) or a Recurrent Neural Network (RNN).
Wherein, A score representing the polarity of the emotion,For the word to be embedded in the matrix,Is an emotion polarity analysis function.
The optimistic, cautious or pessimistic classification of category for financial expert speech data includes:
And converting the context information of the financial expert language data into a feature vector by using TF-IDF or word2vec, and integrating the feature vector with the score of the emotion polarity to obtain a comprehensive feature vector. The context information includes:
The mood level, the context information is not divided into several levels, but is obtained and expressed in various ways, and the following are several common obtaining ways:
lexical characteristics, such as part of speech, word frequency, word embedding (word embeddings), etc.
Sentence characteristics, such as sentence length, sentence structure, grammar analysis, etc.
The chapter features are chapter structures, paragraph relations, topic consistency and the like.
The financial expert speech data was category-classified by a trained softmax regression model.
The softmax regression model is trained using comprehensive feature vectors constructed from emotional and contextual features extracted from financial expert speech data, and outputs classification probabilities that the financial expert speech data belongs to optimistic, cautious or pessimistic.
Wherein, A probability distribution representing the classification of the text,The context information is represented by a representation of the context information,A score representing the polarity of the emotion,Is a text classification function.
The method for obtaining the emotion strength of the word embedding matrix by combining emotion polarity and category classification comprises the following steps:
and calculating the emotion intensity of the word embedding matrix according to the weight and polarity scores of different emotion categories.
Wherein I represents an emotion intensity value,Is the weight of the i-th emotion class,Is the score of emotion polarity, N is the total number of words.
For example, the word "wave" may be determined as neutral, while "premium stock" may be determined as positive emotion. In the text classification phase, the entire comment may be divided into "optimistic" categories. In the emotional intensity assessment, the comment may obtain a higher emotional intensity value according to the model calculation, indicating that the comment has a larger positive influence on the market emotion. Through this series of analyses, the present invention is able to provide investors with quantitative assessments concerning market emotion, thereby assisting investment decisions.
Through the steps, the emotion tendencies of financial language can be accurately judged, emotion intensity evaluation can be carried out by combining with context factors, and an efficient and reliable solution is provided for market emotion analysis.
The market emotion assessment module 300 is configured to construct a market emotion assessment model based on historical financial expert speech data and corresponding historical market feedback data, assess the market emotion of the word embedded matrix by adopting the market emotion assessment model, and perform correlation analysis on the market emotion and the market key index to obtain the influence of the financial expert speech data on the market key index.
Wherein, the constructing of the market emotion estimation model includes:
historical language data and related market feedback data of financial experts are collected and data preprocessing is performed.
The pretreatment comprises the following steps:
Historical speech data and market feedback data for financial professionals are collected and consolidated, including but not limited to stock price fluctuations, trading volume changes, social media moods, and the like. And cleaning and normalizing the data to eliminate noise and abnormal values in the data.
Wherein, Is the data after the normalization and is used for the data,Is the original data of the data set,AndRespectively minimum and maximum values in the dataset.
And extracting key features from the preprocessed data to construct key feature vectors, wherein the key features comprise scores of emotion polarities of historical language data of financial experts, category classification and market key indexes of related market feedback data.
Wherein, Is a feature vector of the object set,Is the i-th feature.
And constructing realization sequence data based on the constructed key feature vector, and training by adopting a long-short-term memory network to obtain a market emotion assessment model.
Wherein, Is the result of the assessment of the emotion of the market,Is a model parameter.
Evaluating the market emotion of the word embedding matrix by adopting a market emotion evaluation model, and performing correlation analysis on the market emotion and the market key index, wherein the obtaining of the influence of financial expert speaking data on the market key index comprises the following steps:
And obtaining the influence of financial expert speaking data on the market key indexes according to the obtained market emotion of the word embedding matrix and the change curve of the market key indexes (such as index rise and fall and transaction amount change) of the market feedback data related to the classification matching of the emotion polarity and the category.
For example, historical speech data and related market feedback data of the expert are collected first through a data preprocessing step. In the feature extraction stage, the emotion polarity score of the comment, the text classification result and the like are taken as features. And then, processing the feature vector by using the constructed LSTM emotion estimation model to obtain a market emotion estimation result. Through emotion quantitative analysis, a remarkable positive correlation exists between the speech of the expert and the market transaction amount, so that the speech of the expert has a pushing effect on market transaction liveness and/or the speech of the expert has market correctness. This analysis can provide investors with deep insight into market dynamics, assisting them in making more reasonable investment decisions.
Referring to fig. 2, a method for analyzing semantic emotion of financial expert language includes:
S100, word segmentation, part-of-speech tagging, entity identification and professional term extraction are carried out on financial expert speaking data to form a word embedding matrix;
S200, carrying out emotion polarity analysis on each text vector in the word embedding matrix, wherein the emotion polarities comprise positive, negative and neutral, and carrying out optimistic, cautious or pessimistic category classification on financial expert speaking data, and obtaining the emotion strength of the word embedding matrix by combining the emotion polarities and the category classification;
S300, a market emotion assessment model is built based on historical financial expert speaking data and corresponding historical market feedback data, the market emotion of the word embedding matrix is assessed by adopting the market emotion assessment model, and correlation analysis is carried out on the market emotion and the market key index, so that influence of the financial expert speaking data on the market key index is obtained.
The word embedding matrix is formed by word segmentation, part-of-speech tagging, entity identification and professional term extraction of financial expert speaking data, and comprises the following steps:
word segmentation is carried out on financial expert speaking data by adopting a word segmentation model trained by a corpus in the financial field;
Labeling the parts of speech of the segmented financial expert speaking data;
Identifying financial entities of the financial expert speaking data after part of speech tagging by adopting a long-short-term memory network or a transducer model;
extracting the technical terms by combining a deep learning model of a dictionary in the financial field to form a financial technical term library, and extracting the technical terms of the financial entity by calculating cosine similarity of the identified financial entity and the technical terms in the financial technical term library;
A word embedding matrix of financial expert speech data is constructed based on the extracted technical terms of the financial entity.
The emotion polarity analysis for each text vector in the word embedding matrix comprises the following steps:
analyzing each text vector in the word embedding matrix by using the trained deep neural network model, and calculating the emotion polarity score of each word;
The optimistic, cautious or pessimistic class classification of financial expert speech data includes:
Converting the context information of the financial expert language data into a feature vector by using TF-IDF or word2vec, and integrating the feature vector with the score of the emotion polarity to obtain a comprehensive feature vector;
classifying the financial expert speaking data by a trained softmax regression model;
The emotion intensity of the word embedding matrix obtained by combining emotion polarity and category classification comprises the following steps:
and calculating the emotion intensity of the word embedding matrix according to the weight and polarity scores of different emotion categories.
Wherein the constructing the market emotion assessment model based on the historical financial expert speech data and the corresponding historical market feedback data comprises:
collecting historical language data and related market feedback data of financial experts, and preprocessing the data;
Extracting key features from the preprocessed data to construct key feature vectors, wherein the key features comprise scores of emotion polarities of historical language data of financial experts, category classification and market key indexes of related market feedback data;
And constructing realization sequence data based on the constructed key feature vector, and training by adopting a long-short-term memory network to obtain a market emotion assessment model.
The method for estimating the market emotion of the word embedding matrix by adopting the market emotion estimation model, and carrying out correlation analysis on the market emotion and the market key index, wherein the step of obtaining the influence of financial expert speaking data on the market key index comprises the following steps:
And obtaining the influence of financial expert speaking data on the market key indexes according to the obtained market emotion of the word embedding matrix, the score of emotion polarity and the change curve of the market key indexes of the market feedback data related to category classification matching.
Finally, it is noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be understood that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1.一种金融专家语言语义感情分析系统,其特征在于,包括自然语言处理模块、情感分析模块和市场情绪评估模块;1. A financial expert language semantic sentiment analysis system, characterized by comprising a natural language processing module, a sentiment analysis module and a market sentiment assessment module; 所述自然语言处理模块用于对金融专家言论数据进行分词、词性标注、实体识别和专业术语提取,形成词嵌入矩阵;The natural language processing module is used to perform word segmentation, part-of-speech tagging, entity recognition and professional terminology extraction on the financial expert speech data to form a word embedding matrix; 所述情感分析模块用于对词嵌入矩阵中每个文本向量进行情感极性分析,情感极性包括正面、负面和中性,并针对金融专家言论数据进行乐观、谨慎或悲观的类别分类,结合情感极性和类别分类得到词嵌入矩阵的情感强度;The sentiment analysis module is used to perform sentiment polarity analysis on each text vector in the word embedding matrix, where the sentiment polarity includes positive, negative and neutral, and classify the financial expert speech data into optimistic, cautious or pessimistic categories, and combine the sentiment polarity and category classification to obtain the sentiment intensity of the word embedding matrix; 所述市场情绪评估模块用于基于历史金融专家言论数据和其对应的历史市场反馈数据构建市场情绪评估模型,采用市场情绪评估模型评估词嵌入矩阵的市场情绪,并将市场情绪与市场关键指标进行相关性分析,得到金融专家言论数据对市场关键指标的影响。The market sentiment assessment module is used to build a market sentiment assessment model based on historical financial expert speech data and its corresponding historical market feedback data, use the market sentiment assessment model to evaluate the market sentiment of the word embedding matrix, and perform correlation analysis between market sentiment and key market indicators to obtain the impact of financial expert speech data on key market indicators. 2.根据权利要求1所述的金融专家语言语义感情分析系统,其特征在于,所述对金融专家言论数据进行分词、词性标注、实体识别和专业术语提取,形成词嵌入矩阵,包括:2. The financial expert language semantic sentiment analysis system according to claim 1 is characterized in that the word segmentation, part-of-speech tagging, entity recognition and professional terminology extraction of the financial expert speech data to form a word embedding matrix includes: 采用金融领域语料库训练的分词模型对金融专家言论数据进行分词;The word segmentation model trained with the financial field corpus is used to segment the speech data of financial experts; 对分词后的金融专家言论数据进行词性的标注;Perform part-of-speech tagging on the financial experts’ speech data after word segmentation; 采用长短时记忆网络或Transformer模型对词性标注后的金融专家言论数据的金融实体进行识别;Use long short-term memory network or Transformer model to identify financial entities in the speech data of financial experts after part-of-speech tagging; 采用结合金融领域词典的深度学习模型对专业术语进行提取形成金融专业术语库,通过计算识别出的金融实体与金融专业术语库中的专业术语的余弦相似度对金融实体的专业术语进行提取;A deep learning model combined with a dictionary in the financial field is used to extract professional terms to form a financial professional terminology library. The professional terms of financial entities are extracted by calculating the cosine similarity between the identified financial entities and the professional terms in the financial professional terminology library. 基于提取的金融实体的专业术语构建金融专家言论数据的词嵌入矩阵。The word embedding matrix of financial expert speech data is constructed based on the extracted professional terms of financial entities. 3.根据权利要求1所述的金融专家语言语义感情分析系统,其特征在于,所述对词嵌入矩阵中每个文本向量进行情感极性分析包括:3. The financial expert language semantic sentiment analysis system according to claim 1, characterized in that the sentiment polarity analysis of each text vector in the word embedding matrix comprises: 利用训练的深度神经网络模型对词嵌入矩阵中每个文本向量进行分析,计算每个词汇的情感极性的得分;Use the trained deep neural network model to analyze each text vector in the word embedding matrix and calculate the sentiment polarity score of each word; 所述针对金融专家言论数据进行乐观、谨慎或悲观的类别分类包括:The categories of optimistic, cautious or pessimistic classification of financial expert opinions data include: 使用TF-IDF或word2vec将金融专家言论数据的语境信息转化为特征向量,将所述特征向量与其所述情感极性的得分进行整合,得到综合特征向量;Use TF-IDF or word2vec to convert the context information of the financial expert speech data into a feature vector, and integrate the feature vector with the score of the sentiment polarity to obtain a comprehensive feature vector; 通过训练的softmax回归模型对金融专家言论数据进行类别分类;The financial expert speech data is classified into categories through the trained softmax regression model; 所述结合情感极性和类别分类得到词嵌入矩阵的情感强度包括:The sentiment intensity of the word embedding matrix obtained by combining sentiment polarity and category classification includes: 根据不同情感类别的权重和极性得分,计算词嵌入矩阵的情感强度。The sentiment intensity of the word embedding matrix is calculated based on the weights and polarity scores of different sentiment categories. 4.根据权利要求1所述的金融专家语言语义感情分析系统,其特征在于,所述基于历史金融专家言论数据和其对应的历史市场反馈数据构建市场情绪评估模型包括:4. The financial expert language semantic sentiment analysis system according to claim 1, characterized in that the market sentiment assessment model is constructed based on historical financial expert speech data and its corresponding historical market feedback data, comprising: 收集金融专家的历史言论数据和相关的市场反馈数据,并进行数据预处理;Collect historical speech data of financial experts and relevant market feedback data, and perform data preprocessing; 从预处理后的数据中提取关键特征构建关键特征向量,所述关键特征包括金融专家的历史言论数据的情感极性的得分、类别分类和相关的市场反馈数据的市场关键指标;Extract key features from the preprocessed data to construct a key feature vector, wherein the key features include sentiment polarity scores of historical speech data of financial experts, category classifications, and market key indicators of relevant market feedback data; 基于构建的关键特征向量构建实现序列数据,采用长短期记忆网络进行训练,得到市场情绪评估模型。Based on the constructed key feature vectors, the sequence data is constructed and trained using the long short-term memory network to obtain the market sentiment assessment model. 5.根据权利要求4所述的金融专家语言语义感情分析系统,其特征在于,所述采用市场情绪评估模型评估词嵌入矩阵的市场情绪,并将市场情绪与市场关键指标进行相关性分析,得到金融专家言论数据对市场关键指标的影响包括:5. The financial expert language semantic sentiment analysis system according to claim 4 is characterized in that the market sentiment evaluation model is used to evaluate the market sentiment of the word embedding matrix, and the market sentiment is correlated with the key market indicators to obtain the impact of the financial expert speech data on the key market indicators, including: 根据得到的词嵌入矩阵的市场情绪以及情感极性的得分和类别分类匹配相关的市场反馈数据的市场关键指标的变化曲线,得到金融专家言论数据对市场关键指标的影响。According to the market sentiment of the word embedding matrix and the score of sentiment polarity and the change curve of the market key indicators of the relevant market feedback data matched with the category classification, the influence of the financial expert speech data on the market key indicators is obtained. 6.一种金融专家语言语义感情分析方法,其特征在于,包括:6. A method for analyzing the semantic sentiment of financial expert language, characterized by comprising: 对金融专家言论数据进行分词、词性标注、实体识别和专业术语提取,形成词嵌入矩阵;Perform word segmentation, part-of-speech tagging, entity recognition, and professional terminology extraction on financial expert speech data to form a word embedding matrix; 对词嵌入矩阵中每个文本向量进行情感极性分析,情感极性包括正面、负面和中性,并针对金融专家言论数据进行乐观、谨慎或悲观的类别分类,结合情感极性和类别分类得到词嵌入矩阵的情感强度;Perform sentiment polarity analysis on each text vector in the word embedding matrix. Sentiment polarity includes positive, negative, and neutral. Financial experts’ speech data is classified into optimistic, cautious, or pessimistic categories. The sentiment intensity of the word embedding matrix is obtained by combining sentiment polarity and category classification. 基于历史金融专家言论数据和其对应的历史市场反馈数据构建市场情绪评估模型,采用市场情绪评估模型评估词嵌入矩阵的市场情绪,并将市场情绪与市场关键指标进行相关性分析,得到金融专家言论数据对市场关键指标的影响。A market sentiment evaluation model is constructed based on historical financial expert speech data and its corresponding historical market feedback data. The market sentiment evaluation model is used to evaluate the market sentiment of the word embedding matrix, and the correlation analysis between market sentiment and key market indicators is performed to obtain the impact of financial expert speech data on key market indicators. 7.根据权利要求6所述的金融专家语言语义感情分析方法,其特征在于,所述对金融专家言论数据进行分词、词性标注、实体识别和专业术语提取,形成词嵌入矩阵,包括:7. The method for analyzing the semantic sentiment of financial expert language according to claim 6, characterized in that the step of performing word segmentation, part-of-speech tagging, entity recognition and professional term extraction on the speech data of financial experts to form a word embedding matrix comprises: 采用金融领域语料库训练的分词模型对金融专家言论数据进行分词;The word segmentation model trained with the financial field corpus is used to segment the speech data of financial experts; 对分词后的金融专家言论数据进行词性的标注;Perform part-of-speech tagging on the financial experts’ speech data after word segmentation; 采用长短时记忆网络或Transformer模型对词性标注后的金融专家言论数据的金融实体进行识别;Use long short-term memory network or Transformer model to identify financial entities in the speech data of financial experts after part-of-speech tagging; 采用结合金融领域词典的深度学习模型对专业术语进行提取形成金融专业术语库,通过计算识别出的金融实体与金融专业术语库中的专业术语的余弦相似度对金融实体的专业术语进行提取;A deep learning model combined with a dictionary in the financial field is used to extract professional terms to form a financial professional terminology library. The professional terms of financial entities are extracted by calculating the cosine similarity between the identified financial entities and the professional terms in the financial professional terminology library. 基于提取的金融实体的专业术语构建金融专家言论数据的词嵌入矩阵。The word embedding matrix of financial expert speech data is constructed based on the extracted professional terms of financial entities. 8.根据权利要求6所述的金融专家语言语义感情分析方法,其特征在于,所述对词嵌入矩阵中每个文本向量进行情感极性分析包括:8. The method for analyzing the language semantics of financial experts according to claim 6, wherein the step of performing sentiment polarity analysis on each text vector in the word embedding matrix comprises: 利用训练的深度神经网络模型对词嵌入矩阵中每个文本向量进行分析,计算每个词汇的情感极性的得分;Use the trained deep neural network model to analyze each text vector in the word embedding matrix and calculate the sentiment polarity score of each word; 所述针对金融专家言论数据进行乐观、谨慎或悲观的类别分类包括:The categories of optimistic, cautious or pessimistic classification of financial expert opinions data include: 使用TF-IDF或word2vec将金融专家言论数据的语境信息转化为特征向量,将所述特征向量与所述情感极性的得分进行整合,得到综合特征向量;Use TF-IDF or word2vec to convert the context information of the financial expert speech data into a feature vector, and integrate the feature vector with the score of the sentiment polarity to obtain a comprehensive feature vector; 通过训练的softmax回归模型对金融专家言论数据进行类别分类;The financial expert speech data is classified into categories through the trained softmax regression model; 所述结合情感极性和类别分类得到词嵌入矩阵的情感强度包括:The sentiment intensity of the word embedding matrix obtained by combining sentiment polarity and category classification includes: 根据不同情感类别的权重和极性得分,计算词嵌入矩阵的情感强度。The sentiment intensity of the word embedding matrix is calculated based on the weights and polarity scores of different sentiment categories. 9.根据权利要求6所述的金融专家语言语义感情分析方法,其特征在于,所述基于历史金融专家言论数据和其对应的历史市场反馈数据构建市场情绪评估模型包括:9. The method for analyzing the semantic sentiment of financial expert language according to claim 6, wherein the step of constructing a market sentiment evaluation model based on historical financial expert speech data and corresponding historical market feedback data comprises: 收集金融专家的历史言论数据和相关的市场反馈数据,并进行数据预处理;Collect historical speech data of financial experts and relevant market feedback data, and perform data preprocessing; 从预处理后的数据中提取关键特征构建关键特征向量,所述关键特征包括金融专家的历史言论数据的情感极性的得分、类别分类和相关的市场反馈数据的市场关键指标;Extract key features from the preprocessed data to construct a key feature vector, wherein the key features include sentiment polarity scores of historical speech data of financial experts, category classifications, and market key indicators of relevant market feedback data; 基于构建的关键特征向量构建实现序列数据,采用长短期记忆网络进行训练,得到市场情绪评估模型。Based on the constructed key feature vectors, the sequence data is constructed and trained using the long short-term memory network to obtain the market sentiment assessment model. 10.根据权利要求9所述的金融专家语言语义感情分析方法,其特征在于,所述采用市场情绪评估模型评估词嵌入矩阵的市场情绪,并将市场情绪与市场关键指标进行相关性分析,得到金融专家言论数据对市场关键指标的影响包括:10. The method for analyzing the semantic sentiment of financial expert language according to claim 9 is characterized in that the market sentiment evaluation model is used to evaluate the market sentiment of the word embedding matrix, and the market sentiment is correlated with the key market indicators to obtain the impact of the financial expert speech data on the key market indicators, including: 根据得到的词嵌入矩阵的市场情绪以及情感极性的得分和类别分类匹配相关的市场反馈数据的市场关键指标的变化曲线,得到金融专家言论数据对市场关键指标的影响。According to the market sentiment of the word embedding matrix and the score of sentiment polarity and the change curve of the market key indicators of the relevant market feedback data matched with the category classification, the influence of the financial expert speech data on the market key indicators is obtained.
CN202411523783.1A 2024-10-30 2024-10-30 A financial expert language semantic sentiment analysis system and method Active CN119047486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411523783.1A CN119047486B (en) 2024-10-30 2024-10-30 A financial expert language semantic sentiment analysis system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411523783.1A CN119047486B (en) 2024-10-30 2024-10-30 A financial expert language semantic sentiment analysis system and method

Publications (2)

Publication Number Publication Date
CN119047486A true CN119047486A (en) 2024-11-29
CN119047486B CN119047486B (en) 2025-02-11

Family

ID=93569113

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411523783.1A Active CN119047486B (en) 2024-10-30 2024-10-30 A financial expert language semantic sentiment analysis system and method

Country Status (1)

Country Link
CN (1) CN119047486B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119293251A (en) * 2024-12-09 2025-01-10 江苏苏商银行股份有限公司 A method for analyzing financial user sentiment and psychological portraits based on a large language model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150286710A1 (en) * 2014-04-03 2015-10-08 Adobe Systems Incorporated Contextualized sentiment text analysis vocabulary generation
CN106919673A (en) * 2017-02-21 2017-07-04 浙江工商大学 Text mood analysis system based on deep learning
CN109933664A (en) * 2019-03-12 2019-06-25 中南大学 An Improved Method for Fine-Grained Sentiment Analysis Based on Sentiment Word Embedding
CN109934503A (en) * 2019-03-19 2019-06-25 合肥工业大学 An early warning method of financial market risk in the Internet environment
WO2024021354A1 (en) * 2022-07-28 2024-02-01 中国科学院深圳先进技术研究院 Model training method, price prediction method, terminal device and storage medium
CN118628113A (en) * 2024-06-24 2024-09-10 天元大数据信用管理有限公司 A financial transaction risk assessment method, device and medium based on machine learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150286710A1 (en) * 2014-04-03 2015-10-08 Adobe Systems Incorporated Contextualized sentiment text analysis vocabulary generation
CN106919673A (en) * 2017-02-21 2017-07-04 浙江工商大学 Text mood analysis system based on deep learning
CN109933664A (en) * 2019-03-12 2019-06-25 中南大学 An Improved Method for Fine-Grained Sentiment Analysis Based on Sentiment Word Embedding
CN109934503A (en) * 2019-03-19 2019-06-25 合肥工业大学 An early warning method of financial market risk in the Internet environment
WO2024021354A1 (en) * 2022-07-28 2024-02-01 中国科学院深圳先进技术研究院 Model training method, price prediction method, terminal device and storage medium
CN118628113A (en) * 2024-06-24 2024-09-10 天元大数据信用管理有限公司 A financial transaction risk assessment method, device and medium based on machine learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119293251A (en) * 2024-12-09 2025-01-10 江苏苏商银行股份有限公司 A method for analyzing financial user sentiment and psychological portraits based on a large language model

Also Published As

Publication number Publication date
CN119047486B (en) 2025-02-11

Similar Documents

Publication Publication Date Title
Li et al. A deep learning-based approach to constructing a domain sentiment lexicon: a case study in financial distress prediction
US11886814B2 (en) Systems and methods for deviation detection, information extraction and obligation deviation detection
CN108563638B (en) Microblog emotion analysis method based on topic identification and integrated learning
Jiang et al. Mining semantic features in current reports for financial distress prediction: Empirical evidence from unlisted public firms in China
CN110347787B (en) Interview method and device based on AI auxiliary interview scene and terminal equipment
CN118821045B (en) Knowledge-enhanced product question-answering community user conversation emotion recognition method and system
CN113111152A (en) Depression detection method based on knowledge distillation and emotion integration model
CN116610592B (en) Customizable software testing and evaluation method and system based on natural language processing technology
CN109726745A (en) A goal-based sentiment classification method incorporating descriptive knowledge
CN119047486B (en) A financial expert language semantic sentiment analysis system and method
CN119577522B (en) Method for correcting old man dialogue intention information based on deep neural network
CN118171645B (en) Business information analysis method and system based on text classification
CN111651606A (en) Text processing method and device and electronic equipment
CN111159405B (en) Irony detection method based on background knowledge
CN111145903A (en) Method and device for acquiring vertigo inquiry text, electronic equipment and inquiry system
CN118193731A (en) Method and system for topic identification and clustering screening of scientific and technological texts based on SAO structure
CN118013975A (en) Natural language processing method and system based on artificial intelligence
CN119441389A (en) A financial sentiment analysis system
Jui et al. A machine learning-based segmentation approach for measuring similarity between sign languages
CN119538919B (en) Text feature data processing method and system based on BERT model and TF-IDF weighting
CN106227802A (en) A kind of based on Chinese natural language process and the multiple source Forecasting of Stock Prices method of multi-core classifier
Korade et al. Elevating intelligent voice assistant chatbots with natural language processing, and OpenAI technologies
CN120015307A (en) An intelligent method and system for rapid screening and triage of aortic dissection
CN117788137A (en) Prediction model and construction method of credit default of small and medium-sized enterprises based on unstructured data
CN116340521A (en) Document classification method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载