Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort shall fall within the protection scope of the present specification.
In order to solve the above technical problem, a method for responding to a user question according to an embodiment of the present specification is first introduced. The execution subject of the reply method for the user question may be reply equipment for the user question, including but not limited to a server, an industrial personal computer, a PC, and the like. As shown in fig. 1, the reply method to the user question may include the following implementation steps.
S110: and receiving a question sentence input by a user.
The question sentence may be a question sentence that the user has put forward with respect to information or data that the user wishes to acquire. The question sentence may be a question sentence directly composed of natural language, for example, in a financial question-answering scenario, when a user needs to transact a corresponding business in a business hall, the question sentence "business of XX business hall in XX city? "is XX spot-time saturday on duty? "in practical application, a user may input any form of question and sentence according to a requirement, which is not limited to the above example and is not described herein again.
S120: determining at least one scene category corresponding to the question sentences; the scene category corresponds to a data table category.
After the question sentence is acquired, the scene category corresponding to the question sentence may be determined first. Because the data in the database is stored according to different data tables based on the scene type, the data tables corresponding to the scene type can be determined by acquiring the scene type corresponding to the question sentence, and further, the corresponding data can be acquired from the data tables.
Specifically, the scene category may be a place category, a time category, a business category, and so on, for example, the question "XX city XX business hall several business places? "in a scenario corresponding to a business time query category may be determined. In practical application, the dividing manner of the scene categories may be set according to requirements, and is not limited to the above example, and is not described herein again.
In some embodiments, a classification model for determining a scene category corresponding to a question sentence may also be trained in advance by using a BERT algorithm. In particular, the classification model may be a mathematical model for classifying unclassified traffic data into known types. The classification model may be a bayesian classification model, a Support Vector Machine (SVM) classification model, or a Convolutional Neural Network (CNN) classification model.
After the classification model is obtained through training, the classification model can be used for recognizing the scene category corresponding to the question sentence. It should be noted that, in practical applications, when the classification model is used to identify question statements, a situation may occur in which only a single table is identified for a question, and in this situation, the identified data table identifier may be directly displayed, thereby saving corresponding steps.
After one or more scene categories are identified, all data tables corresponding to the scene categories can be acquired as identified data tables.
S130: analyzing condition items and query items in the question sentences; the condition items comprise data types associated with the question sentences; the query item comprises the data type required to be acquired by the question statement.
After the question sentence is acquired, in combination with the data table corresponding to the question sentence determined in step S120, the condition item and the query item corresponding to the data table in the question sentence may be further analyzed. The condition items comprise data types associated with the question sentences, and the query items comprise data types required to be acquired by the question sentences. For example, when the question is "i want to know the business hours of the long island branch line in shanghai", the corresponding condition item may be "shanghai city" or "long island branch line" in the corresponding data table, and the corresponding query item may be "business hours" in the corresponding data table. By analyzing the condition items and the query items in the question sentences, corresponding data can be more accurately queried from the corresponding data tables to be fed back to the user.
In some embodiments, obtaining the condition item in the question statement may be by using a condition model for recognition. The condition model is used for analyzing condition items of the question sentences. Before analyzing the condition items in the question sentences, the sample condition sentences can be obtained, and the sample condition sentences correspond to the query data table identifiers, the condition field operation types and the condition link relations. And then, the condition model can be trained by using the sample condition statement, so that the identification of the query data table identification, the condition field operation type and the condition link relation corresponding to the question statement can be realized by using the trained condition model.
The lookup table identifier may be used to identify a corresponding data table, which is the data table identified in step S120. The condition field identification can locate the specific field corresponding to the condition item in the data table. The condition field operation type may be used to indicate a specific type of operation taken for the condition field, and may include, for example, one of no operation, equal to, greater than, less than, close to, greater than or equal to, less than or equal to. The conditional link relation can be used for representing data relations among different conditional fields, and specifically can include one of no relation, and relation or relation, so that data results required by a user can be better obtained.
To illustrate by using a specific example, a condition data set is constructed, so that the data of each row is { "query": question; table _ id, the id of the corresponding data table; "header" [ which columns of the current table are used as condition columns, the corresponding values of the condition columns are 1, and the others are 0 ]; "operation" [ what the operation on the column is in the condition, 0 corresponds to no operation, 1 corresponds to equal to, 2 corresponds to greater than, 3 corresponds to less than, 4 corresponds to like, 5 corresponds to greater than or equal to, 6 corresponds to less than or equal to ]; "connection" [ the several columns correspond to the link relationship between the conditions, 0 corresponds to nothing, 1 corresponds to or, 2 corresponds to and ] }. For the question of ' i want to know the business hours of the long island road branches in Shanghai ', the corresponding conditional data behavior { ' query ': i want to know the business hours of the long island road branches in Shanghai '; "table _ id": 111 "; "header" [0,1,1,1,0 ]; "operation" [0,1,1,1,0 ]; "connection" [0,2,2,2,0] }.
Correspondingly, in an example, based on the sample condition statement, the condition model may also be trained by using a BERT algorithm, and a specific training process may be set based on requirements of actual applications, which is not described herein again.
Generally, when data location search is performed according to a condition item, if a specific attribute value can be determined based on a question statement and the operation type of a condition field corresponding to the condition item is equal to that, matching can be directly performed in a corresponding column. For example, when the target data corresponding to a specific time needs to be searched, the data with the time being the target time may be directly matched in the data table as the searched data.
However, in some embodiments, a case may occur in which text is recorded in the column of the data table corresponding to the condition item, and in this case, the completely corresponding data may not be matched. For example, when a user wishes to know the business hours of a long island branch, "long island branch" may be recorded in the question presented. If a completely corresponding matching manner is adopted, the long island road branch line and the long island branch line may not be matched. For this case, matching may be performed based on the similarity between the text recorded in the column and the text in the question, and the most similar text obtained may be used as the text to be queried. The specific similarity matching method may be adjusted based on the requirements of the actual application, and is not described herein again.
In other embodiments, it may also happen that the obtained attribute of the condition column is a numerical value, and the operation on the condition column is a comparison operation, such as greater than, less than, and the like, where the condition value needs to be identified from the question. For example, when the user asks that the number of employees is larger than twenty, the condition column corresponding to the question can be obtained after the question is input into the model as the number of employees, the content of the column where the number of employees is located in the data table is a numerical value, the question is segmented, and similarity matching is carried out on each word and the column name of the condition column. For example, the result after the above conditional statement is segmented may be [ i, want to know, number of employees, greater than, twenty, branch ], after each word is subjected to similarity matching with the number of column names, the most similar word is the number of employees, and then the word representing the numerical value closest to the word (first searching from the right of the word) is obtained as the conditional value, which is twenty in this example. The specific operation mode may be adjusted based on the requirement of the actual application, and is not described herein again.
Accordingly, in some embodiments, the query term may also be identified and extracted using a query model, which may be used to analyze the query term in the question statement. Before analyzing query items in a question statement, a sample query statement may be obtained, where the sample query statement corresponds to a query data table identifier, a query field operation type, and a field grouping condition. And then, the query model can be trained by using the sample query statement, so that the identification of the query data table identification, the query field operation type and the field grouping condition corresponding to the question statement can be realized by using the trained query model.
The lookup data table identification may be used to identify the data table that is found. The query field identification may locate a specific field corresponding to the query item in the data table. The query field operation type may be used to indicate a specific type of operation taken with respect to the query field, and may include at least one of no operation, count operation, sum operation, average operation, minimum operation, maximum operation, for example. The field grouping condition can be used for indicating that a question groups a specific data column in the data table, and the group condition corresponding to the corresponding field can be identified through the field grouping condition.
Using a specific example to illustrate, establishing a query data set for a log under a single-table query, wherein each row of data is { "query": a question; table _ id, id of data table corresponding to question; "select" [ which columns of the current table are queried by the question, the corresponding value of the query column is 1, and the other columns are 0 ]; "agg" [ the question corresponds to the aggregation operation of the query column, 0 corresponds to no operation, 1 corresponds to count, 2 corresponds to sum, 3 corresponds to avg, 4 corresponds to min, and 5 corresponds to max ]; "group _ by" [ question which column in the table is grouped, the grouped column has a value of 1, and the others are 0 ]. For example, assuming that the node attribute table Id is 111, the column names of the table are Id (node Id), province (ProvinceName), city (CityName), node Name (Name), and business time (WorkTime); the question is that if one wants to know the business hours of the branches of the Shanghai Long island, the corresponding data behavior { "query": the business hours of the branches of the Shanghai Long island; "table _ id": 111 "; "select" [0,0,0,0,1 ]; "agg" [0,0,0,0,0 ]; "group _ by" [0,0,0,0,0] }.
Correspondingly, in an example, based on the sample query statement, the query model may also be trained by using a BERT algorithm, and a specific training process may be set based on requirements of actual applications, which is not described herein again.
In some embodiments, due to the influence of factors such as the format of the question sentence itself or the defect of the system's own recognition mode, if the generated query field identifier or condition field identifier is empty, the query item or condition item may not be normally acquired. In this case, the question sentences may be logged in a conversion error log, that is, a log for recording sentences for which question answering cannot be achieved. And for the question in the conversion error log, model increment training can be carried out by taking the question as sample data at intervals, so that the recognition process of the question is optimized, and the effectiveness of question recognition is improved.
S140: determining target data in a target data table according to the scene category, the condition items and the query items; the target data table includes a data table corresponding to the data table category.
And after the scene category, the condition item and the query item are obtained, the data which accord with the question sentence can be found. Specifically, the target data in the target data table can be located through the scene category, and then the target data in the target data table is locked by using the condition items and the query items. The target data table may be a data table corresponding to the data table category. And because the condition items and the query items limit the requirements of the query conditions and the query results, the final target data can be obtained according to the acquisition in the target data table.
The target data may be data directly obtained by searching in the target data table, or may be data obtained by performing calculation using data in the target data table. The specific obtaining manner may be determined according to the corresponding relationship between the condition item and the query item, and is not described herein again.
In some embodiments, when the target data is obtained, an sql query statement or hql query statement may be generated according to the scenario category, the condition item, and the query item, so as to query the target data in the target data table by using the sql query statement or hql query statement.
In practical applications, due to the limitation of the format of the question statement or the function of the system itself, there may be a case where the sql query statement or hql query statement cannot be generated. In this case, the question sentence may be logged in the conversion error log. For a detailed description of the conversion error log and a description of the operation taken for the record in the conversion error log, reference may be made to the description in step S130, and details are not repeated here.
Correspondingly, when the number of the target data tables is more than one, a plurality of different query statements can be generated for different tables to query respectively. For a specific query method, reference may be made to the above example, which is not described herein again.
S150: and feeding back a query result corresponding to the question sentence to a user based on the target data.
After the target data is obtained, a corresponding query result can be generated according to the target data and fed back to the user to complete the answer to the question sentence. For example, in a question "i want to know the business hours of the long island road branch in shanghai" for the user, the found business hours may be fed back to the user in the form of a query result according to the found business hours, for example, the feedback statement is "the business hours of the long island road branch in shanghai are 8 to 17 points". The specific manner of feeding back the corresponding result may be adjusted according to the actual application, and is not limited to the above example, and is not described herein again.
In some embodiments, if the query result is empty or the query result cannot be queried normally in step S130 or step S140, an error query message may be fed back to the user to prompt the user that the user cannot query the question sentence. For example, "sorry, no relevant result found" may be fed back to the user to inform the user that the query was not successfully completed, ensuring that the user can be given a reply to optimize the user experience.
Based on the description of the above embodiments, it can be seen that, after a question sentence input by a user is received, the method may determine a scene category corresponding to the question sentence first, so as to determine at least one data table corresponding to the question sentence. Then, by analyzing the condition items and the query items in the question sentences, specific data corresponding to the condition items and the query items can be searched in the data table, and the searched target data feeds back corresponding query results to the user. By the method, when the user question is answered, the data table in the database does not need to be processed in advance, but the relevant data corresponding to the user question is searched and obtained in a mode of directly analyzing the user question, so that the comprehensiveness of the data related to the answering process is ensured, the accurate and effective answering for the user question is realized, and the use experience of the user is guaranteed.
In view of the above-mentioned response method to the user question, a response device to the user question according to the embodiment of the present specification is introduced. The answering device for the user question can be arranged on the answering device for the user question. As shown in fig. 2, the reply device to the user question includes the following modules.
The question sentence receiving module 210 is configured to receive a question sentence input by a user.
A scene category determining module 220, configured to determine at least one scene category corresponding to the question sentence; the scene category corresponds to a data table category.
An item parsing module 230, configured to parse the condition items and query items in the question statement; the condition items comprise data types associated with the question sentences; the query item comprises the data type required to be acquired by the question statement.
And the target data positioning module 240 is used for determining target data in a target data table according to the scene category, the condition item and the query item.
And a query result feedback module 250, configured to feed back a query result corresponding to the question statement to a user based on the target data.
In view of the above reply method to the user question, the present specification provides a reply device to the user question. As shown in fig. 3, the answering device for the user's question may include a memory and a processor.
In this embodiment, the memory may be implemented in any suitable manner. For example, the memory may be a read-only memory, a mechanical hard disk, a solid state disk, a U disk, or the like. The memory may be used to store computer program instructions.
In this embodiment, the processor may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The processor may execute the computer program instructions to perform the steps of: receiving a question sentence input by a user; determining at least one scene category corresponding to the question sentences; the scene category corresponds to a data table category; analyzing condition items and query items in the question sentences; the condition items comprise data types associated with the question sentences; the query item comprises a data type required to be acquired by a question sentence; determining target data in a target data table according to the scene category, the condition items and the query items; the target data table comprises a data table corresponding to the data table category; and feeding back a query result corresponding to the question sentence to a user based on the target data.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
From the above description of the embodiments, it is clear to those skilled in the art that the present specification can be implemented by software plus the necessary first hardware platform. Based on such understanding, the technical solutions of the present specification may be essentially or partially implemented in the form of software products, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The description is operational with numerous first or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
While the specification has been described with examples, those skilled in the art will appreciate that there are numerous variations and permutations of the specification that do not depart from the spirit of the specification, and it is intended that the appended claims include such variations and modifications that do not depart from the spirit of the specification.