Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a visual data model design method based on a user, which solves the problems that the most suitable standard type is difficult to be screened out quickly and accurately according to the data characteristics and the visual requirements in the face of rich visual types, so that the visual effect is poor and the data information cannot be effectively transmitted.
The method for designing the visual data model based on the user-oriented, comprises the following steps:
The method comprises the steps of obtaining visual data, preprocessing the visual data to obtain preprocessed data, and processing the preprocessed data according to visual requirements to obtain processed visual data;
the method comprises the steps of obtaining data characteristics of visual data, screening visual types according to visual requirements to obtain preselected types, and screening the preselected types to obtain standard types by combining the data characteristics;
Obtaining historical data, segmenting time to obtain a time segmentation interval, calculating the average value of the difference value of the using times in the time segmentation interval, carrying out normalization processing to obtain a fluctuation index, analyzing the occurrence number to obtain a rare index, summing the two to obtain an analysis index, and comparing the analysis index with a threshold value to screen focus of attention;
Determining layout defects according to user feedback data of a standard type, acquiring a corresponding optimization scheme, calculating feedback indexes and matching indexes of the optimization scheme, summing to obtain an optimization scheme selection value, and generating an optimization standard;
and carrying out layout analysis on the focus of attention by taking the design mode of the optimization standard as the standard, and generating design information.
As a further aspect of the present invention, the specific means for obtaining the standard type is:
Analyzing and acquiring the data characteristics of the obtained processing visual data, wherein the data characteristics are data types and data scales, acquiring all visual types, screening the visual types according to visual requirements to obtain preselected types, and screening the preselected types according to the data characteristics to obtain standard types;
And acquiring data characteristics corresponding to the preselected types and recording the data characteristics as characteristics to be analyzed, simultaneously matching the characteristics to be analyzed with the data characteristics, and screening the preselected types matched with the characteristics to be analyzed and recording the preselected types as standard types.
As a further scheme of the invention, the specific mode for obtaining the fluctuation index is as follows:
acquiring all historical data in time T, wherein a specific numerical value of the time T is set by an operator, corresponding processing visual data in the historical data is acquired, the index is marked as i, and i=1, 2 and j, wherein j represents the type of the processing visual data;
The time T is taken as a standard to carry out segmentation processing on the time T to obtain a time segmentation interval, a specific numerical value of the time T is set by an operator, and then the using times C i of the processing visual data i in the time segmentation interval are obtained;
and similarly, acquiring the use times of all time segment intervals in the time T, calculating the difference value of the use times of two adjacent time segment intervals, calculating the average value of the difference values of the use times, carrying out normalization processing on the calculated average value, and recording the normalization result as a fluctuation index.
As a further scheme of the invention, the specific way of obtaining the rare index is as follows:
The time T is segmented to obtain a time segmentation interval, the occurrence number corresponding to the visual data i processed in the time segmentation interval in the time T is obtained, then normalization processing is carried out on the occurrence number, and the occurrence number of normalization processing is recorded as a rare index.
As a further aspect of the present invention, the specific manner of screening the focus of attention is:
and summing the obtained fluctuation index and the rare index to obtain an analysis index of the processed visual data, and performing the same analysis on all the processed visual data by analogy to obtain corresponding analysis indexes, sequencing the analysis indexes from large to small, comparing the analysis indexes with a threshold value, setting a specific value of the threshold value by an operator, and marking the processed visual data with the analysis index larger than the threshold value as a focus of attention.
As a further scheme of the present invention, the specific way of generating the optimization criteria is as follows:
The method comprises the steps of obtaining a standard type, obtaining user feedback data corresponding to the standard type at the same time, obtaining layout defects corresponding to the standard type according to the user feedback data, obtaining a corresponding optimization scheme according to the obtained layout defects, wherein the optimization scheme represents the optimization scheme of a model of the same type, the index is marked as n, n=1, 2 and m, m represents the number of the optimization schemes, obtaining the user feedback data corresponding to the optimization scheme at the same time, and calculating the feedback index and the matching index of the optimization scheme according to the user feedback data.
As a further aspect of the present invention, the feedback index manner of the calculation optimization scheme is:
And acquiring all user feedback data, acquiring feedback values corresponding to the optimization scheme n in the user feedback data, calculating the average value of all the feedback values, and simultaneously carrying out normalization processing on the obtained average value to be recorded as a feedback index.
As a further aspect of the present invention, the means for calculating the matching index of the optimization scheme is as follows:
and obtaining an optimization scheme, carrying out matching analysis on the optimization scheme in a layout mode to obtain a standard type, simultaneously obtaining the space layout ratio of the optimization scheme to the standard type, carrying out normalization processing on the space layout ratio, analyzing the element coverage rate of the optimization scheme to the standard type, carrying out normalization processing on the element coverage rate of the optimization scheme to the standard type, and carrying out numerical summation on the element coverage rate and the element coverage rate of the optimization scheme to the standard type to obtain a corresponding matching index.
As a further scheme of the present invention, the specific way of generating the optimization criteria is as follows:
And summing the obtained feedback index and the matching index to obtain a selected value corresponding to the optimal scheme, and generating an optimal standard by taking the optimal scheme with the maximum selected value as a standard.
The invention provides a visual data model design method based on user-oriented. Compared with the prior art, the method has the following beneficial effects:
According to the method, the using times of processing the visual data in the historical data in different time segmentation intervals are analyzed, the fluctuation condition of using the fluctuation index measurement data is calculated, the occurrence number of the processing visual data in the time segmentation intervals is analyzed, the scarcity degree of the rare index measurement data is calculated, the analysis indexes are obtained by integrating the two, the focus of attention of a user is accurately determined, and the direction is provided for targeted optimization.
According to the invention, a layout optimization system based on user feedback data is established, the user feedback is collected, the layout defects are analyzed, the successful experience of the same type of model is referred to make an optimization scheme, meanwhile, the feedback index and the matching index are calculated, and the feedback index and the matching index are synthesized to obtain a selected value so as to determine the optimal scheme with the most advantages, so that the user satisfaction degree and the operation efficiency are remarkably improved, and the user residence time and the purchase conversion rate can be effectively improved in an e-commerce platform.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, the application provides a user-oriented visual data model design method, which specifically comprises the following steps:
Step1, obtaining visual data, wherein the visual data has wide sources, such as a database of an enterprise, a public data set on the Internet, data collected by a sensor and the like. Taking an e-commerce enterprise as an example, the data may be from order data recorded by a transaction system, browsing and purchasing behavior data collected by a user behavior monitoring system and the like, and preprocessing the data to obtain preprocessed data, wherein the data preprocessing operation comprises data cleaning, data denoising, repeated data deleting and missing value processing, the repeated data recording is identified and deleted through a data deduplication algorithm, and for the missing value, the methods of mean filling, median filling, prediction filling based on machine learning and the like are adopted for processing, meanwhile, corresponding data conversion is carried out on the preprocessed data according to the visual requirements, and then grouping and aggregation processing are carried out on the preprocessed data to obtain the processed visual data. Still taking electronic commerce order data as an example, if the sales total of commodities in each category of different months is to be displayed, the order data can be grouped according to the month and the commodity category, then the order amount in each group is subjected to summation and aggregation operation, and finally the sales total data of various commodities in each month is obtained.
Step2, after the operations of obtaining, preprocessing, grouping and the like of the original data are completed, the processed visual data are obtained. At this time, the key features of the data need to be deeply analyzed and acquired, and mainly cover two aspects of data type and data scale. The data types may be classified into a numeric type (e.g., integer, floating point number, data like sales of products, price, etc.), a character type (e.g., character string, e.g., product name, customer name), a date-time type (e.g., order date, user registration time), etc. The data size is mainly represented by the corresponding data capacity, for example, a database table containing millions of transaction records, the data capacity of which is relatively large, and a small data set with only hundreds of records, the data capacity of which is small.
Meanwhile, all types possibly applicable to data visualization need to be acquired, and common visualization types include a histogram, a line graph, a pie chart, a scatter chart, a map and the like. Next, these visualization types are initially screened for pre-selected types according to specific visualization requirements. For example, if the visual demand is a sales ratio showing different product categories, based on the demand, pie charts, bar charts, etc. can be pre-selected as they are good at showing the ratio and contrast relationship, and if the demand is a sales ratio showing a trend of a certain product over time, the line charts, area charts, etc. will be pre-selected.
The pre-selected types are then further screened for data characteristics to determine the final standard type. The specific screening mode is as follows, according to each pre-selected type, the corresponding data characteristic is obtained, and the data characteristic is recorded as the characteristic to be analyzed. And then, carefully matching the characteristics to be analyzed with the actual data characteristics of the processed visual data, screening out the pre-selected types matched with the characteristics to be analyzed and the actual data characteristics, and marking the pre-selected types as standard types.
For example, assuming that the processing visual data is annual sales data of each region of a certain company, the data types include regions (character type) and sales (numerical type), the data size is such that sales data of 50 regions are recorded, and the data capacity is moderate. The visual requirement is to compare sales differences in various regions. The pre-selected types obtained by the preliminary screening are a histogram and a bar chart. For the bar graph, the characteristic to be analyzed is suitable for comparing the numerical data, the numerical difference among different categories (areas) can be clearly displayed, and for the bar graph, the bar graph is also suitable for comparing the numerical data, and when the categories are more, the horizontal arrangement is more convenient for reading. Since the data characteristics of the processed visual data are matched with the characteristics to be analyzed of the bar graph and the bar graph, the bar graph and the bar graph can be used as standard types for visual display of the data.
Step3, acquiring a standard type and processing visual data, simultaneously acquiring all historical data in time T, setting a specific value of the time T by an operator, analyzing the historical data to determine the focus of attention of a user, and analyzing both data volatility and data scarcity by specific analysis;
In order to deeply analyze the use condition change trend of the processing visual data in different time phases, corresponding processing visual data in the historical data is firstly obtained. These process visualization data are labeled i, where i=1, 2, j, where j represents the total number of categories of process visualization data,
For example, in an e-commerce sales data visualization project, processing the visualization data may include sales charts counted by product category, distribution maps of order amounts for each region, user purchase frequency charts, etc., where j is the number of these different types of visualization data.
Then, the total time range T is subjected to segmentation processing by taking the time T as a standard, so that a series of time segmentation intervals are obtained. The specific value of the time t is set by an operator according to the actual analysis requirement.
For example, if the time range T of interest is one year (from 1 month, 1 day to 12 months, 31 days), the operator analyzes the time interval of one month, and T is set to one month, and the time segment intervals are 1 month, 1 day, 1 month, 31 days, 2 months, 1 day, 2 months, 28 days, respectively.
Then, for each of the process visualization data labeled i, the number of times of use thereof C i in each time segment section is counted, for example, for a sales chart counted by product category (assuming that the label i=1, in the time segment section of 1 month 1 day-1 month 31 days, which is viewed 15 times by the market analyst, then C 1 =15 in this section.
And so on, the number of uses of the visualized data for each process is obtained for all time segment intervals within the time T.
After the use times of each time segment interval are obtained, calculating the difference value of the use times of two adjacent time segment intervals. Assuming that for the process visualization data labeled i, the number of times of use at the kth time segment is C i,k and the number of times of use at the kth+1th time segment is C i,k+1, the difference between the number of times of use of adjacent two time segment=Ci,k+1-Ci,k。
For example, for the sales chart i=1, the number of 1 month 1 day-1 month 31 day use C 1,1 =15, and the number of 2 month 1 day-2 month 28 day use C 1,2 =20, the adjacent section use difference is 20-15=5.
And calculating the average value of the difference values of the using times of all adjacent time segment intervals. For the process visualization data, numbered i, it is assumed that there are a total of n time segment intervals [ ]) Then the average of the difference in the number of uses。
Continuing with the sales chart described above as an example, assuming a total of 12 months n=12 for one year, the difference between the 11 adjacent month usage times is calculated as,,...,The difference is divided by 11 to obtain the average value of the difference of the times of use。
And finally, carrying out normalization processing on the calculated average value. The normalization process can compare the fluctuation of different types of processing visual data at the same scale. There are various normalization methods commonly used, such as max-min normalization.
Assume that the maximum value in the average value of the difference values of the using times of all the processing visual data isMinimum value ofFor the process visualization data, numbered i, the normalization result thereof is the fluctuation index. The larger the value of the fluctuation index FI i is in the range of 0 to 1, the more severe the fluctuation of the use times of the type of processing visual data in the time T is indicated, and the smaller the value is, the more stable the use times are indicated.
The specific mode of analyzing the data scarcity is that the time T is segmented to obtain a time segmentation interval, the occurrence number corresponding to the visual data i processed in the time segmentation interval in the time T is obtained, then the occurrence number is normalized, the normalization processing mode is the same as the data fluctuation normalization processing mode, and the occurrence number of the normalization processing is recorded as a rare index;
and summing the obtained fluctuation index and the rare index to obtain an analysis index of the processed visual data, and performing the same analysis on all the processed visual data by analogy to obtain corresponding analysis indexes, sequencing the analysis indexes from large to small, comparing the analysis indexes with a threshold value, setting a specific value of the threshold value by an operator, and marking the processed visual data with the analysis index larger than the threshold value as a focus of attention.
Step4, first, determining the standard type of layout. Standard types typically originate from widely accepted best practices within the industry, established design specifications, or mature design patterns. For example, in the layout design of the e-commerce platform commodity display page, standard types may include a waterfall flow layout (commodity is displayed in a continuous vertical arrangement without paging), a grid layout (commodity is displayed in a regular line form), and the like. After determining the standard type, user feedback data corresponding to the standard type is collected. The method can be realized in various modes, such as setting a user satisfaction questionnaire on a platform, inquiring the user about the evaluation of the usability, visual effect, commodity searching convenience and the like of the current layout, or recording the data of the operation track, residence time and the like of the user on the page through a user behavior analysis tool, so as to obtain the actual use experience of the user on the layout. Based on the collected user feedback data, shortcomings of standard type layouts are deeply parsed. For example, for a waterfall flow layout, the user feedback data display section spends a long time for the user to find a specific commodity, which indicates that the layout has a shortage in commodity positioning function, and for a grid layout, it may be found that visual fatigue frequently occurs during browsing by the user, and the visual design or element arrangement mode of the description layout needs to be improved.
And searching a corresponding optimization scheme aiming at the identified layout defects. These optimization schemes typically borrow from successful experience with the same type of model. Taking an e-commerce platform as an example, if the commodity positioning problem of the waterfall flow layout is prominent, the optimization scheme can be to add commodity classification navigation to the side bar of the page or introduce an intelligent search screening function, and for the visual fatigue problem of the grid layout, the optimization scheme can be to adjust the display style of the commodity pictures and increase the blank space among elements to optimize the visual comfort, and the optimization scheme labels are denoted as n, wherein n=1, 2 and m, and m represents the number of the optimization schemes.
User feedback data associated therewith is also collected for each optimization scheme. On this basis, two key indexes, namely a feedback index and a matching index, are calculated.
And (3) calculating feedback indexes, namely comprehensively collecting all user feedback data related to the optimization scheme n and extracting feedback values in the user feedback data. The feedback value may be a rating (e.g., 1-5 points) of the user in the questionnaire, or a positive or negative sentiment score extracted from the user's comments by sentiment analysis. The average value of the feedback values is calculated, and then normalization processing is performed. Assume that optimization scheme 1 (adding sidebar navigation in the waterfall flow layout) receives user scores of 3, 4, 5, 4, 3, respectively, with an average of (3+4+5+4+3)/(5=3.8. If the average value of feedback values of all the optimization schemes is 4.5 and the minimum value is 2.5, the feedback index of the optimization scheme is (3.8-2.5)/(4.5-2.5) =0.65.
Taking the optimization scheme 1 as an example, firstly carrying out matching analysis on the optimization scheme 1 and a standard type (waterfall flow layout) from the aspect of a layout mode, determining the similarity degree of the optimization scheme and the standard type on the space layout, and calculating the space layout ratio. Assuming that the merchandise display area in the standard type is 70% of the total area of the page, in the optimization scheme 1, the ratio is 68%, and the space layout ratio is 68%/(70% ≡0.97). And (3) carrying out normalization processing on all the related space layout ratios, wherein if the maximum value is 1.1 and the minimum value is 0.8, the normalized space layout ratio of the optimization scheme 1 is 0.97/1.1 approximately equal to 0.88. Meanwhile, the element coverage rate of the optimization scheme and the standard type is analyzed. For example, the commodity picture element in the standard type should cover 40% of the total area of the page, and the actual coverage area of the commodity picture in the optimization scheme 1 is 38% of the total area of the page, so that the element coverage rate is 38% +.40% = 0.95. And (3) carrying out normalization processing on the element coverage rate, wherein if the maximum value in all the element coverage rates is 1.05 and the minimum value is 0.9, the element coverage rate of the optimization scheme 1 is 0.95/1.05 approximately equal to 0.90 after normalization. And adding the normalized space layout ratio and the element coverage ratio to obtain the matching index of the optimization scheme 1 as 0.88+0.90=1.78.
And adding the feedback index of each optimization scheme with the matching index to obtain a corresponding selected value. Continuing with the above-described optimization scheme 1 as an example, the value is selected to be 0.65+1.78=2.43. And comparing the selected values of all the optimized schemes, wherein the optimized scheme with the largest selected value is the scheme with the most advantage. In this case of the e-commerce platform, if the selection value of other optimization schemes is smaller than 2.43, the optimization scheme 1 (adding sidebar navigation in the waterfall flow layout) is determined as the final optimization standard.
The pain points encountered by the user in the using layout process can be accurately solved through the generation of the user feedback data driving optimization scheme, and the user satisfaction degree and the operation efficiency are remarkably improved. For example, the optimized layout in the e-commerce platform can enable the user to find the required commodity faster, and reduce browsing fatigue, so that the residence time of the user on the platform and the purchase conversion rate are increased.
Step5, acquiring an optimization standard and a focus of attention, and designing the focus of attention by taking a design mode of the optimization standard as a standard to generate corresponding design information.
The partial data in the above formula are all calculated by taking the numerical value, and are not substituted into parameter units for calculation, and meanwhile, contents which are not described in detail in the specification belong to the prior art known to the person skilled in the art.
The above embodiments are only for illustrating the technical method of the present invention and not for limiting the same, and it should be understood by those skilled in the art that the technical method of the present invention may be modified or substituted without departing from the spirit and scope of the technical method of the present invention.