+

CN110807060A - Education big data analysis system - Google Patents

Education big data analysis system Download PDF

Info

Publication number
CN110807060A
CN110807060A CN201911048339.8A CN201911048339A CN110807060A CN 110807060 A CN110807060 A CN 110807060A CN 201911048339 A CN201911048339 A CN 201911048339A CN 110807060 A CN110807060 A CN 110807060A
Authority
CN
China
Prior art keywords
data
educational
module
analysis
big data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911048339.8A
Other languages
Chinese (zh)
Inventor
何罡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Prehua International Education Technology Co Ltd
Original Assignee
Beijing Prehua International Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Prehua International Education Technology Co Ltd filed Critical Beijing Prehua International Education Technology Co Ltd
Priority to CN201911048339.8A priority Critical patent/CN110807060A/en
Publication of CN110807060A publication Critical patent/CN110807060A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • G06F16/287Visualization; Browsing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Educational Technology (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

本发明涉及教育大数据分析领域,具体地说,涉及一种教育大数据分析系统,包括教育数据采集模块、教育数据整理模块、教育数据挖掘模块和教育数据分析模块。本发明以学生学习成绩为数据源,利用数据挖掘技术,为教育大数据的合理有效利用提供高效的处理方式,不仅涉及人工智能领域以及统计学的应用,而且涉及数据库的管理和使用,应用于学生成绩分析中,使其数据更具有时效性,更加简洁明了选取成绩数据库中多门课程作为研究对象,找出某门课程对与其他课程的开设是否有影响,为学校教科老师以后排课提供参考,为以后学生选课提供依据,可以部署在各高校的管理中心,具备功能齐全、性能可靠等优点。

The invention relates to the field of educational big data analysis, in particular to an educational big data analysis system, comprising an educational data acquisition module, an educational data sorting module, an educational data mining module and an educational data analysis module. The invention takes students' academic performance as a data source, and uses data mining technology to provide an efficient processing method for the rational and effective utilization of educational big data. It not only involves the application of artificial intelligence and statistics, but also involves the management and use of databases. In the analysis of student performance, make the data more time-sensitive and clear It can be deployed in the management centers of various colleges and universities, and has the advantages of complete functions and reliable performance.

Description

一种教育大数据分析系统An educational big data analysis system

技术领域technical field

本发明涉及教育大数据分析技术领域,具体为一种教育大数据分析系统。The invention relates to the technical field of educational big data analysis, in particular to an educational big data analysis system.

背景技术Background technique

各高校评价学生学业以及综合素质均以学生的各科考试成绩为重要指标,各高校进过长期的运作都积累了存储了大量的学生成绩信息,但是各高校对这些成绩并不是很重视,对成绩的分析处理一般都还停留在古老的查询、统计的时代,例如统计优、良、及格、不及格的人数;计算平均分、标准差,计算绩点;统计绩点。而对于学生取得的这些成绩与课程之间的关系有没有关联没有做深入的了解,没有发现这些存储的成绩是重要的排课依据。依然采取的人工排课方式,由教学院长或者各系主任编写,他们凭借多年的教学经验,再结合有关规定,来决定给学生开哪些课,以及课程顺序。这不免会有一定的主观性,忽略了多年来积累的学生成绩这一宝贵的资源。Colleges and universities evaluate students' academic performance and comprehensive quality based on the students' test scores in various subjects as important indicators. After long-term operation, colleges and universities have accumulated and stored a large amount of student score information, but colleges and universities do not pay much attention to these scores. The analysis and processing of grades are generally still in the age of ancient queries and statistics, such as counting the number of outstanding, good, passing, and failing grades; calculating the average score, standard deviation, calculating the grade point; calculating the grade point. However, there is no in-depth understanding of the relationship between these grades obtained by students and the relationship between courses, and it is not found that these stored grades are an important basis for scheduling courses. The manual course arrangement method still adopted is compiled by the dean of teaching or each department head. They rely on years of teaching experience, combined with relevant regulations, to decide which courses to open to students and the sequence of courses. This inevitably involves a certain degree of subjectivity and ignores the valuable resource of student grades accumulated over the years.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种教育大数据分析系统,以解决上述背景技术中提出的某种或某些缺陷。The purpose of the present invention is to provide an educational big data analysis system to solve one or some of the defects proposed in the above background art.

为实现上述目的,本发明提供如下技术方案:To achieve the above object, the present invention provides the following technical solutions:

一种教育大数据分析系统,包括教育数据采集模块、教育数据整理模块、教育数据挖掘模块和教育数据分析模块,各模块之间通信连接,教育数据采集模块采集教育大数据并发送至教育数据整理模块,教育数据整理模块通过对教育大数据进行预处理,按照预设标准格式对获取到的数据进行清洗,过滤掉冗余信息,将不同属性、格式的教育大数据依据属性分类存储为对应模板格式的存储数据,并将识别出的类型打上分类标签得到分类数据,教育数据挖掘模块对数据库中的各类标签数据进行挖掘,检索出教育数据库中所有的频繁项集,利用频繁项集构造出满足最小置信度的规则,教育数据分析模块接受用户提交的不同维度的分析请求,提取教育数据融合存储系统中的存储数据进行数据分析,将数据分析结果可视化展示。An educational big data analysis system, comprising an educational data acquisition module, an educational data sorting module, an educational data mining module and an educational data analysis module, the modules are communicated and connected, and the educational data acquisition module collects educational big data and sends it to the educational data sorting Module, the education data sorting module preprocesses the education big data, cleans the obtained data according to the preset standard format, filters out redundant information, and stores the education big data of different attributes and formats as corresponding templates according to the attributes. The data is stored in the format, and the identified types are labeled with classification labels to obtain classified data. The education data mining module mines various label data in the database, retrieves all the frequent itemsets in the education database, and uses the frequent itemsets to construct a Satisfying the rule of minimum confidence, the educational data analysis module accepts analysis requests of different dimensions submitted by users, extracts the stored data in the educational data fusion storage system for data analysis, and displays the data analysis results visually.

作为优选,教育数据采集模块包括网端端口、校园端口和人工端口,网端端口与云端服务器相连,用于收集网络平台上的教育大数据,校园端口与校园管理设备,收集校园管理设备记录及产生的教育大数据,人工端口用于人工补录缺省数据。Preferably, the education data collection module includes a network port, a campus port and a manual port. The network port is connected to the cloud server and is used to collect educational big data on the network platform. The campus port is connected to campus management equipment and collects campus management equipment records and records. The generated educational big data, the manual port is used to manually supplement the default data.

作为优选,教育数据采集模块的采集方式包括网端收集、在线校园数据提取和手动录入。Preferably, the collection methods of the education data collection module include network terminal collection, online campus data extraction and manual input.

作为优选,分类数据为音频数据、图像数据和/或文本数据时,提取子模块,还用于从音频数据、图像数据和/或文本数据中提取上课过程中表征学生行为的第一数据,第一数据包括:学习状态、兴趣调查问卷表、访问记录、回复/提问内容、回答问题的次数和/或作业完成情况;以及提取表征教师行为的第二数据,第二数据包括:讲课的方式、兴趣调查问卷表、访问记录、回复/提问内容、提问的次数、教学进度和/或作业的布置情况。Preferably, when the classified data is audio data, image data and/or text data, the extraction sub-module is further configured to extract the first data representing student behavior during the class from the audio data, image data and/or text data. The first data includes: learning status, interest questionnaire, interview records, content of responses/questions, times of answering questions and/or homework completion; and extracting second data representing teacher behavior, the second data includes: teaching methods, Interest questionnaire, interview records, content of responses/questions, number of questions asked, teaching progress and/or assignments.

作为优选,教育数据挖掘模块采用关联规则和Apriori算法对分类数据进行预处理。Preferably, the educational data mining module uses association rules and Apriori algorithm to preprocess the classified data.

作为优选,关联规则用于反映一个数据与其他数据之间的相互依存和关联性,具体为:Preferably, association rules are used to reflect the interdependence and correlation between one data and other data, specifically:

设I={i1,i2,i3,……,in}是数据的集合,in为数据,D为数据库T的集合,T为每个数据唯一的数据号,设X、Y为一个I中数据的集合,且X∩Y=Ф一个关联规则为形如

Figure BDA0002254671150000021
的逻辑蕴含式,规定
Figure BDA0002254671150000022
在数据集D中支持度是数据集中同时包含X和Y的数据数与所有数据数之比,反应规则的可靠程度记为support Let I={i 1 , i 2 , i 3 ,...,in } be the set of data, in is the data, D is the set of database T, T is the unique data number of each data, let X, Y is a set of data in I, and X∩Y=Ф an association rule of the form
Figure BDA0002254671150000021
The logical implication of , states that
Figure BDA0002254671150000022
In data set D, the support degree is the ratio of the number of data containing both X and Y to the number of all data in the data set, and the reliability of the reaction rule is recorded as support

且support

Figure BDA0002254671150000024
=P(X∪Y),and support
Figure BDA0002254671150000024
=P(X∪Y),

若数据集中超过用户给定的最小支持度阈值,则该数据集为频繁数据集,反应规则的把握程度为confidence

Figure BDA0002254671150000025
If the data set exceeds the minimum support threshold given by the user, the data set is a frequent data set, and the degree of certainty of the response rule is confidence
Figure BDA0002254671150000025

且confidence

Figure BDA0002254671150000026
=P(Y|X),and confidence
Figure BDA0002254671150000026
=P(Y|X),

同时满足最小支持度阈值和最小置信度阈值的规则成为强规则,给定一个数据集D,挖掘关联规则问题就是寻找支持度和置信度分别大于用户给定的最小阈值的关联规则。A rule that satisfies both the minimum support threshold and the minimum confidence threshold at the same time becomes a strong rule. Given a data set D, mining association rules is to find the association rules whose support and confidence are greater than the minimum thresholds given by the user.

作为优选,Apriori算法采用广度优先的迭代搜素,首先找出频繁1-项集L1,用L1查找频繁2-项集L2,依次类推,直到求出所有的频繁项集,当发现某频繁项集的数目为零,则计算停止,最终输出所有的项目的频繁集。As a preference, the Apriori algorithm uses breadth-first iterative search, first finds the frequent 1-itemsets L1, uses L1 to find the frequent 2-itemsets L2, and so on, until all frequent itemsets are found, when a frequent item is found When the number of sets is zero, the calculation stops and finally outputs the frequent set of all items.

作为优选,教育数据分析模块包括模版库、仓库和建模平台,模版库、建模平台和仓库均与云端服务器相连。Preferably, the educational data analysis module includes a template library, a warehouse and a modeling platform, and the template library, the modeling platform and the warehouse are all connected to the cloud server.

作为优选,模板库包括识别单元和匹配单元,识别单元设有模糊分类器,用于对分析请求进行模糊分类,识别单元与匹配单元相连,匹配单元包括若干个匹配模板,匹配单元与建模平台相连,将处理信息发送至建模平台,建模平台包括建模模板和数据处理单元,建模模板存有若干标准建模模型,数据处理单元分别连接建模模板和教育数据融合存储系统,仓库连接数据处理单元,用于存储生成的分析结果。Preferably, the template library includes an identification unit and a matching unit, the identification unit is provided with a fuzzy classifier for performing fuzzy classification on the analysis request, the identification unit is connected with the matching unit, the matching unit includes several matching templates, and the matching unit is connected to the modeling platform. Connect, send the processing information to the modeling platform. The modeling platform includes a modeling template and a data processing unit. The modeling template stores several standard modeling models. The data processing unit is respectively connected to the modeling template and the education data fusion storage system. A data processing unit is connected for storing the generated analysis results.

作为优选,教育数据分析模块分析结果的生成方法包括:Preferably, the method for generating the analysis result of the educational data analysis module includes:

S1:输入分析请求;S1: Input analysis request;

S2:模版库通过识别单元和匹配单元生成处理信息,发送处理信息至建模平台;S2: The template library generates processing information through the identification unit and the matching unit, and sends the processing information to the modeling platform;

S3:建模平台提取建模模型,生成分析请求报告;S3: The modeling platform extracts the modeling model and generates an analysis request report;

S4:发送分析结果至仓库。S4: Send the analysis results to the warehouse.

与现有技术相比,本发明的有益效果是:Compared with the prior art, the beneficial effects of the present invention are:

1、本教育大数据分析系统以学生学习成绩为数据源,选取成绩数据库中多门课程作为研究对象,找出某门课程对与其他课程的开设是否有影响,为学校教科老师以后排课提供参考,为以后学生选课提供依据。1. This educational big data analysis system takes students' academic performance as the data source, selects multiple courses in the grade database as the research object, finds out whether a certain course has an impact on the opening of other courses, and provides the school's teaching teachers for future course arrangement. Reference, to provide a basis for future students to choose courses.

2、本教育大数据分析系统利用数据挖掘技术,为教育大数据的合理有效利用提供高效的处理方式,不仅涉及人工智能领域以及统计学的应用,而且涉及数据库的管理和使用,应用于学生成绩分析中,使其数据更具有时效性,更加简洁明了。此外,整合网络端口、校园端口和人工端口的数据进行综合分析获得教育大数据,对获取到的数据进行整理分类,将其整理成符合预设要求的分类数据,将整理后的分类数据存储到与所述分类数据相对应的数据库,进行统一保存管理,以便后续调用该数据,在需要对这些数据进行分析和统计时,便调用存储于数据库中的数据并进行深度分析,在获得分析结果后,根据该分析结果有针对性的提供服务,从而提高学生学习以及老师教学的质量和效率,达到事半功倍的效果。2. This educational big data analysis system uses data mining technology to provide efficient processing methods for the rational and effective use of educational big data. It not only involves the application of artificial intelligence and statistics, but also involves the management and use of databases, and is applied to student performance. In the analysis, make the data more timely and concise. In addition, integrate the data of network port, campus port and manual port for comprehensive analysis to obtain educational big data, organize and classify the obtained data, organize it into classified data that meets the preset requirements, and store the sorted classified data in the The database corresponding to the classified data is stored and managed in a unified manner, so that the data can be called later. When the data needs to be analyzed and counted, the data stored in the database will be called for in-depth analysis, and after the analysis results are obtained. , and provide targeted services according to the analysis results, so as to improve the quality and efficiency of students' learning and teachers' teaching, and achieve a multiplier effect.

3、本教育大数据分析系统作为一款能针对学生和/或教师需求提供服务的系统,在使用时,可以部署在各高校的管理中心,便能对基于互联网体系架构中的传感器所采集的数据以及人工导入的数据进行包括分类、存储、分析以及提供服务等的深度处理,具备功能齐全、性能可靠等优点。3. As a system that can provide services for the needs of students and/or teachers, this educational big data analysis system can be deployed in the management centers of various colleges and universities, and can analyze the data collected by sensors in the Internet-based architecture. Data and manually imported data undergo in-depth processing including classification, storage, analysis, and service provision, and have the advantages of complete functions and reliable performance.

附图说明Description of drawings

图1是本发明的模块流程示意图;1 is a schematic flow diagram of a module of the present invention;

图2是本发明的教育数据分析模块组成示意图。FIG. 2 is a schematic diagram of the composition of the educational data analysis module of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

实施例1Example 1

一种教育大数据分析系统,如图1所示,包括教育数据采集模块1、教育数据整理模块2、教育数据挖掘模块3和教育数据分析模块4,各模块之间通信连接,教育数据采集模块1采集教育大数据并发送至教育数据整理模块2,教育数据整理模块2通过对教育大数据进行预处理,按照预设标准格式对获取到的数据进行清洗,过滤掉冗余信息,将不同属性、格式的教育大数据依据属性分类存储为对应模板格式的存储数据,并将识别出的类型打上分类标签得到分类数据,教育数据挖掘模块3对数据库中的各类标签数据进行挖掘,检索出教育数据库中所有的频繁项集,利用频繁项集构造出满足最小置信度的规则,教育数据分析模块4接受用户提交的不同维度的分析请求,提取教育数据融合存储系统中的存储数据进行数据分析,将数据分析结果可视化展示。An educational big data analysis system, as shown in Figure 1, includes an educational data acquisition module 1, an educational data sorting module 2, an educational data mining module 3 and an educational data analysis module 4, and the communication connection between the modules, the educational data acquisition module 1. Collect educational big data and send it to the educational data sorting module 2. The educational data sorting module 2 preprocesses the educational big data, cleans the obtained data according to the preset standard format, filters out redundant information, and separates different attributes. The educational big data of the format is classified according to the attributes and stored as the storage data of the corresponding template format, and the identified types are labeled with the classification labels to obtain the classified data. The education data mining module 3 mines the various label data in the database, and retrieves the All frequent itemsets in the database are used to construct a rule that satisfies the minimum confidence degree. The educational data analysis module 4 accepts the analysis requests of different dimensions submitted by the user, and extracts the stored data in the educational data fusion storage system for data analysis. Visualize the data analysis results.

进一步的,教育数据采集模块1包括网端端口、校园端口和人工端口,网端端口与云端服务器相连,用于收集网络平台上的教育大数据,校园端口与校园管理设备,收集校园管理设备记录及产生的教育大数据,人工端口用于人工补录缺省数据,教育数据采集模块1的采集方式包括网端收集、在线校园数据提取和手动录入,可以整合网络端口、校园端口和人工端口的数据进行综合分析,数据更为全面、分析结果更为准确。Further, the education data collection module 1 includes a network port, a campus port and an artificial port. The network port is connected to the cloud server and is used to collect educational big data on the network platform. The campus port is connected to campus management equipment and collects campus management equipment records. and the generated educational big data. The manual port is used to manually supplement the default data. The collection methods of the educational data collection module 1 include network terminal collection, online campus data extraction and manual input. It can integrate the network port, campus port and manual port. Comprehensive analysis of the data makes the data more comprehensive and the analysis results more accurate.

具体的,分类数据为音频数据、图像数据和/或文本数据时,提取子模块,还用于从音频数据、图像数据和/或文本数据中提取上课过程中表征学生行为的第一数据,第一数据包括:学习状态、兴趣调查问卷表、访问记录、回复/提问内容、回答问题的次数和/或作业完成情况;以及提取表征教师行为的第二数据,第二数据包括:讲课的方式、兴趣调查问卷表、访问记录、回复/提问内容、提问的次数、教学进度和/或作业的布置情况。Specifically, when the classified data is audio data, image data and/or text data, the extraction sub-module is further configured to extract the first data representing student behavior during the class from the audio data, image data and/or text data. The first data includes: learning status, interest questionnaire, interview records, content of responses/questions, times of answering questions and/or homework completion; and extracting second data representing teacher behavior, the second data includes: teaching methods, Interest questionnaire, interview records, content of responses/questions, number of questions asked, teaching progress and/or assignments.

此外,如图2所示,教育数据分析模块包括模版库、仓库和建模平台,模版库、建模平台和仓库均与云端服务器相连。模板库包括识别单元和匹配单元,识别单元设有模糊分类器,用于对分析请求进行模糊分类,识别单元与匹配单元相连,匹配单元包括若干个匹配模板,匹配单元与建模平台相连,将处理信息发送至建模平台,建模平台包括建模模板和数据处理单元,建模模板存有若干标准建模模型,数据处理单元分别连接建模模板和教育数据融合存储系统,仓库连接数据处理单元,用于存储生成的分析结果。In addition, as shown in Figure 2, the educational data analysis module includes a template library, a warehouse and a modeling platform, and the template library, the modeling platform and the warehouse are all connected to the cloud server. The template library includes an identification unit and a matching unit. The identification unit is provided with a fuzzy classifier for fuzzy classification of the analysis request. The identification unit is connected with the matching unit. The matching unit includes several matching templates. The matching unit is connected with the modeling platform. The processing information is sent to the modeling platform. The modeling platform includes a modeling template and a data processing unit. The modeling template stores several standard modeling models. The data processing unit is respectively connected to the modeling template and the education data fusion storage system, and the warehouse is connected to the data processing. A cell that stores the generated analysis results.

教育数据分析模块分析结果的生成方法包括:The methods for generating the analysis results of the education data analysis module include:

S1:输入分析请求;S1: Input analysis request;

S2:模版库通过识别单元和匹配单元生成处理信息,发送处理信息至建模平台;S2: The template library generates processing information through the identification unit and the matching unit, and sends the processing information to the modeling platform;

S3:建模平台提取建模模型,生成分析请求报告;S3: The modeling platform extracts the modeling model and generates an analysis request report;

S4:发送分析结果至仓库。S4: Send the analysis results to the warehouse.

本实施例的教育大数据分析系统利用数据挖掘技术,为教育大数据的合理有效利用提供高效的处理方式。不仅涉及人工智能领域以及统计学的应用,而且涉及数据库的管理和使用,应用于学生成绩分析中,使其数据更具有时效性,更加简洁明了。整合网络端口、校园端口和人工端口的数据进行综合分析获得教育大数据,对获取到的数据进行整理分类,将其整理成符合预设要求的分类数据,将整理后的分类数据存储到与所述分类数据相对应的数据库,进行统一保存管理,以便后续调用该数据,在需要对这些数据进行分析和统计时,便调用存储于数据库中的数据并进行深度分析,在获得分析结果后,根据该分析结果有针对性的提供服务,从而提高学生学习以及老师教学的质量和效率,达到事半功倍的效果。本实施例的教育大数据分析系统作为一款能针对学生和/或教师需求提供服务的系统,在使用时,可以部署在各高校的管理中心,便能对基于互联网体系架构中的传感器所采集的数据以及人工导入的数据进行包括分类、存储、分析以及提供服务等的深度处理,具备功能齐全、性能可靠等优点。The educational big data analysis system of this embodiment utilizes data mining technology to provide an efficient processing method for rational and effective utilization of educational big data. It not only involves the application of artificial intelligence and statistics, but also involves the management and use of databases. It is applied to the analysis of student performance to make the data more timely and concise. Integrate the data of the network port, campus port and manual port for comprehensive analysis to obtain educational big data, organize and classify the obtained data, organize it into classified data that meets the preset requirements, and store the sorted classified data in the The database corresponding to the classification data described above is stored and managed in a unified manner, so that the data can be called later. When the data needs to be analyzed and counted, the data stored in the database will be called for in-depth analysis. After the analysis results are obtained, according to the The analysis results provide targeted services, thereby improving the quality and efficiency of students' learning and teachers' teaching, and achieving a multiplier effect. The educational big data analysis system of this embodiment is a system that can provide services for the needs of students and/or teachers. When in use, it can be deployed in the management centers of various colleges and universities, so that the data collected by sensors in the Internet-based architecture can be analyzed. In-depth processing including classification, storage, analysis, and provision of services is carried out on the data and manually imported data, and has the advantages of complete functions and reliable performance.

实施例2Example 2

作为本发明的第二种实施例,教育数据挖掘模块3采用关联规则和Apriori算法对分类数据进行预处理。As the second embodiment of the present invention, the educational data mining module 3 uses association rules and Apriori algorithm to preprocess the classified data.

具体的,关联规则用于反映一个数据与其他数据之间的相互依存和关联性,具体为:Specifically, association rules are used to reflect the interdependence and correlation between one data and other data, specifically:

设I={i1,i2,i3,……,in}是数据的集合,in为数据,D为数据库T的集合,T为每个数据唯一的数据号,设X、Y为一个I中数据的集合,且X∩Y=Ф一个关联规则为形如

Figure BDA0002254671150000061
的逻辑蕴含式,规定
Figure BDA0002254671150000062
在数据集D中支持度是数据集中同时包含X和Y的数据数与所有数据数之比,反应规则的可靠程度记为support
Figure BDA0002254671150000063
Let I={i 1 , i 2 , i 3 ,...,in } be the set of data, in is the data, D is the set of database T, T is the unique data number of each data, let X, Y is a set of data in I, and X∩Y=Ф an association rule of the form
Figure BDA0002254671150000061
The logical implication of , states that
Figure BDA0002254671150000062
In data set D, the support degree is the ratio of the number of data containing both X and Y to the number of all data in the data set, and the reliability of the reaction rule is recorded as support
Figure BDA0002254671150000063

且support

Figure BDA0002254671150000064
=PX∪Y,and support
Figure BDA0002254671150000064
=PX∪Y,

若数据集中超过用户给定的最小支持度阈值,则该数据集为频繁数据集,反应规则的把握程度为confidence

Figure BDA0002254671150000065
If the data set exceeds the minimum support threshold given by the user, the data set is a frequent data set, and the degree of certainty of the response rule is confidence
Figure BDA0002254671150000065

且confidence

Figure BDA0002254671150000066
=PY|X,and confidence
Figure BDA0002254671150000066
=PY|X,

同时满足最小支持度阈值和最小置信度阈值的规则成为强规则,给定一个数据集D,挖掘关联规则问题就是寻找支持度和置信度分别大于用户给定的最小阈值的关联规则。A rule that satisfies both the minimum support threshold and the minimum confidence threshold at the same time becomes a strong rule. Given a data set D, mining association rules is to find the association rules whose support and confidence are greater than the minimum thresholds given by the user.

Apriori算法采用广度优先的迭代搜素,首先找出频繁1-项集L1,用L1查找频繁2-项集L2,依次类推,直到求出所有的频繁项集,当发现某频繁项集的数目为零,则计算停止,最终输出所有的项目的频繁集,将Apriori算法引入数据库挖掘领域,对数据库关联挖掘中的数据、项、基集、关联规则进行定义,考虑关联规则的支持度和置信度,提取相似或相近的知识点。Apriori algorithm uses breadth-first iterative search, first finds frequent 1-itemsets L1, uses L1 to find frequent 2-itemsets L2, and so on, until all frequent itemsets are found, when the number of frequent itemsets is found If it is zero, the calculation stops, and the frequent set of all items is finally output. The Apriori algorithm is introduced into the field of database mining, and the data, items, basis sets, and association rules in database association mining are defined, and the support and confidence of the association rules are considered. degree to extract similar or similar knowledge points.

本实施例的教育大数据分析系统以学生学习成绩为数据源,选取成绩数据库中多门课程作为研究对象,找出某门课程对与其他课程的开设是否有影响,为学校教科老师以后排课提供参考,为以后学生选课提供依据。The educational big data analysis system of this embodiment takes students' academic achievements as the data source, selects multiple courses in the achievement database as the research objects, finds out whether a certain course has an impact on the opening of other courses, and arranges courses for the school teachers in the future Provide a reference for future students to choose courses.

以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的仅为本发明的优选例,并不用来限制本发明,在不脱离本发明精神和范围的前提下,本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The foregoing has shown and described the basic principles, main features and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments, and the above-mentioned embodiments and descriptions are only preferred examples of the present invention, and are not intended to limit the present invention, without departing from the spirit and scope of the present invention. Under the premise, the present invention will also have various changes and improvements, and these changes and improvements all fall within the scope of the claimed invention. The claimed scope of the present invention is defined by the appended claims and their equivalents.

Claims (10)

1.一种教育大数据分析系统,其特征在于:包括教育数据采集模块(1)、教育数据整理模块(2)、教育数据挖掘模块(3)和教育数据分析模块(4),各模块之间通信连接,教育数据采集模块(1)采集教育大数据并发送至教育数据整理模块(2),教育数据整理模块(2)通过对教育大数据进行预处理,按照预设标准格式对获取到的数据进行清洗,过滤掉冗余信息,将不同属性、格式的教育大数据依据属性分类存储为对应模板格式的存储数据,并将识别出的类型打上分类标签得到分类数据,教育数据挖掘模块(3)对数据库中的各类标签数据进行挖掘,检索出教育数据库中所有的频繁项集,利用频繁项集构造出满足最小置信度的规则,教育数据分析模块(4)接受用户提交的不同维度的分析请求,提取教育数据融合存储系统中的存储数据进行数据分析,将数据分析结果可视化展示。1. an educational big data analysis system, it is characterized in that: comprise educational data acquisition module (1), educational data sorting module (2), educational data mining module (3) and educational data analysis module (4), each module The educational data collection module (1) collects educational big data and sends it to the educational data sorting module (2), and the educational data sorting module (2) preprocesses the educational big data and parses the acquired data according to the preset standard format. The data is cleaned, the redundant information is filtered out, the educational big data of different attributes and formats are classified and stored as the storage data of the corresponding template format according to the attributes, and the identified types are labeled with the classification labels to obtain the classified data, and the educational data mining module ( 3) Mining all kinds of label data in the database, retrieving all the frequent itemsets in the education database, and using the frequent itemsets to construct the rules that satisfy the minimum confidence, the education data analysis module (4) accepts different dimensions submitted by users , extract the stored data in the educational data fusion storage system for data analysis, and visualize the data analysis results. 2.根据权利要求1所述的教育大数据分析系统,其特征在于:教育数据采集模块(1)包括网端端口、校园端口和人工端口,网端端口与云端服务器相连,用于收集网络平台上的教育大数据,校园端口与校园管理设备,收集校园管理设备记录及产生的教育大数据,人工端口用于人工补录缺省数据。2. education big data analysis system according to claim 1, is characterized in that: education data acquisition module (1) comprises network port, campus port and artificial port, and network port is connected with cloud server, is used for collecting network platform The educational big data on the campus port and the campus management equipment collects the educational big data recorded and generated by the campus management equipment, and the manual port is used to manually supplement the default data. 3.根据权利要求2所述的教育大数据分析系统,其特征在于:教育数据采集模块(1)的采集方式包括网端收集、在线校园数据提取和手动录入。3 . The educational big data analysis system according to claim 2 , wherein the collection methods of the education data collection module (1) include network terminal collection, online campus data extraction and manual input. 4 . 4.根据权利要求1所述的教育大数据分析系统,其特征在于:分类数据为音频数据、图像数据和/或文本数据时,提取子模块,还用于从音频数据、图像数据和/或文本数据中提取上课过程中表征学生行为的第一数据,第一数据包括:学习状态、兴趣调查问卷表、访问记录、回复/提问内容、回答问题的次数和/或作业完成情况;以及提取表征教师行为的第二数据,第二数据包括:讲课的方式、兴趣调查问卷表、访问记录、回复/提问内容、提问的次数、教学进度和/或作业的布置情况。4. education big data analysis system according to claim 1, is characterized in that: when classification data is audio data, image data and/or text data, extract submodule, also be used for from audio data, image data and/or Extracting first data representing student behavior during class from the text data, the first data includes: learning status, interest questionnaire, interview records, content of responses/questions, times of answering questions and/or homework completion; and extracting representations The second data of the teacher's behavior, the second data includes: the way of teaching, the questionnaire of interest, the interview record, the content of the reply/question, the frequency of the question, the teaching progress and/or the arrangement of the homework. 5.根据权利要求1所述的教育大数据分析系统,其特征在于:教育数据挖掘模块(3)采用关联规则和Apriori算法对分类数据进行预处理。5. The educational big data analysis system according to claim 1, wherein the educational data mining module (3) uses association rules and Apriori algorithm to preprocess the classified data. 6.根据权利要求5所述的教育大数据分析系统,其特征在于:关联规则用于反映一个数据与其他数据之间的相互依存和关联性,具体为:6. The educational big data analysis system according to claim 5, wherein the association rule is used to reflect the interdependence and correlation between one data and other data, specifically: 设I={i1,i2,i3,……,in}是数据的集合,in为数据,D为数据库T的集合,T为每个数据唯一的数据号,设X、Y为一个I中数据的集合,且X∩Y=Ф一个关联规则为形如
Figure FDA0002254671140000021
的逻辑蕴含式,规定
Figure FDA0002254671140000022
在数据集D中支持度是数据集中同时包含X和Y的数据数与所有数据数之比,反应规则的可靠程度记为
Let I={i 1 , i 2 , i 3 ,...,in } be the set of data, in is the data, D is the set of database T, T is the unique data number of each data, let X, Y is a set of data in I, and X∩Y=Ф an association rule of the form
Figure FDA0002254671140000021
The logical implication of , states that
Figure FDA0002254671140000022
In data set D, the support degree is the ratio of the number of data containing both X and Y to the number of all data in the data set, and the reliability of the reaction rule is recorded as
and 若数据集中超过用户给定的最小支持度阈值,则该数据集为频繁数据集,反应规则的把握程度为
Figure FDA0002254671140000025
If the data set exceeds the minimum support threshold given by the user, the data set is a frequent data set, and the degree of certainty of the reaction rule is
Figure FDA0002254671140000025
Figure FDA0002254671140000026
and
Figure FDA0002254671140000026
同时满足最小支持度阈值和最小置信度阈值的规则成为强规则,给定一个数据集D,挖掘关联规则问题就是寻找支持度和置信度分别大于用户给定的最小阈值的关联规则。A rule that satisfies both the minimum support threshold and the minimum confidence threshold at the same time becomes a strong rule. Given a data set D, mining association rules is to find the association rules whose support and confidence are greater than the minimum thresholds given by the user.
7.根据权利要求5所述的教育大数据分析系统,其特征在于:Apriori算法采用广度优先的迭代搜素,首先找出频繁1-项集L1,用L1查找频繁2-项集L2,依次类推,直到求出所有的频繁项集,当发现某频繁项集的数目为零,则计算停止,最终输出所有的项目的频繁集。7. The educational big data analysis system according to claim 5, is characterized in that: Apriori algorithm adopts breadth-first iterative search, first finds frequent 1-itemsets L1, uses L1 to find frequent 2-itemsets L2, and sequentially By analogy, until all frequent itemsets are obtained, when the number of frequent itemsets is found to be zero, the calculation stops, and finally the frequent sets of all items are output. 8.根据权利要求1所述的教育大数据分析系统,其特征在于:教育数据分析模块(4)包括模版库、仓库和建模平台,模版库、建模平台和仓库均与云端服务器相连。8. The educational big data analysis system according to claim 1, wherein the educational data analysis module (4) comprises a template library, a warehouse and a modeling platform, and the template library, the modeling platform and the warehouse are all connected to a cloud server. 9.根据权利要求8所述的教育大数据分析系统,其特征在于:模板库包括识别单元和匹配单元,识别单元设有模糊分类器,用于对分析请求进行模糊分类,识别单元与匹配单元相连,匹配单元包括若干个匹配模板,匹配单元与建模平台相连,将处理信息发送至建模平台,建模平台包括建模模板和数据处理单元,建模模板存有若干标准建模模型,数据处理单元分别连接建模模板和教育数据融合存储系统,仓库连接数据处理单元,用于存储生成的分析结果。9. The educational big data analysis system according to claim 8, wherein the template library comprises an identification unit and a matching unit, and the identification unit is provided with a fuzzy classifier for performing fuzzy classification on the analysis request, the identification unit and the matching unit The matching unit includes several matching templates. The matching unit is connected to the modeling platform and sends the processing information to the modeling platform. The modeling platform includes modeling templates and data processing units. The modeling template stores several standard modeling models. The data processing unit is respectively connected to the modeling template and the educational data fusion storage system, and the warehouse is connected to the data processing unit for storing the generated analysis results. 10.根据权利要求9所述的教育大数据分析系统,其特征在于:教育数据分析模块分析结果的生成方法包括:10. The education big data analysis system according to claim 9, wherein the method for generating the analysis result of the education data analysis module comprises: S1:输入分析请求;S1: Input analysis request; S2:模版库通过识别单元和匹配单元生成处理信息,发送处理信息至建模平台;S2: The template library generates processing information through the identification unit and the matching unit, and sends the processing information to the modeling platform; S3:建模平台提取建模模型,生成分析请求报告;S3: The modeling platform extracts the modeling model and generates an analysis request report; S4:发送分析结果至仓库。S4: Send the analysis results to the warehouse.
CN201911048339.8A 2019-10-30 2019-10-30 Education big data analysis system Pending CN110807060A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911048339.8A CN110807060A (en) 2019-10-30 2019-10-30 Education big data analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911048339.8A CN110807060A (en) 2019-10-30 2019-10-30 Education big data analysis system

Publications (1)

Publication Number Publication Date
CN110807060A true CN110807060A (en) 2020-02-18

Family

ID=69489648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911048339.8A Pending CN110807060A (en) 2019-10-30 2019-10-30 Education big data analysis system

Country Status (1)

Country Link
CN (1) CN110807060A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541126A (en) * 2020-12-26 2021-03-23 贵州树精英教育科技有限责任公司 Accurate teaching data mining
CN113034319A (en) * 2020-12-24 2021-06-25 广东国粒教育技术有限公司 User behavior data processing method and device in teaching management, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239430A (en) * 2014-08-27 2014-12-24 广西教育学院 Item weight change based method and system for mining education data association rules
CN104573124A (en) * 2015-02-09 2015-04-29 山东大学 Education cloud application statistics method based on parallelized association rule algorithm
CN105046362A (en) * 2015-07-24 2015-11-11 河南科技大学 Real-time prediction method of food safety on the basis of association rule mining
CN107967572A (en) * 2017-12-15 2018-04-27 华中师范大学 A kind of intelligent server based on education big data
CN108121785A (en) * 2017-12-15 2018-06-05 华中师范大学 A kind of analysis method based on education big data
CN108132989A (en) * 2017-12-15 2018-06-08 华中师范大学 A kind of distributed system based on education big data
CN108595617A (en) * 2018-04-23 2018-09-28 温州市鹿城区中津先进科技研究院 A kind of education big data overall analysis system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239430A (en) * 2014-08-27 2014-12-24 广西教育学院 Item weight change based method and system for mining education data association rules
CN104573124A (en) * 2015-02-09 2015-04-29 山东大学 Education cloud application statistics method based on parallelized association rule algorithm
CN105046362A (en) * 2015-07-24 2015-11-11 河南科技大学 Real-time prediction method of food safety on the basis of association rule mining
CN107967572A (en) * 2017-12-15 2018-04-27 华中师范大学 A kind of intelligent server based on education big data
CN108121785A (en) * 2017-12-15 2018-06-05 华中师范大学 A kind of analysis method based on education big data
CN108132989A (en) * 2017-12-15 2018-06-08 华中师范大学 A kind of distributed system based on education big data
CN108595617A (en) * 2018-04-23 2018-09-28 温州市鹿城区中津先进科技研究院 A kind of education big data overall analysis system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113034319A (en) * 2020-12-24 2021-06-25 广东国粒教育技术有限公司 User behavior data processing method and device in teaching management, electronic equipment and storage medium
CN112541126A (en) * 2020-12-26 2021-03-23 贵州树精英教育科技有限责任公司 Accurate teaching data mining

Similar Documents

Publication Publication Date Title
CN107967572A (en) A kind of intelligent server based on education big data
CN108121785A (en) A kind of analysis method based on education big data
CN109272789A (en) Learning effect assessment system and appraisal procedure based on data analysis
CN109214664B (en) Emotional behavior comprehensive analysis system based on artificial intelligence
CN108920544A (en) A kind of personalized position recommended method of knowledge based map
Sundar A comparative study for predicting students academic performance using Bayesian network classifiers
LU100314B1 (en) Method and system for predicting academic achievements of students based on naive bayesian model
CN113918588B (en) Knowledge point-based wrong question dynamic intelligent management system
CN111882247A (en) Online learning system evaluation method based on comprehensive fuzzy evaluation model
CN108520662B (en) Teaching feedback system based on knowledge point analysis
Shaziya et al. Prediction of students performance in semester exams using a naïve bayes classifier
CN108132989A (en) A kind of distributed system based on education big data
CN118585502A (en) A network education resource sharing system based on big data
CN110807060A (en) Education big data analysis system
CN111444244B (en) A big data information management system
Rahutomo et al. Building Datawarehouse for Educational Institutions in 9 Steps
CN112348721A (en) Primary and secondary school student comprehensive quality evaluation platform and method based on big data acquisition
CN114983416A (en) System for pre-warning and caring psychological crisis risk of students and operation method thereof
CN117952796A (en) Reading teaching quality assessment method and system based on data analysis
Wu [Retracted] Higher Education Environment Monitoring and Quality Assessment Model Using Big Data Analysis and Deep Learning
Liu The Application of K-Means Clustering Algorithm in the Quality Analysis of College English Teaching
CN115409329A (en) Teacher micro-capability evaluation method based on deep learning model
CN111626902B (en) A blockchain-based online education management system and method
CN114819620A (en) A Decision Tree-Based Learning Situation Analysis Method
CN111797124A (en) Examination situation analysis method, examination situation analysis device, storage medium and examination situation analysis system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200218

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载