CN108427600A - Data task processing method, application server and computer readable storage medium - Google Patents
Data task processing method, application server and computer readable storage medium Download PDFInfo
- Publication number
- CN108427600A CN108427600A CN201810066359.7A CN201810066359A CN108427600A CN 108427600 A CN108427600 A CN 108427600A CN 201810066359 A CN201810066359 A CN 201810066359A CN 108427600 A CN108427600 A CN 108427600A
- Authority
- CN
- China
- Prior art keywords
- task
- data
- dependence
- application server
- completed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3051—Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
- G06F9/4494—Execution paradigms, e.g. implementations of programming paradigms data driven
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45504—Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
- G06F9/45508—Runtime interpretation or emulation, e g. emulator loops, bytecode interpretation
- G06F9/45512—Command shells
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/48—Indexing scheme relating to G06F9/48
- G06F2209/484—Precedence
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- Computer And Data Communications (AREA)
- Debugging And Monitoring (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of data task processing method, this method includes:Task list is obtained from terminal device;The task dependent is configured, to analyze the dependence of data and task;Record the implementation procedure that data synchronize;The implementation procedure and data synchronized according to the data judges whether data are synchronous with the dependence of task;If data have synchronously completed, having completed data synchronization of the task is executed;If data are not completed to synchronize, warning information is sent out.The present invention also provides a kind of application server and computer readable storage mediums.Data task processing method, application server and computer readable storage medium provided by the invention can lead to the implementation procedure that the data synchronize and data judge whether data are synchronous with the dependence of task, realize that task could be executed by only having data to complete synchronization.
Description
Technical field
The present invention relates to a kind of data analysis field more particularly to data task processing method, application server and calculating
Machine readable storage medium storing program for executing.
Background technology
Hadoop is a distributed basis framework of increasing income, and user can be the case where not knowing about distributed low-level details
Under, develop distributed program.Hadoop realizes a distributed file system, it provides high transmission rates and carrys out access application
Data, be suitble to those to have the application program of super large data set.The scheduling of task is usually realized using ozzie, but is based on
The task scheduling of independent ozzie cannot analyze the dependence for realizing data and task, such as when the task of model will wait one section
In data synchronize run through after could execute, however the dependence of data and task control confusion, more difficult discovery after ging wrong.
Invention content
In view of this, a kind of data task processing method of present invention proposition, application server and computer-readable storage medium
Matter can obtain corresponding help document related to current page operation by the information that requests help described in analysis, improve user
Experience.
First, to achieve the above object, the present invention proposes that a kind of data task processing method, this method are applied to application and take
Business device, the method includes:
Task list is obtained from terminal device;
The task dependent is configured, to analyze the dependence of data and task;
Record the implementation procedure that data synchronize;
The implementation procedure and data synchronized according to the data judges whether data are synchronous with the dependence of task;
If data have synchronously completed, having completed data synchronization of the task is executed;
If data are not completed to synchronize, warning information is sent out.
Optionally, the dependent for configuring the task in the task list, to analyze the dependence of data and task
Step specifically comprises the following steps:
Obtain effective dependence (relier) configuration of the flow nodes of the task;
Dependent status query statement is executed, and exports original dependence result;
Merging multiple tasks node, completion dependent status, and to relying on result duplicate removal;
Configuration slide labels are relied on for the dependence result mark of the duplicate removal, complete the dependence for analyzing all tasks.
Optionally, if the data have synchronously completed, the step of having completed the task that data synchronize is executed, it is specific to wrap
Include following steps:
It obtains waiting wheel race task and runs task again;
Execute the wheel race task and the heavy race task.
Optionally, the execution wheel race task and the step of the heavy race task before, further include following steps:
The wheel race task and the heavy race task are ranked up by priority level height;
It is preferential to execute the high task of grade.
Optionally, the method further includes following steps:
Monitor currently performed task;
When occurring abnormal in task execution process, early warning is sent out.
In addition, to achieve the above object, the present invention also provides a kind of application server, the application server includes storage
Device, processor, the data task processing system that can be run on the processor is stored on the memory, and the data are appointed
Business processing system realizes following steps when being executed by the processor:
Task list is obtained from terminal device;
The task dependent is configured, to analyze the dependence of data and task;
Record the implementation procedure that data synchronize;
The implementation procedure and data synchronized according to the data judges whether data are synchronous with the dependence of task;
If data have synchronously completed, having completed data synchronization of the task is executed;
If data are not completed to synchronize, warning information is sent out.
Optionally, the dependent for configuring the task in the task list, to analyze the dependence of data and task
Step specifically comprises the following steps:
Obtain effective dependence (relier) configuration of the flow nodes of the task;
Dependent status query statement is executed, and exports original dependence result;
Merging multiple tasks node, completion dependent status, and to relying on result duplicate removal;
Configuration slide labels are relied on for the dependence result mark of the duplicate removal, the scheduling for completing all tasks relies on.
Optionally, if the data have synchronously completed, the step of having completed the task that data synchronize is executed, it is specific to wrap
Include following steps:
It obtains waiting wheel race task and runs task again;
The wheel race task and the heavy race task are ranked up by priority level height;
It is preferential to execute the high task of grade.
Optionally, when the data task processing system is executed by the processor, following steps are also realized:
Monitor currently performed task;
When occurring abnormal in task execution process, early warning is sent out.
Further, to achieve the above object, the present invention also provides a kind of computer readable storage medium, the computers
Readable storage medium storing program for executing is stored with data task processing system, and the data task processing system can be held by least one processor
Row, so that at least one processor is executed such as the step of above-mentioned data task processing method.
Compared to the prior art, application server proposed by the invention, data task processing method and computer-readable
Storage medium obtains task list from terminal device first;Then, the task dependent is configured, to analyze data and task
Dependence;Then, the implementation procedure that record data synchronize;Further, according to the data synchronize implementation procedure and
Data judge whether data are synchronous with the dependence of task;Finally, if data have synchronously completed, data have been completed in execution
Synchronous task;If data are not completed to synchronize, warning information is sent out.In this way, both can be avoided in the prior art data with
The defect of the dependence control confusion of task, can also lead to the dependence of implementation procedure and data and task that the data synchronize
Judge whether data synchronize, realizes that task could be executed by only having data to complete synchronization.
Description of the drawings
Fig. 1 is each one optional application environment schematic diagram of embodiment of the present invention;
Fig. 2 is the schematic diagram of one optional hardware structure of application server in Fig. 1;
Fig. 3 is the program module schematic diagram of data task processing system first embodiment of the present invention;
Fig. 4 is the program module schematic diagram of data task processing system second embodiment of the present invention;
Fig. 5 is the flow diagram of data task processing method first embodiment of the present invention;
Fig. 6 is the flow diagram of data task processing method second embodiment of the present invention;
Reference numeral:
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not
For limiting the present invention.Based on the embodiments of the present invention, those of ordinary skill in the art are not before making creative work
The every other embodiment obtained is put, shall fall within the protection scope of the present invention.
It should be noted that the description for being related to " first ", " second " etc. in the present invention is used for description purposes only, and cannot
It is interpreted as indicating or implying its relative importance or implicitly indicates the quantity of indicated technical characteristic.Define as a result, " the
One ", the feature of " second " can explicitly or implicitly include at least one of the features.In addition, the skill between each embodiment
Art scheme can be combined with each other, but must can be implemented as basis with those of ordinary skill in the art, when technical solution
Will be understood that the combination of this technical solution is not present in conjunction with there is conflicting or cannot achieve when, also not the present invention claims
Protection domain within.
As shown in fig.1, being each one optional application environment schematic diagram of embodiment of the present invention.
In the present embodiment, present invention can apply to include but are not limited to, terminal device 1, application server 2, network
In 3 application environment.Wherein, the terminal device 1 can be that mobile phone, smart phone, laptop, digital broadcasting connect
Receive device, PDA (personal digital assistant), PAD (tablet computer), PMP (portable media player), navigation device, vehicle-mounted dress
The movable equipment set etc., and such as number TV, desktop computer, notebook, server etc. fixed terminal.It is described
Application server 2 can be that the calculating such as rack-mount server, blade server, tower server or Cabinet-type server are set
Standby, which can be independent server, can also be the server cluster that multiple servers are formed.It is described
Network 3 can be intranet (Intranet), internet (Internet), global system for mobile communications (Global
System of Mobile communication, GSM), wideband code division multiple access (Wideband Code Division
Multiple Access, WCDMA), 4G networks, 5G networks, bluetooth (Bluetooth), Wi-Fi, speech path network etc. is wireless or has
Gauze network.
Wherein, the application server 2 by the network 3 respectively with one or more 1 communication links of terminal device
It connects, to carry out data transmission and interact.
As shown in fig.2, being the schematic diagram of 2 one optional hardware structure of application server in Fig. 1.
In the present embodiment, the application server 2 may include, but be not limited only to, and company can be in communication with each other by system bus
Connect memory 11, processor 12, network interface 13.It should be pointed out that Fig. 1 illustrates only the application clothes with component 11-13
Business device 2, it should be understood that being not required for implementing all components shown, the implementation that can be substituted is more or less
Component.
Wherein, the memory 11 includes at least a type of readable storage medium storing program for executing, and the readable storage medium storing program for executing includes
Flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memories etc.), random access storage device (RAM), it is static with
Machine accesses memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), may be programmed only
Read memory (PROM), magnetic storage, disk, CD etc..In some embodiments, the memory 11 can be described answers
With the hard disk or memory of the internal storage unit of server 2, such as the application server 2.In further embodiments, described to deposit
Reservoir 11 can also be the External memory equipment of the application server 2, such as the plug-in type being equipped on the application server 2 is hard
Disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card
(Flash Card) etc..Certainly, the memory 11 can also both include the internal storage unit of the application server 2 or wrap
Include its External memory equipment.In the present embodiment, the memory 11 is installed on the behaviour of the application server 2 commonly used in storage
Make system and types of applications software, such as the program code etc. of data task processing system 200.In addition, the memory 11 is also
It can be used for temporarily storing the Various types of data that has exported or will export.
The processor 12 can be in some embodiments central processing unit (Central Processing Unit,
CPU), controller, microcontroller, microprocessor or other data processing chips.The processor 12 is commonly used in answering described in control
With the overall operation of server 2.In the present embodiment, the processor 12 is for running the program generation stored in the memory 11
Code or processing data, such as run the data task processing system 200 etc..
The network interface 13 may include radio network interface or wired network interface, which is commonly used in
Communication connection is established between the application server 2 and other electronic equipments.In the present embodiment, the network interface 13 is mainly used
In the application server 2 is connected with one or more terminal devices 1 by the network 3, in the application service
Data transmission channel and communication connection are established between device 2 and one or more of terminal devices 1.
So far, oneself is through describing the hardware configuration and work(of the application environment and relevant device of each embodiment of the present invention in detail
Energy.In the following, above application environment and relevant device will be based on, each embodiment of the present invention is proposed.
First, the present invention proposes a kind of data task processing system 200.
As shown in fig.3, being the Program modual graph of 200 first embodiment of data task processing system of the present invention.
In the present embodiment, the data task processing system 200 includes a series of calculating being stored on memory 11
Machine program instruction, when the computer program instructions are executed by processor 12, the data that various embodiments of the present invention may be implemented are appointed
Business processing operation.In some embodiments, the specific operation realized based on the computer program instructions each section, data are appointed
Business processing system 200 can be divided into one or more modules.For example, in figure 3, the data task processing system 200
Acquisition module 201, configuration module 202, logging modle 203, judgment module 204, execution module 205 and pre- can be divided into
Alert module 206.Wherein:
The acquisition module 201, for obtaining task list from terminal device 1.
Specifically, hadoop data platforms center is had in the application server 2, hadoop data platforms center is from outer
The terminal device 1 in portion obtains data, the data that the application server 2 is got according to hadoop data platforms center and progress
When data processing, need to carry out data acquisition, the operations such as data cleansing, data analysis, each process may relate to
To multiple tasks, some need sequence executes, and what is had can execute parallel.
In the present embodiment, the application server 2 obtains task list by acquisition module 201 from terminal device 1.Institute
State execution and sequence that application server 2 manages these tasks by oozie.Oozie is the scheduler based on hadoop, with
The form of xml writes scheduling flow, can dispatch mr, pig, hive, shell, jar etc..The application server 2 passes through
Oozie sequences execute flow of task node, support fork (the multiple nodes of branch), join (it is one to merge multiple nodes).
The configuration module 202, the dependent for configuring the task in the task list, with configuration data and task
Dependence.
Specifically, configuration task dependent is the dependence for configuration data and task, and only data are complete appoints
Business could execute.In the present embodiment, the application server 2 is by obtaining flow of task node fork (the multiple nodes of branch)
Effective relier (dependence) configurations, and execute relier status inquiry sentences, export baseline results, and then merge multiple tasks
Node, completion dependent status, and to relying on result duplicate removal, finally marked for the dependence result of the duplicate removal and rely on configuration slice mark
Label, the scheduling for completing all tasks rely on.
Table 1 is please referred to, is the dependence configuration format requirement in the present embodiment.
Table 1
In the present embodiment, the application server 2 overrides allocation list, from life by hive load loading configuration files
Production environment please run deployment after acquiring newest configuration file modification, and deployment order is to realize configuration dependent:
Step1:It uploads script and authorizes, formats and (private user is allowed to operate), upload to/tmp catalogues
chmod 777/tmp/relier_config_all.txt
Step2:Switching user (if your private user allows to execute hive orders, can not switch)
sudo su-hduser0006
Step3:Execute order
hive-e"use aml_awbs;Set mapred.job.queue.name=queue_0006_02;
truncate table fm_relier_check_script;
load data local inpath'/tmp/relier_config_all.txt'into table aml_
awbs.fm_relier_check_script;"
In an alternative embodiment of the invention, realize that configuration dependent can configure deployment order by modification:
Step1:Switching user (if your private user allows to execute hive orders, can not switch)
sudo su-hduser0006
Step2:Execute order
Hive-e " set mapred.job.queue.name=queue_0006_02;
insert overwrite table aml_awbs.fm_relier_check_script
select relier_name,
src_job_name,
If (relier_name='i_jt-aml-999-cd',
'select concat(y,m,d)datestr,\'Y\'state from aml_awbs.JOB_STATE where
JOB_NAME=' jt-aml-999-cd ",
relier_name)script_string,
fork
from aml_awbs.fm_relier_check_script"
The logging modle 203, the implementation procedure that the data for recording the task synchronize.
Specifically, it will be recalled from above that only the complete task of data just can perform, therefore in order to ensure the hadoop numbers
It is complete to obtain data from external terminal device 1 according to Platform center, when data have update or modification, the application clothes
Business device 2 records the implementation procedure that data synchronize by logging modle 203.In the present embodiment, the logging modle 203 utilizes
Shell creates daily record and state table, the execution time that the implementation procedure and data that record data synchronize synchronize.
The judgment module 204, the dependence of implementation procedure and data and task for being synchronized according to the data
Judge whether data synchronously complete.
Specifically, before execution task, whether the application server 2 first judge data by judgment module 204
Synchronized completion.The application server 2 is according to the synchronous execution of the shell daily records created and the data recorded in state table
The execution time and the dependence of data and task that journey, data synchronize judge whether data synchronously complete.
The execution module 205 has been synchronously completed for working as data, executes having completed data synchronization of the task.
The warning module 206 sends out warning information if not completing to synchronize for data.
Specifically, in the case that only data completion synchronization, i.e. data are complete, the execution module 205, which can just execute, appoints
Business.When data do not complete synchronous, the warning module 206 sends out warning information, in the present embodiment, the warning information
Including but not limited to without completing synchronous data information, last hyposynchronous time etc., to notify staff to carry out
Manual intervention.
By above procedure module 201-206, data task processing system 200 proposed by the invention, first, from terminal
Equipment 1 obtains task list;Then, the task dependent is configured, to analyze the dependence of data and task;Then, remember
Record the implementation procedure that data synchronize;Further, the dependence of the implementation procedure and data and task that are synchronized according to the data is closed
System judges whether data synchronize;Finally, if data have synchronously completed, having completed data synchronization of the task is executed;If data
It does not complete to synchronize, sends out warning information.In this way, the dependence control that both can be avoided data in the prior art and task is chaotic
Defect, implementation procedure that the data synchronize can also be led to and data judge whether data synchronous with the dependence of task,
Realize that task could be executed by only having data to complete synchronization.
Further, the above-mentioned first embodiment based on data task processing system 200 of the present invention proposes the of the present invention
Two embodiments (as shown in Figure 4).In the present embodiment, the data task processing system 200 further includes and sorting module 207,
In,
The acquisition module 201 is additionally operable to obtain waiting wheel race task;
Data synchronization has been completed it will be recalled from above that can just execute in the first embodiment, when only data synchronously complete
Task.In the present embodiment, the task including but not limited to takes turns race task and runs task again.Wheel race task refers to having
The task that cycle executes in the date is imitated, task is run again and refers to needing re-executing for task after executing failure.
Specifically, the application server 2 obtains waiting wheel race task by the acquisition module 201, and judges
Whether dependent configuration is met, and under the premise of meeting dependent's configuration, effective wheel of analysis task runs date series, in this reality
It applies in example, wheel runs task sequence and is based on working day or consecutive days series, and default setting wheel runs sequence length and up to passes by 730
Day.
Table 2 is please referred to, to take turns race task configuration requirement:
Table 2
In the present embodiment, realize that the code of wheel race task is:
It is inserted into allocation list:
Step1:Switching user (if your private user allows to execute hive orders, can not switch)
sudo su-hduser0006
Step2:Execute order
hive-e"use aml_awbs;Set mapred.job.queue.name=queue_0006_02;
insert overwrite table aml_awbs.fm_model_config
select
board,series,model,date_start,date_end,ask_execute,ask_export,desc_
interval,d
esc_relier
from aml_awbs.fm_model_config
where regexp_replace(upper(concat(board,series,model)),″,″)<>
regexp_replace(upper(concat('ky','zq','1214-13')),″,″)
union all
select'ky'board,'zq'series,'1214-13'model,
'20150101'date_start,'20990101'date_end,
'Y'ask_execute,'N'ask_export,
'w:d:1'desc_interval,
'i_jt-aml-investzq-import-cd:15'desc_relier
from default.dual"
The acquisition module 201 is additionally operable to obtain waiting heavy race task;
Specifically, table 3 is please referred to, the configuration requirement for running task again in implementing for the present invention one:
Table 3
In the present embodiment, realize that the code for running task again is:
It is inserted into allocation list:
Step1:Switching user (if your private user allows to execute hive orders, can not switch)
sudo su-hduser0006
Step2:Execute order
Hive-e " set mapred.job.queue.name=queue_0006_02;
insert into table aml_awbs.fm_model_task_rerun_set
select'ky','zq','1214-25','20141202','y','y','1.0'from default.dual"
The sorting module 207 is additionally operable to carry out the wheel race task and the heavy race task by priority level height
Sequence.
The execution module 205 is additionally operable to preferentially execute the high task of grade.
Specifically, in the present embodiment, the sorting module 207 according to obtain task chronological order to the wheel
Race task and the heavy race task carry out priority level height and sort.It is understood that in other embodiments of the invention,
Priority requirements can be set according to actual demand.
The warning module 206 is additionally operable to monitor currently performed task, exception occurs in task execution process
When, send out early warning.
Specifically, the application server 2 monitors currently performed task, task execution process by warning module 206
When occurring abnormal in the middle, early warning is sent out, to notify staff's timely processing.
By above procedure module 207, data task processing system 200 proposed by the invention can also will be got
Wheel run task and the heavy race task and be ranked up by priority level height, it is preferential to execute the high task of grade, monitor simultaneously
Currently performed task sends out early warning, to realize task of supervision when occurring abnormal in task execution process.
In addition, the present invention also proposes a kind of data task processing method.
As shown in fig.5, being the flow diagram of data task processing method first embodiment of the present invention.In the present embodiment
In, the execution sequence of the step in flow chart shown in fig. 5 can change according to different requirements, and certain steps can be omitted.
Step S301 obtains task list from terminal device 1.
Specifically, hadoop data platforms center is had in the application server 2, hadoop data platforms center is from outer
The terminal device 1 in portion obtains data, the data that the application server 2 is got according to hadoop data platforms center and progress
When data processing, need to carry out data acquisition, the operations such as data cleansing, data analysis, each process may relate to
To multiple tasks, some need sequence executes, and what is had can execute parallel.
In the present embodiment, the application server 2 obtains task list from terminal device 1.The application server 2 is logical
Cross execution and sequence that oozie manages these tasks.Oozie is the scheduler based on hadoop, and Scheduling Flow is write in the form of xml
Journey can dispatch mr, pig, hive, shell, jar etc..The application server 2 executes flow of task by oozie sequences
Node supports fork (the multiple nodes of branch), join (it is one to merge multiple nodes).
Step S302 configures the dependent of the task in the task list, with the dependence of configuration data and task.
Specifically, configuration task dependent is the dependence for configuration data and task, and only data are complete appoints
Business could execute.In the present embodiment, the application server 2 is by obtaining flow of task node fork (the multiple nodes of branch)
Effective relier (dependence) configurations, and execute relier status inquiry sentences, export baseline results, and then merge multiple tasks
Node, completion dependent status, and to relying on result duplicate removal, finally marked for the dependence result of the duplicate removal and rely on configuration slice mark
Label, the scheduling for completing all tasks rely on.
Table 1 is please referred to, is the dependence configuration format requirement in the present embodiment.
Table 1
In the present embodiment, the application server 2 overrides allocation list, from life by hive load loading configuration files
Production environment please run deployment after acquiring newest configuration file modification, and deployment order is to realize configuration dependent:
Step1:It uploads script and authorizes, formats and (private user is allowed to operate), upload to/tmp catalogues
chmod 777/tmp/relier_config_all.txt
Step2:Switching user (if your private user allows to execute hive orders, can not switch)
sudo su-hduser0006
Step3:Execute order
hive-e"use aml_awbs;Set mapred.job.queue.name=queue_0006_02;
truncate table fm_relier_check_script;
load data local inpath'/tmp/relier_config_all.txt'into table aml_
awbs.fm_relier_check_script;"
In an alternative embodiment of the invention, realize that configuration dependent can configure deployment order by modification:
Step1:Switching user (if your private user allows to execute hive orders, can not switch)
sudo su-hduser0006
Step2:Execute order
Hive-e " set mapred.job.queue.name=queue_0006_02;
insert overwrite table aml_awbs.fm_relier_check_script
select relier_name,
src_job_name,
If (relier_name='i_jt-aml-999-cd',
'select concat(y,m,d)datestr,\'Y\'state from aml_awbs.JOB_STATE where
JOB_NAME=' jt-aml-999-cd ",
relier_name)script_string,
fork
from aml_awbs.fm_relier_check_script"
Step S303 records the implementation procedure that the data of the task synchronize.
Specifically, it will be recalled from above that only the complete task of data just can perform, therefore in order to ensure the hadoop numbers
It is complete to obtain data from external terminal device 1 according to Platform center, when data have update or modification, the application clothes
The implementation procedure that the record data of device 2 of being engaged in synchronize.In the present embodiment, the application server 2 creates daily record and shape using shell
State table, the execution time that the implementation procedure and data that record data synchronize synchronize.
Whether step S304, the implementation procedure and the dependence of data and task synchronized according to the data judge data
It synchronously completes.
Specifically, before execution task, the application server 2 first judges whether data have synchronously completed.It is described
Application server 2 is held according to what the synchronous implementation procedure of the shell daily records created and the data recorded in state table, data synchronized
The dependence of row time and data and task judges whether disconnected data synchronously complete.
Step S305, when data have synchronously completed, the task of data synchronization has been completed in execution.
Step S306 sends out warning information if data are not completed to synchronize.
Specifically, in the case that only data completion synchronization, i.e. data are complete, the application server 2, which can just execute, appoints
Business.When data do not complete synchronous, the application server 2 sends out warning information, in the present embodiment, the warning information
Including but not limited to without completing synchronous data information, last hyposynchronous time etc., to notify staff to carry out
Manual intervention.
S301-306 through the above steps, data task processing method proposed by the invention, first, from terminal device 1
Obtain task list;Then, the task dependent is configured, to analyze the dependence of data and task;Then, data are recorded
Synchronous implementation procedure;Further, the implementation procedure and the dependence of data and task synchronized according to the data judges
Whether data synchronize;Finally, if data have synchronously completed, having completed data synchronization of the task is executed;If data are not complete
At synchronization, warning information is sent out.In this way, both can be avoided lacking for the dependence control confusion of data in the prior art and task
It falls into, the implementation procedure and data that can also lead to the data synchronization judge whether data are synchronous with the dependence of task, realize
Only data, which complete synchronization, could execute task.
Further, the above-mentioned first embodiment based on data task processing method of the present invention proposes that data of the present invention are appointed
The second embodiment for processing method of being engaged in.
As shown in fig. 6, being the flow diagram of data task processing method second embodiment of the present invention.In the present embodiment
In, this method further includes following steps:
Step S401 obtains waiting wheel race task;
It will be recalled from above that in the first embodiment, task can be just executed when only data synchronously complete.In the present embodiment
In, the task including but not limited to takes turns race task and runs task again.Wheel race task is referred to recycling within validity date and be executed
Task, again run task refer to execute failure after need re-executing for task.
Specifically, the application server 2 obtains waiting wheel race task, and judges whether to meet dependent's configuration,
Under the premise of meeting dependent's configuration, effective wheel of analysis task runs date series, and in the present embodiment, wheel runs task sequence
Based on working day or consecutive days series, default setting wheel runs sequence length and up to passes by 730.
Table 2 is please referred to, to take turns race task configuration requirement:
Table 2
In the present embodiment, realize that the code of wheel race task is:
It is inserted into allocation list:
Step1:Switching user (if your private user allows to execute hive orders, can not switch)
sudo su-hduser0006
Step2:Execute order
hive-e"use aml_awbs;Set mapred.job.queue.name=queue_0006_02;
insert overwrite table aml_awbs.fm_model_config
select
board,series,model,date_start,date_end,ask_execute,ask_export,desc_
interval,d
esc_relier
from aml_awbs.fm_model_config
where regexp_replace(upper(concat(board,series,model)),″,″)<>
regexp_replace(upper(concat('ky','zq','1214-13')),″,″)
union all
select'ky'board,'zq'series,'1214-13'model,
'20150101'date_start,'20990101'date_end,
'Y'ask_execute,'N'ask_export,
'w:d:1'desc_interval,
'i_jt-aml-investzq-import-cd:15'desc_relier
from default.dual"
Step S402 obtains waiting heavy race task;
Specifically, table 3 is please referred to, the configuration requirement for running task again in implementing for the present invention one:
Table 3
In the present embodiment, realize that the code for running task again is:
It is inserted into allocation list:
Step1:Switching user (if your private user allows to execute hive orders, can not switch)
sudo su-hduser0006
Step2:Execute order
Hive-e " set mapred.job.queue.name=queue_0006_02;
insert into table aml_awbs.fm_model_task_rerun_set
select'ky','zq','1214-25','20141202','y','y','1.0'from default.dual"
The wheel race task and the heavy race task are ranked up by step S403 by priority level height.
Step S404, it is preferential to execute the high task of grade.
Specifically, in the present embodiment, the application server 2 according to obtain task chronological order to the wheel
Race task and the heavy race task carry out priority level height and sort.It is understood that in other embodiments of the invention,
Priority requirements can be set according to actual demand.
Step S405 monitors currently performed task, when occurring abnormal in task execution process, sends out early warning.
Specifically, the application server 2 monitors currently performed task, when occurring abnormal in task execution process,
Early warning is sent out, to notify staff's timely processing.
S401-S405 through the above steps, data task processing method proposed by the invention can also will be got
Wheel race task and the heavy race task are ranked up by priority level height, preferential to execute the high task of grade, while monitoring is worked as
The task of preceding execution sends out early warning, to realize task of supervision when occurring abnormal in task execution process.
The embodiments of the present invention are for illustration only, can not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical scheme of the present invention substantially in other words does the prior art
Going out the part of contribution can be expressed in the form of software products, which is stored in a storage medium
In (such as ROM/RAM, magnetic disc, CD), including some instructions are used so that a station terminal equipment (can be mobile phone, computer, clothes
Be engaged in device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
It these are only the preferred embodiment of the present invention, be not intended to limit the scope of the invention, it is every to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (10)
1. a kind of data task processing method is applied to application server, which is characterized in that the method includes:
Task list is obtained from terminal device;
The dependent for configuring the task in the task list, to analyze the dependence of data and task;
Record the implementation procedure that the data of the task synchronize;
The implementation procedure and data synchronized according to the data judges whether data are synchronous with the dependence of task;
If data have synchronously completed, having completed data synchronization of the task is executed;
If data are not completed to synchronize, warning information is sent out.
2. data task processing method as described in claim 1, which is characterized in that appointing in the configuration task list
The step of dependent of business, dependence to analyze data and task, specifically comprise the following steps:
Obtain effective dependence configuration of the flow nodes of the task;
Dependent status query statement is executed, and exports original dependence result;
Merging multiple tasks node, completion dependent status, and to relying on result duplicate removal;
Configuration slide labels are relied on for the dependence result mark of the duplicate removal, complete the dependence for analyzing all tasks.
3. data task processing method as described in claim 1, which is characterized in that if the data have synchronously completed, hold
Row has completed the step of task of data synchronization, specifically comprises the following steps:
It obtains waiting wheel race task and runs task again;
Execute the wheel race task and the heavy race task.
4. data task processing method as claimed in claim 3, which is characterized in that the wheel race task and described of executing
Further include following steps before the step of running task again:
The wheel race task and the heavy race task are ranked up by priority level height;
It is preferential to execute the high task of grade.
5. data task processing method as claimed in claim 4, which is characterized in that the method further includes following steps:
Monitor currently performed task;
When occurring abnormal in task execution process, early warning is sent out.
6. a kind of application server, which is characterized in that the application server includes memory, processor, on the memory
It is stored with the data task processing system that can be run on the processor, the data task processing system is by the processor
Following steps are realized when execution:
Task list is obtained from terminal device;
The task dependent is configured, to analyze the dependence of data and task;
Record the implementation procedure that data synchronize;
The implementation procedure and data synchronized according to the data judges whether data are synchronous with the dependence of task;
If data have synchronously completed, having completed data synchronization of the task is executed;
If data are not completed to synchronize, warning information is sent out.
7. application server as claimed in claim 6, which is characterized in that task in the configuration task list according to
The step of Lai Zhe, dependence to analyze data and task, specifically comprise the following steps:
Obtain effective dependence configuration of the flow nodes of the task;
Dependent status query statement is executed, and exports original dependence result;
Merging multiple tasks node, completion dependent status, and to relying on result duplicate removal;
Configuration slide labels are relied on for the dependence result mark of the duplicate removal, complete the dependence for analyzing all tasks.
8. server the use as claimed in claim 7, which is characterized in that if the data have synchronously completed, execute
The step of completing the task that data synchronize, specifically comprises the following steps:
It obtains waiting wheel race task and runs task again;
The wheel race task and the heavy race task are ranked up by priority level height;
It is preferential to execute the high task of grade.
9. application server as claimed in claim 8, which is characterized in that the data task processing system is by the processor
When execution, following steps are also realized:
Monitor currently performed task;
When occurring abnormal in task execution process, early warning is sent out.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has data task processing system, institute
Stating data task processing system can be executed by least one processor, so that at least one processor executes such as claim
The step of data task processing method described in any one of 1-5.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810066359.7A CN108427600B (en) | 2018-01-24 | 2018-01-24 | Data task processing method, application server and computer readable storage medium |
| PCT/CN2018/089192 WO2019144552A1 (en) | 2018-01-24 | 2018-05-31 | Data task processing method, application server and computer-readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810066359.7A CN108427600B (en) | 2018-01-24 | 2018-01-24 | Data task processing method, application server and computer readable storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN108427600A true CN108427600A (en) | 2018-08-21 |
| CN108427600B CN108427600B (en) | 2021-03-16 |
Family
ID=63156041
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201810066359.7A Active CN108427600B (en) | 2018-01-24 | 2018-01-24 | Data task processing method, application server and computer readable storage medium |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN108427600B (en) |
| WO (1) | WO2019144552A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113946626A (en) * | 2021-10-26 | 2022-01-18 | 中国平安人寿保险股份有限公司 | Data synchronization detection method, device, computer equipment and storage medium |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102129390A (en) * | 2011-03-10 | 2011-07-20 | 中国科学技术大学苏州研究院 | Task scheduling system of on-chip multi-core computing platform and method for task parallelization |
| CN102750179A (en) * | 2011-04-22 | 2012-10-24 | 中国移动通信集团河北有限公司 | Method and device for scheduling tasks between cloud computing platform and data warehouse |
| CN103873567A (en) * | 2014-03-03 | 2014-06-18 | 北京智谷睿拓技术服务有限公司 | Task-based data transmission method and data transmission device |
| CN104092591A (en) * | 2014-08-04 | 2014-10-08 | 飞狐信息技术(天津)有限公司 | Task monitoring method and system |
| CN106980543A (en) * | 2017-04-05 | 2017-07-25 | 福建智恒软件科技有限公司 | The distributed task dispatching method and device triggered based on event |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103116525A (en) * | 2013-01-24 | 2013-05-22 | 贺海武 | Map reduce computing method under internet environment |
| CN104615486B (en) * | 2014-12-26 | 2019-07-02 | 北京京东尚科信息技术有限公司 | For searching for the multi-task scheduling of Extension Software Platform and executing methods, devices and systems |
| CN106294496B (en) * | 2015-06-09 | 2020-02-07 | 北京京东尚科信息技术有限公司 | Data migration method and tool based on hadoop cluster |
| CN105184470A (en) * | 2015-08-28 | 2015-12-23 | 浪潮软件股份有限公司 | Message mode-based method for integrating task lists of multiple business systems |
-
2018
- 2018-01-24 CN CN201810066359.7A patent/CN108427600B/en active Active
- 2018-05-31 WO PCT/CN2018/089192 patent/WO2019144552A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102129390A (en) * | 2011-03-10 | 2011-07-20 | 中国科学技术大学苏州研究院 | Task scheduling system of on-chip multi-core computing platform and method for task parallelization |
| CN102750179A (en) * | 2011-04-22 | 2012-10-24 | 中国移动通信集团河北有限公司 | Method and device for scheduling tasks between cloud computing platform and data warehouse |
| CN103873567A (en) * | 2014-03-03 | 2014-06-18 | 北京智谷睿拓技术服务有限公司 | Task-based data transmission method and data transmission device |
| CN104092591A (en) * | 2014-08-04 | 2014-10-08 | 飞狐信息技术(天津)有限公司 | Task monitoring method and system |
| CN106980543A (en) * | 2017-04-05 | 2017-07-25 | 福建智恒软件科技有限公司 | The distributed task dispatching method and device triggered based on event |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113946626A (en) * | 2021-10-26 | 2022-01-18 | 中国平安人寿保险股份有限公司 | Data synchronization detection method, device, computer equipment and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN108427600B (en) | 2021-03-16 |
| WO2019144552A1 (en) | 2019-08-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110138688A (en) | Dynamic adjusts method, apparatus, equipment and the readable storage medium storing program for executing of business interface | |
| CN110266514A (en) | Journal obtaining method, device, terminal and storage medium | |
| CN109828905A (en) | Automated testing method, device, computer installation and storage medium | |
| CN107506145B (en) | Physical storage scheduling method and cloud host creation method | |
| CN104967658A (en) | Data synchronization method on multiple devices | |
| CN107908656A (en) | Increase channel method, application server and computer-readable recording medium | |
| CN112463780A (en) | Data quality inspection method and device | |
| CN103473151A (en) | Database table backup method and device | |
| CN106383764A (en) | Data acquisition method and device | |
| CN104216835A (en) | Method and device for implementing memory fusion | |
| CN109814877A (en) | Project deployment method and device based on environmental management | |
| CN107909272A (en) | Staffs training registration method, application server and computer-readable recording medium | |
| CN109032635A (en) | Method for upgrading software, device and the network equipment | |
| WO2019095667A1 (en) | Database data collection method, application server, and computer readable storage medium | |
| CN110784347A (en) | Node management method, system, equipment and storage medium for container cluster | |
| CN109165210A (en) | A kind of method and device of cluster Hbase Data Migration | |
| CN104933495A (en) | Work task assessment system on Android-based mobile terminal | |
| CN107908480A (en) | Wages computational methods, application server and computer-readable recording medium | |
| CN104967532A (en) | TOC technology operation and maintenance system and application method | |
| CN108427600A (en) | Data task processing method, application server and computer readable storage medium | |
| CN112653753B (en) | RPC-based multi-room independent multi-activity method and system and electronic equipment | |
| CN205281471U (en) | Printing terminal and system | |
| CN111176924B (en) | GPU card dropping simulation method, system, terminal and storage medium | |
| CN113434281A (en) | Equipment scheduling method and cloud platform | |
| CN107733785A (en) | A kind of multiple terminals chat message synchronization removal method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |