CN120104569A

CN120104569A - A data archiving processing method and system based on shutdown system

Info

Publication number: CN120104569A
Application number: CN202510584865.5A
Authority: CN
Inventors: 阙兢兢; 张雷; 刘兆攀; 赖家先
Original assignee: Hangzhou Yikangxin Technology Co ltd
Current assignee: Hangzhou Yikangxin Technology Co ltd
Priority date: 2025-05-08
Filing date: 2025-05-08
Publication date: 2025-06-06
Anticipated expiration: 2045-05-08
Also published as: CN120104569B

Abstract

The present invention discloses a data archiving processing method and system based on a shutdown system, the method comprising: according to the data type, access frequency and business importance of the shutdown system, a multi-dimensional classification model based on deep learning is used to classify the data to obtain a classified data set and its priority label; the classified data set is input into a compression and encryption joint processing model based on a neural network to obtain a compressed and encrypted data packet; the compressed and encrypted data packet is distributed to a distributed storage system, and an index construction algorithm based on a graph database is used to generate a data storage path index to obtain a distributed storage index table; according to the distributed storage index table, the data integrity is verified using an archiving verification technology based on blockchain to generate an archiving verification report. By using the embodiments of the present invention, it is possible to realize intelligent classification, effective compression and safe storage of data in the shutdown system, and ultimately ensure the integrity and traceability of the data.

Description

Data archiving processing method and system based on shutdown system

Technical Field

The invention belongs to the technical field of data archiving, and particularly relates to a data archiving processing method and system based on a shutdown system.

Background

With the rapid development of information technology, a large amount of data is widely generated and used in various industries. Many businesses accumulate a vast amount of historical data during their operation, and the management and storage of such data is becoming a significant challenge for the business. In industries such as finance, medical treatment, energy, and the like, data is not only huge in volume, but also covers sensitive information, and the security and business compliance of enterprises are related. Therefore, how to efficiently and safely archive data in a shutdown system becomes an important problem to be solved. Traditional data archiving methods generally adopt static classification and a simple storage scheme, and often cannot meet the requirements of modern enterprises on the high efficiency, the safety and the flexibility of data storage. In these methods, the classification of data often relies on manual intervention, resulting in inefficient archiving of data and difficulty in adapting to rapidly changing business requirements.

Disclosure of Invention

The invention aims to provide a data archiving processing method and system based on a shutdown system, which solve the defects in the prior art, can realize intelligent classification, effective compression and safe storage of data in the shutdown system, and finally ensure the integrity and traceability of the data.

One embodiment of the application provides a data archiving processing method based on a shutdown system, which comprises the following steps:

Classifying data by adopting a multidimensional classification model based on deep learning according to the data type, access frequency and service importance of a shutdown system, wherein the multidimensional classification model dynamically divides the archiving priority of the data through an attention technology and a self-adaptive weight distribution algorithm to obtain a classified data set and a priority label thereof;

Inputting the classified data set into a neural network-based compression and encryption combined processing model for data compression and encryption, wherein the combined processing model ensures the data security and simultaneously realizes high-efficiency compression by combining a quantization compression algorithm and homomorphic encryption technology to obtain a compressed and encrypted data packet;

Distributing the compressed and encrypted data packet to a distributed storage system, and generating a data storage path index by adopting an index construction algorithm based on a graph database, wherein the index construction algorithm optimizes data retrieval efficiency and storage load balance through dynamic hash mapping and a distributed consistency protocol to obtain a distributed storage index table;

and verifying the data integrity by using a blockchain-based archiving verification technology according to the distributed storage index table, wherein the archiving verification technology monitors the data storage state and the access record in real time by combining an intelligent contract and a distributed account book technology to generate an archiving verification report, so that the reliability and traceability of data archiving are ensured.

Optionally, the classifying the data according to the data type, the access frequency and the service importance of the shutdown system by adopting a multidimensional classification model based on deep learning, wherein the multidimensional classification model dynamically divides the archiving priority of the data through an attention technology and an adaptive weight distribution algorithm to obtain a classified data set and a priority label thereof, and the classifying method comprises the following steps:

according to the data type in the shut-down system, acquiring multi-source data in real time by adopting a data acquisition frame based on edge calculation, and carrying out noise filtering and missing value filling on the data by a self-adaptive data cleaning algorithm to generate a preliminary standardized data set;

For the preliminary standardized dataset, a multidimensional classification model based on deep learning is adopted, multidimensional features of data are extracted by combining data types, access frequencies and business importance, and association relations among features of different dimensions are captured through a multi-head attention technology to generate preliminary feature representation;

for the preliminary feature representation, a priority classification method based on a self-adaptive weight distribution algorithm is adopted, the data access frequency and the service importance are combined, the archiving priority of the data is dynamically distributed, and the accuracy and the rationality of weight distribution are optimized through a attention technology, so that a preliminary priority label is generated;

And (3) for the preliminary priority labels, a data classification method based on a clustering algorithm is adopted, the data are classified into different categories by combining the types of the data and the priority labels, the classification accuracy and consistency are ensured by a dynamic threshold adjustment technology, and a classified data set and the priority labels thereof are generated.

Optionally, the data set after classification is input into a neural network-based compression and encryption combined processing model for data compression and encryption, where the combined processing model combines a quantization compression algorithm and a homomorphic encryption technology to ensure data security and realize efficient compression, so as to obtain a compressed and encrypted data packet, and the method includes:

The classified data set is converted into low-precision representation by adopting a quantization compression algorithm based on a neural network, and the quantization precision is dynamically adjusted by adopting a self-adaptive quantization threshold adjustment technology, so that preliminary compressed data is generated on the premise of ensuring the minimum data information loss;

the encryption method based on homomorphic encryption technology is adopted for the primary compressed data, the data is encrypted by combining the priority label and the security requirement of the data, and the high efficiency and the security of the encryption process are ensured by the lightweight key management technology, so that the primary encrypted data is generated;

for the preliminary encrypted data, adopting a neural network-based combined optimization method, combining a quantization compression algorithm and a homomorphic encryption technology, dynamically adjusting compression and encryption parameters, and generating a preliminary compressed encrypted data packet by balancing compression efficiency and encryption security through a multi-objective optimization algorithm;

And (3) for the preliminary compressed and encrypted data packet, adopting a data integrity verification method based on hash verification to ensure that the data is not damaged in the compression and encryption processes, and dynamically adjusting compression and encryption parameters through a feedback correction technology to generate a final compressed and encrypted data packet.

Optionally, the distributing the compressed and encrypted data packet to a distributed storage system, generating a data storage path index by adopting an index construction algorithm based on a graph database, where the index construction algorithm optimizes data retrieval efficiency and storage load balancing through dynamic hash mapping and a distributed consistency protocol, and obtains a distributed storage index table, and the method includes:

For the compressed and encrypted data packet, a distribution method based on a distributed storage system is adopted, a data storage path is planned by combining a priority label of data and a load state of a storage node, and the balance of data distribution is ensured through a dynamic hash mapping algorithm, so that a preliminary storage path plan is generated;

For preliminary storage path planning, an index construction method based on a graph database is adopted, a data storage path is abstracted into nodes and edges in a graph structure, and consistency and reliability of indexes are ensured through a distributed consistency protocol, so that a preliminary index structure is generated;

For the preliminary index structure, an index optimization method based on dynamic load balancing is adopted, the index structure is dynamically adjusted by combining the real-time load state and the data access frequency of the storage nodes, and the data retrieval efficiency and the storage load balancing are optimized through a self-adaptive hash mapping technology, so that an optimized index structure is generated;

And mapping the optimized index structure into a distributed storage index table by adopting an index table generation method based on a visualization technology, ensuring the accuracy and consistency of the index table by adopting a real-time monitoring technology, and generating a final distributed storage index table.

Optionally, according to the distributed storage index table, the data integrity is verified by using a blockchain-based archiving verification technology, where the archiving verification technology monitors the data storage status and the access record in real time by combining an intelligent contract and a distributed ledger wall technology, generates an archiving verification report, and ensures the reliability and traceability of data archiving, and includes:

According to the distributed storage index table, an archive verification technology based on a blockchain is adopted, the data storage state is verified by combining an intelligent contract, the data is ensured not to be damaged or tampered in the storage process through a hash verification algorithm, and a preliminary integrity verification result is generated;

For the preliminary integrity verification result, a recording method based on a distributed account book technology is adopted, the data storage state and the access record are written into a blockchain, the non-tamper property and traceability of the record are ensured through a consensus technology, and a preliminary distributed account book record is generated;

For the distributed account book record, a real-time monitoring method based on intelligent contracts is adopted, the data access frequency and the storage state are combined, abnormal behaviors are detected, and a preliminary monitoring report is generated through an abnormal detection algorithm;

and integrating the integrity verification result, the distributed account book record and the monitoring report into an archiving verification report by adopting a report generation method based on a natural language generation technology for the preliminary monitoring report, and generating a final archiving verification report by adopting a visualization technology.

Yet another embodiment of the present application provides a data archiving and processing system based on a shutdown system, the system comprising:

The classifying module is used for classifying the data by adopting a multidimensional classifying model based on deep learning according to the data type, the access frequency and the service importance of the shutdown system, wherein the multidimensional classifying model dynamically divides the archiving priority of the data through an attention technology and a self-adaptive weight distribution algorithm to obtain a classified data set and a priority label thereof;

the processing module is used for inputting the classified data set into a compression and encryption combined processing model based on a neural network to compress and encrypt the data, wherein the combined processing model ensures the data security and simultaneously realizes high-efficiency compression by combining a quantization compression algorithm and a homomorphic encryption technology to obtain a compressed and encrypted data packet;

The index module is used for distributing the compressed and encrypted data packet to a distributed storage system, and generating a data storage path index by adopting an index construction algorithm based on a graph database, wherein the index construction algorithm optimizes data retrieval efficiency and storage load balance through dynamic hash mapping and a distributed consistency protocol to obtain a distributed storage index table;

And the archiving module is used for verifying the data integrity by utilizing a blockchain-based archiving verification technology according to the distributed storage index table, wherein the archiving verification technology monitors the data storage state and the access record in real time by combining an intelligent contract and a distributed account book technology to generate an archiving verification report, so that the reliability and traceability of data archiving are ensured.

A further embodiment of the application provides a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the method of any of the preceding claims when run.

Yet another embodiment of the application provides an electronic device comprising a memory having a computer program stored therein and a processor configured to run the computer program to perform the method recited in any of the preceding claims.

Compared with the prior art, the data archiving processing method based on the shutdown system provided by the invention classifies data by adopting a multidimensional classification model based on deep learning according to the data type, access frequency and business importance of the shutdown system to obtain classified data sets and priority labels thereof, inputs the classified data sets into a compression and encryption combined processing model based on a neural network to obtain compressed and encrypted data packets, distributes the compressed and encrypted data packets into a distributed storage system, generates a data storage path index by adopting an index construction algorithm based on a graph database to obtain a distributed storage index table, verifies the data integrity by utilizing a blockchain-based archiving verification technology according to the distributed storage index table to generate an archiving verification report, and can realize intelligent classification, effective compression and safe storage of data in the shutdown system and finally ensure the integrity and traceability of the data.

Drawings

Fig. 1 is a hardware block diagram of a computer terminal according to a data archiving method based on a shutdown system according to an embodiment of the present invention;

Fig. 2 is a schematic flow chart of a data archiving method based on a shutdown system according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a data archiving and processing system based on a shutdown system according to an embodiment of the present invention.

Detailed Description

The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.

The embodiment of the invention firstly provides a data archiving processing method based on a shutdown system, which can be applied to electronic equipment such as a computer terminal, in particular to a common computer and the like.

The following describes the operation of the computer terminal in detail by taking it as an example. Fig. 1 is a hardware block diagram of a computer terminal according to a data archiving method based on a shutdown system according to an embodiment of the present invention. As shown in fig. 1, the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a non-volatile storage medium and an internal memory.

The non-volatile storage medium may store an operating system and a computer program. The computer program comprises program instructions that, when executed, cause the processor to perform any of a number of data archiving methods based on a shut down system.

The processor is used to provide computing and control capabilities to support the operation of the entire computer device.

The internal memory provides an environment for the execution of a computer program in a non-volatile storage medium that, when executed by a processor, causes the processor to perform any of a variety of data archiving methods based on a shutdown system.

The network interface is used for network communication such as transmitting assigned tasks and the like. It will be appreciated by those skilled in the art that the architecture shown in fig. 1 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements may be implemented, as a particular computer device may include more or less components than those shown, or may be combined with some components, or may have a different arrangement of components.

It should be appreciated that the Processor may be a central processing unit (Central Processing Unit, CPU), it may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Referring to fig. 2, an embodiment of the present invention provides a data archiving processing method based on a shutdown system, which may include the following steps:

S201, classifying data by adopting a multidimensional classification model based on deep learning according to the data type, access frequency and service importance of a shutdown system, wherein the multidimensional classification model dynamically divides the archiving priority of the data through an attention technology and a self-adaptive weight distribution algorithm to obtain a classified data set and a priority label thereof;

In this process, the system uses a deep learning algorithm to identify different types of data by analyzing characteristics of the data, including the structure, content, and frequency of use of the data, etc. The multidimensional classification model focuses on key characteristics of data, such as the frequency of access of data in historical use and importance to business processes, through attention techniques. Meanwhile, the self-adaptive weight distribution algorithm dynamically adjusts the weights of the characteristics according to real-time feedback, so that a model can give higher attention to high-value or high-frequency accessed data, and the archiving priority of the data is automatically formed.

The classification method based on deep learning can remarkably improve the efficiency and accuracy of data archiving. By dynamically prioritizing the archiving of the data, the system ensures that important and high frequency data can be prioritized and securely stored, avoiding loss of important information. In addition, the intelligent classification is beneficial to reasonable utilization of storage resources, unnecessary storage overhead is reduced, and flexibility and response capability of whole data management are improved. By combining the application of the attention mechanism and the self-adaptive algorithm, the system can continuously learn and adapt to new data characteristics and service requirements, and long-term optimization and promotion are realized.

Specifically, according to the data type in the shutdown system, a data acquisition frame based on edge calculation is adopted to acquire multi-source data in real time, and noise filtering and missing value filling are carried out on the data through a self-adaptive data cleaning algorithm to generate a preliminary standardized data set;

In this step, the system will employ an edge-calculation based data acquisition framework to enable real-time capture of various types of data in the shutdown system. The introduction of edge calculation enables data processing to be closer to a data source, delay is reduced, and acquisition efficiency is improved. The system automatically identifies and filters noise in the data, such as abnormal values and error inputs, through an adaptive data cleaning algorithm, and simultaneously implements a missing value filling strategy, and fills the missing data by using the average value or the median of surrounding data, so that the generated data set is ensured to have higher integrity and accuracy.

The data acquisition and cleaning mode can improve data quality to the greatest extent, and provides a solid foundation for subsequent data analysis and classification. Through just carrying out effective filtration and washing to the data at the acquisition stage, the system can reduce data redundancy, promotes the efficiency of follow-up processing to effectively shorten whole data processing cycle. The high-quality standardized data set is more beneficial to training of the deep learning model, so that the classification accuracy of the multi-dimensional classification model is improved.

In this step, the system first deploys edge computing nodes to acquire multi-source data in the shutdown system. The edge calculation has the advantages that the data processing task can be close to the source of data generation as much as possible, so that network delay is reduced, and the real-time performance of data acquisition is improved. The system designs a set of data acquisition frameworks which can support access to different data sources, including sensor data, user input, and history. For example, as the system collects real-time monitoring data (e.g., temperature, humidity, flow, etc.) from multiple sensors, it can stream the data to an edge node for aggregation and preliminary processing.

After the data is acquired, the system processes the acquired data by using an adaptive data cleaning algorithm. The algorithm automatically identifies and filters noise by analyzing features of the dataset. For example, during acquisition, if the system finds that some sensor data is abnormal (e.g., above or below a preset reasonable range), the data will be marked as noise and filtered out. Meanwhile, aiming at the missing value, the algorithm fills the missing data through an interpolation method or a statistical method (such as mean filling, median filling and the like) so as to ensure the integrity and reliability of the generated primary standardized data set.

After the data cleansing is completed, the resulting preliminary standardized data set may be formatted into a consistent structure for subsequent processing. This structure typically employs a common data format (e.g., CSV or JSON) to ensure that each record contains the necessary fields, such as data type, timestamp, acquisition source, etc. The data normalization process not only improves the consistency of the data, but also lays a good foundation for subsequent analysis, and a high-quality data set is a key of the effect of a subsequent deep learning model.

In this step, the system will apply a multi-dimensional classification model based on deep learning for the preliminary standardized dataset for feature extraction. In combination with data type, access frequency and business importance, the model will analyze the multidimensional features of each piece of data and capture complex relationships between features through a multi-headed attention mechanism. For example, some data may be important in high frequency access situations, while other data has its unique value in business critical links, and through multi-head attention, it can be ensured that the model fully considers this information, generating a more expressive feature representation.

The main significance of this step is to promote the understanding ability of the model to the data features. By utilizing a deep learning model and a multi-head attention mechanism, the system can better identify potential rules and relationships in the data, which is critical to subsequent classification decisions. The generated preliminary feature representation reflects the real value of the data, so that the model can carry out more reasonable filing priority division on the basis of multiple dimensions, and the accuracy and reliability of overall classification are improved.

After the preliminary standardized dataset is input into the deep learning based multidimensional classification model, the system first embeds the data, converting it into tensor form suitable for model processing. At this point, the model will analyze and extract potential multidimensional features for data type, access frequency, and business importance. For example, the system may analyze the user's access behavior data to identify usage habits of different user groups for particular data, thereby extracting features related to the user's behavior.

The system then uses a multi-headed attention mechanism to process the extracted multidimensional features. The multi-head attention mechanism can pay attention to the relation between the data features from different angles, and the expression capability of the model is improved. For example, for a set of features, some of which are directly related to business importance, while others may better reflect the frequency of use of the data, this mechanism allows the model to process this information in parallel in multiple "heads", progressively learning the importance of each feature in different contexts.

Finally, through the processing of multi-head attention, the model generates a preliminary feature representation. These feature representations can be considered as multi-dimensional comprehensive descriptions of each piece of data, including representation information of the data and an assessment of its importance. These feature descriptions will provide the necessary information basis for dynamic partitioning of subsequent priorities, ensuring the rationality and accuracy of the final classification.

In this step, the system will apply a method based on an adaptive weight distribution algorithm to the initially generated feature representation to prioritize the data. The system dynamically adjusts the weight of each data by analyzing the access frequency and the service importance of the data, and prioritizes the data according to the weights. The application of attention technology enables the system to focus on features most relevant to data archiving, thereby enabling more reasonable archiving prioritization, ensuring fast processing of high value data.

By means of the dynamic prioritization, the system remarkably improves decision efficiency of data archiving, ensures that important information can be processed and stored in time, and reduces business loss possibly caused by information delay. Meanwhile, accurate priority allocation can not only optimize the use of storage resources, but also improve the response capability of the system to the demands of users and increase the flexibility and adaptability of the service.

Based on the preliminary characterization, the system will employ an adaptive weight distribution algorithm to dynamically calculate the archive priority of each item of data. This process first combines the access frequency of the data with the business importance, passing their respective weights as inputs to the algorithm. For example, some data may take higher priority because of frequent access, while other data may be given higher weight because of its importance in critical decisions.

In the process of dynamically calculating weights, the system also uses attention techniques to optimize the weight assignment of features. By focusing on the impact of different features in data archiving, the model can assign priorities more accurately. The system will traverse all features, identify features directly related to archival decisions, and assign their corresponding attention scores, and finally calculate the comprehensive priority label for each piece of data. For example, if the access frequency of a piece of data increases rapidly over a particular period, the model will automatically detect this change and dynamically increase its priority.

Finally, the generated primary priority label is used as a guide for subsequent classification, so that timely processing of the data with high importance and high frequency is ensured. The process not only ensures the flexibility of the algorithm, but also ensures the scientificity of priority allocation, and lays a guiding foundation for the effectiveness of the whole data management system.

In this step, the system will apply a clustering algorithm based data classification method to classify the data into different categories according to the preliminary priority tags and data types. The clustering algorithm can automatically group similar data according to the data characteristics and the priorities, and the classification not only considers the service importance of the data, but also influences the subsequent data management strategy. The dynamic threshold adjustment technique ensures that in the classification process, appropriate thresholds can be adaptively set for different categories to optimize the classification effect.

The key of this step is to ensure the accuracy and consistency of data classification, thereby laying a good foundation for the archiving process of the subsequent data. By reasonably classifying the data, the system can purposefully archive and store the data, thereby avoiding complicated data management and processing flow and improving the efficiency of data processing. The classification method also provides convenience for subsequent analysis and retrieval, so that the data is more flexible and efficient to use.

In this step, the system will combine the preliminary priority label employed with the data type using a clustering algorithm based method to perform automatic classification of the data. The system first selects a clustering algorithm (e.g., K-Means, hierarchical clustering, etc.) that is applicable to the current data set and applies it to the data set that satisfies certain conditions to find the inherent links and similarities between the data. For example, the system may cluster all data marked as high priority, identifying commonalities among them, and thus dividing the data into different groups.

With the progress of clustering, the system also adopts a dynamic threshold adjustment technology to ensure the accuracy and consistency of classification results. The dynamic threshold is set based on the classification effect of real-time evaluation, the system can continuously monitor the situation of the clustering result, and dynamically adjust the threshold according to the data quantity and the feature similarity in different categories so as to optimize the classification result. For example, if the amount of data within a category is too small, the system may lower the clustering threshold for that category, thereby incorporating more data therein, ensuring that important information is not fractured.

Finally, the system generates a classified data set and a priority label thereof through a clustering algorithm and a dynamic threshold adjustment process. The result not only provides basis for subsequent data archiving and storage, but also provides a good structure for data retrieval and analysis in the future, so that the overall data management becomes more efficient and systematic.

S202, inputting the classified data set into a neural network-based compression and encryption combined processing model for data compression and encryption, wherein the combined processing model ensures the data security and simultaneously realizes high-efficiency compression by combining a quantization compression algorithm and a homomorphic encryption technology to obtain a compressed and encrypted data packet;

In this step, the classified data set is input to a neural network-based compression and encryption joint processing model for data compression and encryption. The joint processing model will utilize a combination of quantized compression algorithms and homomorphic encryption techniques to ensure that the data volume is compressed without losing the integrity and security of the data. The quantization compression algorithm converts high-precision data into low-precision representation by performing precision reduction processing on the data, so that the storage space occupied by the data is reduced. The homomorphic encryption technology ensures that the encrypted data can still keep operability during processing, namely, the encrypted data can still be calculated in an encrypted state, thereby avoiding the risk of exposing the original data in the sensitive data exchange process. This allows the data to be compressed efficiently without loss, speeding up the efficiency of data transfer and storage.

This combined processing mechanism of compression and encryption has important practical significance. On one hand, the high-efficiency data compression can remarkably reduce the storage cost and bandwidth requirements, particularly when large-scale data is processed, the transmission efficiency and the storage capacity of the data can be rapidly improved, and on the other hand, the privacy safety of the data can be ensured by homomorphic encryption, the data can be effectively prevented from being stolen or tampered in the process of processing, and the safety of the data in the storage and transmission processes is ensured. By the implementation of the mechanism, when an enterprise processes data in a shutdown system, the safety of the data is ensured, the efficiency of data processing is improved, and the overall performance and usability of the system are enhanced.

Specifically, a neural network-based quantization compression algorithm can be adopted for the classified data set to convert high-precision data into low-precision representation, and the quantization precision is dynamically adjusted through a self-adaptive quantization threshold adjustment technology, so that preliminary compressed data is generated on the premise of ensuring that the data information loss is minimum;

In this step, the system performs quantization compression on the classified data set. The quantization compression algorithm is characterized in that high-precision data (such as floating point numbers) are converted into low-precision representations (such as fixed point numbers), so that the storage space of the data is effectively reduced. For example, a high precision measurement may require the use of 32-bit floating point number storage, which may be compressed to 16-bit or 8-bit fixed point numbers after quantization, thereby saving storage overhead. The compression method is not only limited to numerical data, but also can be applied to data in multiple fields such as images, audios and the like, and ensures that the processing of the data is not limited due to the excessively high storage cost.

In the process, the system dynamically adjusts the quantization precision according to the characteristics of each data characteristic and the distribution condition of each data characteristic in the data set by using an adaptive quantization threshold adjustment technology. For example, for some important features, the system may choose a higher quantization accuracy, while for features that have less impact on traffic, a lower quantization accuracy is employed. Therefore, the system can effectively find the optimal balance point between the compression rate and the information loss, and the information integrity of the data is reserved to the maximum extent.

By means of quantization compression, the system can remarkably reduce the storage space occupied by data, and therefore processing efficiency is improved when massive data are faced. Particularly, in the scene of needing to store long-time data or analyzing real-time data, the quantization technology can reduce the storage burden and improve the response speed of the system. Meanwhile, the strategy of dynamically adjusting the quantization precision ensures that the basic effectiveness of the data is not affected while the data is compressed, which is important for the subsequent data analysis and processing, can keep the usability and accuracy of the data, and promotes the flexibility and high efficiency of the whole system in data management to be ensured.

In this step, the system first receives the classified data set, applying a neural network-based quantization compression algorithm for each data feature. The algorithm aims to convert high-precision data (such as floating point numbers) into low-precision representations (such as fixed point numbers) so as to reduce the storage space occupied by the data. For example, assuming that temperature data recorded by a sensor is in the form of floating point numbers (e.g., 23.456 ℃), after quantization and compression, the data can be converted into a fixed point number form with lower precision (e.g., 23 ℃), so that not only is the storage requirement reduced, but also the efficiency of subsequent data processing is improved.

In order to ensure that the information loss in the compression process is reduced to the minimum, the system adopts an adaptive quantization threshold adjustment technology, and the technology can monitor the change of the data characteristics in real time and automatically adjust the quantization precision according to the characteristics of different data. For example, for some feature data with larger variation, the system may choose to preserve higher quantization accuracy, while for data with smaller variation, the quantization accuracy may be reduced. The dynamic adjustment mechanism enables the system to flexibly optimize the compression strategy according to actual conditions, and ensures that the generated compressed data is stored efficiently while key information is reserved.

Finally, the quantized and compressed data are summarized to form a preliminary compressed data set. These compressed data not only perform well in terms of saving storage space, but also lay the foundation for subsequent encryption steps. The low precision representation of the compressed data ensures high efficiency of storage and transmission, so that subsequent processing steps can be performed more rapidly, and the response speed and processing capacity of the system are improved as a whole.

In this step, the system will perform homomorphic encryption processing on the preliminary data after quantization compression. Unlike conventional encryption methods, homomorphic encryption allows the operation on encrypted data without first decrypting, a feature that keeps the data secure during processing. In the implementation process, the system applies different encryption strategies to different types of data according to the priority labels and security requirements of the data. For example, for highly sensitive data, the system may choose a stronger encryption algorithm, while for less sensitive data, a more basic encryption method is used to optimize processing efficiency while meeting security requirements.

In addition, lightweight key management techniques play an important role in this process. The system can generate and manage the key required in the encryption process, ensure that the distribution and storage of the key meet the security standard, and simultaneously can provide efficient key operation. For example, a hash algorithm may be employed to generate a corresponding hash value for a key, thereby verifying its validity without exposing the key. The process can not only protect the security of the data, but also improve the flexibility and response speed of the system in the data encryption processing.

Homomorphic encryption bridges the data security and processing power, allowing users to perform the necessary computations while maintaining data privacy. The technical application is particularly suitable for being used in cloud computing or multi-party cooperation environments, and can effectively avoid security risks caused by data leakage. In addition, through lightweight key management, the encryption process of the system can be performed efficiently, and a user can obtain smooth operation experience while enjoying security protection. Finally, the generated primary encrypted data provides safety guarantee for subsequent data storage and retrieval, and ensures the integrity and reliability in the data archiving process.

In this step, the system will apply homomorphic encryption techniques to the preliminary compressed data for encryption. The core advantage of this technique is that it allows the calculation of encrypted data without decryption, thus protecting the privacy of the data. For example, for a set of data containing user sensitive information, the use of homomorphic encryption ensures that the data does not reveal its original content when processed, yet still enables necessary calculations, such as statistics or analysis. This makes the technique particularly suitable for scenarios where sensitive data needs to be handled securely in a cloud computing environment.

In the encryption process, the system dynamically selects a proper homomorphic encryption strategy by combining the priority label and the security requirement of each piece of data. For example, for highly sensitive financial transaction data, the system may choose to employ a complex encryption process based on an encryption algorithm (e.g., paillier or RGH) to ensure the security of the data after encryption. Meanwhile, the lightweight key management technology can improve the efficiency of the encryption process and ensure the high efficiency of key generation, distribution and use. For example, the system may use a random number generator to generate the encryption key and employ hashing techniques to verify the validity and security of the key, thereby greatly reducing the operational complexity.

Finally, the homomorphic encrypted data form a preliminary encrypted data packet, so that the safety and privacy in the preservation and transmission processes are ensured. The encryption processing in the stage provides a safe and reliable data base for the subsequent joint optimization stage, and ensures that the data is not threatened due to damage or leakage in the processing process, thereby improving the overall security architecture of the system.

In this step, the system will perform joint optimization on the preliminary encrypted data with the goal of balancing the compression efficiency of the data with the security of the encryption. The neural network-based joint optimization method can dynamically adjust parameters of the quantization compression and encryption algorithm by learning characteristics of historical data. For example, the system may identify which data should employ higher compression rates to save storage by analyzing the characteristics of the current data and its access patterns, and which data requires finer encryption to ensure security.

By introducing a multi-objective optimization algorithm, the system can trade-off between compression efficiency and encryption strength, and select the parameter setting most suitable for the current data characteristics. This algorithm will typically consider multiple target outputs simultaneously, e.g. to achieve an optimal compression rate while ensuring that the data is not distorted, and to dynamically enhance the encryption strength if necessary to accommodate variations in external risk. The intelligent optimization strategy ensures the flexibility of the data in the processing process and can adapt to various operation conditions in real time.

Through the joint optimization, the system can ensure the safety of information and reduce the potential risk of data leakage while ensuring the efficient compression of data. This flexible adjustment capability not only adapts to various types of data requirements, but also provides scalability for future data processing. Therefore, the generated primary compressed encrypted data packet not only realizes the saving of storage space, but also improves the safety and reliability of data in the transmission and storage processes, and plays an important role in promoting the efficiency of the whole data processing system.

In this step, the system will apply a neural network based joint optimization method for the preliminary encrypted data. The method aims at dynamically adjusting quantization compression and homomorphic encryption parameters by learning and analyzing data characteristics so as to realize the balance between compression efficiency and encryption security. Specifically, the system may use the trained neural network model to evaluate characteristics of the current data and determine the most appropriate compression and encryption parameters based thereon.

For example, the system may analyze the access frequency, sensitivity, and compressed effects of different data types, and for frequently accessed low sensitive data, the system may select a higher compression rate to increase storage efficiency, while for sensitive data, the system may prioritize the increase in encryption strength. In this process, in conjunction with the application of the multi-objective optimization algorithm, the system is able to process multiple optimization objectives, such as compression rate, processing speed, and encryption strength, simultaneously. Through the multi-level optimization, the system can find the optimal balance among different processing demands, so that the processing process is more efficient and safer.

Finally, the generated primary compressed encrypted data packet realizes excellent storage performance on the basis of ensuring data integrity and safety. The data packet provides an effective basis for the subsequent integrity verification step, ensures that the whole data processing flow achieves an optimal solution between efficiency and safety, and improves the adaptability of the system in terms of data management.

In this step, the system will perform data integrity verification on the preliminary compressed encrypted data packet to ensure that the data is not damaged or tampered with during the entire compression and encryption process. Therefore, the system adopts a hash-based verification method, firstly carries out hash operation on the preliminary data packet to generate a hash value with fixed length, and the hash value is used for representing the integrity of the data. When the packet is created, the system compares the hash value with the original data that was not compressed and not encrypted to verify that the data remains intact and secure during processing. This process is critical, particularly when handling sensitive data, and can effectively prevent the risk of data loss or tampering.

Meanwhile, the system also implements a feedback correction technology to dynamically adjust compression and encryption parameters. By analyzing the results of the hash check, the system can identify potential problems in the data processing. For example, if the hash values do not match, the system will trigger a feedback mechanism to automatically adjust the rate of quantization compression or encryption strength to seek a solution. The mechanism enables the system to perform self-error correction and optimization in the data processing process, thereby ensuring that the finally generated data packet is safe and efficient.

The integrity verification process provides a powerful safety guarantee for data processing, ensures that the data cannot be damaged or tampered in the compression and encryption process, and lays a solid foundation for subsequent data transmission and storage. By implementing the hash check and dynamic feedback correction technology, the system can realize efficient self-adjustment, so that the data processing process is more flexible and adaptive. The method not only enhances the security of the data, but also improves the reliability of the whole data archiving process, and provides powerful support for the subsequent management and use of the data.

In this step, the system uses a hash-based data integrity verification method to ensure that the initially compressed encrypted data packets are not corrupted during processing. The hash check is performed by generating a hash value of a fixed length for the data content, which hash value is capable of uniquely characterizing the data. After the data packet is created, the system calculates its hash value and compares it with the hash value of the original data in the subsequent processing to confirm its integrity and consistency. This process can effectively prevent data from being damaged or tampered with by an operation error or an external attack.

In addition, if the hash check result shows that the data is abnormal or inconsistent, the system can immediately start the feedback correction technology. This technique will analyze the cause of the possible hash mismatch and correct the compression and encryption parameters based on the dynamically adjusted result. For example, assuming that the hash value of a data packet does not match the original value, the system may reduce the compression ratio to preserve more information or increase the encryption strength to enhance the security of the data. The feedback mechanism ensures that the integrity and safety of data can be maintained under any condition, and the robustness of the system is improved.

Eventually, after integrity verification and necessary adjustments, the system will generate the final compressed encrypted data packet. The data packet can ensure the stability and the safety in storage and transmission, and lays a solid foundation for subsequent data management and use. The step not only enhances the security guarantee of the data, but also improves the reliability of the whole data processing process, ensures the efficient connection of all links, and ensures that a user can perform data operation more reliablely.

S203, distributing the compressed and encrypted data packet to a distributed storage system, and generating a data storage path index by adopting an index construction algorithm based on a graph database, wherein the index construction algorithm optimizes data retrieval efficiency and storage load balance through dynamic hash mapping and a distributed consistency protocol to obtain a distributed storage index table;

In the method, the compressed and encrypted data packet is firstly transmitted to a distributed storage system so as to ensure high availability and reliability of the data. At this point, the system generates a data storage path index through an index building algorithm based on the graph database. Specifically, the algorithm implements a dynamic hash mapping technique to distribute data across different storage nodes, thereby avoiding a node becoming a bottleneck due to overload. Meanwhile, by using a distributed consistency protocol, the system ensures the consistency of data in all storage nodes so as to avoid the problem of data redundancy or inconsistency and ensure the high efficiency and stability of the data during access. The finally formed storage path index not only covers the position information of the data storage, but also provides high-efficiency support for subsequent data retrieval.

By distributing the compressed and encrypted data packet to the distributed storage system and generating the corresponding storage path index, the data access efficiency and availability are greatly improved. In summary, the index construction algorithm based on the graph database optimizes the data retrieval efficiency, can realize quick and accurate positioning on frequently read data, and improves the user experience. In addition, the combination of the dynamic hash mapping and the distributed consistency protocol ensures the performance of the whole system on load balance, so that the safety and the effectiveness of the data in the storage process are ensured. The method is suitable for large-scale data storage and management, especially for enterprise information systems involving sensitive information.

Specifically, a distribution method based on a distributed storage system can be adopted for compressing and encrypting the data packet, a data storage path is planned by combining a priority label of data and a load state of a storage node, and the balance of data distribution is ensured through a dynamic hash mapping algorithm, so that a preliminary storage path plan is generated;

In this step, the system plans the data storage path by a distribution method based on distributed storage, combining the priority label of the data and the load state of the storage node. First, the system will analyze each compressed and encrypted data packet and evaluate its priority label to ensure that important data can be stored preferentially on the more powerful storage nodes. When determining the selection of the storage nodes, the system also monitors the load state of each storage node in real time so as to avoid performance degradation caused by overload of a certain node.

The dynamic programming mode of the storage path ensures the balanced distribution of the data on different storage nodes, thereby optimizing the access performance of the data. Meanwhile, due to comprehensive consideration of the priority label, the key business data can obtain a faster response speed, and the overall performance and reliability of the system are improved. By the method, enterprises can effectively manage and store large-scale data, so that important data can be used at any time, and the processing efficiency and expandability of the data are improved.

In this step, the system first analyzes the received compressed encrypted data packet and extracts its priority label. Priority tags are typically based on the importance of the data, the frequency of access, and security requirements. For example, the priority of financial transaction data may be higher, while the priority of log files may be lower. The system allocates a storage node to each data packet, and the storage node is selected by taking the priority of the data and the load condition of each current node into consideration. In this way, the system is able to send important data packets preferentially to the nodes with lower loads to ensure fast access and processing of the data.

To achieve balanced distribution of data storage paths, the system may use a dynamic hash mapping algorithm. The algorithm can dynamically calculate the hash value according to the characteristics of the data packet and distribute the data packet to the corresponding storage node. For example, assume that there are three storage nodes A, B and C, and data packets X, Y and Z to be stored. The system calculates hash values of the data packets and selects the most suitable node for storage according to the load condition. This dynamic computation and selection mechanism ensures a uniform distribution of data in the storage system, avoiding the performance bottlenecks of some nodes due to overload.

Finally, the generated preliminary storage path plan will include the target storage node for each data packet and its storage path information. This information will be recorded for subsequent data retrieval and management. By the method, the system not only realizes high efficiency of data storage, but also improves the overall system performance, and ensures the expandability and reliability in a large-scale data processing scene.

In this step, the system performs index building of the graph database on the preliminary storage path. In particular, the system may treat the storage path as a graph structure containing multiple nodes and edges to facilitate real-time access and querying of data. Each node represents a storage location, and the edges represent the connection between the storage nodes. By dividing the structures, the system can effectively organize and manage the storage paths, and the data retrieval speed is improved. Meanwhile, the reliability of the index is guaranteed by combining the distributed consistency protocol, so that data inconsistency possibly occurring in a distributed environment is avoided.

By the index construction method of the graph database, management of the data storage path becomes more efficient and flexible. The formation of the graph structure enables the query of the data to be performed through an effective graph traversal algorithm, so that the data retrieval speed is remarkably improved. In addition, the implementation of the distributed consistency protocol ensures the reliability of an index structure, prevents the problem of inconsistent indexes caused by node faults, and improves the fault tolerance and the availability of the system. The method provides an innovative solution for data management and retrieval in a dynamic storage environment.

In this step, the system translates the preliminary storage path plan into an index structure in the graph database. Specifically, each storage node is considered a node in the graph, and the connection relationship (i.e., data storage path) between nodes is considered an edge in the graph. The advantage of this graph structure is that it enables intuitive presentation of the association between the storage path and the data, making storage management clearer and more efficient.

In the implementation process, the system can extract information of the storage nodes and construct connection among the nodes. For example, if packet X is stored at node A and packet Y needs to be read from node B, the system will form an edge from A to B in the figure. The structure can quickly find the path of data storage when inquiring the data, and obviously reduces the time of data retrieval. In addition, the system will manage the index using a distributed consistency protocol (e.g., paxos or Raft) to ensure consistency of the index information between the nodes. The protocol ensures coordination and consistency of the modification operation when the index is updated every time, so that the problem of inconsistent data caused by node failure or network delay is avoided.

As a result, the generated preliminary index structure can ensure that the entire index information is reliable and consistent. The implementation of the graph structure not only improves the visualization degree of the data storage path, but also provides important support for subsequent data access, so that the efficiency and accuracy of data retrieval are greatly improved.

In the step, the system adopts an index optimization method based on dynamic load balancing to adjust the preliminary index structure. Firstly, the system monitors the real-time load state of each storage node and the access frequency of each data, and judges which nodes are high in load and which data are accessed frequently. Based on the information, the system dynamically optimizes the index structure, so that the data accessed by high frequency can be preferentially positioned on the node with lower load, and the balance of storage load and the improvement of data retrieval efficiency are realized.

The dynamic index structure adjusting mode can effectively disperse storage pressure and avoid that a plurality of storage nodes influence the performance of the whole system due to overload. In addition, by the adaptive hash mapping technique, the retrieval of data becomes faster and more efficient, especially when faced with large data sets that are frequently accessed, greatly enhancing the user experience. In summary, the optimization process not only improves the storage performance, but also ensures reasonable utilization of resources, and provides powerful support for sustainable development of the system.

In this step, the system implements an optimization method for dynamic load balancing for the preliminary index structure. The system first monitors the real-time load state of the storage node, including CPU utilization, memory occupancy, and current data access request. For example, when node A has a higher CPU load and node B is more idle, the system may identify that node B should receive more data storage requests. Through such dynamic adjustment, the system can effectively avoid performance degradation of certain nodes caused by overload.

At the same time, the system optimizes the index structure in combination with the data access frequency. For frequently accessed data, the system adjusts its storage path to be stored on the node with lighter load, thereby improving the access speed. For example, a particular data file has been recently read multiple times, and the system may choose to migrate it from a higher-load node to a lower-load node in an effort to balance the overall load while ensuring access speed.

Finally, the index structure is optimized by an adaptive hash mapping technique. This process ensures fast retrieval and consistency of data and achieves a significant effect on the balance of storage loads. The optimized index structure enables data to be quickly retrieved during access, so that user experience is improved, and the data is particularly outstanding in a scene with high concurrent access.

In this step, the system maps the optimized index structure into a distributed storage index table to facilitate further data management and retrieval. Through the visualization technology, a user can view and manage the index structure in a graphical mode, so that the storage path and the data distribution condition can be understood conveniently. The system presents the visualized result as a table or graph which is easy to read and understand, and displays the state of each storage node, the stored data items, the access frequency and other information. Meanwhile, through a real-time monitoring technology, the system can continuously detect the accuracy and consistency of the index table, and ensure that errors or omission cannot occur in the data access process.

Through the visual display of the index structure, a user can intuitively know the state of the storage system and quickly identify potential problems, such as overhigh load of a certain storage node or abnormal data access. In addition, the introduction of the real-time monitoring technology ensures the continuous updating and accuracy of the index table, thereby improving the efficiency and reliability of data retrieval. The method improves the manageability and usability of the storage system, promotes the safety and high efficiency of data management, and enables the vast users to use the storage resources more conveniently.

In this step, the system converts the optimized index structure into a visualized distributed storage index table. The visualization displays the states of the storage nodes and the data storage paths in a graphical mode, so that a manager can easily understand and monitor the overall operation condition of the storage system. In particular, the system may present the information such as the load of the node, the stored data and the priority thereof as a clear chart or graph by means of some visualization tools, so that the user can conveniently make decisions and operations.

In the implementation process, the system can integrate a real-time monitoring technology and continuously track the states of all storage nodes. This includes not only the load situation of the node but also the access frequency and storage status of the data. For example, if the load of a node continues to be too high, the system will quickly alert and identify the node in the visual interface for attention by the manager. The real-time monitoring can enable a manager to adjust the storage strategy in time, so that the stability and the effectiveness of data storage are improved.

Eventually, the generated distributed storage index table will provide a solid foundation for management of the system and efficient access of data. The accuracy and consistency of the method ensure that no delay or error exists when the data retrieval operation is executed, and the user experience is improved. By the method, a user can conveniently monitor the use condition of the storage resource and make a timely decision, so that the manageability and the high efficiency of the system are greatly enhanced.

S204, verifying the data integrity by using a blockchain-based archiving verification technology according to the distributed storage index table, wherein the archiving verification technology monitors the data storage state and the access record in real time by combining an intelligent contract and a distributed account book technology to generate an archiving verification report, and the reliability and traceability of data archiving are ensured.

In the method, based on the distributed storage index table, the system verifies the integrity of the data by using a blockchain-based archive verification technique. Specifically, the system transmits information in the distributed storage index table to the blockchain network, and verifies the storage state of the data in real time by combining the intelligent contracts. The intelligent contract is automatically executed contract code, corresponding operation can be automatically executed when specific conditions are met, and through the intelligent contract, the system can check the integrity of each data item in the storage process, so that the data is ensured not to be tampered or damaged. Meanwhile, by using the distributed account book technology of the blockchain, the system can also record all data storage states and access records, ensure the non-tamper property of the records and provide a reliable basis for subsequent audit and tracing.

By utilizing the blockchain technology to carry out data integrity verification, the safety and reliability of the data in the storage and access processes are ensured. The method greatly enhances the traceability of the data, can quickly trace back to the real state of the data when the data problem occurs, and provides powerful support so as to treat the problem in time. In addition, the technical architecture combining the intelligent contracts and the distributed account book ensures that the data verification process is efficient, transparent and automatic, reduces the requirement of manual operation, and improves the safety and management efficiency of the whole system. This mechanism has particular application to industries that possess large amounts of sensitive data, such as finance, medicine, etc.

Specifically, according to the distributed storage index table, an archive verification technology based on a blockchain is adopted, the data storage state is verified by combining an intelligent contract, the data is ensured not to be damaged or tampered in the storage process through a hash verification algorithm, and a preliminary integrity verification result is generated;

In this step, the system first extracts the relevant data storage information from the distributed storage index table, generating a unique hash value for each piece of data. The hash check algorithm converts the content of the data into a hash value with a fixed length, and any data change can cause the change of the hash value, so that the consistency and the integrity of the data can be effectively detected in the way. The system will compare these hash values to the storage state using the smart contract to determine if the data was tampered with or corrupted during storage.

The implementation of this step ensures the integrity and security of the data. Once the hash values are found to be mismatched, the system immediately triggers an alarm and records specific abnormal events, providing clues for subsequent trace back and processing. The method makes data storage management transparent and reliable, provides powerful guarantee for data protection, and is especially suitable for business situations requiring high security level, such as financial transactions and personal privacy data storage.

In this step, the system first extracts detailed information of each piece of data from the distributed storage index table, including the identity, storage location and current hash value of the data. This process is typically performed concurrently with the uploading of data, and the system records the hash value for each data object for subsequent integrity checking. Hash value generation algorithms, such as SHA-256, calculate the data content and return strings of fixed length, ensuring that even minor modifications will result in significant changes in the hash value. This enables the system to monitor the integrity of the data in real time during storage, and if the data is found to be tampered with, an inconsistency can be immediately identified.

Next, the system will compare the hash value for each piece of data with its stored record. This verification process can be seamlessly automated through smart contracts, following predefined contract rules. For example, the smart contract may set that the system automatically calculates the current hash value of the data and compares it to the original hash value stored on the blockchain at each data access. If the two are consistent, the system records the verification result and generates a preliminary integrity verification result, if the two are inconsistent, an alarm is triggered, the data is marked as potential risk, and further examination is required.

Through this flow, the system ensures the integrity and security of the data since it was created. This not only improves the transparency of data management, but also enhances the user's trust in data storage security. Particularly, when sensitive information (such as medical data or financial records) is processed, the block chain technology is adopted for verification, so that the data can be effectively prevented from being tampered, and high security is ensured.

In this step, the system stores the preliminary integrity verification results and associated data storage status and access records in the blockchain. In particular, the system invokes a distributed ledger technique to record each data access and its state on the blockchain. And by utilizing a consensus mechanism (such as PoW, poS and the like), the validity and consistency of each record are ensured, so that data inconsistency caused by malicious tampering is prevented. In this way, the storage status and access records of all data form a complete, non-tamperable historical track.

By the recording method, the system provides powerful traceability, and any data access and change can be traced back and audited later. This is particularly important for industries where legal, compliance and audit requirements are high (e.g., banks, insurance, etc.). Meanwhile, due to the decentralization characteristic of the blockchain, the storage and management of data become more transparent, and the trust degree of a user on the system is improved. Meanwhile, the method is also beneficial to realizing high-efficiency data management and safety compliance.

In this step, the system integrates preliminary integrity verification result related information into the blockchain to record data storage status and access behavior. Specifically, the system writes each data verification, storage state and its corresponding hash value, timestamp and visitor information to the blockchain using distributed ledger wall technology. By such logging, all access and verification information is retained on the blockchain in a transparent and non-tamperable form, ensuring that future audit and traceability work can rely on reliable evidence.

The validity and integrity of each record will agree between network nodes using a consensus mechanism (e.g., proof of Work or Proof of state). When a data access request occurs, the smart contract automatically triggers a recording mechanism to add the newly generated state data to the transaction pool to be validated. These records are added to the blockchain only after verification and validation by multiple nodes. Therefore, the system ensures the authenticity and credibility of the data record, any administrator or related personnel can check at any time, and the data access and storage state in any time period can be verified, so that the transparency of the whole system is ensured.

For example, assume that a financial transaction system, while processing user transaction data, each transaction is stored, the system records the specific contents of the transaction (e.g., amount, sender and receiver accounts, etc.), calculates the hash value of the transaction by a hash algorithm, and then writes the information into the blockchain. By the method, the record of the transaction is accurate and cannot be tampered, and a complete evidence chain is provided for subsequent audit, so that any dispute can be rapidly verified.

In the step, the system utilizes the intelligent contract to realize real-time monitoring, and combines the data access frequency and the storage state to dynamically detect abnormal behaviors. The intelligent contract can set a condition triggering mechanism, for example, if the access frequency of certain data suddenly increases or the load state of certain storage nodes is abnormal, the system can automatically execute a preset detection task to carry out deep analysis. In this way, the system can timely capture potential security threats or abnormal operations.

Implementation of this process increases the security of the system, enabling it to quickly cope with possible security risks. By timely detecting and processing the abnormal behaviors, the risk of data leakage or tampering can be effectively reduced. In addition, the monitoring report can provide valuable information to the system administrator, facilitating subsequent system optimization and security management decisions. The intelligent monitoring means not only effectively improves the data security, but also provides a perfect data protection scheme for enterprises.

In this step, the system analyzes the records in the distributed ledger with the real-time monitoring capabilities of the smart contracts to identify potential abnormal behavior. This process involves monitoring the frequency of data access and real-time checking of the storage status, for example, whether the number of accesses of a particular data item over a certain time exceeds a preset threshold, or whether the loading of a particular data storage node is abnormal. The monitoring pointers can be dynamically established based on historical data to form a comprehensive set of monitoring standards.

The intelligent contract automatically operates on the block chain network, and corresponding monitoring actions are caused by set conditions. For example, if it is observed that the access frequency of a certain storage node suddenly and significantly increases in a short time, the smart contract immediately triggers a predefined anomaly detection algorithm, and the node is subjected to deep analysis to determine whether such access is normal. If suspicious behaviors are detected, the system generates a preliminary monitoring report, and information such as the nature, time and possible influence of the abnormality is recorded in detail.

The monitoring mechanism not only improves the safety of the system, but also provides real-time basis for management decisions. For example, when a certain storage node is accessed multiple times in a short time in succession, and the access behavior is significantly different from the historical access pattern, the system may immediately feed this information back to the administrator. The administrator can quickly take reactive measures, such as further investigation or temporarily freezing the relevant data, based on the preliminary monitoring report, preventing potential security risk from expanding.

In this step, the system integrates the preliminary monitoring report, the integrity verification result, and the distributed ledger record to generate a comprehensive archive verification report. The report is generated by relying on Natural Language Generation (NLG) technology, and the system automatically writes the report content with logic and readability according to the existing data information and monitoring results. In the process of generating reports, the NLG algorithm can convert complex technical information into easy-to-understand language, so that non-technicians can clearly understand the storage state and the security of data. Meanwhile, the system presents key data and analysis results in the report in a chart or other visual forms through a visual technology, so that the readability and information transfer effect of the report are enhanced.

The automatic report generation mechanism greatly improves the efficiency of data management and reduces the workload of manually writing reports. By integrating the information from multiple sources into a single well-defined archive verification report, a decision maker can quickly grasp the state, integrity, and security of the data store, thereby making timely responses and adjustments. The method not only improves the transparency of the report, but also promotes the information sharing in the organization, and helps the departments to better cooperate and communicate with each other. In addition, the report with the visual element is more attractive, can intuitively display data abnormality or potential risk, and provides intuitive decision support for a manager.

In this step, the system combines the preliminary monitoring report, the integrity verification result, and the record of the distributed ledger, using Natural Language Generation (NLG) techniques, to create a comprehensive archive verification report. This process involves extracting relevant information from different data sources, integrating it and converting it into readable natural language text. The NLG technology can analyze data and automatically generate structured contents, such as storage states of the description data, integrity verification results and anomalies found in monitoring, so as to provide comprehensive condition report for a management layer.

The system may also use visualization techniques to enhance the effectiveness of information delivery in generating reports. The data may be presented in the form of a chart, image or dashboard, for example using a bar chart to show the change in data access frequency over the past month, or using a pie chart to show the status distribution of different data items. The visual presentation mode not only makes report content richer and easy to understand, but also helps decision makers to quickly acquire key data.

The resulting archival verification report is output in the form of a PDF or web page and stored in a secure storage location for review by the relevant authorities. For example, in a hospital management system, when the relevant departments need to view the storage and access records of patient data, they can quickly access the report, finding the results of integrity verification of the patient data, the storage status, and any abnormal behavior found in the monitoring. The automatic report generation mode greatly improves the working efficiency and transparency, so that the data management is more scientific and efficient.

It can be seen that the data is classified by adopting a multidimensional classification model based on deep learning according to the data type, access frequency and service importance of the shutdown system to obtain a classified data set and a priority label thereof, the classified data set is input into a compression and encryption combined processing model based on a neural network to obtain a compressed and encrypted data packet, the compressed and encrypted data packet is distributed to a distributed storage system, a data storage path index is generated by adopting an index construction algorithm based on a graph database to obtain a distributed storage index table, and the data integrity is verified by utilizing a blockchain-based archiving verification technology according to the distributed storage index table to generate an archiving verification report, so that the intelligent classification, effective compression and safe storage of the data in the shutdown system can be realized, and the integrity and traceability of the data are finally ensured.

Yet another embodiment of the present invention provides a data archiving and processing system based on a shutdown system, see fig. 3, which may include:

The classification module 301 is configured to classify data according to a data type, an access frequency and a service importance of the shutdown system by adopting a multidimensional classification model based on deep learning, wherein the multidimensional classification model dynamically divides an archiving priority of the data through an attention technology and an adaptive weight distribution algorithm to obtain a classified data set and a priority label thereof;

the processing module 302 is configured to input the classified data set into a neural network-based compression and encryption combined processing model for data compression and encryption, where the combined processing model combines a quantization compression algorithm and a homomorphic encryption technology to ensure data security and realize efficient compression, so as to obtain a compressed and encrypted data packet;

The indexing module 303 is configured to distribute the compressed and encrypted data packet to a distributed storage system, and generate a data storage path index by adopting an index construction algorithm based on a graph database, where the index construction algorithm optimizes data retrieval efficiency and storage load balance through dynamic hash mapping and a distributed consistency protocol, so as to obtain a distributed storage index table;

And the archiving module 304 is configured to verify the integrity of the data by using a blockchain-based archiving verification technique according to the distributed storage index table, where the archiving verification technique monitors the data storage status and the access record in real time by combining an intelligent contract and a distributed ledger wall technique, and generates an archiving verification report, so as to ensure the reliability and traceability of data archiving.

The embodiment of the invention also provides a storage medium, in which a computer program is stored, wherein the computer program is configured to perform the steps of any of the method embodiments described above when run.

Specifically, in the present embodiment, the above-described storage medium may be configured to store a computer program for executing the steps of:

The present invention also provides an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

Specifically, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.

Specifically, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

The construction, features and effects of the present invention have been described in detail with reference to the embodiments shown in the drawings, but the above description is only a preferred embodiment of the present invention, but the present invention is not limited to the embodiments shown in the drawings, all changes, or modifications to the teachings of the invention, which fall within the meaning and range of equivalents are intended to be embraced therein, are intended to be embraced therein.

Claims

1. A method for data archiving and processing based on a shutdown system, the method comprising:

2. The method according to claim 1, wherein the classifying the data according to the data type, the access frequency and the service importance of the shutdown system by using a multidimensional classification model based on deep learning, wherein the multidimensional classification model dynamically classifies the archiving priority of the data by using an attention technology and an adaptive weight distribution algorithm to obtain a classified data set and a priority label thereof, and the method comprises:

3. The method according to claim 2, wherein the inputting the classified data set into a neural network-based compression and encryption combined processing model for data compression and encryption, wherein the combined processing model implements efficient compression while ensuring data security by combining a quantization compression algorithm and a homomorphic encryption technology, and obtains a compressed and encrypted data packet, and the method comprises:

4. The method of claim 3, wherein distributing the compressed and encrypted data packets to a distributed storage system generates a data storage path index using an index building algorithm based on a graph database, wherein the index building algorithm optimizes data retrieval efficiency and storage load balancing through dynamic hash mapping and a distributed consistency protocol, and obtains a distributed storage index table, and the method comprises:

5. The method of claim 4, wherein verifying the integrity of the data according to the distributed storage index table using a blockchain-based archive verification technique, wherein the archive verification technique monitors the data storage status and access records in real time by combining smart contract and distributed ledger techniques, generates an archive verification report, ensures reliability and traceability of the data archive, comprising:

6. A data archiving and processing system based on a shutdown system, the system comprising:

7. The system according to claim 6, wherein the classification module is specifically configured to:

8. The system according to claim 7, wherein the processing module is specifically configured to:

9. A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the method of any of claims 1-5 when run.

10. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of claims 1-5.