CN114356838B

CN114356838B - Shared memory point-to-point blocking communication modeling method and system based on MPI model

Info

Publication number: CN114356838B
Application number: CN202210013518.3A
Authority: CN
Inventors: 陈衡; 陈家诚; 侯畅; 任柏汀; 张兴军; 蔡玮林; 王子衡
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2022-01-06
Filing date: 2022-01-06
Publication date: 2024-08-13
Anticipated expiration: 2042-01-06
Also published as: CN114356838A

Abstract

The invention discloses a shared memory point-to-point blocking communication modeling method and system based on MPI model, firstly determining which communication channel is located on the shared memory and the communication protocol used, the fully asynchronous point-to-point communication is then modeled, after which the point-to-point communication in the synchronous start case is modeled, and finally the space time is represented in the communication overhead. Compared with a blocking point-to-point communication modeling method based on an MPI hardware communication performance model and a conventional point-to-point communication modeling mode based on an MPI software communication performance model, the modeling method provided by the invention has the advantages that modeling of more channels is provided, the modeling capability of related characteristics of the middleware is enhanced, the modeling of point-to-point communication on a shared memory can be well performed, and the precision is remarkably improved.

Description

Shared memory point-to-point blocking communication modeling method and system based on MPI model

Technical Field

The invention belongs to the technical field of high-performance calculation and network models, and particularly relates to a shared memory point-to-point blocking communication modeling method and system based on an MPI model.

Background

High performance computing is a discipline that explores how to design and fully exploit computer performance. In most high performance applications, acceleration is required using multi-core technology. Multi-core technology typically uses inter-process communication techniques to coordinate inter-core coordination, where the messaging interface (MESSAGE PASSING INTERFACE, MPI) is a de facto industry standard for inter-process communication.

Inter-process communication is a relatively long part of the time consumption in most high performance applications, and the time consumption tends to increase as the number of processes increases. In order to improve the running efficiency and expansibility of the high-performance application as a whole, mathematical modeling is required to be performed on inter-process communication, so that the problem of inter-process communication is analyzed more accurately, and a corresponding solution is provided. The MPI communication performance model is used for analyzing communication cost by utilizing system parameters and capturing and researching characteristics such as RDMA transmission mechanism, operating system bypass, middleware overhead, network topology and the like. It is mainly applied in two aspects: first, improve the performance of MPI collective operations, and then optimize communication scheduling. There are also potential applications, such as solving load balancing problems in conjunction with performance models.

The MPI communication performance model is divided into two types of models according to whether hardware related parameters are used or not, and the two types of models are respectively called a hardware model and a software model. The hardware model refers to a model for representing communication overhead by using hardware parameters such as bandwidth, delay and the like, which is a network model in a conventional sense, but the hardware model is difficult to process communication overhead caused by middleware, and large prediction errors are caused by difficulty in representing memory access competition when concurrent communication is processed. Compared with a hardware model, the software parameters can process the communication cost caused by the middleware and can well represent access memory competition caused by concurrent communication. However, the software model tends to focus on the communication overhead of the system as a whole and does not model the communication primitives. This makes it difficult to apply the software model to a system in which processes have different communication overheads, which hinders development of the communication model. Therefore, expanding the MPI software communication performance model for this difficulty becomes a problem to be solved.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a shared memory point-to-point blocking communication modeling method and a system based on an MPI model, which respectively model the sending and receiving behaviors of communication and represent the space time in the model cost, thereby solving the problem that a software model only models the communication cost of the whole system, and the software model is difficult to apply to the system with different communication costs in the process.

The invention adopts the following technical scheme:

a shared memory point-to-point blocking communication modeling method based on an MPI model comprises the following steps:

s1, determining a communication channel position and a communication protocol of communication on a shared memory according to communication information;

S2, modeling the completely asynchronous point-to-point communication according to the communication channel position and the communication protocol determined in the step S1 to obtain a completely asynchronous point-to-point communication model;

S3, modeling the point-to-point communication under the synchronous starting condition according to the communication channel position and the communication protocol determined in the step S1, and obtaining a synchronous starting point-to-point communication model;

And S4, judging the idle time according to the communication protocol determined in the step S1, and respectively representing the idle time in the complete asynchronous point-to-point communication model obtained in the step S2 and the synchronous starting point-to-point communication model obtained in the step S3 to obtain the shared memory point-to-point blocking communication modeling representation under various conditions to complete blocking communication modeling.

Specifically, step S1 specifically includes:

S101, determining CPU core numbers corresponding to different NUMA nodes according to a processor architecture, determining numbers of cores where two communication processes are located, and judging a channel of point-to-point communication;

S102, if the communication channel is located at the same NUMA node, the communication channel c=0, if the communication channel is located at a different NUMA node, the communication channel c=noc, where the communication channel is represented by parameters o ^c (m) and L ^c (m, τ), o ^c (m) represents overhead required for the message transmission operation to be invoked to start the message injection channel c, and L ^c (m, τ) represents time required for concurrently transferring τ messages of length m between two buffers;

s103, judging a communication protocol according to the communication interface and the message length called by the MPI, and dividing the communication protocol into eager and rendezvous protocols.

Specifically, in step S2, when considering the completely asynchronous situation, the transmission is started before the reception, and communication is performed under eager protocol; when the sending behavior is considered, the sending end only sends the data from the sending buffer area to the middle buffer area to finish communication; when receiving behavior is considered, the receiving end only takes the data already in the intermediate buffer into the receiving end buffer.

Further, modeling the sender and receiver overheads of fully asynchronous point-to-point communications is:

Wherein, on the channel c, For the total overhead of the transmitting end(s) or the receiving end (r),The overhead is invoked for communication at the sender(s) or at the receiver (r), L ^c (m, 1) being the time required to transfer 1 message of message length m between two buffers.

Specifically, the step S3 specifically includes:

s301, when the size of a communication message is smaller than a threshold value of segmented transmission, corresponding to the synchronous modeling mode 1, after the transmitting end transmits all data to the intermediate buffer area, the receiving end takes the data from the intermediate buffer area to the receiving buffer area, and after the transmitting end transmits all the data to the intermediate buffer area, the communication is ended, segmented concurrent transmission does not exist in the period, and after the receiving end waits for the transmitting end to transmit the data to the intermediate buffer area, the receiving end starts communication and takes the data from the intermediate buffer area to the receiving buffer area;

S302, when the size of the communication message is larger than the threshold value of the segmented transmission, corresponding to the synchronous modeling mode 2, the sending end spending comprises the message without segmented transmission and the message requiring segmented transmission, the message without segmented transmission only sends data to the intermediate buffer, and the receiving end processes the message without segmented transmission after waiting for the data in the intermediate buffer.

Further, in step S301, the sender overheadThe method comprises the following steps:

receiving end overhead The following are provided:

Wherein, on the channel c, For communication call overhead of the sender(s), L ^c (m, 1) is the time required for transferring 1 message with message length of m between two buffers, and o ^c (m) is the whole communication call overhead.

Further, in step S302, the sender overheadThe method comprises the following steps:

receiving end overhead The method comprises the following steps:

Wherein L ⁰ (M, 1) is the time required for transmitting 1 message with message length M between two buffers, M is the data amount of non-concurrent transmission, k is the number of segments of concurrent transmission data, L ⁰ (S, 2) is the time required for concurrently transmitting 2 messages with message length S between two buffers, S is the number of message segments, o ⁰ (M) is the overall communication call overhead, including communication delay.

Specifically, step S4 specifically includes:

S401, judging the sequence of transmission and reception start, if the reception occurs before transmission, the reception behavior has empty time, and the transmission behavior does not exist;

S402, judging the sequence of starting transmission and receiving, if the sequence occurs before transmission, under eager protocol, no empty time exists between transmission and receiving, under rendezvous protocol, no empty time exists between transmission behavior and receiving behavior;

S403, judging the sequence of starting transmission and receiving, if the sequence is started at the same time, the communication parties do not have empty time;

s404, judging the length of the empty time, wherein the length is determined by the time difference between the transmission and the reception;

s405 represents the time of the space in the communication overhead, and represents the time of the space in the overhead if the transmission and reception are not synchronized.

The invention also provides a shared memory point-to-point blocking communication modeling system based on an MPI model, which comprises:

the communication information judging module is used for determining the communication channel position and the communication protocol of the communication on the shared memory according to the communication information;

The asynchronous transmission modeling module models the fully asynchronous point-to-point communication according to the communication channel position and the communication protocol determined by the communication information judging module to obtain a fully asynchronous point-to-point communication model;

The synchronous transmission modeling module models the point-to-point communication under the synchronous starting condition according to the communication channel position and the communication protocol determined by the communication information judging module to obtain a synchronous starting point-to-point communication model;

And the idle time judging module judges the idle time according to the communication protocol determined by the communication information judging module, and respectively represents the idle time in a complete asynchronous point-to-point communication model obtained by the asynchronous transmission modeling module and a synchronous starting point-to-point communication model obtained by the synchronous transmission modeling module to obtain a shared memory point-to-point blocking communication modeling representation under various conditions, so as to complete blocking communication modeling.

Compared with the prior art, the invention has at least the following beneficial effects:

The invention relates to a shared memory point-to-point blocking communication modeling method based on an MPI model, which comprises the steps of firstly determining which communication channel is positioned on a shared memory, then modeling completely asynchronous point-to-point communication, modeling point-to-point communication under the condition of synchronous starting, and finally representing space time in communication overhead by combining a communication protocol; by comprehensively and overall considering factors affecting communication overhead, including communication channels and communication protocols, a more comprehensive communication modeling representation is derived.

Furthermore, the communication channels are determined through the CPU positions and are divided into NUMA nodes and NUMA nodes, and the classification method can further distinguish the communication channels and corresponding communication modeling.

Further, communication in the case of completely asynchronous transmission is modeled first, corresponding to transmission being started prior to reception and communication being performed under eager protocol, which is a case of communication modeling.

Further, modeling is performed on the condition of completely asynchronous communication between the sending end and the receiving end according to the analysis of the previous step, the communication channel is represented in the modeling, and the calling cost and the message transferring time are used for representing the communication cost. The two ends of the communication are not affected by each other and thus the communication overhead is minimal.

Further, synchronous modeling is performed for communication transmission, i.e. communication modeling in another case.

Further, the modeling is performed on synchronous communication smaller than the segment transmission threshold, the communication channel is represented in the modeling, and the communication of the transmitting end and the receiving end is respectively modeled, so that the influence caused by concurrent transmission is not required to be considered, but the influence caused by segment transmission is required to be considered.

Further, synchronous communication larger than a segmented transmission threshold is modeled, a communication channel is expressed in the modeling, and the influence caused by segmented transmission and concurrent transmission is considered to finish the modeling of communication of a transmitting end and a receiving end respectively.

Further, more communication overhead modeling is covered. Because of the limitations of the communication protocol, the communication overhead typically includes empty time. And determining whether the idle time exists and the duration of the idle time according to the starting time of the sending end and the receiving end and the communication using protocol.

In summary, the invention provides a method for modeling point-to-point blocking communication on a shared memory based on an MPI model, wherein modeling factors comprise a communication channel, a communication protocol, a communication terminal, a segmented transmission, a concurrent transmission, a synchronous/asynchronous modeling scheme and a space and the like, and specific modeling schemes under each condition are comprehensively and accurately provided, so that optimization based on modeling can be more accurate and effective.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a graph showing the communication time measured during a short message and predicted by modeling in accordance with the present invention;

fig. 3 shows the communication time measured and predicted by modeling of the present invention for a long message.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it will be understood that the terms "comprises" and "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

Various structural schematic diagrams according to the disclosed embodiments of the present invention are shown in the accompanying drawings. The figures are not drawn to scale, wherein certain details are exaggerated for clarity of presentation and may have been omitted. The shapes of the various regions, layers and their relative sizes, positional relationships shown in the drawings are merely exemplary, may in practice deviate due to manufacturing tolerances or technical limitations, and one skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions as actually required.

The invention provides a shared memory point-to-point blocking communication modeling method based on an MPI model, which is characterized in that firstly, the communication channel on which the communication is positioned on the shared memory is determined, then, the completely asynchronous point-to-point communication is modeled, then, the point-to-point communication under the condition of synchronous starting is modeled, and finally, the space time is represented in the communication overhead by combining a communication protocol. Compared with a blocking point-to-point communication modeling method based on an MPI hardware communication performance model and a conventional point-to-point communication modeling mode based on an MPI software communication performance model, the modeling method provided by the invention has the advantages that modeling of more channels is provided, the modeling capability of related characteristics of the middleware is enhanced, the modeling of point-to-point communication on a shared memory can be well performed, and the precision is remarkably improved.

Referring to fig. 1, the method for modeling the point-to-point blocking communication of the shared memory based on the MPI model includes the following steps:

S1, determining different communication channels of a shared memory according to processor architecture information;

s101, determining CPU core numbers corresponding to different NUMA nodes according to a processor architecture, and determining the numbers of cores where two communication processes are located, so as to judge a channel of point-to-point communication;

s102, representing the channel on parameters;

Specifically, if the communication channel is located at the same NUMA node, the communication channel represents c=0, and if the communication channel is located at a different NUMA node, the communication channel represents c=noc, and the communication channel is represented in parameters o ^c (m) and L ^c (m, τ).

Let the parameter L denote the overhead for message transfer, which refers to the time required to transfer a message between two buffers, denoted by L ^c (m, τ), c denotes the channel (c=0 denotes the shared memory channel, c=noc denotes the network-on-chip channel), m denotes the message size, τ denotes the number of concurrent processes, s denotes the transmit behavior, and r denotes the receive behavior. Parameter o ^c (m) represents the overhead required by the message transfer operation to be invoked to start injecting the message into channel c, is influenced by the size of the message and the communication end, and is different on different channels, o ^c (m), the invention is applied to three parametersDistinguishing from o ^c (m) to respectively correspond to the spending of a sending end, a receiving end when the system is completely asynchronous and a receiving end when the system is started synchronously; the receiver overhead at the beginning of the synchronization is higher than the receiver overhead at the end of the asynchronization.

S2, modeling completely asynchronous point-to-point communication according to the communication channel position and the communication protocol determined in the step S1;

In the case of a completely asynchronous situation, the precondition is that the transmission starts before the reception and the communication takes place under the eager protocol. Considering the sending behavior, the sending end only needs to send the data from the sending buffer area to the middle buffer area to finish the communication. In consideration of the receiving behavior, the receiving end only needs to take the data already in the intermediate buffer area to the receiving end buffer area; modeling the sender and receiver overheads of fully asynchronous point-to-point communications is:

S3, modeling point-to-point communication under the synchronous starting condition according to the communication channel position and the communication protocol determined in the step S1;

S301, when the size of the communication message is smaller than the threshold value of the segment transmission, the synchronous modeling mode 1 is corresponding. The meaning of the segment transmission threshold is: when the communication message is less than the threshold value of the segment transmission, the message is not transmitted in segments, and when the communication message is not less than the threshold value, the message is transmitted in segments. At this time, the message is not transmitted in segments, so the receiving end can only fetch the data from the intermediate buffer to the receiving buffer after the sending end sends the data to the intermediate buffer, the sending end can end the communication after sending the data to the intermediate buffer, and the segment concurrent transmission is not existed in the period, and the overhead of the sending end is expressed as:

The receiving end can start communication after waiting for the transmitting end to transmit data to the intermediate buffer area, and the receiving end spends as follows:

the threshold value of the segment transmission has the following meaning: when the message is less than the threshold, the message has no segmented transmission, and when the message is not less than the threshold, the message has segmented transmission.

S302, when the size of the communication message is larger than the threshold value of the segmented transmission, the synchronous modeling mode 2 is corresponding. At this time, the overhead of the transmitting end is composed of two parts, messages requiring no segmented transmission and messages requiring segmented transmission, let M denote the message size of non-segmented transmission, and k denote the number of segments transmitted concurrently with the segmentation. The sending end only needs to send the data to the intermediate buffer area without the information of the segmented transmission, and the cost of the sending end is expressed as:

the receiving end processes the message without segmentation, and the communication can be started after waiting for the existence of data in the intermediate buffer area, and the cost of the receiving end is expressed as:

where M represents the message size of the non-segmented transmission and k represents the number of segments of the segmented concurrent transmission.

S401, judging the sequence of transmission and reception start, if the reception occurs before transmission, the reception behavior always has empty time, and the transmission behavior does not exist;

s402, judging the sequence of starting transmission and receiving, if the sequence occurs before transmission, under eager protocol, no empty time exists in transmission and reception, under rendezvous protocol, the empty time exists in transmission behavior, and no receiving behavior exists;

S405, the space time is expressed in the communication overhead, and if the transmission and the reception are not synchronous, the space time is expressed in the overhead as appropriate.

The modeling problem of the point-to-point blocking communication of the shared memory is finally obtained by considering a series of factors influencing the communication time, such as the communication channel, the communication protocol, the idle time and the like. The communication modeling can be applied to a plurality of fields, such as tuning MPI collection algorithm, theoretical support for automatic tuning, opinion for optimizing communication scheduling and load balancing in combination with performance model.

In still another embodiment of the present invention, a system for modeling a point-to-point blocking communication of a shared memory based on an MPI model is provided, where the system can be used to implement the above method for modeling a point-to-point blocking communication of a shared memory based on an MPI model, and specifically, the system for modeling a point-to-point blocking communication of a shared memory based on an MPI model includes a communication information determining module, an asynchronous transmission modeling module, a synchronous transmission modeling module, and a null isochronous length determining module.

Referring to fig. 2 and 3, experiments performed on Intel Xeon Gold 5218R using MPICH3.4.1, the prediction error of the transmitting end and the receiving end is relatively small no matter when short messages and long messages are sent, when the messages are small, the transmission time is not linearly increased due to segmented transmission, and a segmented function relationship exists between the two; when the message is large, the transmission time and the message size are almost linear, and the modeling work of the invention captures both characteristics, which shows that the modeling method of the invention is effective.

In summary, the modeling method and system for point-to-point blocking communication of the shared memory based on the MPI model consider asynchronous communication in the software model, the former work only pays attention to the cost of synchronous communication, and the concept of waiting time is introduced to describe the asynchronous communication, which is not involved in the traditional software model; the sending and receiving overheads are distinguished, and different communication protocols are further distinguished by combining the waiting time, so that MPI blocking primitives are more accurately represented; channels of shared memory are distinguished and divided into intra-NUMA nodes and inter-NUMA node communications, which have not been captured by previous communication performance models.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above is only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited by this, and any modification made on the basis of the technical scheme according to the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. The shared memory point-to-point blocking communication modeling method based on the MPI model is characterized by comprising the following steps of:

S2, modeling the completely asynchronous point-to-point communication according to the communication channel position and the communication protocol determined in the step S1 to obtain a completely asynchronous point-to-point communication model, and starting transmission before receiving and communicating under eager protocol when the completely asynchronous condition is considered; when the sending behavior is considered, the sending end only sends the data from the sending buffer area to the middle buffer area to finish communication; when receiving behaviors are considered, the receiving end only takes the data in the middle buffer area into the buffer area of the receiving end;

s3, modeling the point-to-point communication under the synchronous starting condition according to the communication channel position and the communication protocol determined in the step S1 to obtain a synchronous starting point-to-point communication model, wherein the step S3 specifically comprises the following steps:

S302, when the size of the communication message is larger than the threshold value of the segmented transmission, corresponding to the synchronous modeling mode 2, the overhead of the sending end comprises the message without segmented transmission and the message requiring segmented transmission, the message without segmented transmission only sends data to the middle buffer zone, and the receiving end processes the message without segmented transmission after waiting for the data in the middle buffer zone;

2. The method for modeling the point-to-point blocking communication of the shared memory based on the MPI model according to claim 1, wherein the step S1 is specifically:

3. The method for modeling a point-to-point blocking communication of a shared memory based on an MPI model according to claim 1, wherein in step S2, the sending end and receiving end overheads of the modeling of the point-to-point communication which is completely asynchronous are:

4. The method for modeling point-to-point blocking communication of shared memory based on MPI model as claimed in claim 1, wherein in step S301, the sender overhead isThe method comprises the following steps:

receiving end overhead The following are provided:

5. The method for modeling point-to-point blocking communication of shared memory based on MPI model as claimed in claim 1, wherein in step S302, the sender overhead isThe method comprises the following steps:

receiving end overhead The method comprises the following steps:

6. The method for modeling the point-to-point blocking communication of the shared memory based on the MPI model according to claim 1, wherein step S4 specifically comprises:

7. A shared memory point-to-point blocking communication modeling system based on an MPI model, comprising:

The asynchronous transmission modeling module models the fully asynchronous point-to-point communication according to the communication channel position and the communication protocol determined by the communication information judging module to obtain a fully asynchronous point-to-point communication model, and when the fully asynchronous condition is considered, the transmission is started before the reception, and the communication is carried out under eager protocols; when the sending behavior is considered, the sending end only sends the data from the sending buffer area to the middle buffer area to finish communication; when receiving behaviors are considered, the receiving end only takes the data in the middle buffer area into the buffer area of the receiving end;

the synchronous transmission modeling module models the point-to-point communication under the synchronous starting condition according to the communication channel position and the communication protocol determined by the communication information judging module to obtain a synchronous starting point-to-point communication model, which is specifically as follows:

When the size of the communication message is smaller than the threshold value of the segmented transmission, corresponding to the synchronous modeling mode 1, after the sending end sends the data to the intermediate buffer zone, the receiving end takes the data from the intermediate buffer zone to the receiving buffer zone, and after the sending end sends the data to the intermediate buffer zone, the communication is ended, the segmented concurrent transmission does not exist in the period, and the receiving end starts communication after waiting for the sending end to send the data to the intermediate buffer zone, and takes the data from the intermediate buffer zone to the receiving buffer zone;

When the size of the communication message is larger than the threshold value of the segmented transmission, corresponding to the synchronous modeling mode 2, the overhead of the sending end comprises the message without segmented transmission and the message requiring segmented transmission, the message without segmented transmission only sends data to the middle buffer zone, and the receiving end processes the message without segmented transmission after waiting for the data in the middle buffer zone;