CN116881188B

CN116881188B - Method, equipment and medium for chip-to-chip interface interconnection

Info

Publication number: CN116881188B
Application number: CN202310976442.9A
Authority: CN
Inventors: 李剑峰
Original assignee: Xinyaohui Technology Co ltd
Current assignee: Xinyaohui Technology Co ltd
Priority date: 2023-05-15
Filing date: 2023-05-15
Publication date: 2024-01-09
Anticipated expiration: 2043-05-15
Also published as: CN116303191A; CN116303191B; CN116881188A

Abstract

This application provides a method, equipment and medium for chip-to-wafer interface interconnection. It can meet the needs of high data transmission rate and high data transmission reliability, and can also flexibly adapt to the requirements of various protocols, rules, strategies, etc. in terms of data transmission, flow control, scheduling functions, bandwidth, data channels, etc.

Description

Method, equipment and medium for chip-to-chip interface interconnection

技术领域Technical field

本申请涉及芯片设计技术领域，尤其涉及一种晶片到晶片接口互联的方法、设备及介质。The present application relates to the field of chip design technology, and in particular to a method, equipment and medium for chip-to-wafer interface interconnection.

背景技术Background technique

随着半导体工艺的进步和芯片设计规模的增长，芯片的面积也越来越大，系统级芯片(system on chip，SOC)也即片上系统的性能提升变得更加困难，并且同时还面对漏电流增大、散热处理难度增加以及芯片时钟主频增长缓慢等问题。半导体工艺制程上的突破也很难让SOC做到最优的性能和极低的功耗，并且还面对上升的芯片制造成本和制造良率下降的问题。为了利用现有工艺技术制造出性能和功耗符合要求的芯片，将SOC拆分成多个晶片(die)，利用封装和互联技术来构建芯片组或者多芯片模块。例如，通过芯粒(chiplet)技术，将原本集成于同一个系统单晶片中的各个功能块拆分，分开制造后通过封装和互联技术来最终集成封装为一个系统芯片组。经过封装后的多个晶片之间的数据互联要确保正确性，晶片到晶片(die to die)接口互联也需要具有高数据带宽、低延时和高可靠性的特性，才能满足如网络、大规模数据中心和人工智能等领域的应用需求。但是，现有技术中的晶片到晶片接口互联的方式难以满足数据传输速率高和数据传输可靠性高的需求。With the advancement of semiconductor technology and the growth of chip design scale, the chip area is getting larger and larger. It becomes more difficult to improve the performance of system on chip (SOC), that is, system on chip, and it also faces leakage. Problems include increased current, increased difficulty in heat dissipation, and slow growth of the chip clock frequency. Breakthroughs in semiconductor processes also make it difficult for SOC to achieve optimal performance and extremely low power consumption, and it also faces rising chip manufacturing costs and declining manufacturing yields. In order to use existing process technology to manufacture chips with performance and power consumption that meet the requirements, the SOC is split into multiple wafers (die), and packaging and interconnection technologies are used to build chipsets or multi-chip modules. For example, through chiplet technology, each functional block originally integrated in the same system single chip is separated, manufactured separately, and finally integrated and packaged into a system chipset through packaging and interconnection technology. The data interconnection between multiple encapsulated chips must ensure accuracy, and the die-to-die interface interconnection also needs to have high data bandwidth, low latency and high reliability to meet the needs of networks, large-scale applications, etc. Application needs in fields such as large-scale data centers and artificial intelligence. However, the chip-to-chip interface interconnection method in the prior art is difficult to meet the requirements for high data transmission rate and high data transmission reliability.

为此，本申请提供了一种晶片到晶片接口互联的方法、设备及介质，用于解决现有技术中的技术难题。To this end, this application provides a chip-to-wafer interface interconnection method, equipment and medium to solve technical problems in the existing technology.

发明内容Contents of the invention

第一方面，本申请提供了一种晶片到晶片接口互联的方法。多个晶片中的每一个晶片包括晶片到晶片接口用于该晶片和所述多个晶片中相对于该晶片的另一晶片之间的数据互联，所述方法应用于第一晶片，所述第一晶片是所述多个晶片中的任一晶片，所述第一晶片的晶片到晶片接口包括接口缓存、协议处理单元和传输接口，所述方法包括：响应于所述第一晶片的数据发送，输入待发送第一数据到所述接口缓存；通过所述协议处理单元，对所述待发送第一数据进行数据切割得到切割后数据，对所述切割后数据进行循环冗余校验计算生成循环冗余计算结果，将所述循环冗余计算结果添加到所述切割后数据中从而进行组装和条带化分发得到分发后数据，对所述分发后数据进行编码操作得到编码后数据，对所述编码后数据进行扰码和组帧得到待发送第二数据；通过所述传输接口，发送所述待发送第二数据，响应于所述第一晶片的数据接收，通过所述传输接口获取待接收第一数据；通过所述协议处理单元，对所述待接收第一数据进行定帧和解扰后再进行解码操作得到解码后数据，对所述解码后数据进行数据聚合处理得到聚合后数据，对所述聚合后数据进行循环冗余校验和数据组合得到待接收第二数据；输入所述待接收第二数据到所述接口缓存。In a first aspect, the present application provides a chip-to-wafer interface interconnection method. Each wafer of the plurality of wafers includes a wafer-to-wafer interface for data interconnection between the wafer and another wafer of the plurality of wafers relative to the wafer, the method is applied to a first wafer, the second wafer A wafer is any wafer among the plurality of wafers, the wafer-to-wafer interface of the first wafer includes an interface cache, a protocol processing unit and a transmission interface, and the method includes: in response to data transmission of the first wafer , input the first data to be sent to the interface cache; through the protocol processing unit, perform data cutting on the first data to be sent to obtain the cut data, and perform cyclic redundancy check calculation and generation on the cut data Cyclic redundancy calculation results, add the cyclic redundancy calculation results to the cut data to perform assembly and striping distribution to obtain the distributed data, perform encoding operations on the distributed data to obtain the encoded data, and The encoded data is scrambled and framed to obtain the second data to be sent; the second data to be sent is sent through the transmission interface, and in response to the data reception of the first chip, the second data to be sent is obtained through the transmission interface The first data to be received is; through the protocol processing unit, the first data to be received is framed and descrambled and then decoded to obtain decoded data, and the decoded data is subjected to data aggregation processing to obtain aggregated data. , perform cyclic redundancy check and data combination on the aggregated data to obtain the second data to be received; input the second data to be received into the interface cache.

通过本申请的第一方面，可以满足数据传输速率高和数据传输可靠性高的需求，还可以灵活地适配各种协议、规则、策略等在如数据传输、流量控制、调度功能、带宽、数据通道等方面的要求。Through the first aspect of this application, the requirements for high data transmission rate and high data transmission reliability can be met, and various protocols, rules, strategies, etc. can be flexibly adapted to such aspects as data transmission, flow control, scheduling functions, bandwidth, Requirements for data channels, etc.

在本申请的第一方面的一种可能的实现方式中，所述编码操作基于第一编码方案，所述解码操作基于第一解码方案，所述第一编码方案对应所述第一解码方案。In a possible implementation of the first aspect of the application, the encoding operation is based on a first encoding scheme, the decoding operation is based on a first decoding scheme, and the first encoding scheme corresponds to the first decoding scheme.

在本申请的第一方面的一种可能的实现方式中，所述第一编码方案是64/67编码，所述第一解码方案是64/67解码。In a possible implementation of the first aspect of the present application, the first encoding scheme is 64/67 encoding, and the first decoding scheme is 64/67 decoding.

在本申请的第一方面的一种可能的实现方式中，所述待发送第一数据来自与所述第一晶片相关联的用户数据接口，所述待接收第二数据被发送给所述用户数据接口。In a possible implementation of the first aspect of the application, the first data to be sent comes from a user data interface associated with the first chip, and the second data to be received is sent to the user Data interface.

在本申请的第一方面的一种可能的实现方式中，所述待接收第二数据在被发送给所述用户数据接口之前进行速率适配处理。In a possible implementation of the first aspect of the present application, the second data to be received undergoes rate adaptation processing before being sent to the user data interface.

在本申请的第一方面的一种可能的实现方式中，所述待发送第二数据被发送到相对于所述第一晶片的第二晶片的晶片到晶片接口的传输接口。In a possible implementation of the first aspect of the application, the second data to be sent is sent to a transmission interface of a wafer-to-wafer interface of a second wafer relative to the first wafer.

在本申请的第一方面的一种可能的实现方式中，通过所述协议处理单元，将所述循环冗余计算结果添加到所述切割后数据中从而进行组装和条带化分发得到所述分发后数据，包括：通过所述协议处理单元，将所述循环冗余计算结果添加到所述切割后数据中从而按照突发长度的设置进行组装和条带化分发得到所述分发后数据。In a possible implementation of the first aspect of the present application, the protocol processing unit adds the cyclic redundancy calculation result to the cut data to perform assembly and striping distribution to obtain the The distributed data includes: adding the cyclic redundancy calculation result to the cut data through the protocol processing unit to assemble and stripe the distributed data according to the setting of the burst length to obtain the distributed data.

在本申请的第一方面的一种可能的实现方式中，所述传输接口是串行器解串器接口。In a possible implementation of the first aspect of the application, the transmission interface is a serializer-deserializer interface.

在本申请的第一方面的一种可能的实现方式中，所述协议处理单元是Interlaken协议处理单元。In a possible implementation of the first aspect of this application, the protocol processing unit is an Interlaken protocol processing unit.

在本申请的第一方面的一种可能的实现方式中，通过所述协议处理单元，至少在对所述待发送第一数据进行数据切割得到切割后数据之前，对所述接口缓存中的所述待发送第一数据进行速率适配处理。In a possible implementation of the first aspect of the present application, through the protocol processing unit, at least before performing data cutting on the first data to be sent to obtain the cut data, all the data in the interface cache are processed. The first data to be sent is subjected to rate adaptation processing.

在本申请的第一方面的一种可能的实现方式中，通过所述协议处理单元，在对所述切割后数据进行循环冗余校验计算生成所述循环冗余计算结果的过程中同步生成控制字段用于记录所述切割后数据的描述信息。In a possible implementation of the first aspect of the present application, through the protocol processing unit, the cyclic redundancy check calculation is performed on the cut data to generate the cyclic redundancy calculation result synchronously. The control field is used to record the description information of the cut data.

在本申请的第一方面的一种可能的实现方式中，所述多个晶片是同质晶片或者异质晶片。In a possible implementation of the first aspect of the application, the plurality of wafers are homogeneous wafers or heterogeneous wafers.

在本申请的第一方面的一种可能的实现方式中，所述多个晶片对应同一系统单晶片中的功能块，所述多个晶片通过各自的晶片到晶片接口被封装到一起从而构成与所述系统单晶片对应的系统芯片组。In a possible implementation of the first aspect of the application, the plurality of chips correspond to functional blocks in the same system single chip, and the plurality of chips are packaged together through respective chip-to-wafer interfaces to form a The system chipset corresponding to the system single chip.

在本申请的第一方面的一种可能的实现方式中，所述多个晶片通过芯粒技术被封装到一起。In a possible implementation of the first aspect of the application, the plurality of wafers are packaged together using die technology.

在本申请的第一方面的一种可能的实现方式中，所述编码操作基于第一编码方案，所述编码后数据包括多个压缩前数据，所述多个压缩前数据的大小均为第一数值的比特位，所述第一数值是基于所述第一编码方案，其中，对所述编码后数据进行扰码和组帧得到所述待发送第二数据，包括：对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的多个压缩后数据，并且对所述多个压缩前数据进行前向纠错计算生成冗余纠错码，将所述冗余纠错码加入所述多个压缩后数据从而更新所述编码后数据，对更新后的所述编码后数据进行扰码和组帧得到所述待发送第二数据。In a possible implementation of the first aspect of the application, the encoding operation is based on a first encoding scheme, the encoded data includes a plurality of pre-compression data, and the sizes of the plurality of pre-compression data are all A numerical value of bits, the first numerical value is based on the first encoding scheme, wherein scrambling and framing the encoded data to obtain the second data to be sent includes: The pre-compression data is compressed and transcoded respectively to obtain a plurality of compressed data corresponding to the plurality of pre-compression data, and forward error correction calculation is performed on the plurality of pre-compression data to generate a redundant error correction code. The redundant error correction code is added to the plurality of compressed data to update the encoded data, and the updated encoded data is scrambled and framed to obtain the second data to be sent.

在本申请的第一方面的一种可能的实现方式中，所述第一编码方案是64/67编码，所述第一数值是67。In a possible implementation of the first aspect of this application, the first encoding scheme is 64/67 encoding, and the first numerical value is 67.

在本申请的第一方面的一种可能的实现方式中，更新前的所述编码后数据的大小和所述更新后的所述编码后数据的大小一致，所述多个压缩后数据对应所述多个压缩前数据中的数据域，对所述多个压缩前数据进行前向纠错计算生成的所述冗余纠错码对应所述多个压缩前数据中的用于同步的比特域。In a possible implementation of the first aspect of the present application, the size of the encoded data before the update is consistent with the size of the encoded data after the update, and the plurality of compressed data correspond to The data field in the plurality of pre-compression data, the redundant error correction code generated by performing forward error correction calculation on the plurality of pre-compression data corresponds to the bit field used for synchronization in the plurality of pre-compression data. .

在本申请的第一方面的一种可能的实现方式中，所述解码操作基于第一解码方案，所述第一编码方案对应所述第一解码方案，其中，对所述解码后数据进行数据聚合处理得到聚合后数据，包括：对所述解码后数据进行解压缩和反向转码后再进行前向纠错检验，然后进行数据聚合处理得到所述聚合后数据。In a possible implementation of the first aspect of the present application, the decoding operation is based on a first decoding scheme, and the first encoding scheme corresponds to the first decoding scheme, wherein the decoded data is subjected to data processing. The aggregation process to obtain the aggregated data includes: decompressing and reverse transcoding the decoded data and then performing a forward error correction check, and then performing data aggregation processing to obtain the aggregated data.

在本申请的第一方面的一种可能的实现方式中，对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的所述多个压缩后数据，包括：压缩所述多个压缩前数据中的用于同步的比特域从而用于传输所述冗余纠错码，以及保持所述多个压缩前数据中的数据域。In a possible implementation of the first aspect of the application, the plurality of pre-compression data are compressed and transcoded respectively to obtain the plurality of post-compression data corresponding to the plurality of pre-compression data, The method includes: compressing a bit field used for synchronization in the plurality of pre-compressed data to transmit the redundant error correction code, and maintaining a data field in the plurality of pre-compressed data.

在本申请的第一方面的一种可能的实现方式中，所述传输接口的与所述多个压缩前数据相关联的链路传输带宽等于所述传输接口的与所述多个压缩后数据相关联的链路传输带宽。In a possible implementation of the first aspect of the present application, the link transmission bandwidth of the transmission interface associated with the plurality of pre-compression data is equal to the link transmission bandwidth of the transmission interface associated with the plurality of post-compression data. The associated link transmission bandwidth.

在本申请的第一方面的一种可能的实现方式中，对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的所述多个压缩后数据是基于所述第一编码方案，所述第一编码方案是基于与所述协议处理单元相关联的晶片到晶片接口互联协议。In a possible implementation of the first aspect of the present application, the plurality of pre-compression data are compressed and transcoded respectively to obtain the plurality of post-compression data corresponding to the plurality of pre-compression data. Based on the first encoding scheme, the first encoding scheme is based on a die-to-die interface interconnect protocol associated with the protocol processing unit.

第二方面，本申请实施例还提供了一种计算机设备，所述计算机设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现根据上述任一方面的任一种实现方式的方法。In a second aspect, embodiments of the present application also provide a computer device. The computer device includes a memory, a processor, and a computer program stored on the memory and executable on the processor. The processor executes The computer program implements a method according to any implementation of any of the above aspects.

第三方面，本申请实施例还提供了一种计算机可读存储介质，所述计算机可读存储介质存储有计算机指令，当所述计算机指令在计算机设备上运行时使得所述计算机设备执行根据上述任一方面的任一种实现方式的方法。In a third aspect, embodiments of the present application also provide a computer-readable storage medium that stores computer instructions. When the computer instructions are run on a computer device, the computer device causes the computer device to execute the above-mentioned method. Any method of achieving either aspect.

第四方面，本申请实施例还提供了一种计算机程序产品，所述计算机程序产品包括存储在计算机可读存储介质上的指令，当所述指令在计算机设备上运行时使得所述计算机设备执行根据上述任一方面的任一种实现方式的方法。In a fourth aspect, embodiments of the present application further provide a computer program product. The computer program product includes instructions stored on a computer-readable storage medium. When the instructions are run on a computer device, the computer device executes A method according to any implementation of any of the above aspects.

附图说明Description of the drawings

为了更清楚地说明本申请实施例技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present application, which are of great significance to this field. Ordinary technicians can also obtain other drawings based on these drawings without exerting creative work.

图1为本申请实施例提供的一种通过多个彼此互联的晶片集成得到系统芯片组的示意图；Figure 1 is a schematic diagram of a system chipset obtained by integrating multiple interconnected chips according to an embodiment of the present application;

图2为本申请实施例提供的一种通过晶片到晶片接口互联的多个晶片的示意图；Figure 2 is a schematic diagram of multiple chips interconnected through a chip-to-wafer interface according to an embodiment of the present application;

图3为本申请实施例提供的另一种通过晶片到晶片接口互联的多个晶片的示意图；FIG. 3 is another schematic diagram of multiple chips interconnected through a chip-to-wafer interface according to an embodiment of the present application;

图4为本申请实施例提供的一种晶片到晶片接口互联的方法的流程示意图；Figure 4 is a schematic flowchart of a chip-to-wafer interface interconnection method provided by an embodiment of the present application;

图5为本申请实施例提供的一种通过晶片到晶片接口且基于第一编码方案进行数据发送过程的流程示意图；Figure 5 is a schematic flowchart of a data sending process based on a first encoding scheme through a chip-to-wafer interface according to an embodiment of the present application;

图6为本申请实施例提供的一种通过晶片到晶片接口且基于第一解码方案进行数据接收过程的流程示意图。FIG. 6 is a schematic flowchart of a data receiving process based on a first decoding scheme through a chip-to-wafer interface according to an embodiment of the present application.

具体实施方式Detailed ways

下面将结合附图对本申请实施例作进一步地详细描述。The embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

应当理解的是，在本申请的描述中，“至少一个”指一个或一个以上，“多个”指两个或两个以上。另外，“第一”、“第二”等词汇，除非另有说明，否则仅用于区分描述的目的，而不能理解为指示或暗示相对重要性，也不能理解为指示或暗示顺序。It should be understood that in the description of this application, "at least one" refers to one or more than one, and "plurality" refers to two or more. In addition, words such as "first" and "second", unless otherwise stated, are only used for the purpose of distinguishing and describing, and cannot be understood as indicating or implying relative importance, nor can they be understood as indicating or implying order.

图1为本申请实施例提供的一种通过多个彼此互联的晶片集成得到系统芯片组的示意图。如图1所示，系统芯片组110包括四个晶片，分别为晶片A 102、晶片B 104、晶片C106和晶片D 108。其中，图1中所示的系统芯片组110所包括的四个晶片，也就是晶片A 102、晶片B 104、晶片C 106和晶片D 108，都属于晶片(die)。晶片也叫裸晶或裸片，是以半导体材料制作而成未经封装的一小块集成电路的本体。晶片可以理解为芯片未封装前的晶粒，是从硅晶元(wafer)上用激光切割而成的小片，每一个晶片就是一个独立的功能芯片。将一个晶片作为一个单位封装起来成为芯片。晶片是不能直接使用的，因为晶片未经过封装没有引脚也没有散热片。系统单晶片指的是在单个晶片上实现整个系统的功能，也叫做片上系统或者系统级芯片(system on chip，SOC)。随着芯片设计规模的迅速增长，系统功能变得更加丰富也更加复杂，单个晶片的面积也迅速增长，要在单个晶片上实现整个系统的功能也就是制备系统单晶片面临越来越大的挑战，例如性能提升变得更难、散热处理难度更大、漏电流增大、芯片时钟主频率增长更慢等。而且晶片面积增长和芯片设计复杂度的增加也带来了制造良率下降的问题。为了能在提升性能同时还做到控制成本和功耗，将原本在系统单晶片上实现的整个系统的功能拆分后通过多个晶片来实现，利用封装和互联技术来构建芯片组或者多芯片模块。例如，通过芯粒(chiplet)技术，将原本集成于同一个系统单晶片中的各个功能块拆分，分开制造后通过封装和互联技术来最终集成封装为一个系统芯片组。FIG. 1 is a schematic diagram of a system chipset obtained by integrating multiple interconnected chips according to an embodiment of the present application. As shown in FIG. 1 , the system chipset 110 includes four chips, namely chip A 102 , chip B 104 , chip C 106 and chip D 108 . Among them, the four wafers included in the system chipset 110 shown in FIG. 1 , namely wafer A 102 , wafer B 104 , wafer C 106 and wafer D 108 , all belong to dies. A chip, also called a bare die or die, is a small unpackaged integrated circuit body made of semiconductor materials. A wafer can be understood as the grain of the chip before it is packaged. It is a small piece cut by laser from a silicon wafer. Each wafer is an independent functional chip. A wafer is packaged as a unit into a chip. The chip cannot be used directly because it is not packaged, has no pins and does not have a heat sink. System on chip refers to the realization of the functions of the entire system on a single chip, also called system on chip or system on chip (SOC). With the rapid growth of chip design scale, system functions have become richer and more complex, and the area of a single chip has also grown rapidly. To realize the functions of the entire system on a single chip, that is, to prepare a system single chip, is facing increasing challenges. , For example, performance improvement becomes more difficult, heat dissipation processing becomes more difficult, leakage current increases, and the chip clock main frequency increases more slowly, etc. Moreover, the increase in chip area and chip design complexity has also brought about the problem of declining manufacturing yield. In order to improve performance while controlling costs and power consumption, the functions of the entire system that were originally implemented on a single system chip are split and implemented through multiple chips, and packaging and interconnection technologies are used to build chipsets or multi-chips. module. For example, through chiplet technology, each functional block originally integrated in the same system single chip is separated, manufactured separately, and finally integrated and packaged into a system chipset through packaging and interconnection technology.

继续参阅图1，图1所示的系统芯片组110可以对应整个系统的功能。系统芯片组110所实现的系统的功能可以原本是在单个晶片上实现的也就是可以通过系统单晶片实现，而在图1中将系统芯片组110的功能拆分后，通过四个晶片也即晶片A 102、晶片B 104、晶片C 106和晶片D 108来实现。晶片A 102、晶片B 104、晶片C 106和晶片D 108彼此之间通过晶片到晶片(die to die)接口互联来实现数据互联。从图1中可以看出，晶片A 102、晶片B 104、晶片C 106和晶片D 108中的任一个晶片与其它三个晶片分别地存在晶片到晶片接口互联关系(图1中用一个双向箭头示出一个晶片到晶片接口互联关系)。晶片之间的互联接口用于晶片之间的数据互联，也对应了晶片到晶片接口互联，必须满足高数据带宽、低延时和高可靠性等特性，才能用于集成封装多个晶片进而满足芯片在网络、超大规模数据中心和人工智能等领域的应用。这里，晶片到晶片接口互联意味着将一个晶片与另一个晶片互联后用于封装在一起，每个晶片都包括至少一个带有物理接口的模块。具有公共接口的一个晶片可以通过短距离导线与另一个晶片进行通信。晶片到晶片接口互联可以分成同质晶片和异质晶片两种情形。其中，同质晶片主要是进行晶片拆分，将两个或者更多个相等的或者同质的晶片数据互联，使得这些更小个的多个晶片表现得像单个晶片。例如，图1中的四个晶片也即晶片A102、晶片B 104、晶片C 106和晶片D 108，可以均是中央处理器(Central Processing Unit)。异质晶片主要是进行封装集成，将不同功能集成到统一封装后的芯片组里。例如图1中的晶片A 102和晶片B 104可以是中央处理器，而晶片C 106和晶片D 108可以是图形处理器(graphic processing unit，GPU)、神经网络处理器(neural-network processing unit，NPU)、张量处理器(tensor processing unit，TPU)或数据处理器(data processing unit，DPU)。因此，不同的半导体制备工艺，不同的集成封装方式，这些可能影响晶片到晶片接口互联的有关细节。例如，晶片到晶片接口互联可能用于一个中央处理器的晶片与另一个中央处理器的晶片之间的数据互联，也可能用于一个中央处理器的晶片与神经网络处理器的晶片之间的数据互联。另外，晶片到晶片接口互联一般采用特定的协议、规则、策略等，用于规范晶片之间高速数据传输的有关细节。例如，可以在流量控制、调度功能、带宽、数据通道等有关方面做出优化或者提出需求，从而更好地满足如网络、超大规模数据中心和人工智能等业务场景。例如，晶片到晶片接口互联可能采用支持多通道传输的协议，并且在数据传输带宽方面可能要达到10千兆位每秒(Gbps)至300Gbps或者更高速率。晶片到晶片接口互联可能还需要支持串行器解串器(SERializer/DESerializer，SERDES)。SERDES意味着在发送端将多路低速并行信号转换成高速串行信号后以差分方式传输，在接收端将高速串行信号转换成低速并行信号。为了支持SERDES，在接收端需要集成时钟数据恢复(clock data recovery，CDR)电路，利用CDR电路从数据边沿信息恢复时钟信号并进行采样来恢复数据信号。因为要将多个晶片封装集成到一起，所以晶片到晶片接口互联也是集成在芯片组内部，例如图1所示的晶片A 102、晶片B 104、晶片C106和晶片D 108彼此之间的数据互联被集成在系统芯片组110内部。这样意味着，当晶片之间的数据互联发生了问题，例如出现了数据传输错误、数据丢失等，很难对问题进行定位也很难做出修正。但是，随着对晶片之间的数据传输速率要求越来越高，也给确保晶片之间数据传输的可靠性和正确性带来更大的挑战。为此，本申请实施例提供了一种晶片到晶片接口互联的方法、设备及介质，用于使得晶片到晶片接口互联的方式可以满足数据传输速率高和数据传输可靠性高的需求。Continuing to refer to FIG. 1 , the system chipset 110 shown in FIG. 1 can correspond to the functions of the entire system. The functions of the system implemented by the system chipset 110 may originally be implemented on a single chip, that is, through a system single chip. However, after the functions of the system chipset 110 are split into four chips in Figure 1, that is, Wafer A 102, wafer B 104, wafer C 106 and wafer D 108 are implemented. Wafer A 102, wafer B 104, wafer C 106 and wafer D 108 are interconnected with each other through die to die interfaces to implement data interconnection. It can be seen from Figure 1 that any one of wafer A 102, wafer B 104, wafer C 106 and wafer D 108 has a wafer-to-wafer interface interconnection relationship with the other three wafers (a bidirectional arrow is used in Figure 1 Shows a die-to-die interface interconnect relationship). The interconnection interface between chips is used for data interconnection between chips, and also corresponds to the chip-to-wafer interface interconnection. It must meet the characteristics of high data bandwidth, low latency and high reliability before it can be used to integrate and package multiple chips to meet the requirements of Chip applications in areas such as networks, hyperscale data centers and artificial intelligence. Here, wafer-to-wafer interface interconnection means interconnecting one wafer to another wafer for packaging together, each wafer including at least one module with a physical interface. One die with a common interface can communicate with another die over short distance wires. Chip-to-chip interface interconnection can be divided into two cases: homogeneous die and heterogeneous die. Among them, homogeneous wafers mainly perform wafer splitting and interconnect the data of two or more equal or homogeneous wafers so that these smaller wafers behave like a single wafer. For example, the four wafers in Figure 1, namely wafer A102, wafer B 104, wafer C 106 and wafer D 108, may all be central processing units (Central Processing Units). Heterogeneous chips are mainly packaged and integrated, integrating different functions into a unified packaged chipset. For example, wafer A 102 and wafer B 104 in FIG. 1 may be central processing units, while wafer C 106 and wafer D 108 may be graphics processing units (GPU) or neural-network processing units. NPU), tensor processing unit (TPU) or data processing unit (DPU). Therefore, different semiconductor manufacturing processes and different integrated packaging methods may affect the relevant details of chip-to-wafer interface interconnection. For example, a chip-to-chip interface interconnect may be used for data interconnection between a CPU chip and another CPU chip, or between a CPU chip and a neural network processor chip. Data interconnection. In addition, chip-to-chip interface interconnection generally uses specific protocols, rules, strategies, etc. to standardize the details of high-speed data transmission between chips. For example, optimization or requirements can be made in traffic control, scheduling functions, bandwidth, data channels and other related aspects to better meet business scenarios such as networks, ultra-large-scale data centers and artificial intelligence. For example, die-to-die interface interconnects may use protocols that support multi-channel transmission and may have data transmission bandwidths of 10 gigabits per second (Gbps) to 300 Gbps or higher. Chip-to-chip interface interconnects may also need to support serializer-deserializer (SERializer/DESerializer, SERDES). SERDES means that multiple low-speed parallel signals are converted into high-speed serial signals at the transmitting end and then transmitted in a differential manner, and high-speed serial signals are converted into low-speed parallel signals at the receiving end. In order to support SERDES, a clock data recovery (CDR) circuit needs to be integrated at the receiving end. The CDR circuit is used to recover the clock signal from the data edge information and perform sampling to recover the data signal. Because multiple chip packages need to be integrated together, the chip-to-chip interface interconnection is also integrated within the chipset, such as the data interconnection between chip A 102, chip B 104, chip C 106 and chip D 108 shown in Figure 1 Integrated inside the system chipset 110. This means that when problems occur in the data interconnection between chips, such as data transmission errors, data loss, etc., it is difficult to locate the problem and make corrections. However, as the requirements for data transmission rates between chips become higher and higher, it also brings greater challenges to ensuring the reliability and accuracy of data transmission between chips. To this end, embodiments of the present application provide a method, device and medium for chip-to-wafer interface interconnection, so that the chip-to-wafer interface interconnection method can meet the requirements of high data transmission rate and high data transmission reliability.

图2为本申请实施例提供的一种通过晶片到晶片接口互联的多个晶片的示意图。如图2所示，晶片E 210与晶片F 220之间通过各自的晶片到晶片接口实现数据互联。具体地，晶片E 210包括晶片到晶片接口E 212，快速外围组件互联接口E 214，高带宽存储器接口E 216，以及以太网接口E 218。晶片F 220包括晶片到晶片接口F 222，快速外围组件互联接口F 224，高带宽存储器接口F 226，以及以太网接口F 228。其中，晶片E 210所包括的晶片到晶片接口E 212用于与晶片F 220所包括的晶片到晶片接口F 222进行物理连接，从而实现晶片E 210和晶片F 220之间的数据互联。而晶片E 210所包括的快速外围组件互联接口E 214和晶片F220所包括的快速外围组件互联接口F 224都用于支持快速外围组件互联(peripheral component interconnect express，PCIE)标准，例如用于连接PCIE标准设备，从而可以用于连接服务器和高性能计算中心。晶片E 210所包括的高带宽存储器接口E216以及晶片F 220所包括的高带宽存储器接口F 226都用于连接高带宽存储器(HighBandwidth Memory，HBM)从而可用于如网络交换、网络报文转发及图形处理器等业务场景。晶片E 210所包括的以太网接口E 218和晶片F 220所包括的以太网接口F 228均用于提供与以太网(ethernet)有关的功能，例如接入以太网网卡等。图2所示的晶片E 210与晶片F220，各自包括多种不同功能的接口，从而可以用于各种业务场景，例如可以通过晶片E 210所包括的快速外围组件互联接口E 214接入PCIE总线或者通过晶片F 220所包括的高带宽存储器接口F 226接入HBM等。而在晶片E 210与晶片F 220之间通过晶片E 210所包括的晶片到晶片接口E 212与晶片F 220所包括的晶片到晶片接口F 222的物理连接实现了晶片之间数据互联，从而使得晶片E 210与晶片F 220对外表现得像一个晶片，这样在提升性能同时还做到控制成本和功耗。晶片E 210与晶片F 220之间的晶片到晶片接口互联，也就是晶片E 210所包括的晶片到晶片接口E 212与晶片F 220所包括的晶片到晶片接口F 222的之间的数据互联，需要满足特定的协议、规则、策略等提出的输出传输方面的要求，也可能需要在流量控制、调度功能、带宽、数据通道等方面满足要求。FIG. 2 is a schematic diagram of multiple chips interconnected through a chip-to-wafer interface according to an embodiment of the present application. As shown in Figure 2, data interconnection is achieved between chip E 210 and chip F 220 through respective chip-to-wafer interfaces. Specifically, die E 210 includes die-to-die interface E 212, peripheral component interconnect fast interface E 214, high bandwidth memory interface E 216, and Ethernet interface E 218. Die F 220 includes die-to-die interface F 222 , Peripheral Component Interconnect Fast interface F 224 , high-bandwidth memory interface F 226 , and Ethernet interface F 228 . Among them, the wafer-to-wafer interface E 212 included in the wafer E 210 is used to physically connect with the wafer-to-wafer interface F 222 included in the wafer F 220, thereby realizing data interconnection between the wafer E 210 and the wafer F 220. The peripheral component interconnect express E 214 included in the chip E 210 and the peripheral component interconnect express F 224 included in the chip F 220 are both used to support the peripheral component interconnect express (PCIE) standard, for example, for connecting to PCIE. Standard equipment that can be used to connect servers and high-performance computing centers. The high-bandwidth memory interface E216 included in the chip E 210 and the high-bandwidth memory interface F 226 included in the chip F 220 are both used to connect high-bandwidth memory (HighBandwidth Memory, HBM) and can be used for network switching, network message forwarding and graphics. Processor and other business scenarios. The Ethernet interface E 218 included in the chip E 210 and the Ethernet interface F 228 included in the chip F 220 are both used to provide functions related to Ethernet (ethernet), such as access to an Ethernet network card. The chip E 210 and the chip F 220 shown in Figure 2 each include a variety of interfaces with different functions, so that they can be used in various business scenarios. For example, the PCIE bus can be accessed through the fast peripheral component interconnection interface E 214 included in the chip E 210 Or access the HBM or the like through the high-bandwidth memory interface F 226 included in the chip F 220 . The data interconnection between wafers is realized through the physical connection between wafer E 210 and wafer F 220 through the wafer-to-wafer interface E 212 included in wafer E 210 and the wafer-to-wafer interface F 222 included in wafer F 220, so that Chip E 210 and chip F 220 behave like one chip to the outside world, which improves performance while controlling cost and power consumption. The wafer-to-wafer interface interconnection between wafer E 210 and wafer F 220 is the data interconnection between the wafer-to-wafer interface E 212 included in wafer E 210 and the wafer-to-wafer interface F 222 included in wafer F 220, It needs to meet the output transmission requirements put forward by specific protocols, rules, policies, etc., and may also need to meet requirements in terms of flow control, scheduling functions, bandwidth, data channels, etc.

图3为本申请实施例提供的另一种通过晶片到晶片接口互联的多个晶片的示意图。如图3所示，晶片G 310包括第一晶片到晶片接口G 312，第二晶片到晶片接口G 313，高带宽存储器接口G 316，以及以太网接口G 318。晶片H 320包括第一晶片到晶片接口H 322，第二晶片到晶片接口H 323，高带宽存储器接口H 326，以及以太网接口H 328。晶片G 310和晶片H320之间通过晶片G 310所包括第一晶片到晶片接口G 312与晶片H 320所包括的第一晶片到晶片接口H 322的物理连接实现了晶片之间数据互联，从而使得晶片G 310与晶片H320对外表现得像一个晶片，这样在提升性能同时还做到控制成本和功耗。另外，晶片G 310还通过第二晶片到晶片接口G 313可以与另一晶片连接，晶片H 320还通过第二晶片到晶片接口H 323可以与另一晶片连接。晶片G 310与晶片H 320之间的晶片到晶片接口互联，需要满足特定的协议、规则、策略等提出的输出传输方面的要求，也可能需要在流量控制、调度功能、带宽、数据通道等方面满足要求。FIG. 3 is another schematic diagram of multiple chips interconnected through a chip-to-wafer interface according to an embodiment of the present application. As shown in FIG. 3 , die G 310 includes a first die-to-die interface G 312 , a second die-to-die interface G 313 , a high-bandwidth memory interface G 316 , and an Ethernet interface G 318 . Wafer H 320 includes a first wafer-to-wafer interface H 322, a second wafer-to-wafer interface H 323, a high bandwidth memory interface H 326, and an Ethernet interface H 328. The data interconnection between the wafers is realized through the physical connection between the wafer G 310 and the wafer H320 through the first wafer-to-wafer interface G 312 included in the wafer G 310 and the first wafer-to-wafer interface H 322 included in the wafer H 320, so that Chip G 310 and chip H320 behave like one chip to the outside world, which improves performance while controlling cost and power consumption. In addition, the wafer G 310 can also be connected to another wafer through the second wafer-to-wafer interface G 313, and the wafer H 320 can also be connected to another wafer through the second wafer-to-wafer interface H 323. The chip-to-chip interface interconnection between chip G 310 and chip H 320 needs to meet the output transmission requirements of specific protocols, rules, policies, etc., and may also require improvements in flow control, scheduling functions, bandwidth, data channels, etc. fulfil requirements.

参阅图1、图2和图3，晶片到晶片接口互联意味着将一个晶片与另一个晶片互联后用于封装在一起，每个晶片都包括至少一个带有物理接口的模块。具有公共接口的一个晶片可以通过短距离导线与另一个晶片进行通信。根据多个晶片集成封装成一个芯片组的方式，也根据多个晶片的功能划分和接口布置，一个晶片可能有一个或者多个晶片到晶片接口，每个晶片到晶片接口也可能采用特定的协议、规则、策略等，用于满足在数据传输、流量控制、调度功能、带宽、数据通道等有关方面的要求。而这些协议、规则、策略等，可能也会对晶片到晶片接口互联的具体细节提出要求，例如通信通道数量、控制字结构、数据编码和数据解码、流量控制选项等。为此，需要提供一种晶片到晶片接口互联的方法，可以满足数据传输速率高和数据传输可靠性高的需求，还可以灵活地适配各种协议、规则、策略等在如数据传输、流量控制、调度功能、带宽、数据通道等方面的要求。下面结合图4详细说明。Referring to Figure 1, Figure 2 and Figure 3, chip-to-wafer interface interconnection means interconnecting one chip to another chip for packaging together. Each chip includes at least one module with a physical interface. One die with a common interface can communicate with another die over short distance wires. Depending on how multiple chips are integrated and packaged into a chipset, and also based on the functional division and interface layout of multiple chips, a chip may have one or more chip-to-chip interfaces, and each chip-to-chip interface may also use a specific protocol. , rules, policies, etc., to meet the requirements in data transmission, flow control, scheduling functions, bandwidth, data channels, etc. These protocols, rules, strategies, etc. may also impose requirements on specific details of chip-to-chip interface interconnection, such as the number of communication channels, control word structure, data encoding and data decoding, flow control options, etc. To this end, it is necessary to provide a chip-to-chip interface interconnection method that can meet the needs of high data transmission rate and high data transmission reliability, and can also flexibly adapt to various protocols, rules, policies, etc., such as data transmission, traffic Requirements for control, scheduling functions, bandwidth, data channels, etc. Detailed description is given below in conjunction with Figure 4.

图4为本申请实施例提供的一种晶片到晶片接口互联的方法的流程示意图。其中，图4所示晶片到晶片接口互联的方法适用于多个晶片，多个晶片中的每一个晶片包括晶片到晶片接口用于该晶片和所述多个晶片中相对于该晶片的另一晶片之间的数据互联。所述方法应用于第一晶片，所述第一晶片是所述多个晶片中的任一晶片，所述第一晶片的晶片到晶片接口包括接口缓存、协议处理单元和传输接口。如图4所示，方法包括以下步骤。FIG. 4 is a schematic flowchart of a chip-to-wafer interface interconnection method provided by an embodiment of the present application. Wherein, the wafer-to-wafer interface interconnection method shown in Figure 4 is applicable to multiple wafers, and each wafer in the multiple wafers includes a wafer-to-wafer interface for the wafer and another of the multiple wafers relative to the wafer. Data interconnection between chips. The method is applied to a first wafer, which is any wafer among the plurality of wafers, and the wafer-to-wafer interface of the first wafer includes an interface cache, a protocol processing unit and a transmission interface. As shown in Figure 4, the method includes the following steps.

步骤S402：响应于所述第一晶片的数据发送，输入待发送第一数据到所述接口缓存。Step S402: In response to the data transmission of the first chip, input the first data to be sent to the interface cache.

步骤S404：通过所述协议处理单元，对所述待发送第一数据进行数据切割得到切割后数据，对所述切割后数据进行循环冗余校验计算生成循环冗余计算结果，将所述循环冗余计算结果添加到所述切割后数据中从而进行组装和条带化分发得到分发后数据，对所述分发后数据进行编码操作得到编码后数据，对所述编码后数据进行扰码和组帧得到待发送第二数据。Step S404: Use the protocol processing unit to perform data cutting on the first data to be sent to obtain cut data, perform cyclic redundancy check calculation on the cut data to generate a cyclic redundancy calculation result, and convert the cyclic redundancy calculation result to the cut data. The redundant calculation results are added to the cut data to assemble and stripe the distributed data to obtain the distributed data. The distributed data is encoded to obtain the encoded data. The encoded data is scrambled and grouped. The frame gets the second data to be sent.

步骤S406：通过所述传输接口，发送所述待发送第二数据。Step S406: Send the second data to be sent through the transmission interface.

步骤S408：响应于所述第一晶片的数据接收，通过所述传输接口获取待接收第一数据。Step S408: In response to the data reception of the first chip, obtain the first data to be received through the transmission interface.

步骤S410：通过所述协议处理单元，对所述待接收第一数据进行定帧和解扰后再进行解码操作得到解码后数据，对所述解码后数据进行数据聚合处理得到聚合后数据，对所述聚合后数据进行循环冗余校验和数据组合得到待接收第二数据。Step S410: Use the protocol processing unit to frame and descramble the first data to be received and then perform a decoding operation to obtain decoded data. Perform data aggregation processing on the decoded data to obtain aggregated data. The aggregated data is subjected to cyclic redundancy check and data combination to obtain the second data to be received.

步骤S412：输入所述待接收第二数据到所述接口缓存。Step S412: Input the second data to be received into the interface cache.

图4所示的晶片到晶片接口互联的方法适用于多个晶片，多个晶片中的每一个晶片包括晶片到晶片接口用于该晶片和所述多个晶片中相对于该晶片的另一晶片之间的数据互联。所述方法应用于第一晶片，所述第一晶片是所述多个晶片中的任一晶片，所述第一晶片的晶片到晶片接口包括接口缓存、协议处理单元和传输接口。多个晶片对应了将多个功能块组合在一起组成一块芯片，每个功能块之间的接口是晶片到晶片接口。第一晶片是所述多个晶片中的任一晶片，第一晶片的晶片到晶片接口提供了第一晶片与另一晶片之间的数据互联。参阅上述的各个步骤，在步骤S402，响应于所述第一晶片的数据发送，输入待发送第一数据到所述接口缓存。这里所述第一晶片可以从用户侧接收数据并将用户侧的数据输入到所述接口缓存。在一些实施例中，可以通过所述接口缓存进行速率适配处理，这样有助于整体上数据传输的稳定性和可靠性。接着，在步骤S404，通过所述协议处理单元，对所述待发送第一数据进行数据切割得到切割后数据。所述第一晶片的晶片到晶片接口可能采用特定的协议、规则、策略等，用于满足在数据传输、流量控制、调度功能、带宽、数据通道等有关方面的要求。而这些协议、规则、策略等，可能也会对晶片到晶片接口互联的具体细节提出要求，例如通信通道数量、控制字结构、数据编码和数据解码、流量控制选项等。这里，所述协议处理单元主要用于提供必要的处理功能以满足所述第一晶片的晶片到晶片接口所采用的协议、规则、或者策略等在数据传输、流量控制、调度功能、带宽、数据通道等方面的要求。在步骤S404中通过所述协议处理单元对所述待发送第一数据进行数据切割得到切割后数据。这里意味着在协议层进行数据切割，这是考虑到后续流程中突发长度的设置和控制字组装需要，一次传输的数据长度称为突发长度。一般来说，进行数据切割是以固定比特位(例如64比特位)为单位切割数据。对所述切割后数据进行循环冗余校验计算生成循环冗余计算结果。进行循环冗余校验计算以生成循环冗余计算结果，是为了后续流程的循环冗余校验(cyclic redundancy check，CRC)。CRC利用了能根据数据包或文件等数据生成简短的固定位数校验码的散列函数，用来校验数据传输或者保存后可能出现的错误。可以在数据传输或者保存之前计算校验码并将校验码附加到数据后面，这样接收方可以进行校验以确定数据是否发生变化。在步骤S404，将所述循环冗余计算结果添加到所述切割后数据中从而进行组装和条带化分发得到分发后数据。这里，组装意味着数据切割和控制字组装。在一些实施例中，在数据切割的基础上将协议控制字与代表了要传输的数据的数据字一起结合突发长度的设置进行组装。条带化意味着根据数据通道和通道化数据传输方面的要求，使得分发后数据可以通过多个通道发送，例如每个通道可以对应一个SERDES物理链路。因此，响应于所述第一晶片的数据发送，待发送第一数据被输入到所述接口缓存，然后被数据切割得到切割后数据，再然后将所述循环冗余计算结果添加到所述切割后数据中从而进行组装和条带化分发得到分发后数据。如此完成了从待发送第一数据到分发后数据的转换，也体现了在协议层进行控制的细节，可以用于满足例如通信通道数量、控制字结构等要求。接着，在步骤S404，对所述分发后数据进行编码操作得到编码后数据，对所述编码后数据进行扰码和组帧得到待发送第二数据。进行编码操作意味着进行帧层控制字组装，其中包括添加同步字作为元帧同步头用于确定元帧位置。应当理解的是，进行编码操作一般是基于所述第一晶片的晶片到晶片接口所采用的特定的协议、规则、策略等，例如协议可能规定了特定的编码格式，要求将原来的数据按照一定的规则编写成特定的编码格式下的数据。在一些实施例中，进行编码操作意味着将64比特位的数据编码成67比特位的数据。在进行编码操作得到编码后数据之后，对所述编码后数据进行扰码和组帧得到待发送第二数据。这样有助于增强数据完整性和链路稳定性。最后，在步骤S406，通过所述传输接口，发送所述待发送第二数据。在一些实施例中，传输接口与多个SERDES通道连接，待发送第二数据可以经过对齐后经多个SERDES通道发送出去。The method of wafer-to-wafer interface interconnection shown in Figure 4 is applicable to a plurality of wafers, each wafer of the plurality of wafers including a wafer-to-wafer interface for that wafer and another wafer of the plurality of wafers relative to the wafer. data interconnection between. The method is applied to a first wafer, which is any wafer among the plurality of wafers, and the wafer-to-wafer interface of the first wafer includes an interface cache, a protocol processing unit and a transmission interface. Multiple chips correspond to the combination of multiple functional blocks to form a chip, and the interface between each functional block is a chip-to-chip interface. The first wafer is any one of the plurality of wafers, and the wafer-to-wafer interface of the first wafer provides data interconnection between the first wafer and another wafer. Referring to the above steps, in step S402, in response to the data transmission of the first chip, the first data to be sent is input into the interface buffer. Here, the first chip may receive data from the user side and input the data from the user side into the interface cache. In some embodiments, rate adaptation processing can be performed through the interface cache, which contributes to the overall stability and reliability of data transmission. Next, in step S404, the protocol processing unit performs data cutting on the first data to be sent to obtain cut data. The chip-to-chip interface of the first chip may adopt specific protocols, rules, strategies, etc., to meet relevant requirements in data transmission, flow control, scheduling functions, bandwidth, data channels, etc. These protocols, rules, strategies, etc. may also impose requirements on specific details of chip-to-chip interface interconnection, such as the number of communication channels, control word structure, data encoding and data decoding, flow control options, etc. Here, the protocol processing unit is mainly used to provide necessary processing functions to meet the protocols, rules, or strategies adopted by the chip-to-wafer interface of the first chip in terms of data transmission, flow control, scheduling functions, bandwidth, data requirements for channels, etc. In step S404, the protocol processing unit performs data segmentation on the first data to be sent to obtain segmented data. This means that data is cut at the protocol layer. This takes into account the setting of the burst length and the need for control word assembly in the subsequent process. The length of data transmitted at one time is called the burst length. Generally speaking, data segmentation is performed by segmenting data in units of fixed bits (for example, 64 bits). Perform cyclic redundancy check calculation on the cut data to generate a cyclic redundancy calculation result. The cyclic redundancy check calculation is performed to generate the cyclic redundancy calculation result for the purpose of the cyclic redundancy check (CRC) of the subsequent process. CRC uses a hash function that can generate a short fixed-digit check code based on data such as data packets or files to check errors that may occur after data transmission or storage. The check code can be calculated before the data is transmitted or saved and appended to the data so that the receiver can verify whether the data has changed. In step S404, the cyclic redundancy calculation result is added to the cut data to perform assembly and striping distribution to obtain distributed data. Here, assembly means data cutting and control word assembly. In some embodiments, the protocol control word and the data word representing the data to be transmitted are assembled together with the setting of the burst length on the basis of data cutting. Striping means that according to the requirements of data channels and channelized data transmission, the distributed data can be sent through multiple channels. For example, each channel can correspond to a SERDES physical link. Therefore, in response to the data transmission of the first wafer, the first data to be sent is input to the interface cache, and then is cut by the data to obtain the cut data, and then the cyclic redundancy calculation result is added to the cut data. The post-data is then assembled and striped for distribution to obtain the post-distribution data. This completes the conversion from the first data to be sent to the distributed data, and also reflects the details of control at the protocol layer, which can be used to meet requirements such as the number of communication channels and control word structure. Next, in step S404, an encoding operation is performed on the distributed data to obtain encoded data, and the encoded data is scrambled and framed to obtain second data to be sent. Encoding operation means performing frame layer control word assembly, which includes adding a synchronization word as a meta-frame synchronization header to determine the meta-frame position. It should be understood that encoding operations are generally based on specific protocols, rules, strategies, etc. adopted by the chip-to-wafer interface of the first chip. For example, the protocol may stipulate a specific encoding format, requiring the original data to be encoded according to a certain The rules are written into data in a specific encoding format. In some embodiments, performing an encoding operation means encoding 64-bit data into 67-bit data. After the encoding operation is performed to obtain the encoded data, the encoded data is scrambled and framed to obtain the second data to be sent. This helps enhance data integrity and link stability. Finally, in step S406, the second data to be sent is sent through the transmission interface. In some embodiments, the transmission interface is connected to multiple SERDES channels, and the second data to be sent can be aligned and then sent out through the multiple SERDES channels.

继续参阅图4及上述各个步骤，在步骤S408，响应于所述第一晶片的数据接收，通过所述传输接口获取待接收第一数据。接着，在步骤S410，通过所述协议处理单元，对所述待接收第一数据进行定帧和解扰后再进行解码操作得到解码后数据，对所述解码后数据进行数据聚合处理得到聚合后数据，对所述聚合后数据进行循环冗余校验和数据组合得到待接收第二数据。如此，从所述传输接口接收的待接收第一数据，例如通过SERDES接口接收到的数据，经过定帧和解扰后，也就是完成了帧定位和解扰码。接着，进行解码操作，解码后的数据进行数据聚合处理，最后再经过循环冗余校验和数据组合，恢复出用户数据，最后写入数据缓存。可以经过速率适配调整后再发送到用户侧接口。Continuing to refer to FIG. 4 and the above steps, in step S408, in response to the data reception of the first chip, the first data to be received is obtained through the transmission interface. Next, in step S410, the protocol processing unit performs framing and descrambling on the first data to be received and then performs a decoding operation to obtain decoded data, and performs data aggregation processing on the decoded data to obtain aggregated data. , perform cyclic redundancy check and data combination on the aggregated data to obtain the second data to be received. In this way, after the first data to be received received from the transmission interface, for example, the data received through the SERDES interface, is framed and descrambled, frame positioning and descrambling are completed. Then, the decoding operation is performed, and the decoded data undergoes data aggregation processing. Finally, the user data is restored through cyclic redundancy check and data combination, and finally written into the data cache. It can be sent to the user-side interface after rate adaptation adjustment.

如此，图4所示的一种晶片到晶片接口互联的方法，可以满足数据传输速率高和数据传输可靠性高的需求，还可以灵活地适配各种协议、规则、策略等在如数据传输、流量控制、调度功能、带宽、数据通道等方面的要求。图4所示的晶片到晶片接口互联的方法，其中的编码操作和解码操作可以是基于所述第一晶片的晶片到晶片接口所采用的具体协议，例如按照其中规定的编码格式和规则进行编码和解码。一般情况下，晶片之间接口需要较大的数据传输带宽，并且还具有流量控制及通道化传输特性。在网络芯片的业务场景下，为了实现高速数字通信传输，可以在晶片之间接口采用特定通信协议的接口与SERDES的组合。In this way, a chip-to-chip interface interconnection method shown in Figure 4 can meet the needs of high data transmission rate and high data transmission reliability, and can also flexibly adapt to various protocols, rules, strategies, etc., such as data transmission , flow control, scheduling functions, bandwidth, data channels, etc. requirements. In the wafer-to-wafer interface interconnection method shown in Figure 4, the encoding operation and the decoding operation may be based on the specific protocol adopted by the wafer-to-wafer interface of the first wafer, for example, encoding according to the encoding format and rules specified therein and decoding. Generally, the interface between chips requires a large data transmission bandwidth, and also has flow control and channelized transmission characteristics. In the business scenario of network chips, in order to achieve high-speed digital communication transmission, a combination of a specific communication protocol interface and SERDES can be used at the interface between chips.

在一种可能的实施方式中，所述编码操作基于第一编码方案，所述解码操作基于第一解码方案，所述第一编码方案对应所述第一解码方案。在一些实施例中，所述第一编码方案是64/67编码，所述第一解码方案是64/67解码。如此，通过彼此对应的第一编码方案和第一解码方案，可以实现所述第一晶片的晶片到晶片接口的数据发送和数据接收。其中，64/67编码意味着将原来的数据按照一定的规则编写成特定的编码格式下的数据，也就是将64比特位的数据编码成67比特位的数据。64/67解码意味着将67比特位的数据解码成64比特位的数据。In a possible implementation, the encoding operation is based on a first encoding scheme, the decoding operation is based on a first decoding scheme, and the first encoding scheme corresponds to the first decoding scheme. In some embodiments, the first encoding scheme is 64/67 encoding and the first decoding scheme is 64/67 decoding. In this way, through the first encoding scheme and the first decoding scheme corresponding to each other, data transmission and data reception of the wafer-to-wafer interface of the first wafer can be achieved. Among them, 64/67 encoding means that the original data is compiled into data in a specific encoding format according to certain rules, that is, 64-bit data is encoded into 67-bit data. 64/67 decoding means decoding 67-bit data into 64-bit data.

在一种可能的实施方式中，所述待发送第一数据来自与所述第一晶片相关联的用户数据接口，所述待接收第二数据被发送给所述用户数据接口。在一些实施例中，所述待接收第二数据在被发送给所述用户数据接口之前进行速率适配处理。如此，通过速率适配处理，有助于整体上数据传输的稳定性和可靠性。In a possible implementation, the first data to be sent comes from a user data interface associated with the first chip, and the second data to be received is sent to the user data interface. In some embodiments, the second data to be received undergoes rate adaptation processing before being sent to the user data interface. In this way, through rate adaptation processing, it contributes to the overall stability and reliability of data transmission.

在一种可能的实施方式中，所述待发送第二数据被发送到相对于所述第一晶片的第二晶片的晶片到晶片接口的传输接口。如此，实现了第一晶片和第二晶片之间的数据互联。In a possible implementation, the second data to be sent is sent to a transmission interface of a wafer-to-wafer interface of a second wafer relative to the first wafer. In this way, data interconnection between the first chip and the second chip is achieved.

在一种可能的实施方式中，通过所述协议处理单元，将所述循环冗余计算结果添加到所述切割后数据中从而进行组装和条带化分发得到所述分发后数据，包括：通过所述协议处理单元，将所述循环冗余计算结果添加到所述切割后数据中从而按照突发长度的设置进行组装和条带化分发得到所述分发后数据。如此，在数据切割的基础上将协议控制字与代表了要传输的数据的数据字一起结合突发长度的设置进行组装。条带化意味着根据数据通道和通道化数据传输方面的要求，使得分发后数据可以通过多个通道发送，例如每个通道可以对应一个SERDES物理链路。因此，将所述循环冗余计算结果添加到所述切割后数据中从而按照突发长度的设置进行组装和条带化分发得到所述分发后数据，意味着结合了突发长度的设置也就是一次传输的数据长度的设置，这样有助于后续通过多个通道如多个SERDES通道进行数据传输。In a possible implementation, the protocol processing unit adds the cyclic redundancy calculation result to the cut data to assemble and stripe the distributed data to obtain the distributed data, including: The protocol processing unit adds the cyclic redundancy calculation result to the cut data to assemble and stripe the distributed data according to the setting of the burst length to obtain the distributed data. In this way, on the basis of data cutting, the protocol control word and the data word representing the data to be transmitted are assembled together with the setting of the burst length. Striping means that according to the requirements of data channels and channelized data transmission, the distributed data can be sent through multiple channels. For example, each channel can correspond to a SERDES physical link. Therefore, adding the cyclic redundancy calculation result to the cut data to assemble and stripe the distributed data according to the setting of the burst length means that the setting of the burst length is combined, that is, Setting the data length for one transmission will facilitate subsequent data transmission through multiple channels such as multiple SERDES channels.

在一种可能的实施方式中，所述传输接口是串行器解串器接口。在一种可能的实施方式中，所述协议处理单元是Interlaken协议处理单元。一般情况下，晶片之间接口需要较大的数据传输带宽，并且还具有流量控制及通道化传输特性。在网络芯片的业务场景下，为了实现高速数字通信传输，可以在晶片之间接口采用特定通信协议的接口与SERDES的组合。上面提到，图4所示的晶片到晶片接口互联的方法，其中的编码操作和解码操作可以是基于所述第一晶片的晶片到晶片接口所采用的具体协议，例如按照其中规定的编码格式和规则进行编码和解码。例如，Interlaken协议规定了64/67编码这一特定编码格式，也就是意味着，当所述协议处理单元是Interlaken协议处理单元时，所述第一晶片的晶片到晶片接口是基于Interlaken协议进行晶片之间数据互联，因此需要将原来的数据按照一定的规则编写成特定的编码格式下的数据，也就是将64比特位的数据编码成67比特位的数据。应当理解的是，Interlaken协议处理单元是针对Interlaken协议做出优化设计。所述协议处理单元还可能适配其它的芯片之间通信协议，例如XAUI协议和PCIE协议等。取决于具体采用的协议，还有传输接口的设置等，可以采用相应的编码方案和解码方案，从而可以灵活地适配各种协议、规则、策略等在如数据传输、流量控制、调度功能、带宽、数据通道等方面的要求。In a possible implementation, the transmission interface is a serializer-deserializer interface. In a possible implementation, the protocol processing unit is an Interlaken protocol processing unit. Generally, the interface between chips requires a large data transmission bandwidth, and also has flow control and channelized transmission characteristics. In the business scenario of network chips, in order to achieve high-speed digital communication transmission, a combination of a specific communication protocol interface and SERDES can be used at the interface between chips. As mentioned above, in the wafer-to-wafer interface interconnection method shown in Figure 4, the encoding operation and decoding operation may be based on the specific protocol adopted by the wafer-to-wafer interface of the first wafer, for example, according to the encoding format specified therein and rules for encoding and decoding. For example, the Interlaken protocol specifies a specific encoding format of 64/67 encoding, which means that when the protocol processing unit is the Interlaken protocol processing unit, the chip-to-wafer interface of the first chip is based on the Interlaken protocol. The data is interconnected, so the original data needs to be compiled into data in a specific encoding format according to certain rules, that is, 64-bit data is encoded into 67-bit data. It should be understood that the Interlaken protocol processing unit is optimized for the Interlaken protocol. The protocol processing unit may also adapt to other inter-chip communication protocols, such as XAUI protocol and PCIE protocol. Depending on the specific protocol used, as well as the settings of the transmission interface, etc., the corresponding encoding scheme and decoding scheme can be used, so that various protocols, rules, strategies, etc. can be flexibly adapted to data transmission, flow control, scheduling functions, etc. Bandwidth, data channel, etc. requirements.

在一种可能的实施方式中，通过所述协议处理单元，至少在对所述待发送第一数据进行数据切割得到切割后数据之前，对所述接口缓存中的所述待发送第一数据进行速率适配处理。如此，通过速率适配处理，有助于整体上数据传输的稳定性和可靠性。In a possible implementation, through the protocol processing unit, at least before performing data cutting on the first data to be sent to obtain the cut data, the first data to be sent in the interface cache is processed. Rate adaptation processing. In this way, through rate adaptation processing, it contributes to the overall stability and reliability of data transmission.

在一种可能的实施方式中，通过所述协议处理单元，在对所述切割后数据进行循环冗余校验计算生成所述循环冗余计算结果的过程中同步生成控制字段用于记录所述切割后数据的描述信息。如此，生成的控制字段可以用于数据切割和控制字组装。例如，控制字段可以记录所述切割后数据的描述信息以体现协议层控制细节，如控制字结构等。In a possible implementation, through the protocol processing unit, during the process of performing cyclic redundancy check calculation on the cut data to generate the cyclic redundancy calculation result, a control field is synchronously generated for recording the Description information of the cut data. In this way, the generated control fields can be used for data cutting and control word assembly. For example, the control field may record description information of the cut data to reflect protocol layer control details, such as control word structure, etc.

在一种可能的实施方式中，所述多个晶片是同质晶片或者异质晶片。在一种可能的实施方式中，所述多个晶片对应同一系统单晶片中的功能块，所述多个晶片通过各自的晶片到晶片接口被封装到一起从而构成与所述系统单晶片对应的系统芯片组。在一种可能的实施方式中，所述多个晶片通过芯粒技术被封装到一起。应当理解的是，不同的半导体制备工艺，不同的集成封装方式，这些可能影响晶片到晶片接口互联的有关细节。例如，晶片到晶片接口互联可能用于同质晶片如一个中央处理器的晶片与另一个中央处理器的晶片之间的数据互联，也可能用于异质晶片如一个中央处理器的晶片与神经网络处理器的晶片之间的数据互联。In a possible implementation, the plurality of wafers are homogeneous wafers or heterogeneous wafers. In a possible implementation, the plurality of chips correspond to functional blocks in the same system single chip, and the plurality of chips are packaged together through respective chip-to-wafer interfaces to form a system corresponding to the system single chip. System chipset. In a possible implementation, the plurality of wafers are packaged together using die technology. It should be understood that different semiconductor manufacturing processes and different integrated packaging methods may affect the relevant details of the chip-to-wafer interface interconnection. For example, chip-to-chip interface interconnection may be used for data interconnection between homogeneous chips, such as a CPU chip and another CPU chip, or may be used for heterogeneous chips, such as a CPU chip and a neural network chip. Data interconnection between chips of a network processor.

继续参阅图4，图4所示的一种晶片到晶片接口互联的方法，可以满足数据传输速率高和数据传输可靠性高的需求，还可以灵活地适配各种协议、规则、策略等在如数据传输、流量控制、调度功能、带宽、数据通道等方面的要求。图4所示的晶片到晶片接口互联的方法，其中的编码操作和解码操作可以是基于所述第一晶片的晶片到晶片接口所采用的具体协议，例如按照其中规定的编码格式和规则进行编码和解码。随着对数据传输带宽和数据传输速率提出了更高的要求，也在数据传输可靠性和数据纠错方面带来了更大的挑战。并且，还需要考虑到具体采用的编码方案和解码方案，可能因晶片到晶片接口所采用的协议、规则、策略等而有所变化，因此有必要在协议层处理部分提供一种通用的数据保护机制，能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护，包括能增强对链路数据的错误检测和提供误码纠正机制，进而可以提高晶片之间互联接口高速数据传输的可靠性，满足如网络、大规模数据中心和人工智能等领域的应用需求。下面详细说明这些改进。Continuing to refer to Figure 4, a chip-to-chip interface interconnection method shown in Figure 4 can meet the requirements of high data transmission rate and high data transmission reliability, and can also flexibly adapt to various protocols, rules, strategies, etc. Such as data transmission, flow control, scheduling functions, bandwidth, data channels, etc. requirements. In the wafer-to-wafer interface interconnection method shown in Figure 4, the encoding operation and the decoding operation may be based on the specific protocol adopted by the wafer-to-wafer interface of the first wafer, for example, encoding according to the encoding format and rules specified therein and decoding. With higher requirements on data transmission bandwidth and data transmission rate, it also brings greater challenges in data transmission reliability and data error correction. Moreover, it is also necessary to consider that the specific encoding scheme and decoding scheme adopted may change due to the protocols, rules, strategies, etc. used in the chip-to-chip interface. Therefore, it is necessary to provide a universal data protection in the protocol layer processing part. The mechanism can flexibly enhance data protection according to the requirements of specific communication protocols and specific encoding and decoding schemes, including enhancing the error detection of link data and providing an error correction mechanism, thereby improving the high-speed data of the interconnection interface between chips. The reliability of transmission meets the application requirements in fields such as networks, large-scale data centers, and artificial intelligence. These improvements are detailed below.

在一种可能的实施方式中，所述编码操作基于第一编码方案，所述编码后数据包括多个压缩前数据，所述多个压缩前数据的大小均为第一数值的比特位，所述第一数值是基于所述第一编码方案，其中，对所述编码后数据进行扰码和组帧得到所述待发送第二数据，包括：对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的多个压缩后数据，并且对所述多个压缩前数据进行前向纠错计算生成冗余纠错码，将所述冗余纠错码加入所述多个压缩后数据从而更新所述编码后数据，对更新后的所述编码后数据进行扰码和组帧得到所述待发送第二数据。如此，利用前向纠错(forward errorcorrection，FEC)技术，实现在通信系统中控制传输错误并且对连同数据发送额外的信息进行错误恢复从而降低比特误码率。具体地，利用FEC技术，通过增强要发送的数据加上一定的冗余纠错码后一起发送，这样接收方可以根据纠错码对接收到的数据进行差错检测以及进行纠错。这样，在所述编码后数据所包括的多个压缩前数据的基础上，对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的多个压缩后数据，并且对所述多个压缩前数据进行前向纠错计算生成冗余纠错码，将所述冗余纠错码加入所述多个压缩后数据从而更新所述编码后数据。如此，通过利用FEC技术来生成冗余纠错码，从而加强了数据保护，能够降低晶片之间接口数据传输的误码率，并且还对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的多个压缩后数据，这样就降低了数据规模，从而后续可以将所述冗余纠错码加入所述多个压缩后数据从而更新所述编码后数据，例如可以在待发送的数据尾部添加冗余数据域，从而降低对数据传输带宽的影响。最后，对更新后的所述编码后数据进行扰码和组帧得到所述待发送第二数据，在增强数据保护的同时也降低了对数据传输带宽的影响。应当注意的是，所述编码操作基于第一编码方案，所述多个压缩前数据的大小均为第一数值的比特位，所述第一数值是基于所述第一编码方案。因此，上述增强的数据保护机制可以适配通信协议具体规定的第一编码方案，因此能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护。在一些实施例中，所述第一编码方案是64/67编码，所述第一数值是67。也就是说，所述多个压缩前数据的大小均为67个比特位。In a possible implementation, the encoding operation is based on a first encoding scheme, the encoded data includes a plurality of pre-compression data, and the sizes of the plurality of pre-compression data are all bits of the first value, so The first value is based on the first encoding scheme, wherein scrambling and framing the encoded data to obtain the second data to be sent includes: performing compression and conversion on the plurality of pre-compression data respectively. The code obtains a plurality of compressed data corresponding to the plurality of pre-compression data, performs forward error correction calculation on the plurality of pre-compression data to generate a redundant error correction code, and converts the redundant error correction code into The plurality of compressed data are added to update the encoded data, and the updated encoded data are scrambled and framed to obtain the second data to be sent. In this way, forward error correction (FEC) technology is used to control transmission errors in the communication system and perform error recovery by sending additional information along with the data to reduce the bit error rate. Specifically, FEC technology is used to enhance the data to be sent and add a certain redundant error correction code before sending it together, so that the receiver can perform error detection and error correction on the received data based on the error correction code. In this way, on the basis of the plurality of pre-compression data included in the encoded data, the plurality of pre-compression data are compressed and transcoded respectively to obtain a plurality of compressed data corresponding to the plurality of pre-compression data. data, and performs forward error correction calculation on the plurality of pre-compressed data to generate redundant error correction codes, and adds the redundant error correction codes to the plurality of compressed data to update the encoded data. In this way, by using FEC technology to generate redundant error correction codes, data protection is enhanced, the bit error rate of interface data transmission between chips can be reduced, and the multiple pre-compression data are also compressed and transcoded respectively to obtain the same The multiple pre-compression data correspond to multiple compressed data one-to-one, thus reducing the data scale, so that the redundant error correction code can be added to the multiple compressed data subsequently to update the encoded data. , for example, redundant data fields can be added at the end of the data to be sent, thereby reducing the impact on data transmission bandwidth. Finally, the updated encoded data is scrambled and framed to obtain the second data to be sent, which not only enhances data protection but also reduces the impact on data transmission bandwidth. It should be noted that the encoding operation is based on the first encoding scheme, the sizes of the plurality of pre-compressed data are all bits of a first value, and the first value is based on the first encoding scheme. Therefore, the above-mentioned enhanced data protection mechanism can adapt to the first encoding scheme specified by the communication protocol, and therefore can flexibly enhance data protection according to the requirements of the specific communication protocol and the specific encoding and decoding scheme. In some embodiments, the first encoding scheme is 64/67 encoding and the first value is 67. That is to say, the sizes of the plurality of pre-compressed data are all 67 bits.

进一步地，在一些实施例中，更新前的所述编码后数据的大小和所述更新后的所述编码后数据的大小一致，所述多个压缩后数据对应所述多个压缩前数据中的数据域，对所述多个压缩前数据进行前向纠错计算生成的所述冗余纠错码对应所述多个压缩前数据中的用于同步的比特域。如此，在基于第一编码方案进行编码操作得到所述编码后数据所包括的多个压缩前数据之后，对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的多个压缩后数据。并且，更新前的所述编码后数据的大小和所述更新后的所述编码后数据的大小一致。这意味着，在利用FEC技术来生成冗余纠错码并且将所述冗余纠错码加入所述多个压缩后数据从而更新所述编码后数据，没有改变数据的大小，而是通过压缩转码来获得了压缩的数据空间用于传输FEC算法的冗余数据域也就是FEC纠错码。并且，所述多个压缩后数据对应所述多个压缩前数据中的数据域，这意味着在保持数据域不变的前提下，用冗余部分来存储纠错码，因此整体上的编码后数据在更新前后的大小保持不变。例如，设多个压缩前数据是四个大小均为67比特位的数据(例如所述第一数值为67)，在压缩后这四个大小均为67比特位的数据变成了261比特位，然后纠错码占据7个比特。这意味着，更新前的所述编码后数据的大小是四个大小均为67比特位的数据也就是268比特位，而所述更新后的所述编码后数据的大小是261比特位加上被纠错码占据的7个比特位因此仍为268比特位。这样在不改变链路传输带宽的情况下引入了FEC纠错机制，通过在接收端增加相应的反向转码和FEC解码操作，就可以增强数据保护。在一些实施例中，FEC算法可以是RS(536，522)算法，相应的冗余纠错码为140比特，可以保护5220比特的数据。也就是说，在发送端根据输入的数据块(大小为5220比特)计算出冗余纠错码(大小为140比特)，将冗余纠错码合并在数据块的尾部后一并传输。在接收端根据帧定界后的FEC数据块及逆行FEC校验计算，如果发现数据域的数据有错，则可以根据纠错码进行纠错从而恢复出正确的数据。采用这种FEC算法的数据保护方案可以纠正数据域最多70个比特的错误。Further, in some embodiments, the size of the encoded data before the update is consistent with the size of the encoded data after the update, and the plurality of compressed data correspond to the plurality of pre-compressed data. The data field, the redundant error correction code generated by performing forward error correction calculation on the plurality of pre-compression data corresponds to the bit field used for synchronization in the plurality of pre-compression data. In this way, after performing an encoding operation based on the first encoding scheme to obtain a plurality of pre-compression data included in the encoded data, the plurality of pre-compression data are compressed and transcoded respectively to obtain the same data as the plurality of pre-compression data. Multiple compressed data corresponding to one. Furthermore, the size of the encoded data before the update is consistent with the size of the encoded data after the update. This means that when FEC technology is used to generate redundant error correction codes and the redundant error correction codes are added to the plurality of compressed data to update the encoded data, the size of the data is not changed, but through compression Transcoding obtains a compressed data space for transmitting the redundant data field of the FEC algorithm, which is the FEC error correction code. Moreover, the plurality of compressed data correspond to the data fields in the plurality of pre-compressed data, which means that on the premise of keeping the data field unchanged, the redundant part is used to store the error correction code, so the overall encoding The size of the final data remains unchanged before and after the update. For example, assuming that the plurality of pre-compression data are four data with a size of 67 bits each (for example, the first value is 67), after compression, the four data with a size of 67 bits each become 261 bits. , then the error correction code occupies 7 bits. This means that the size of the encoded data before the update is four 67-bit data, that is, 268 bits, and the size of the encoded data after the update is 261 bits plus The 7 bits occupied by the error correction code are therefore still 268 bits. In this way, the FEC error correction mechanism is introduced without changing the link transmission bandwidth. By adding corresponding reverse transcoding and FEC decoding operations at the receiving end, data protection can be enhanced. In some embodiments, the FEC algorithm may be an RS (536, 522) algorithm, and the corresponding redundant error correction code is 140 bits, which can protect 5220 bits of data. That is to say, at the sending end, the redundant error correction code (size is 140 bits) is calculated based on the input data block (size is 5220 bits), and the redundant error correction code is combined at the end of the data block and then transmitted together. At the receiving end, based on the frame delimited FEC data block and retrograde FEC check calculation, if the data in the data field is found to be incorrect, the error can be corrected based on the error correction code to recover the correct data. Data protection schemes using this FEC algorithm can correct up to 70 bit errors in the data field.

进一步地，在一种可能的实施方式中，所述解码操作基于第一解码方案，所述第一编码方案对应所述第一解码方案，其中，对所述解码后数据进行数据聚合处理得到聚合后数据，包括：对所述解码后数据进行解压缩和反向转码后再进行前向纠错检验，然后进行数据聚合处理得到所述聚合后数据。如此，在发送端利用FEC技术来加强数据保护的基础上，也在接收端相应的反向转码和FEC解码操作(例如可以在接收端将压缩掉的用于同步的同步头比特恢复添加到对应的数据块位置)，这样建立起了晶片之间数据互联的通用的数据保护机制，一方面利用FEC技术来生成冗余纠错码并且将所述冗余纠错码加入所述多个压缩后数据从而更新所述编码后数据，这样可以通过FEC算法生成的冗余纠错码来纠正数据域内的错误和提供数据传输可靠性，另一方面通过压缩转码以确保整体上的编码后数据在更新前后的大小保持不变，这样避免了因为引入冗余纠错码而增加链路传输带宽，而且是利用了冗余数据域也就不影响数据域本身(对所述多个压缩前数据进行前向纠错计算生成的所述冗余纠错码对应所述多个压缩前数据中的用于同步的比特域)，在保持链路传输带宽不变的前提下实现了FEC保护。另外，编码操作和解码操作都可以结合具体的通信协议的要求。例如，Interlaken协议规定了第一编码方案是64/67编码这种特定编码格式，则可以进行相应的压缩转码，例如将四个大小均为67比特位的数据进行压缩转码得到261比特位，从而提供7个比特位给冗余纠错码。换句话说，可以灵活地根据具体的通信协议的要求以及具体的编码解码方案来采用合适的FEC算法，从而可以适配通信协议具体规定的第一编码方案，因此能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护。另外，通过选择与第一编码方案对应的第一解码方案，在发送端利用FEC技术来加强数据保护的基础上，也在接收端相应的反向转码和FEC解码操作，这样建立起了晶片之间数据互联的通用的数据保护机制，能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护，包括能增强对链路数据的错误检测和提供误码纠正机制，进而可以提高晶片之间互联接口高速数据传输的可靠性，满足如网络、大规模数据中心和人工智能等领域的应用需求。Further, in a possible implementation, the decoding operation is based on a first decoding scheme, the first encoding scheme corresponds to the first decoding scheme, wherein the decoded data is subjected to data aggregation processing to obtain the aggregation The post-data includes: decompressing and reverse transcoding the decoded data, then performing forward error correction check, and then performing data aggregation processing to obtain the post-aggregation data. In this way, on the basis of using FEC technology to enhance data protection at the sending end, corresponding reverse transcoding and FEC decoding operations are also performed at the receiving end (for example, the compressed synchronization header bits used for synchronization can be restored and added to the receiving end). Corresponding data block position), thus establishing a universal data protection mechanism for data interconnection between chips. On the one hand, FEC technology is used to generate redundant error correction codes and the redundant error correction codes are added to the multiple compression The post-data thereby updates the encoded data, so that errors in the data domain can be corrected through the redundant error correction code generated by the FEC algorithm and data transmission reliability is provided. On the other hand, compression and transcoding are used to ensure the overall encoded data. The size before and after the update remains unchanged, thus avoiding the increase in link transmission bandwidth due to the introduction of redundant error correction codes, and the use of redundant data fields does not affect the data field itself (the multiple pre-compression data The redundant error correction code generated by forward error correction calculation corresponds to the bit field used for synchronization in the plurality of pre-compression data), and FEC protection is achieved while keeping the link transmission bandwidth unchanged. In addition, both encoding operations and decoding operations can be combined with the requirements of specific communication protocols. For example, the Interlaken protocol stipulates that the first encoding scheme is a specific encoding format such as 64/67 encoding, and corresponding compression and transcoding can be performed. For example, four data of 67 bits each are compressed and transcoded to obtain 261 bits. , thereby providing 7 bits for the redundant error correction code. In other words, the appropriate FEC algorithm can be flexibly adopted according to the requirements of the specific communication protocol and the specific encoding and decoding scheme, so that it can adapt to the first encoding scheme specifically specified by the communication protocol, and therefore can flexibly be used according to the specific communication protocol. requirements and specific encoding and decoding schemes to enhance data protection. In addition, by selecting the first decoding scheme corresponding to the first encoding scheme, on the basis of using FEC technology to strengthen data protection at the sending end, the corresponding reverse transcoding and FEC decoding operations are also performed at the receiving end, thus establishing a chip The universal data protection mechanism for data interconnection can flexibly enhance data protection according to the requirements of specific communication protocols and specific encoding and decoding schemes, including enhancing error detection of link data and providing error correction mechanisms, thereby It can improve the reliability of high-speed data transmission of interconnect interfaces between chips and meet application needs in fields such as networks, large-scale data centers, and artificial intelligence.

进一步地，在一种可能的实施方式中，对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的所述多个压缩后数据，包括：压缩所述多个压缩前数据中的用于同步的比特域从而用于传输所述冗余纠错码，以及保持所述多个压缩前数据中的数据域。在一些实施例中，所述传输接口的与所述多个压缩前数据相关联的链路传输带宽等于所述传输接口的与所述多个压缩后数据相关联的链路传输带宽。在一些实施例中，对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的所述多个压缩后数据是基于所述第一编码方案，所述第一编码方案是基于与所述协议处理单元相关联的晶片到晶片接口互联协议。在一些实施例中，所述第一解码方案也是基于与所述协议处理单元相关联的晶片到晶片接口互联协议。如此，实现了灵活地根据具体的通信协议的要求以及具体的编码解码方案来采用合适的FEC算法，从而可以适配通信协议具体规定的第一编码方案，因此能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护。另外，通过选择与第一编码方案对应的第一解码方案，在发送端利用FEC技术来加强数据保护的基础上，也在接收端相应的反向转码和FEC解码操作，这样建立起了晶片之间数据互联的通用的数据保护机制，能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护，包括能增强对链路数据的错误检测和提供误码纠正机制，进而可以提高晶片之间互联接口高速数据传输的可靠性，满足如网络、大规模数据中心和人工智能等领域的应用需求。Further, in a possible implementation, performing compression and transcoding on the plurality of pre-compression data to obtain the plurality of post-compression data corresponding to the one-to-one correspondence with the plurality of pre-compression data includes: compressing the The bit field used for synchronization in the plurality of pre-compression data is used to transmit the redundant error correction code, and the data field in the plurality of pre-compression data is maintained. In some embodiments, the link transmission bandwidth of the transmission interface associated with the plurality of pre-compression data is equal to the link transmission bandwidth of the transmission interface associated with the plurality of post-compression data. In some embodiments, performing compression and transcoding on the plurality of pre-compression data to obtain the plurality of post-compression data corresponding to the plurality of pre-compression data is based on the first encoding scheme, and the The first encoding scheme is based on a die-to-die interface interconnect protocol associated with the protocol processing unit. In some embodiments, the first decoding scheme is also based on a die-to-die interface interconnect protocol associated with the protocol processing unit. In this way, the appropriate FEC algorithm can be flexibly adopted according to the requirements of the specific communication protocol and the specific encoding and decoding scheme, so that the first encoding scheme specified by the communication protocol can be adapted, and therefore the FEC algorithm can be flexibly adopted according to the requirements of the specific communication protocol. requirements as well as specific encoding and decoding schemes to enhance data protection. In addition, by selecting the first decoding scheme corresponding to the first encoding scheme, on the basis of using FEC technology to strengthen data protection at the sending end, the corresponding reverse transcoding and FEC decoding operations are also performed at the receiving end, thus establishing a chip The universal data protection mechanism for data interconnection can flexibly enhance data protection according to the requirements of specific communication protocols and specific encoding and decoding schemes, including enhancing error detection of link data and providing error correction mechanisms, thereby It can improve the reliability of high-speed data transmission of interconnect interfaces between chips and meet application needs in fields such as networks, large-scale data centers, and artificial intelligence.

图5为本申请实施例提供的一种通过晶片到晶片接口且基于第一编码方案进行数据发送过程的流程示意图。如图5所示，通过晶片到晶片接口且基于第一编码方案进行数据发送过程由以下步骤组成。FIG. 5 is a schematic flowchart of a data sending process based on a first encoding scheme through a chip-to-wafer interface according to an embodiment of the present application. As shown in Figure 5, the data transmission process through the wafer-to-wafer interface and based on the first encoding scheme consists of the following steps.

步骤S502：数据缓存。Step S502: Data caching.

步骤S504：数据切割。Step S504: Data cutting.

步骤S506：循环冗余校验计算和生成控制字。Step S506: Calculate cyclic redundancy check and generate control words.

步骤S508：组装和条带化处理。Step S508: Assembly and striping processing.

步骤S510：基于第一编码方案编码。Step S510: Encoding based on the first encoding scheme.

步骤S512：基于第一编码方案压缩转码。Step S512: Compress and transcode based on the first encoding scheme.

步骤S514：前向纠错计算。Step S514: Forward error correction calculation.

步骤S516：扰码和组帧。Step S516: scrambling and framing.

图5所示的通过晶片到晶片接口且基于第一编码方案进行数据发送过程，可以参考在图4所示的晶片到晶片接口互联的方法，因此可以满足数据传输速率高和数据传输可靠性高的需求，还可以灵活地适配各种协议、规则、策略等在如数据传输、流量控制、调度功能、带宽、数据通道等方面的要求。进一步地，图5所示的通过晶片到晶片接口且基于第一编码方案进行数据发送过程，还在步骤S512基于第一编码方案压缩转码以及在步骤S514前向纠错计算，因此在协议层处理部分引入了基于FEC技术的通用的数据保护机制，能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护，包括能增强对链路数据的错误检测和提供误码纠正机制，进而可以提高晶片之间互联接口高速数据传输的可靠性，满足如网络、大规模数据中心和人工智能等领域的应用需求。For the data sending process through the chip-to-chip interface shown in Figure 5 and based on the first encoding scheme, you can refer to the chip-to-chip interface interconnection method shown in Figure 4, so it can meet the requirements of high data transmission rate and high data transmission reliability. It can also flexibly adapt to the requirements of various protocols, rules, strategies, etc. in terms of data transmission, flow control, scheduling functions, bandwidth, data channels, etc. Further, the data sending process through the chip-to-wafer interface and based on the first encoding scheme shown in Figure 5 also includes compression and transcoding based on the first encoding scheme in step S512 and forward error correction calculation in step S514. Therefore, at the protocol layer The processing part introduces a universal data protection mechanism based on FEC technology, which can flexibly enhance data protection according to the requirements of specific communication protocols and specific coding and decoding schemes, including the ability to enhance error detection of link data and provide error correction mechanism, which can improve the reliability of high-speed data transmission of interconnect interfaces between chips and meet application needs in fields such as networks, large-scale data centers, and artificial intelligence.

图6为本申请实施例提供的一种通过晶片到晶片接口且基于第一解码方案进行数据接收过程的流程示意图。如图6所示，通过晶片到晶片接口且基于第一解码方案进行数据接收过程由以下步骤组成。FIG. 6 is a schematic flowchart of a data receiving process based on a first decoding scheme through a chip-to-wafer interface according to an embodiment of the present application. As shown in Figure 6, the data reception process through the chip-to-wafer interface and based on the first decoding scheme consists of the following steps.

步骤S602：定帧和解扰。Step S602: Framing and descrambling.

步骤S604：基于第一解码方案进行解码。Step S604: Decode based on the first decoding scheme.

步骤S606：基于第一解码方案解压缩和反向转码。Step S606: Decompress and reverse transcode based on the first decoding scheme.

步骤S608：前向纠错校验。Step S608: Forward error correction verification.

步骤S610：数据聚合处理。Step S610: Data aggregation processing.

步骤S612：循环冗余校验。Step S612: Cyclic redundancy check.

步骤S614：数据组合。Step S614: Data combination.

步骤S616：数据缓存。Step S616: Data caching.

图6所示的通过晶片到晶片接口且基于第一解码方案进行数据接收过程，可以参考在图4所示的晶片到晶片接口互联的方法，体现了在接收端采取的相应操作。并且进一步地，图6所示的通过晶片到晶片接口且基于第一解码方案进行数据接收过程，在步骤S606基于第一解码方案解压缩和反向转码以及在步骤S608前向纠错校验，因此在协议层处理部分，在发送端利用FEC技术来加强数据保护的基础上，也在接收端相应的反向转码和FEC解码操作，这样建立起了晶片之间数据互联的通用的数据保护机制。For the data receiving process shown in Figure 6 through the chip-to-chip interface and based on the first decoding scheme, reference can be made to the chip-to-chip interface interconnection method shown in Figure 4, which reflects the corresponding operations taken at the receiving end. And further, the data reception process shown in Figure 6 is performed through the chip-to-wafer interface and based on the first decoding scheme, decompression and reverse transcoding based on the first decoding scheme in step S606 and forward error correction verification in step S608 , so in the protocol layer processing part, on the basis of using FEC technology to strengthen data protection at the sending end, the receiving end also performs corresponding reverse transcoding and FEC decoding operations, thus establishing a common data for data interconnection between chips. protection mechanism.

参阅图5和图6，图5所示的通过晶片到晶片接口且基于第一编码方案进行数据发送过程，在步骤S512基于第一编码方案压缩转码以及在步骤S514前向纠错计算，图6所示的通过晶片到晶片接口且基于第一解码方案进行数据接收过程，在步骤S606基于第一解码方案解压缩和反向转码以及在步骤S608前向纠错校验，因此，实现了灵活地根据具体的通信协议的要求以及具体的编码解码方案来采用合适的FEC算法，从而可以适配通信协议具体规定的第一编码方案，因此能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护。通过选择与第一编码方案对应的第一解码方案，在发送端利用FEC技术来加强数据保护的基础上，也在接收端相应的反向转码和FEC解码操作，这样建立起了晶片之间数据互联的通用的数据保护机制，能够灵活地根据具体的通信协议的要求以及具体的编码解码方案来增强数据保护，包括能增强对链路数据的错误检测和提供误码纠正机制，进而可以提高晶片之间互联接口高速数据传输的可靠性，满足如网络、大规模数据中心和人工智能等领域的应用需求。Referring to Figures 5 and 6, the data transmission process shown in Figure 5 through the chip-to-wafer interface and based on the first encoding scheme, compression and transcoding based on the first encoding scheme in step S512 and forward error correction calculation in step S514, Figure The data receiving process shown in 6 is performed through the chip-to-wafer interface and based on the first decoding scheme. In step S606, the data is decompressed and reversely transcoded based on the first decoding scheme and in step S608, the forward error correction is performed. Therefore, the Flexibly adopt the appropriate FEC algorithm according to the requirements of the specific communication protocol and the specific encoding and decoding scheme, so that it can adapt to the first encoding scheme specified by the communication protocol, so it can flexibly adopt the appropriate FEC algorithm according to the requirements of the specific communication protocol and the specific encoding and decoding scheme. Encoding and decoding schemes to enhance data protection. By selecting the first decoding scheme corresponding to the first encoding scheme, on the basis of using FEC technology to enhance data protection at the transmitting end, the corresponding reverse transcoding and FEC decoding operations are also performed at the receiving end, thus establishing an inter-chip The universal data protection mechanism of data interconnection can flexibly enhance data protection according to the requirements of specific communication protocols and specific encoding and decoding schemes, including enhancing error detection of link data and providing error correction mechanisms, which can improve The reliability of high-speed data transmission of interconnect interfaces between chips meets the application needs in fields such as networks, large-scale data centers, and artificial intelligence.

参阅图1至图6，本申请实施例提供了一种晶片到晶片接口互联的方法。多个晶片中的每一个晶片包括晶片到晶片接口用于该晶片和所述多个晶片中相对于该晶片的另一晶片之间的数据互联，所述方法应用于第一晶片，所述第一晶片是所述多个晶片中的任一晶片，所述第一晶片的晶片到晶片接口包括接口缓存、协议处理单元和传输接口，所述方法包括：响应于所述第一晶片的数据发送，输入待发送第一数据到所述接口缓存；通过所述协议处理单元，对所述待发送第一数据进行数据切割得到切割后数据，对所述切割后数据进行循环冗余校验计算生成循环冗余计算结果，将所述循环冗余计算结果添加到所述切割后数据中从而进行组装和条带化分发得到分发后数据，对所述分发后数据基于第一编码方案进行编码操作得到包括多个压缩前数据的编码后数据，对所述多个压缩前数据分别进行压缩转码得到与所述多个压缩前数据一一对应的多个压缩后数据，并且对所述多个压缩前数据进行前向纠错计算生成冗余纠错码，将所述冗余纠错码加入所述多个压缩后数据从而更新所述编码后数据，对更新后的所述编码后数据进行扰码和组帧得到所述待发送第二数据，其中，更新前的所述编码后数据的大小和所述更新后的所述编码后数据的大小一致；通过所述传输接口，发送所述待发送第二数据，响应于所述第一晶片的数据接收，通过所述传输接口获取待接收第一数据；通过所述协议处理单元，对所述待接收第一数据进行定帧和解扰后再基于第一解码方案进行解码操作得到解码后数据，对所述解码后数据进行数据聚合处理得到聚合后数据，对所述聚合后数据进行循环冗余校验和数据组合得到待接收第二数据，所述第一编码方案对应所述第一解码方案；输入所述待接收第二数据到所述接口缓存。Referring to FIGS. 1 to 6 , embodiments of the present application provide a chip-to-wafer interface interconnection method. Each wafer of the plurality of wafers includes a wafer-to-wafer interface for data interconnection between the wafer and another wafer of the plurality of wafers relative to the wafer, the method is applied to a first wafer, the second wafer A wafer is any wafer among the plurality of wafers, the wafer-to-wafer interface of the first wafer includes an interface cache, a protocol processing unit and a transmission interface, and the method includes: in response to data transmission of the first wafer , input the first data to be sent to the interface cache; through the protocol processing unit, perform data cutting on the first data to be sent to obtain the cut data, and perform cyclic redundancy check calculation and generation on the cut data The cyclic redundancy calculation result is added to the cut data to perform assembly and striping distribution to obtain the distributed data, and the distributed data is obtained by encoding the distributed data based on the first encoding scheme. Encoded data including a plurality of pre-compression data, compressing and transcoding the plurality of pre-compression data respectively to obtain a plurality of compressed data corresponding to the plurality of pre-compression data, and performing compression on the plurality of compressed data Perform forward error correction calculations on the previous data to generate redundant error correction codes, add the redundant error correction codes to the plurality of compressed data to update the encoded data, and scramble the updated encoded data. The second data to be sent is obtained by coding and framing, wherein the size of the coded data before updating is consistent with the size of the coded data after updating; the second data to be sent is sent through the transmission interface. Send the second data, in response to the data reception of the first chip, obtain the first data to be received through the transmission interface; frame and descramble the first data to be received through the protocol processing unit, and then Perform a decoding operation based on the first decoding scheme to obtain decoded data, perform data aggregation processing on the decoded data to obtain aggregated data, perform cyclic redundancy check and data combination on the aggregated data to obtain the second data to be received, The first encoding scheme corresponds to the first decoding scheme; the second data to be received is input into the interface cache.

本申请实施例提供的方法和设备是基于同一发明构思的，由于方法及设备解决问题的原理相似，因此方法与设备的实施例、实施方式、示例或实现方式可以相互参见，其中重复之处不再赘述。本申请实施例还提供一种系统，该系统包括多个计算设备，每个计算设备的结构可以参照上述所描述的计算设备的结构。该系统可实现的功能或者操作可以参照上述方法实施例中的具体实现步骤和/或上述装置实施例中所描述的具体功能，在此不再赘述。The methods and devices provided in the embodiments of the present application are based on the same inventive concept. Since the principles of the methods and devices for solving problems are similar, the embodiments, implementations, examples or implementations of the methods and devices can be referred to each other, and there are no duplications. Again. An embodiment of the present application also provides a system that includes multiple computing devices. The structure of each computing device may refer to the structure of the computing device described above. The functions or operations that can be implemented by the system can be referred to the specific implementation steps in the above method embodiment and/or the specific functions described in the above device embodiment, and will not be described again here.

本申请实施例还提供一种计算机可读存储介质，所述计算机可读存储介质中存储有计算机指令，当所述计算机指令在计算机设备(如一个或者多个处理器)上运行时可以实现上述方法实施例中的方法步骤。所述计算机可读存储介质的处理器在执行上述方法步骤的具体实现可参照上述方法实施例中所描述的具体操作和/或上述装置实施例中所描述的具体功能，在此不再赘述。Embodiments of the present application also provide a computer-readable storage medium. Computer instructions are stored in the computer-readable storage medium. When the computer instructions are run on a computer device (such as one or more processors), the above can be implemented. Method steps in method embodiments. The specific implementation of the processor of the computer-readable storage medium in executing the above method steps may refer to the specific operations described in the above method embodiments and/or the specific functions described in the above device embodiments, which will not be described again here.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。本申请实施例可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时，上述实施例可以全部或部分地以计算机程序产品的形式实现。本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质上实施的计算机程序产品的形式。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载或执行所述计算机程序指令时，全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输，例如，计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(如软盘、硬盘、磁带)、光介质、或者半导体介质。半导体介质可以是固态硬盘，也可以是随机存取存储器，闪存，只读存储器，可擦可编程只读存储器，电可擦可编程只读存储器，寄存器或任何其他形式的合适存储介质。Those skilled in the art will understand that embodiments of the present application may be provided as methods, systems, or computer program products. The application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. The embodiments of the present application may be implemented in whole or in part through software, hardware, firmware, or any other combination. When implemented using software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The application may take the form of a computer program product embodied on one or more computer-usable storage media embodying computer-usable program code therein. The computer program product includes one or more computer instructions. When the computer program instructions are loaded or executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g., computer instructions may be transmitted from a website, computer, server or data center via a wired link (e.g. Coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.) means to transmit to another website, computer, server or data center. Computer-readable storage media can be any available media that can be accessed by the computer, or data storage devices such as servers and data centers that contain one or more sets of available media. Available media may be magnetic media (such as floppy disks, hard disks, tapes), optical media, or semiconductor media. The semiconductor medium may be a solid state drive, or it may be random access memory, flash memory, read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory, a register, or any other form of suitable storage medium.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述。可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. Each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram. These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram. These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其它实施例的相关描述。显然，本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请实施例的精神和范围。本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并或删减；本申请实施例系统中的模块可以根据实际需要进行划分、合并或删减。如果本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内，则本申请也意图包含这些改动和变型在内。In the above embodiments, each embodiment is described with its own emphasis. For parts that are not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments. Obviously, those skilled in the art can make various changes and modifications to the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. The steps in the methods of the embodiments of this application can be sequentially adjusted, merged or deleted according to actual needs; the modules in the system of the embodiments of this application can be divided, merged or deleted according to actual needs. If these modifications and variations of the embodiments of the present application fall within the scope of the claims of this application and equivalent technologies, this application is also intended to include these modifications and variations.

Claims

1. A method of wafer-to-wafer interface interconnection, wherein each wafer of a plurality of wafers includes a wafer-to-wafer interface for data interconnection between the wafer and another wafer of the plurality of wafers relative to the wafer, the method being applied to a first wafer that is any wafer of the plurality of wafers, the wafer-to-wafer interface of the first wafer including an interface cache, a protocol processing unit, and a transport interface, the method comprising:

Responding to the data transmission of the first wafer, and inputting first data to be transmitted to the interface cache;

performing data cutting on the first data to be sent through the protocol processing unit to obtain cut data, performing cyclic redundancy check calculation on the cut data to generate a cyclic redundancy calculation result, adding the cyclic redundancy calculation result into the cut data to perform assembling and striping distribution to obtain distributed data, performing coding operation on the distributed data based on a first coding scheme to obtain coded data comprising a plurality of compressed data, performing compression transcoding on the plurality of compressed data to obtain a plurality of compressed data which are in one-to-one correspondence with the plurality of compressed data, performing forward error correction calculation on the plurality of compressed data to generate a redundancy error correction code, adding the redundancy error correction code into the plurality of compressed data to update the coded data, and performing scrambling and framing on the updated coded data to obtain second data to be sent, wherein the size of the coded data before updating is consistent with the size of the updated coded data;

Transmitting the second data to be transmitted through the transmission interface,

responding to the data receiving of the first wafer, and acquiring first data to be received through the transmission interface;

the protocol processing unit is used for framing and descrambling the first data to be received, then carrying out decoding operation based on a first decoding scheme to obtain decoded data, carrying out data aggregation processing on the decoded data to obtain aggregated data, and carrying out cyclic redundancy check and data combination on the aggregated data to obtain second data to be received, wherein the first coding scheme corresponds to the first decoding scheme;

and inputting the second data to be received to the interface cache.

2. The method according to claim 1, wherein adding, by the protocol processing unit, the cyclic redundancy calculation result to the cut data to perform assembling and striping distribution to obtain the distributed data, comprises: and adding the cyclic redundancy calculation result into the cut data through the protocol processing unit so as to assemble and stripe the data according to the burst length setting to obtain the distributed data.

3. The method of claim 1, wherein the transmission interface is a serializer deserializer interface.

4. The method of claim 1, wherein the protocol processing unit is an Interlaken protocol processing unit.

5. The method of claim 1, wherein the plurality of wafers are homogenous wafers or heterogeneous wafers.

6. The method of claim 1, wherein the plurality of wafers correspond to functional blocks in a same system single wafer, the plurality of wafers being packaged together through respective wafer-to-wafer interfaces to form a system chipset corresponding to the system single wafer.

7. The method of claim 1, wherein the plurality of wafers are packaged together by a die technique.

8. The method of claim 1, wherein the plurality of pre-compression data sizes are each bits of a first value, the first value being based on the first encoding scheme.

9. The method of claim 8, wherein the plurality of compressed data corresponds to a data field in the plurality of pre-compressed data, and wherein the redundant error correction code generated by performing forward error correction calculations on the plurality of pre-compressed data corresponds to a bit field for synchronization in the plurality of pre-compressed data.

10. The method of claim 8, wherein performing data aggregation on the decoded data to obtain aggregated data comprises: and decompressing and reversely transcoding the decoded data, then carrying out forward error correction test, and then carrying out data aggregation processing to obtain the aggregated data.

11. The method of claim 9, wherein performing compression transcoding on the plurality of pre-compression data to obtain the plurality of post-compression data that corresponds one-to-one to the plurality of pre-compression data, respectively, comprises:

compressing bit fields for synchronization in the plurality of pre-compression data for transmission of the redundant error correction code, and maintaining data fields in the plurality of pre-compression data.

12. The method of claim 11, wherein a link transmission bandwidth of the transmission interface associated with the plurality of pre-compressed data is equal to a link transmission bandwidth of the transmission interface associated with the plurality of post-compressed data.

13. The method of claim 12, wherein compression transcoding the plurality of pre-compression data to the plurality of pre-compression data in one-to-one correspondence with the plurality of pre-compression data is based on the first encoding scheme, the first encoding scheme being based on a wafer-to-wafer interface interconnect protocol associated with the protocol processing unit.

14. A computer device, characterized in that it comprises a memory, a processor and a computer program stored on the memory and executable on the processor, which processor implements the method according to any of claims 1 to 13 when executing the computer program.

15. A computer readable storage medium storing computer instructions which, when run on a computer device, cause the computer device to perform the method of any one of claims 1 to 13.