+

WO2016037262A1 - Réseaux d'interconnexion optique dynamique optiquement répartis à faible latence - Google Patents

Réseaux d'interconnexion optique dynamique optiquement répartis à faible latence Download PDF

Info

Publication number
WO2016037262A1
WO2016037262A1 PCT/CA2015/000486 CA2015000486W WO2016037262A1 WO 2016037262 A1 WO2016037262 A1 WO 2016037262A1 CA 2015000486 W CA2015000486 W CA 2015000486W WO 2016037262 A1 WO2016037262 A1 WO 2016037262A1
Authority
WO
WIPO (PCT)
Prior art keywords
nodes
optical
wavelength
switch
distribution layer
Prior art date
Application number
PCT/CA2015/000486
Other languages
English (en)
Other versions
WO2016037262A8 (fr
Inventor
Yunqu Liu
Kin-Wai LEONG
Original Assignee
Viscore Technologies Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Viscore Technologies Inc. filed Critical Viscore Technologies Inc.
Publication of WO2016037262A1 publication Critical patent/WO2016037262A1/fr
Publication of WO2016037262A8 publication Critical patent/WO2016037262A8/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0005Switch and router aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0062Network aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0005Switch and router aspects
    • H04Q2011/0007Construction
    • H04Q2011/0018Construction using tunable transmitters or receivers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0005Switch and router aspects
    • H04Q2011/0007Construction
    • H04Q2011/0032Construction using static wavelength routers (e.g. arrayed waveguide grating router [AWGR] )
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0062Network aspects
    • H04Q2011/0064Arbitration, scheduling or medium access control aspects

Definitions

  • This invention relates to optical interconnection networks and more particularly to distributed optical switch networks with cyclic wavelength dependent routing elements and an optically dedicated distributed low latency signaling.
  • Data centers are facilities that store and distribute the data on the Internet. With an estimated 14 trillion web pages on over 750 million websites, data centers contain a lot of data. Further, with almost three billion Internet users accessing these websites, including a growing amount of high bandwidth video, there is a massive amount of data being uploaded and downloaded every second on the Internet.
  • CAGR compound annual growth rate for global IP traffic between users is between 40% based upon Cisco's analysis (see http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/ white_paper_c l l-481360_ns827_Networking_Solutions_White_Paper.html) and 50% based upon the University of Minnesota's Minnesota Internet Traffic Studies (MINTS) analysis.
  • MINTS University of Minnesota's Minnesota Internet Traffic Studies
  • a data center is filled with tall racks of electronics surrounded by cable racks where data is typically stored on big, fast hard drives.
  • Servers are computers that take requests and move the data using fast switches to access the right hard drives and either write or read the data to the hard drives.
  • mid-2013 Microsoft stated it had itself over 1 million servers.
  • routers Connected to these servers are routers that connect the servers to the Internet and therein the user and / or other data centers.
  • an optical network comprising:
  • the nodes within a plane and the equivalent nodes within each plane are both connected by a switch exploiting the wavelength dependent data distribution layer and the passive broadcast control distribution layer.
  • a switch exploiting a wavelength dependent data distribution layer and a passive broadcast control distribution layer.
  • a switch exploiting a wavelength dependent data distribution layer and a passive broadcast control distribution layer wherein the transmitters connected to the wavelength dependent data distribution layer are high speed wavelength tunable sources providing 2R or 3R functionality.
  • a network employing a fast tunable optical laser source in combination with a passive wavelength dependent distributed optical switch with discrete optical hyperedge signaling.
  • a reconfigurable 2R or 3R optically tunable laser source connected via strictly non-blocking wavelength router connected so as to provide one of a distributed VLB switch, a complete graph switch, and a perfect difference graph switch.
  • Figure 1A depicts data center network connections according to the prior art using two-tier leaf-spine architectures
  • Figure IB depicts a chordal interconnection pattern for a ring network according to the prior art for use within the interconnection of servers and data centers;
  • Figure 2 depicts schematically the optical elements of a node within a 12 fiber chordal ring architecture according to the prior art of US 2012/0,321,309 entitled “Optical Architecture and Channel Plan Employing Multi-Fiber Configurations for Data Center Network Switching”;
  • FIG. 3 depicts schematically a system overview of an Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention
  • FIGS 4 and 5 A depict schematically the operating basis of a wavelength dependent interconnection networking element (AWGR) forming part of a Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention
  • FIG. 5B depicts schematically the operating basis of a wavelength independent interconnection networking element forming part of a Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention
  • FIG. 6 depicts schematically the result of combining AWGR based Fast Tunable Laser Source Switch (FTLSS) and AWGR based Dynamic Reconfigurable Graph Data center Interconnection Network (DRGDIN) to provide Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) based networks with a discrete Optical Hyperedge Signaling Panel (OPHYS1P) for connecting data centers in a three-dimensional architecture;
  • FLSS Fast Tunable Laser Source Switch
  • DRGDIN Dynamic Reconfigurable Graph Data center Interconnection Network
  • Figures 7A and 7B depict 2R and 3R regenerators exploiting semiconductor based picosecond tunable wavelength converters;
  • Figure 8 depicts an exemplary schematic of opto-electronic reducer as a receiver structure which offer contention free for a Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention; and
  • HSSPDOS Hyperedge Signaled Physically Distributed Optical Switch
  • FIG. 9 depicts an exemplary schematic of single receiver structure with a contention control push back for a Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention.
  • HASPDOS Hyperedge Signaled Physically Distributed Optical Switch
  • FIG. 10 depicts an exemplary schematic of multiple but not fully populated optoelectronic reducers as receivers with a contention push back for a Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention
  • HASPDOS Hyperedge Signaled Physically Distributed Optical Switch
  • FIG. 1 1 A to 1 1C depicts schematically alternate data center interconnections in a three-dimensional architecture according to an embodiment of the invention by combining AWGR based Fast Tunable Laser Source Switch (FTLSS) and AWGR based Dynamic Reconfigurable Graph Data center Interconnection Network (DRGD1N) to provide Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) based networks with a discrete Optical Hyperedge Signaling Panel (OPHYSIP) and in-plane torus networks.
  • FLSS Fast Tunable Laser Source Switch
  • DRGD1N Dynamic Reconfigurable Graph Data center Interconnection Network
  • the present invention is directed to optical interconnection networks and more particularly to distributed optical switch networks with cyclic wavelength dependent routing elements and an optically dedicated distributed low latency signaling.
  • the number of computer servers that can be added to two-tier leaf/spine data center network architecture is a direct function of the number of uplinks on the leaf switches. If a fully non-blocking topology is provided then the leaf switches are required to have as many uplinks as downlink interfaces to computer servers.
  • 10 Gbps is the default speed of network interfaces of data center servers and hence, with the number of servers required to support the growth of Hybrid/Multi-Cloud services etc. requiring much larger and more centralized data centers, it has become challenging to design non-blocking and cost-effective data center networking fabrics.
  • a combination of a Public and a Private cloud forms a Hybrid Cloud.
  • the combination of multiple Public Cloud services forms a Multi-Cloud.
  • the combination of a Hybrid Cloud and a Multi-Cloud forms a Hybrid/Multi- Cloud.
  • an oversubscription ratio is defined as the ratio of downlink ports to uplink ports when all ports are of equal speed.
  • 40 Gbps of uplink bandwidth to the spine switches is necessary for every 12 servers.
  • cloud scale data center operators are accepting the constraints of a 3: 1 oversubscribed two- tier leaf/spine topology (see for example http://www.ieee802.org/3/400GSG/public/13_07/issenhuth _400_01_0713.pdf) due to the much higher costs of implementing non-blocking fabrics.
  • the 3: 1 threshold is generally seen as a maximum allowable level of oversubscription and is carefully understood and managed by the data center operators. Accordingly, referring to Figure 1A there is depicted a 3: 1 oversubscribed leaf/spine/core architecture supporting communications within and between a pair of data centers, Data center A 1 10 and Data center B 120.
  • the computer infrastructure generally consists of servers 130 interconnected at 10 Gbps to Top of Rack (ToR) Ethernet switches that act as first level aggregation, the leaf switches 140. These ToR leaf switches 140 then uplink at 40 Gbps into end of row (EoR) Ethernet switches, which act as the spine switches 150 of the leaf/spine topology.
  • ToR Top of Rack
  • EoR end of row
  • the spine switches then connect at 100 Gbps to core routers 160, which then in turn interconnect to optical core infrastructure made up metro/long-haul DWDM/ROADMs transport platforms.
  • this leaf/spine/core architecture is the most pervasive manner of providing any-to-any connectivity with a maximum amount of bisection bandwidth within and across data centers it is not without its limitations.
  • One such limitation is latency due to the requirement to route by at least one leaf switch 140 or more typically via two leaf switches and two or more spline switches 150 and / or core routers 160 according to the dimensions of the data center, the uplink capacity, downlink capacity, location(s) of the servers being accessed, etc.
  • alternative architectures have been proposed such as chordal networks and spline ring networks.
  • each spline switch is addressed from another spline switch by the selection of the wavelength upon which the data is transmitted. Accordingly, there the number of spline switches / core switches traversed may be reduced through Dense Wavelength Division Multiplexing (DWDM) based chordal ring architectures as depicted in Figure I B as rather than routing data through multiple spline and / or core switches the data routed from a node based upon wavelength wherein the N' h wavelength denotes the N' h node around the ring.
  • DWDM Dense Wavelength Division Multiplexing
  • a node of a spline ring network is depicted in Figure 2 after Barry et al in US 2012/0,321 ,309 entitled "Optical Architecture and Channel Plan Employing Multi-Fiber Configurations for Data Center Network Switching.”
  • Plexxi Inc. implement a 12 fibre ring, 6 East and 6 West, each with a Coarse WDM channel plan.
  • At each node one fiber in each direction is dedicated to adjacent node communications with 4 wavelengths.
  • a second fiber terminates in each direction with 8 wavelengths wherein these have been added in pairs at nodes N - 2; N - 3; N - 4; N - 5 where N is the current node.
  • Each node can therefore add either one or both wavelengths on the appropriate fiber to allow to be sent directly to nodes N + 2; N + 3; N + 4; N + 5 .
  • N + 2; N + 3; N + 4; N + 5 nodes N + 2; N + 3; N + 4; N + 5 .
  • DCN data center network
  • FIG. 3 there is depicted schematically a system overview of a Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention.
  • first to third servers 31 OA to 3 I OC representing a plurality of servers, which are each connected to a Data Layer fully interconnected network 340 and a Control Plane fully interconnected network 330.
  • each of the first to third servers 31 OA to 310C is bi-directionally coupled to the Data Layer fully interconnected network 340 and bi-directionally coupled to the Control Plane fully interconnected network 330.
  • the Data Layer fully interconnected network 340 exploits a wavelength dependent routing core and as depicted and described below in respect of Figure 5B the Control Plane fully interconnected network 330. Accordingly, as the Data Layer fully interconnected network 340 connects each input to every output each server of the first to third servers 51 OA to 5 IOC respectively is coupled to every other server of the first to third servers 51 OA to 5 IOC respectively.
  • the Data Layer fully interconnected network 340 is a non-centralized switched architecture, i.e. a physically distributed switch, then latency is reduced, cost is reduced, and power consumption reduced. In some embodiments of the invention where the Data Layer fully interconnected network 340 is a passive optical component the power consumption is zero.
  • Control Plane fully interconnected network 330 connects each input to every output each server of the first to third servers 31 OA to 3 IOC respectively is coupled to every other server of the first to third servers 31 OA to 3 IOC respectively.
  • the Control Plane fully interconnected network 330 is a passive optical component the power consumption is zero.
  • FIG. 4 there is depicted a schematic of a Data Layer fully interconnected network 340 employing a cyclic N x N array waveguide grating router (AWGR) and its subsequent exploitation in Figure 5A within a Physically Distributed Optical Switch (PDOS) allowing tunable switching and dynamic reconfiguration.
  • N 4 such that the AWGR 400 has 4 input ports 41 OA to 410D and four output ports 420 A to 420D respectively.
  • first input port 41 OA then this receives one or more optical signals with wavelengths such that these are routed to the first to fourth output ports 420A to 420D respectively, i.e. _(420D) .
  • the third and fourth input ports are mapped as _(420£>) and ⁇ _ ⁇ 42 ⁇ ); ⁇ J420S); /l2 _(420C); /l3 _(420 ) respectively. Accordingly, as depicted in Figure 5A by tuning the wavelength of a signal coupled to an input port of the first to fourth input ports 41 OA to 410D, e.g.
  • the AWGR 400 supports via optically tunable transmitters the equivalent of a larger switching based non- blocking interconnection between the transmitters and receivers.
  • the AWGR 400 removes through its inherent wavelength routing characteristics between different input ports and the common output port array substantial optical interconnection complexity, e.g. input and output arrays of 1 :N and N: l optical switches and a perfect shuffle interconnection of complexity N 2 .
  • an input / output port may be configured with N Tx/Rx pairs whilst another sub-grouping may be configured with 2N Tx/Rx and yet another sub-grouping N I K , i.e. K - 2 , Tx/Rx per port.
  • N I K i.e. K - 2
  • Tx/Rx per port i.e. K - 2
  • the HYSPDOS thereby leverages its functionality as a Virtual Load Bearing (VLB) switch and the HYSPDOS offers data centers a technology route to reduced hop count, reduced latency switching, and higher bandwidth than deployed commercial 2-tier spine-leaf centralized switching.
  • VLB Virtual Load Bearing
  • wavelength agile transmitter data can be routed to the appropriate receiver by simply changing the wavelength of the transmitter. If the wavelength agile transmitter supports fast sub-nanosecond tuning (switching) then the resulting optical network supports dynamic reconfiguration at the packet level. If the wavelength agile transmitter support multiple wavelengths simultaneously then the optical network supports unicast (one-to-one) routing as well as broadcast (one-to-many) distribution.
  • One such embodiment is a WDM array of laser based transmitters such that the same data signal can be modulated onto multiple WDM wavelengths and hence routed to the appropriate receivers.
  • the Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention depicted in Figure 3 in addition to the Data Layer fully interconnected network 340 employs a Control Plane fully interconnected network 330.
  • the Data Layer fully interconnected network 340 supports potentially rapid dynamic reconfiguration and both unicast / broadcast routing the Control Plane fully interconnected network 330 within the embodiments of the invention depicted here are based upon broadcast signaling methodologies.
  • the signaling information may be transported on a Control Plane fully interconnected network 330 that is essentially a replica of the Data Layer fully interconnected network 340 or with other network topologies supporting unicast and / or broadcast signaling communications.
  • First and second embodiments 500 and 550 respectively for a wavelength independent interconnection networking element.
  • First embodiment 500 is a passive distribution network formed from 3dB couplers 510 in ranks 520 which are interconnected through perfect shuffle networks 530.
  • Second embodiment 550 is a multimode interference interferometer (MMI) star coupler providing the same 8x8 broadcast capabilities.
  • MMI multimode interference interferometer
  • the inventors have established an optical networking architecture, the Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS), which exploits AWGR elements with Fast Tunable Laser Source Switch (FASTLASS) to provide AWGR Dynamic Reconfigurable Graph Data center interconnection networks (ADRG-DINs) with discrete optical hyperedge signaling panels (DOEH-SPs).
  • AWGR elements with Fast Tunable Laser Source Switch (FASTLASS) to provide AWGR Dynamic Reconfigurable Graph Data center interconnection networks (ADRG-DINs) with discrete optical hyperedge signaling panels (DOEH-SPs).
  • ADRG-DINs AWGR Dynamic Reconfigurable Graph Data center interconnection networks
  • DOEH-SPs discrete optical hyperedge signaling panels
  • the data centers within a Tier are interconnected via first POXN (Hyperedge/AWGR) A 610 and second POXN B 620 whilst data centers across the Tiers are interconnected via third POXN C 630.
  • Each of the POXN is a HYSPDOS providing Optical Hyperedge Signaling, via the Control Layer data interconnection network 330 within the HYSPDOS, for control together with AWGR based Data Layer data interconnection network 340 within the HYSPDOS.
  • the inventors have considered as the basis of optical transmitters within the wavelength agile routing networks enabled by the AWGR devices within the Data Layer data interconnection network 340 of HYSPDOS that exploit semiconductor based picosecond tunable wavelength converters (TWCs) such as depicted in Figures 7A and 7B.
  • TWCs semiconductor based picosecond tunable wavelength converters
  • FIGs 7A and 7B Referring initially to Figure 7A there is depicted a so-called 2R regenerators exploiting a semiconductor based picosecond tunable wavelength converter, 2R regenerators in optical domain being considered to wavelength convert and optically amplify.
  • the optical wavelength converter converts an input signal of wavelength ⁇ into an output signal of wavelength ⁇ 2.
  • the optical wavelength converter includes a saturable absorber switch (SATABS) 720 which is coupled to the optical input via circulator 710. Also coupled to the SATABS 720 is an optical emitter 730 operating at ⁇ 2 providing a signal at wavelength X2 to the SATABS 720 together with the input signal at wavelength ⁇ .
  • Optical emitter 730 may be a tunable laser operable over the absorption region of STABS 720 or an array of lasers each operable at a different wavelength within the operating window of the SATABS 720.
  • SATABS 720 may be coupled to a broadband light source with tunable filter, e.g. EDFA, supercontinuum light source, LED, etc.
  • optical signals from optical emitter 730 and input signal are comparable powers whilst in other embodiments of the invention the optical emitter 730 is a high power signal or the input optical signal is high power or made high power via an optical amplifier, not shown for clarity.
  • the SATABS 720 generates an output signal at ⁇ 2 which is coupled via the optical circulator 710 to the output via filter 740.
  • filter 740 is a band filter to limit noise within the optical network or it may be a tunable optical filter to increase isolation of the input wavelength ⁇ at the output.
  • SATABS 720 comprises a non-linear absorbing medium then under predetermined conditions, e.g. relatively low intensity light is incident upon the non-linear absorbing medium, then it is highly absorbing. However, upon illumination by a high intensity beam, non-linear absorbing medium saturates, becoming less absorbing. An incident optical beam having an associated wavelength within the absorption region of non-linear absorbing medium can saturate it (making it less absorbing) over its entire absorption range. Thus, it is possible for a high intensity optical beam of wavelength ⁇ to switch another optical beam having a wavelength ⁇ 2 given that both wavelengths fall within the absorption band of nonlinear absorbing medium.
  • a low intensity optical beam of wavelength ⁇ may switch another optical beam having a wavelength ⁇ 2 Wavelength ⁇ can be either greater or smaller than wavelength ⁇ 2 as what is important is the optical power level.
  • the 2R wavelength converter can achieve both “up” conversion and “down” conversion functions, where “up” conversion refers to a conversion from a low energy photon (i.e., long wavelength photon) to a high energy photon (i.e. short wavelength photon) and “down” conversion refers to the opposite.
  • the 3R regenerator wherein in addition to wavelength conversion and optical amplification (i.e., higher output power at converted (output) wavelength than input power at input wavelength) the 3R regenerator retimes (and / or reshapes) the optical signal.
  • the 3R regenerator similarly comprises input port, output port, circulator 710, SATABS 720, and filter 740.
  • the optical source at ⁇ 2 is now modulated with a clock signal such that the optical signal coupled to the SATABS 720 at X2 is digital rather than CW such that now the SATABS 720 will only be transparent when both optical signals meet the appropriate condition.
  • the emitted signal at ⁇ 2 is now retimed and reshaped when compared to the input signal at ⁇ .
  • the SATABS 720 may be a semiconductor optical amplifier (SOA).
  • Each Contention Reducer 800X as depicted by fourth Contention Reducer 800D comprises an optical demultiplexer (DMUX) 820 that separates the optical signal(s) from fourth port 420D into the discrete N wavelengths.
  • Each optical wavelength is then coupled to an avalanched photodiode (APD) / photodetector (PD) with transimpedance amplifier (TIA) combination, depicted as an N element opto-electronic converter array 830 which provides each serial electrical signal to series to parallel (S2P) converters / buffers 840.
  • S2P series to parallel
  • the parallel data from the S2P converter / buffer 840 is then coupled to shuffle logic 845.
  • Shuffle logic 845 allows any suitable combination of the electrical data signals output from the N S2P converter / buffer 840 to be provided to the receiver 860D via electrical multiplexer (MUX) 850.
  • shuffle logic 845 may be omitted from a Contention Reducer such as Contention Reducer 800D as may optionally the buffer functionality within S2P converter / buffer 840.
  • the shuffle logic 845 and buffer functionality may be implemented according to an embodiment of the invention as a silicon field programmable gate array (FPGA) or application specific integrated circuit (ASIC) which is connected to the N channel array of APD(PD)/TIA.
  • FPGA field programmable gate array
  • ASIC application specific integrated circuit
  • This silicon FPGA / ASIC chip has limited size of buffer for each APD(PD)/TIA.
  • the Contention Reducer will send a signal back over the Control Plane fully interconnected network 330, depicted as star coupler 810, to all of the transmitters involved in transmission.
  • This reverse signalling via the Control Plane fully interconnected network suppresses the transmitters that are contending with the selected channel on the receiver, in this instance fourth receiver 860D.
  • the selection of the transmitter by a receiver may be established upon a range of conditions including, but not limited to, maintaining an already transmitting transmitter, randomly picking an active transmitter, and cycling active transmitters.
  • the selected transmitter is given continuous sending privilege for a predetermined period of time, e.g. 3/tf or 10/zs .
  • FIG. 9 there is depicted an exemplary schematic of receiver structure without contention reducer for the Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention.
  • the transmitters first to fourth transmitters 51 OA to 510D, are again connected via Data Layer fully interconnected network 340 to first to fourth receivers 520A to 520D respectively.
  • Control Plane fully interconnected network 330 depicted as star coupler 810
  • the transmitter(s) detects the contention signal and stops transmitting and they restart transmitting at a predetermined offset of a plurality of predetermined offsets or according to a time stamp added / present within the signal routed back through the Control Plane fully interconnected network 330.
  • the time stamp may be added by a modulator within the tapped feedback path prior to the Control Plane fully interconnected network 330. Whilst contention performance and throughput of the design depicted in Figure 9 may not be as high as that depicted in Figure 8 it is a more cost effective approach.
  • a wider bandwidth DMUX at the receiver side may be employed and each transmitter operates upon a subset number of the available transmitter ports.
  • each transmitter operates upon a subset number of the available transmitter ports.
  • By combining outputs to a single receiver non-blocking operation may still be implemented whilst reducing the contention frequency.
  • FIG. 10 Such a scenario is depicted in Figure 10 wherein multiple but not fully populated opto-electronic reducers operate as discussed in respect of Figure 8 with a contention push back for a Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) according to an embodiment of the invention.
  • HASPDOS Hyperedge Signaled Physically Distributed Optical Switch
  • each of the first to fourth Contention Reducers 1000A to 1000D have S2P converters / buffers 840 but their number, M , is now less than the number of channels, N , supported by the Data Layer fully interconnected network 340.
  • Some algorithms may be introduced within the HYSPDOS depicted in Figure 10 on the transmitting side, first to fourth transmitters 51 OA to 510D respectively, in order to reduce the contention possibility to even lower level.
  • Data Layer fully connected layer 340 allows multiple transmitters to connect to one receiver side PD/TIA and associated silicon logic, then multiple transmitters from a group, e.g. all transmitters or a subset of the transmitters, have until they receive a contention signal back from a receiver via the Control Layer fully connected network 330 have no means of determining that their transmission will yield contention for the PD/TIA/Logic of a receiving side receiver, e.g. first to fourth receivers 520A to 520D.
  • the transmitters may be grouped and transmitters within the same group may employee different prioritization for sending data between groups and hence using the same wavelength to transmitters within another group. For instance, a Txl in Group A set Group A as one priority whereas the Tx2 in Group A set the priority for Group B at a different level. Hence, when the transmitters Txl in Group A have data for sending upon the same wavelength they choose this first Group differently, hence reducing the possibility of contention to transmitters within the same group.
  • Another method of avoiding contention in the same group is to employee a SOA- TWC such as depicted in Figures 7A and 7B respectively. Accordingly, in a first group, Group A, if Tx 1 is transmitting, the SOA-TWC associated with Tx2 will convert the Tx2 transmitted wavelength to the first wavelength of a second group, Group B, the SOA-TWC associated with Tx3 will convert the Tx3 transmitter wavelength to the first wavelength of a third group, Group C.
  • This method also can significantly reduce the contention ratio and with appropriate rules the contention may theoretically be reduced to zero, i.e. the HYSPDOS operates contention free.
  • FIG. 1 1A to 1 1C there are depicted firs to third schematics 1 100A to 1 100C respectively in respect of alternate data center interconnections in a three- dimensional architecture according to an embodiment of the invention established by combining AWGR based Fast Tunable Laser Source Switch (FTLSS) and AWGR based Dynamic Reconfigurable Graph Data center Interconnection Network (DRGDIN) to provide Hyperedge Signaled Physically Distributed Optical Switch (HYSPDOS) based networks with a discrete Optical Hyperedge Signaling Panel (OPHYSIP) and in-plane torus networks.
  • FLSS Fast Tunable Laser Source Switch
  • DRGDIN Dynamic Reconfigurable Graph Data center Interconnection Network
  • first schematic 1 100A a plurality of data centers which are denoted as Data center A 1140 and Data center B 1160 are configured into a plurality of tiers, depicted in first schematic 1100A as Tier 1 to Tier 8, and having associated with each Storage 1150, representing the internal memory storage within a data center.
  • the data centers may be visualized as a rectangular array even where their physical locations are not.
  • the data centers at two mutually perpendicular edges of the Tier are interconnected via first Data POXN (Hyperedge/AWGR 1) 1110, second Data POXN (Hyperedge/AWGR 2) 1130, and first Control POXN (Hyperedge/POXN) 1 120 whilst data centers within a Tier are interconnected by first and second ring networks, Torus A 1170 and Torus B 1180 respectively, which provide virtual horizontal and vertical networks in a mesh network interconnecting the data centers upon a Tier.
  • Each of the first and second Data POXN 1110 and 1130 together with the Control POXN 1 120 within a Tier are coupled through equivalent networks that connect all tiers along two mutually perpendicular faces of the rectangular prism formed by each tier of N x M data centers and the R tiers.
  • Figure 1 IB and second schematic 1 100B depict a single Tier whilst in Figure 1 1C with third schematic a routing architecture is depicted according to an embodiment of the invention.
  • a first Data Center A (DCA 1) 1 145 A rather than routing to a first Data Center B (DCB 1) 1 155A via either second Data Center A (DCA 2) 1 145B and third Data Centers A (DCA 3) 1 145C or fourth Data Center A (DCA 4) 1 145D and second Data Center B (DCB 2) 1 155B routes in a manner to cross as many Data Centers as possible.
  • this routing is a spiral passing all data centers.
  • first to third schematics 1100A to 1100C represent servers and / or server racks within a single data center.
  • routing may be to pass as many Data Centers as possible without looping back or it may be set to pass all other nodes.
  • Other routing rules may be established including, but not limited to, shortest path.
  • tiers may be interconnected on additional edges or that additional tier-tier interconnections may be provided throughout the "body" of the rectangular prism virtual construction.
  • additional tier-tier interconnections may be provided throughout the "body" of the rectangular prism virtual construction.
  • linear bus networks shown connecting the edges of the tiers and the ring networks within a tier these networks as well as the additional "through” networks may be ring networks, torus networks, mesh networks, and what the inventors refer to as "cube" networks.
  • “Cube” networks are those connecting elements on multiple tiers which within a representation such as that depicted in Figure 6 or Figures 11A to 1 1C would be represented in three dimensions (3D) rather than one dimension (ID) or two dimensions (2D) within a tier or between tiers.
  • Such "cube” networks may include, for example, a network diagonally through the tiers from one corner to another or from one edge across multiple tiers to another edge along multiple tiers or single tier.
  • an AWGR distributed optical switching methodology is presented wherein routing and channel selection is distributed to the edge of the switching fabric leaving a purely passive interconnect in the core. Further, contention control and signaling via a second optical control plane exploiting a similarly purely passive core are presented and described. Also, this disclosure provides a method to expand the total size of networks constructed from the two distributed switch design.
  • the transmitters connected to ports of the data layer fully interconnected network employ novel sub-nanosecond fast tunable light sources, e.g. SOA based tunable wavelength converters (TWCs), of 2R or 3R functionality to transmit the data packets.
  • TWCs may be replaced fast tunable lasers, broadband source with fast tunable laser or a fast receiver such as a fast tunable coherent receiver.
  • the disclosure allows designs for optical switching within data center interconnections that are physically distributed in a distance range of particular interest given today's large data centers, namely interconnections at the 1km and above range.
  • the current designs according to embodiments of the invention are possible as a result of the interconnection between the optical data layer and the sub-nanosecond distributed signaling panel, i.e. the Control Layer fully interconnected network, which is aligned with the distributed switch..
  • the contention control of mutual exclusivity is a significant challenge as control signals must traverse the distributed switch and may have, potentially, significant propagation delay, as travel from end to end has considerable propagation delay which is about 5 ?
  • the inventors exploit high speed reconfigurable AWGR based networking.
  • the performance of the interconnection network according to embodiments of the invention is better than an equivalent physical core switch.
  • the complete graph performance is close to a central core switch; with less than half of the switch cost.
  • Switches according to embodiments of the invention provide for a 2N configuration plus a central switch.
  • the total wavelength count per fiber is the ports limit of the passive optical cross connection and it also limits the total port number of AWGR based data layer switches.
  • some methodology is needed to expand the total networking size of the disclosed distributed AWGR switch topologies.
  • the inventors exploit what they refer to as the "Cartesian product" to expand the scale of the networks.
  • the Cartesian product of switch connected nodes is topologically homomorphic with the Microsoft Butterfly-Cube (MSFT-BC).
  • MSFT-BC Microsoft Butterfly-Cube
  • the commodity small electronic switch in the MSFT-BC is replaced by a passive optical AWGR based physically distributed (hundreds to thousands meters) switch.
  • the disclosed disclosure design has lower diameter count (1 vs. 2 per dimension), zero power consumption, and better cost economy than MSFT-BC design.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Optical Communication System (AREA)

Abstract

À l'intérieur des centres de données, le rapport entre le trafic intra-centre de données et le trafic externe peut atteindre 1000:1 sur une seule demande simple. À l'intérieur des centres de données, 90% du trafic dans les centres de données est intra-grappe. La topologie de Clos repliée déployée selon l'état antérieur de la technique fait évoluer la complexité de câblage comme une fonction quadratique du nombre de nœuds. Par conséquent, il serait bénéfique que de nouvelles architectures d'interconnexion de fibres optiques traitent le routage hiérarchique traditionnel et interconnexion multiplexée à répartition en temps (TDM) et assurent une latence réduite, une souplesse accrue, un coût plus faible, une consommation énergétique plus faible, et réalisent des interconnexions exploitant des interconnexions photoniques à N x M x D Gb/s, N canaux étant mis en place, dont chacun transporte M signaux à répartition en longueur d'onde à D Gb/s.
PCT/CA2015/000486 2014-09-09 2015-09-09 Réseaux d'interconnexion optique dynamique optiquement répartis à faible latence WO2016037262A1 (fr)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201462047689P 2014-09-09 2014-09-09
US62/047,687 2014-09-09
US62/047,689 2014-09-09
US201462055962P 2014-09-26 2014-09-26
US62/055,962 2014-09-26

Publications (2)

Publication Number Publication Date
WO2016037262A1 true WO2016037262A1 (fr) 2016-03-17
WO2016037262A8 WO2016037262A8 (fr) 2016-04-21

Family

ID=55747593

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2015/000486 WO2016037262A1 (fr) 2014-09-09 2015-09-09 Réseaux d'interconnexion optique dynamique optiquement répartis à faible latence

Country Status (1)

Country Link
WO (1) WO2016037262A1 (fr)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018183526A1 (fr) * 2017-03-29 2018-10-04 Fungible, Inc. Réseau de centre de données à maillage complet, sans blocage et ayant des dispositifs de permutation optique
WO2020014464A1 (fr) * 2018-07-12 2020-01-16 Panduit Corp. Maillage spectral spatial
US10637685B2 (en) 2017-03-29 2020-04-28 Fungible, Inc. Non-blocking any-to-any data center network having multiplexed packet spraying within access node groups
US10659254B2 (en) 2017-07-10 2020-05-19 Fungible, Inc. Access node integrated circuit for data centers which includes a networking unit, a plurality of host units, processing clusters, a data network fabric, and a control network fabric
US10686729B2 (en) 2017-03-29 2020-06-16 Fungible, Inc. Non-blocking any-to-any data center network with packet spraying over multiple alternate data paths
US10725825B2 (en) 2017-07-10 2020-07-28 Fungible, Inc. Data processing unit for stream processing
US10841245B2 (en) 2017-11-21 2020-11-17 Fungible, Inc. Work unit stack data structures in multiple core processor system for stream data processing
US10904367B2 (en) 2017-09-29 2021-01-26 Fungible, Inc. Network access node virtual fabrics configured dynamically over an underlay network
US10929175B2 (en) 2018-11-21 2021-02-23 Fungible, Inc. Service chaining hardware accelerators within a data stream processing integrated circuit
US10965586B2 (en) 2017-09-29 2021-03-30 Fungible, Inc. Resilient network communication using selective multipath packet flow spraying
US11048634B2 (en) 2018-02-02 2021-06-29 Fungible, Inc. Efficient work unit processing in a multicore system
US20210377634A1 (en) * 2019-07-15 2021-12-02 Yunqu Liu Remote data multicasting and remote direct memory access over optical fabrics
US11360895B2 (en) 2017-04-10 2022-06-14 Fungible, Inc. Relay consistent memory management in a multiple processor system
US12212495B2 (en) 2017-09-29 2025-01-28 Microsoft Technology Licensing, Llc Reliable fabric control protocol extensions for data center networks with unsolicited packet spraying over multiple alternate data paths
US12231353B2 (en) 2017-09-29 2025-02-18 Microsoft Technology Licensing, Llc Fabric control protocol for data center networks with packet spraying over multiple alternate data paths
US12278763B2 (en) 2017-09-29 2025-04-15 Microsoft Technology Licensing, Llc Fabric control protocol with congestion control for data center networks
US12294470B2 (en) 2017-09-29 2025-05-06 Microsoft Technology Licensing, Llc Fabric control protocol for large-scale multi-stage data center networks

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107623711B (zh) * 2016-07-15 2020-07-28 北京金山云网络技术有限公司 一种集群中主节点及从节点的分配方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5903686A (en) * 1997-08-21 1999-05-11 Macdonald; Robert I. Optical switch module
US6973269B1 (en) * 2001-10-18 2005-12-06 At&T Corp. Metropolitan networks based on fiber and free space access distribution system
US20120321309A1 (en) * 2011-06-20 2012-12-20 Barry Richard A Optical architecture and channel plan employing multi-fiber configurations for data center network switching
EP1368923B1 (fr) * 2001-03-16 2013-04-24 Meriton Networks US Inc. Procede et appareil permettant d'interconnecter une pluralite de transducteurs optiques et un commutateur optique multiplexe en longueur d'onde
US20140016923A1 (en) * 2011-01-13 2014-01-16 Telefonica, S.A. Multilayer communications network system for distributing multicast services and a method for such a distribution

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5903686A (en) * 1997-08-21 1999-05-11 Macdonald; Robert I. Optical switch module
EP1368923B1 (fr) * 2001-03-16 2013-04-24 Meriton Networks US Inc. Procede et appareil permettant d'interconnecter une pluralite de transducteurs optiques et un commutateur optique multiplexe en longueur d'onde
US6973269B1 (en) * 2001-10-18 2005-12-06 At&T Corp. Metropolitan networks based on fiber and free space access distribution system
US20140016923A1 (en) * 2011-01-13 2014-01-16 Telefonica, S.A. Multilayer communications network system for distributing multicast services and a method for such a distribution
US20120321309A1 (en) * 2011-06-20 2012-12-20 Barry Richard A Optical architecture and channel plan employing multi-fiber configurations for data center network switching

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10986425B2 (en) 2017-03-29 2021-04-20 Fungible, Inc. Data center network having optical permutors
US10425707B2 (en) 2017-03-29 2019-09-24 Fungible, Inc. Non-blocking, full-mesh data center network having optical permutors
WO2018183526A1 (fr) * 2017-03-29 2018-10-04 Fungible, Inc. Réseau de centre de données à maillage complet, sans blocage et ayant des dispositifs de permutation optique
US10637685B2 (en) 2017-03-29 2020-04-28 Fungible, Inc. Non-blocking any-to-any data center network having multiplexed packet spraying within access node groups
US11777839B2 (en) 2017-03-29 2023-10-03 Microsoft Technology Licensing, Llc Data center network with packet spraying
US10686729B2 (en) 2017-03-29 2020-06-16 Fungible, Inc. Non-blocking any-to-any data center network with packet spraying over multiple alternate data paths
US11632606B2 (en) 2017-03-29 2023-04-18 Fungible, Inc. Data center network having optical permutors
US11469922B2 (en) 2017-03-29 2022-10-11 Fungible, Inc. Data center network with multiplexed communication of data packets across servers
US11809321B2 (en) 2017-04-10 2023-11-07 Microsoft Technology Licensing, Llc Memory management in a multiple processor system
US11360895B2 (en) 2017-04-10 2022-06-14 Fungible, Inc. Relay consistent memory management in a multiple processor system
US11842216B2 (en) 2017-07-10 2023-12-12 Microsoft Technology Licensing, Llc Data processing unit for stream processing
US11824683B2 (en) 2017-07-10 2023-11-21 Microsoft Technology Licensing, Llc Data processing unit for compute nodes and storage nodes
US10659254B2 (en) 2017-07-10 2020-05-19 Fungible, Inc. Access node integrated circuit for data centers which includes a networking unit, a plurality of host units, processing clusters, a data network fabric, and a control network fabric
US10725825B2 (en) 2017-07-10 2020-07-28 Fungible, Inc. Data processing unit for stream processing
US11546189B2 (en) 2017-07-10 2023-01-03 Fungible, Inc. Access node for data centers
US11303472B2 (en) 2017-07-10 2022-04-12 Fungible, Inc. Data processing unit for compute nodes and storage nodes
US12294470B2 (en) 2017-09-29 2025-05-06 Microsoft Technology Licensing, Llc Fabric control protocol for large-scale multi-stage data center networks
US11412076B2 (en) 2017-09-29 2022-08-09 Fungible, Inc. Network access node virtual fabrics configured dynamically over an underlay network
US12261926B2 (en) 2017-09-29 2025-03-25 Microsoft Technology Licensing, Llc Fabric control protocol for data center networks with packet spraying over multiple alternate data paths
US11601359B2 (en) 2017-09-29 2023-03-07 Fungible, Inc. Resilient network communication using selective multipath packet flow spraying
US11178262B2 (en) 2017-09-29 2021-11-16 Fungible, Inc. Fabric control protocol for data center networks with packet spraying over multiple alternate data paths
US12278763B2 (en) 2017-09-29 2025-04-15 Microsoft Technology Licensing, Llc Fabric control protocol with congestion control for data center networks
US10904367B2 (en) 2017-09-29 2021-01-26 Fungible, Inc. Network access node virtual fabrics configured dynamically over an underlay network
US10965586B2 (en) 2017-09-29 2021-03-30 Fungible, Inc. Resilient network communication using selective multipath packet flow spraying
US12212495B2 (en) 2017-09-29 2025-01-28 Microsoft Technology Licensing, Llc Reliable fabric control protocol extensions for data center networks with unsolicited packet spraying over multiple alternate data paths
US12231353B2 (en) 2017-09-29 2025-02-18 Microsoft Technology Licensing, Llc Fabric control protocol for data center networks with packet spraying over multiple alternate data paths
US10841245B2 (en) 2017-11-21 2020-11-17 Fungible, Inc. Work unit stack data structures in multiple core processor system for stream data processing
US11734179B2 (en) 2018-02-02 2023-08-22 Fungible, Inc. Efficient work unit processing in a multicore system
US11048634B2 (en) 2018-02-02 2021-06-29 Fungible, Inc. Efficient work unit processing in a multicore system
WO2020014464A1 (fr) * 2018-07-12 2020-01-16 Panduit Corp. Maillage spectral spatial
US10929175B2 (en) 2018-11-21 2021-02-23 Fungible, Inc. Service chaining hardware accelerators within a data stream processing integrated circuit
US20210377634A1 (en) * 2019-07-15 2021-12-02 Yunqu Liu Remote data multicasting and remote direct memory access over optical fabrics

Also Published As

Publication number Publication date
WO2016037262A8 (fr) 2016-04-21

Similar Documents

Publication Publication Date Title
WO2016037262A1 (fr) Réseaux d'interconnexion optique dynamique optiquement répartis à faible latence
US20220150607A1 (en) Photonic switches, photonic switching fabrics and methods for data centers
US11012151B2 (en) Methods and systems relating to optical networks
US9705630B2 (en) Optical interconnection methods and systems exploiting mode multiplexing
US9509408B2 (en) Optical data transmission system
US20150098700A1 (en) Distributed Optical Switching Architecture for Data Center Networking
US8340517B2 (en) Systems and methods for on-chip data communication
US9621967B2 (en) Methods and systems for passive optical switching
CN106233672B (zh) 光交换系统与方法
Yan et al. Archon: A function programmable optical interconnect architecture for transparent intra and inter data center SDM/TDM/WDM networking
Marom et al. Optical switching in future fiber-optic networks utilizing spectral and spatial degrees of freedom
US9800472B2 (en) Network node connection configuration
US9383516B2 (en) System and method for optical input/output arrays
JP2002185482A (ja) Wdmを用いた透過型フォトニックスロットルーティングによる複合パケットスイッチング方法及びシステム
JP2015523827A (ja) 大容量ネットワークノード
US10382158B2 (en) Reversible wavelength channels for optical communication networks
Pal et al. RODA: A reconfigurable optical data center network architecture
Misawa et al. Broadcast-and-select photonic ATM switch with frequency division multiplexed output buffers
Papapavlou et al. Scalability analysis and switching hardware requirements for a novel multi-granular SDM/UWB 10 Pbps optical node
Chaintoutis et al. P-Torus: wavelength-based switching in packet granularity for intra-data-center networks
Aziz et al. Optical interconnects for data center networks
Jones Enabling technologies for in-router DWDM interfaces for intra-data center networks
Ben-Ezra et al. First WDM-SDM Optical Network with Spatial Sub-Group Routing ROADM Nodes Supporting Spatial Lane Changes
Duraisamy et al. POST: a scalable optical data center network
Mukherjee Optical‐Electrical‐Optical (O‐E‐O) Switches

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15839490

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.06.2017)

122 Ep: pct application non-entry in european phase

Ref document number: 15839490

Country of ref document: EP

Kind code of ref document: A1

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载