Title
Electronic-Photonic Co-Design of Silicon Photonic Interconnects

Permalink
https://escholarship.org/uc/item/91q3v4hs

Author
Lin, Sen

Publication Date
2017

Peer reviewed|Thesis/dissertation
Electronic-Photonic Co-Design of Silicon Photonic Interconnects

by

Sen Lin

A dissertation submitted in partial satisfaction of the
requirements for the degree of
Doctor of Philosophy
in
Engineering – Electrical Engineering and Computer Sciences
in the
Graduate Division
of the
University of California, Berkeley

Comitee in charge:

Professor Vladimir Stojanović, Chair
Professor Ming C. Wu
Professor Xiang Zhang

Fall 2017
Abstract

Electronic-Photonic Co-Design of Silicon Photonic Interconnects

by

Sen Lin

Doctor of Philosophy in Engineering – Electrical Engineering and Computer Sciences

University of California, Berkeley

Professor Vladimir Stojanović, Chair

Silicon photonic interconnects hold great promise in meeting the high bandwidth and low-energy demands of next-generation interconnects. System-level driven electronic-photonic co-design is the key to improving the bandwidth density and energy efficiency. In this study, a comprehensive co-optimization framework is developed for high-speed silicon photonic transmitters utilizing compact models and a detailed optical simulation framework. Given technology and link constraints, microring and Mach-Zehnder transmitter designs are optimized and compared based on a unified optical phase shifter model. Non-return-to-zero (NRZ) and pulse-amplitude-modulation-4 (PAM-4) modulation schemes are analyzed and compared for microring-based transmitters. Using the co-design approach, a monolithic 40Gb/s optical NRZ transmitter based on microring modulators is designed and demonstrated in zero-change 45nm CMOS SOI process. Electronic-photonic co-design with the high swing driver enables this transmitter to achieve total energy efficiency of 330fJ/b and the photonics and modulator driver area bandwidth density of 6.7 Tb/s/mm$^2$. This dissertation also discusses the design and demonstration of the first full silicon photonic interconnect on a 3D integrated electronic-photonic platform. These results make the microring-based silicon photonic transceivers an attractive solution for the next-generation inter and intra-rack photonic interconnects. Finally, a short-reach laser-forwarding coherent link architecture is proposed to further improve the energy efficiency of silicon photonic interconnects. The key concepts of the proposed architecture are verified experimentally with microring-based silicon photonic transmitters. The architecture saves the laser power by 6-7.5x and could enable complex modulation schemes for the future short-reach optical links.
# Contents

1 Introduction 1

2 Background 3
   2.1 Silicon Photonic Interconnects 3
       2.1.1 Silicon photonic modulator and photodetector 3
       2.1.2 WDM link architectures 4
       2.1.3 Coherent optical links 5
   2.2 Silicon Photonics Platforms 6
       2.2.1 Monolithic silicon photonics platform 6
       2.2.2 3D integrated silicon photonics platform 7
   2.3 Challenges and Opportunities 8
       2.3.1 Co-optimizing photonics and electronics 8
       2.3.2 Pushing speed limits of photonics transmitters 9
       2.3.3 Reducing the high power consumption of laser sources 9

3 Electronic-photonic Co-Optimization of Silicon Transmitter 11
   3.1 Overview of Co-optimization Framework 11
   3.2 PN-Junction-Based Optical Phase Shifter 12
       3.2.1 Compact model of optical phase shifter 12
       3.2.2 Model verification on different platforms 16
   3.3 Optimization of Microring-based Transmitter 18
       3.3.1 Static model of microring modulator 18
       3.3.2 Transient simulation in Simulink 20
       3.3.3 Optimization of microring modulator design 23
       3.3.4 Microring-based NRZ transmitter design 26
       3.3.5 Microring-based PAM4 transmitter design 28
   3.4 Optimization of Mach-Zehnder Transmitter 32
       3.4.1 Overview of Mach-Zehnder modulator 32
       3.4.2 Multi-stage Mach-Zehnder transmitter 34
       3.4.3 Traveling-wave Mach Zehnder transmitter 36
   3.5 Comparisons 38
   3.6 Summary 40

4 High-speed Monolithic Silicon Photonics Transmitters 41
   4.1 Microring-based Transmitter Design Challenges 41
   4.2 Improved Design of Microring Modulator 41
   4.3 Design of High-speed Transmitter Circuits 44
       4.3.1 NRZ transmitter with AC-coupled driver 44
       4.3.2 NRZ transmitter with single-ended driver 45
       4.3.3 PAM4 transmitter 45
List of Figures

2.1 A single channel silicon photonic interconnect based on microrings. [1] . . . . 4
2.2 System diagram of a microring-based WDM optical link. [1] . . . . . . . . . 5
2.3 Cross-section view of the monolithic silicon photonic platform in 45nm SOI 
CMOS process. [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Photos of the first silicon photonic processor chip in 45nm SOI CMOS process. 7
2.5 Cross-section and top view of the electronic-photonic wafer [5] . . . . . . . 8
3.1 The flowchart of the co-optimization framework for silicon photonics trans-
mitters. The denotations here are used in derivations in the study. . . . . . . 13
3.2 Mach-Zehnder and microring modulators based on different PN junction phase 
shifters. Three common types of phase shifters are listed with top view or 
cross-section view. The corresponding feature lengths ($L_j$) for these PN junc-
tions are listed as well. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Modulation efficiency $V\pi L\pi$ vs. junction doping level. Dashed lines are pre-
dicted $V\pi L\pi$ at -1V reverse bias when $L_j/\gamma$ equals 200nm, 400nm, 600nm 
and 800nm. Reported data on various silicon photonics platforms P1-P8 are 
marked here. Average concentration of n-type and p-type doping is used. . . 16
3.4 Modulation efficiency $V\pi L\pi$ vs. reversed bias voltage for phase shifters on 
three different platforms P4, P5, P7. Detailed information are included in 
Table 1. Note that [5] refers to the interleaved phase shifter on that platform. 17
3.5 Reported waveguide loss vs. predicted waveguide loss. The references for each 
date points are labeled in the figure P1-P8. . . . . . . . . . . . . . . . . . . . 17
3.6 Micrograph of microring modulator in zero-change 45nm SOI process [2]. 
Model diagram of a microring modulator with drop port, where coupler and 
propagation coefficients for electric fields are labeled. . . . . . . . . . . . . 19
3.7 Modeled power transmission spectra of microring modulator under two dif-
f erent bias voltages. Optimal laser wavelength to maximize OMA is labeled. 
For phase shifter model, we assumed that $N_A = N_D = 10^{18}$ cm$^{-3}$, $L_j = 500$ 
nm and $\gamma = 0.75$. The Q factor of this microring modulator is 7700. FSR of 
this microring is around 20nm. . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.8 Schematic of MRM-based optical link in Simulink and the close-ups of mi-
croring modulator (MRM) block and the phase shifter (PS) block. . . . . . 22
3.9 Simulated eye diagram at 25 Gb/s. Device parameters are the same as the 
microring in Fig. 3.7 with optical bandwidth of around 25GHz. Laser detun-
ing is set to optimize OMA. A first-order low-pass filter approximation with 
3dB bandwidth of 20GHz is represented with red dashed line. . . . . . . . . 22
3.10 Optimized OMA for $f_{optical}$ 25GHz, 35GHz and 50GHz versus doping levels in the PN junction. Bias conditions are $V_0 = 0.5$ V and $V_1 = -1.5$ V. Technology constraints: PN junction feature length $L_j = 500$ nm and optical mode confinement factor $\gamma = 0.75$. $L_{rt} = 30 \mu m$, intrinsic loss 13 dB/cm [3]. Symmetric pn-junctions are assumed for simplicity.

3.11 Key characteristics of the optimal microring designs for different doping levels with the design points corresponding to maximum OMAs labeled. The optimization constraints corresponds to the curves in the Fig. 3.10. Three operation regions (A-C) are labeled for 25GHz operation as an example. A is coupling-limited region, C is loss-limited region and B is the optimal region. Note that $t_1$ and $t_2$ are transmission coefficients at the couplers. Stronger coupling means smaller $t_1$ and $t_2$.

3.12 Transmitter circuits for ring modulator. $C_w$ is the wire and packaging parasitic capacitance, and $C_m$ is the modulator junction capacitance.

3.13 Diagram of a full optical link with external laser source.

3.14 Model-estimated total E/b for microring driver+laser for microring-based NRZ transmitter at 50Gb/s. Two different driver swings are considered (1V and 2V). The microring is optimized for each doping level, which corresponds to the designs in Fig. 3.10 and 3.11.

3.15 Microring-based PAM4 Transmitters: (a) electrical DAC driver (b) optical DAC based on segmented microring (c) microring with two segments.

3.16 (a) Linearity comparison between 5-bit electrical DAC and 5-bit optical DAC for MRM-based transmitter. (b) Transmission spectra for a microring with two binary weighted segments. The microring here has an optical bandwidth of 25GHz.

3.17 Model estimated total E/b for microring driver+laser for microring-based PAM4 transmitter at 50Gb/s. Two different driver swings are considered (1V and 2V). The microring is optimized for each doping levels, which corresponds to the designs in Fig. 3.10 and 3.11.

3.18 Transient simulation of the 50Gb/s MRM-based NRZ transmitter and PAM4 transmitter. The microrings are optimized in each case with the same process and link constraints. The optical power for y-axis is normalized to the input power for microrings.

3.19 (a) Architecture of Multi-stage Mach-Zehnder Modulator (MS-MZM) (b) Architecture of Traveling wave Mach-Zehnder Modulator (TW-MZM)

3.20 Optimization results for multi-stage MZM transmitters at 50Gb/s with two peak-to-peak voltage swings (1V and 2V). (a) the total transmitter E/b, (b) laser E/b, (c) the optimal arm length $L_{opt}$.

3.21 Optimization results for traveling-wave MZM transmitters at 50Gb/s with three differential peak-to-peak voltage swings $V_{TW}$ (0.6, 0.8, 1.0V). (a) the total transmitter E/b, (b) laser E/b, (c) optimal arm length L.

3.22 Detailed energy breakdown and energy efficiency comparison between optimized (A) NRZ-MRM, (B) PAM4-MRM, (C) MS-MZM and (D) TW-MZM transmitters at 50Gb/s. Three different link margins are considered: 0dB, 3dB and 6dB.
3.23 Energy efficiency comparison between optimized NRZ-MRM, MS-MZM and TW-MZM transmitters at different data-rates. The gray line shows the receiver sensitivity vs data-rate according to measurement in [4].

4.1 Bandwidth and OMA trade-off of microring modulator
4.2 Microring modulator design considerations
4.3 Critical coupling condition of microring modulator, and the power coupling coefficient extracted from Lumerical FDTD simulation.
4.4 Block diagram of the AC-coupled NRZ transmitter
4.5 Block diagram of the single-ended NRZ transmitter
4.6 Block diagram of the PAM4 transmitter
4.7 System diagram of the digital PLL
4.8 Block diagram of the digital PLL
4.9 Block diagram of the high-speed divider of the digital PLL
4.10 Block diagram of the DCO of the digital PLL
4.11 Layout of the digital PLL and its sub-blocks
4.12 Die photo of test chip and sub-blocks
4.13 Measured transmit 20Gb/s and 40Gb/s NRZ eye-diagrams and dynamic IL/ER
4.14 Power measurement and breakdown for 20Gb/s and 40Gb/s NRZ
4.15 Performance comparison between single-ended and AC-coupled NRZ transmitters
4.16 Measured PAM4 eye diagram and transmit waveform
4.17 Performance comparison to prior works on high-speed silicon photonics transmitters

5.1 Cross-section of 3D heterogeneous integration process
5.2 Photonic and CMOS die views and Multicell architecture
5.3 Optical transmitter schematic and die photos.
5.4 Measured modulator transmission spectrum and eye diagram.
5.5 3D render and die photo of Ge photodetector.
5.6 Optical receiver schematic.
5.7 Measured photodiode responsivity over 100 nm wavelength range and its frequency response (with 50Ohm load) for different bias voltages.
5.8 Measured receiver average photo-current sensitivity over different data rates and BER bathtub curves for both receiver slices
5.9 System diagram of the thermal tuning loop.
5.10 Automatic thermal locking process with measured eye diagrams.
5.11 Link budget of the full optical link.
5.12 Lab setup for full optical link testing.
5.13 Full optical link BER performance.
5.14 Electrical energy breakdown for TX and RX macros in a 5Gb/s link.
5.15 Comparison to previous works

6.1 Schematic of balanced detection. Signals $L, S, X_1$ and $X_2$ are complex numbers and present the electric field of the lightwave. $i(t)$ is the final output current. Bias voltages are set to ensure the same reverse biases across the two identical photodiodes.
6.2 Proposed laser-forwarding architectures (a): a single laser source is shared between Chip 1 and Chip 2
6.3 Proposed laser-forwarding architectures (b): each chip has its own co-located laser source and forwards it to the receiving chip. Blue lines are the signal transmission fibers and red lines are the laser distribution fibers.

6.4 System diagram of the receiver in the proposed laser forwarding link architecture.

6.5 (a) BER vs. laser output power at 50Gbps in the noise-limited regime. The proposed LF-BPSK architecture could reduce laser power by 7.3x compared with conventional IMDD link. (b) BER vs. laser output power in the swing-limited regime. The proposed LF-BPSK architecture could reduce laser power by 6.8x compared with IMDD. No link margin is considered for optical power estimation.

6.6 (a) Probability density function (PDF) of the received signal $y$ in a IMDD receiver, conditioned on the transmitted bit, ZERO or ONE. (b) PDF of $y$ in a BPSK link. $I_{th}$ represents the minimum input swing requirement imposed by the data sampler.

6.7 (a) To achieve $10^{-12}$ BER at 50 Gbps, required laser output power versus coupler loss for different link architectures, (b) To achieve $10^{-12}$ BER at 50 Gbps, required laser output power versus sampler-limited swing for different link architectures.

6.8 Noise distribution with laser phase noise effects. The blue margin is reserved for sampler-limited swing $I_{th}$. The red margins on the two sides are chi-squared noise caused by laser phase noise. The sum of thermal noise and shot noise obeys Gaussian distribution with variation $\sigma^2_0$. The peaks of the conditioned pdf are moved closer due to the added margin $m_p$ for phase noise effects.

6.9 Estimated upper bound of BER for a 50 Gbps LF-BPSK link, considering laser phase noise, shot noise, thermal noise and sampler-limited swing requirement.

6.10 From left to right are an SEM image of a microring modulator fabricated in zero-change 45nm CMOS process [6], model diagram of the microring, and the measured and modeled transmission spectra.

6.11 Left is the transmission spectra and phase response of a microring modulator, where the two dashed lines represent nominal laser wavelength for the two modulation schemes. Right is the phasor diagram of a microring modulator marked with modulation trajectories.

6.12 System diagram of microring-based laser-forwarding coherent link. On-chip 3dB splitting is used in this configuration. $P_L$ is the laser output power, $\alpha_c$ is coupler loss, $\phi_{OS}$ is tunable phase offset for noise tracking and eye optimization.

6.13 Left is the contour of the received signal in LF-BPSK vs. phase offset $\phi_{OS}$ and laser wavelength. The optimal point is marked with the black dot and the optimal phase offset with the dashed line. Right is the received signal for IMDD and LF-BPSK vs. laser wavelength, assuming an optimal $\phi_{OS}$.

6.14 Simulink schematics for optical link simulation. Top schematic is for microring modulator based IMDD link. Bottom schematic is for microring modulator based laser forwarding coherent link. The simulation framework supports all basic optical devices such as laser, modulator, photodetector, coupler and splitter.
6.15 Transient simulation results for microring modulator in BPSK mode. From the top to bottom are the waveforms of drive voltage, amplitude of the modulated signal and phase of the modulated signal.

6.16 Simulated eye diagram for IMDD link and laser forwarding BPSK link. Note the laser power is set the same and the scale for signal amplitude is the same.

6.17 Proposed microring-based WDM coherent link architecture with laser forwarding configuration. Microring-based modulators and filters are used for energy-efficient modulation and intrinsic wavelength selectivity. In this example, on-chip optical power splitting between LO and signal is adopted. Off-chip optical power splitting can also be used.

6.18 Measurement setup of NRZ link and laser-forwarding coherent link.

6.19 Comparison between the eye diagrams of NRZ link and LF-BPSK link. Common conditions: bias condition $V_b = 0.5\,\text{V}$, measurement duration $= 25\,\mu\text{s}$, samples $= 1\,\text{Mpts}$. (a) NRZ modulation, laser output power $P_L = 4\,\text{dBm}$, (b) LF-BPSK, laser output power $P_L = 0\,\text{dBm}$.

6.20 Comparison between the eye diagrams of NRZ link and LF-BPSK link. Common conditions: bias condition $V_b = 1.0\,\text{V}$, measurement window $= 25\,\mu\text{s}$, samples $= 1\,\text{Mpts}$. (a) NRZ modulation, laser output power $P_L = 4\,\text{dBm}$, (b) LF-BPSK, laser output power $P_L = 0\,\text{dBm}$.

6.21 LF-BPSK measurement setup with two standalone receivers.

6.22 The output waveforms of the two receivers in the modified LF-BPSK setup at 10Gbps.

6.23 Comparison between the 10Gb/s eye diagrams of NRZ link and LF-BPSK link. Bias condition $V_b = 1.0\,\text{V}$ (a) 10Gb/s NRZ modulation, laser output power $P_L = 10\,\text{dBm}$, (b) 10Gb/s LF-BPSK, laser output power $P_L = 7\,\text{dBm}$.

6.24 (a) 1Gb/s LF-BPSK eye diagram, measurement window $= 25\,\mu\text{s}$, samples $= 1\,\text{Mpts}$; (b) 1Gb/s LF-BPSK eye diagram, measurement window $= 50\,\mu\text{s}$, samples $= 2\,\text{Mpts}$.

6.25 (a) 3Gb/s LF-QPSK eye diagram, DAC code $= 4/9/15$, measurement window $= 25\,\mu\text{s}$, samples $= 1\,\text{Mpts}$; (b) 3Gb/s LF-QPSK eye diagram, DAC code $= 4/9/15$, measurement window $= 50\,\mu\text{s}$, samples $= 2\,\text{Mpts}$; (c) 3Gb/s LF-QPSK eye diagram, DAC code $= 5/10/15$, measurement window $= 25\,\mu\text{s}$, samples $= 1\,\text{Mpts}$; (d) 3Gb/s LF-QPSK eye diagram, DAC code $= 5/10/15$, measurement window $= 50\,\mu\text{s}$, samples $= 2\,\text{Mpts}$. 
Chapter 1

Introduction

Today's computers are largely limited by communication bandwidth at every level of the system hierarchy: processor chip, server blade, rack, and data center. Silicon photonics co-packaged or integrated with large-scale systems-on-a-chip holds great promise in meeting the high bandwidth and low-energy demands of machine-learning-driven next-generation data-center and high-performance computing interconnects. In particular, silicon photonics has stepped up as a clear contender for the next-generation 400G inter-rack interconnects and 100G intra-rack interconnects in data centers. Recent years have seen great efforts and rapid progress in developing and commercializing silicon photonics technologies from platforms, devices, circuits to systems [1-33]. One recent milestone is the demonstration of the first single-chip computer that communicates directly with light based on monolithic photonics in 2015 [2]. This is an example of how the close integration between photonics and electronics could enhance CMOS capabilities and open the door to new innovations as transistor scaling slows down in the post-Moore’s law era.

However, there are still many challenges in the field of silicon photonic interconnects despite the great progress. The author attempts to address three challenges in the scope of this dissertation. First, the close electronic-photonic integration requires a new co-design methodology that is system oriented. Second, silicon photonics transceivers, especially the monolithic transmitters, that have been reported so far have limited data rate and energy efficiency mainly due to lack of co-optimization. Third, due to the limited wallplug efficiency and integration density of the lasers, laser power has become the bottle-neck of the overall system energy efficiency. Detailed description of these challenges as well as research background is introduced in Chapter 2. They are addressed one by one from Chapter 3 to Chapter 6, with innovative approaches in modeling, chip-level implementations to system architecture.

In Chapter 3, the author presents a new co-optimization and verification framework to address the challenge of system-oriented electronic-photonic co-design. This framework enables engineers to optimize high-speed silicon photonics transmitters in the context of a
practical optical link. It is applicable to most of today’s silicon photonics platforms that rely on PN junction based phase shifters. The author also presents the co-design methodology and co-design techniques for high-speed transmitters and the in-depth comparison between the different modulator types and modulation schemes. The work is published as "Electronic-Photonic Co-Optimization of High-Speed Silicon Photonic Transmitters”, Sen Lin, Sajjad Moazeni, Krishna T. Settaluri, Vladimir Stojanović, Journal of Lightwave Technology, 2017.

Next, the author addresses the second challenge and applies this co-design methodology to the chip-level high-speed optical interconnect designs based on state-of-the-art silicon photonics platforms. In Chapter 4, the author demonstrates a 40Gb/s optical non-return-to-zero (NRZ) transmitter and a 40Gb/s optical pulse-amplitude-modulation (PAM) transmitter. Both transmitter designs use monolithic silicon microring modulators in standard 45nm CMOS SOI process. The NRZ transmitter (including a serializer and modulator driver) achieves total energy efficiency of 330fJ/b and bandwidth density of $6.7\text{Tb/s/mm}^2$ at 40Gb/s. To our knowledge, it is by far the fastest and most energy efficient monolithic optical transmitter ever demonstrated. In this work, the author designed the circuits and optimized the modulators for high-speed transmitters. In Chapter 5, the author presents the first demonstration of a complete silicon photonic interconnect on a 3D integrated electronic-photonic platform. The key circuit blocks for wavelength-multiplexing-division (WDM) architectures are demonstrated along with state-of-art silicon photonic modulator and photodetector. This work is done in collaboration with Krishna Settaluri, Sajjad Moazeni, Chen Sun, Erman Timurdogan, Michele Moresco, Zhan Su, Yu-Hsin Chen, Gerald Leake, Douglas LaTulipe, Colin McDonough, Jeremiah Hebding and Douglas Coolbaugh. It is published as "Demonstration of an Optical Chip-to-Chip Link in a 3D Integrated Electronic-Photonic Platform”, European Solid-State Circuits Conference (ESSCIRC), 2015. The author’s main contribution is the design of transmitter circuits and the full-chip integration of the WDM test chip.

The last part of the study looks beyond the conventional silicon photonic interconnects and explores the feasibility of coherent optical communications with silicon photonics. In Chapter 6, the author takes a system-level approach and addresses the laser power challenge by proposing a short-reach laser-forwarding coherent architecture. The key concepts of the proposed architecture are verified both analytically and experimentally. This proposed architecture saves the laser power by 6-7.5 times and could enable complex coherent modulation for the future short-reach optical links.
Chapter 2

Background

2.1 Silicon Photonic Interconnects

Silicon photonics can potentially achieve lower energy and higher bandwidth density over traditional electrical I/O. Additionally, optical links benefit from distance insensitivity due to the inherently low loss of fibers, allowing for new types of connectivity and network organization in modern digital systems and data-centers. Wavelength-division multiplexing (WDM) may also be realized to place many data channels on a single optical fiber, thereby increasing the bandwidth density while retaining energy efficiency and breaking the I/O pin limitations imposed by the electronics.

Compared with conventional optical interconnects, silicon photonic interconnects reduce manufacturing cost dramatically as the modulators and photodetectors are fabricated on standard silicon wafers instead of very expensive III-V wafers. In addition, silicon photonics are generally compatible with CMOS processes, enabling large-scale integration between CMOS circuits and photonic devices, such as monolithic integration and 3D integration. Silicon photonic interconnects achieve high bandwidth density and high energy efficiency through close electronic-photonic integration.

2.1.1 Silicon photonic modulator and photodetector

On the transmitter side, three types of silicon photonic modulators are of most interest: microring modulator (MRM), Mach-Zehnder modulator (MZM) and electro-absorption modulator (EAM). Microring modulators and EAMs are much more energy efficient than MZMs due to their compact sizes. Among these three types, microring modulator is the only one that has inherent wavelength selectivity and thus it has unmatched potential for future terabit per second dense wavelength division multiplexing (DWDM) links. Therefore, we choose
silicon microring modulators as the primary device for our high-speed link designs on both monolithic and 3D heterogenous platforms.

Fig. 2.1 shows the conceptual diagram of a single-channel silicon photonic link based on microrings. Microring modulator is a resonant photonic device and operates as a notch filter in optical spectrum near the laser wavelength. It modulates the incoming light from laser by shifting its own resonance in and out of the laser wavelength. The resonance of microring modulators can be shifted by high-speed voltage drivers for high data rate modulation. The other two important specifications for optical transmitter are insertion loss (IL) and extinction ratio (ER) (defined as \( T_1 / T_0 \)) as labeled in the same figure. Small IL and high ER could help increase the optical modulation amplitude (OMA) (defined as \( T_1 - T_0 \)) or reduce the required laser power in the link. Given the decrease in receiver sensitivity with increase in data-rate, the larger transmit OMA is required.

In Fig. 2.1, the optical receiver can use a microring filter to receive optical signal at a specific laser wavelength for WDM operation. The drop port of the optical filter is connected to a photodetector to convert optical signals into electrical signals. The photodetectors often require Ge or SiGe due to their good compatibility with standard CMOS process. Ge photodetectors can be used for both O-band and C-band receivers and they can typically achieve close to 1A/W responsivity in C-band [53]. SiGe photodetector can be only used for O-band and shorter wavelengths with much lower responsivity. SiGe already exists in standard CMOS processes as a strain-engineering material to improve the carrier mobility, enabling monolithic integration of silicon photonics into a standard CMOS process with zero change [2].

2.1.2 WDM link architectures

Fig. 2.2 shows the system diagram of a microring-based WDM optical link, where a bank of microrings are used for both transmitter channels and receiver channels. The microring
modulator and filter operate in pairs as depicted in the single-channel diagram. Ring tuning control is implemented for all the transceiver channels to align the resonance wavelength of the microrings and lock them to the corresponding laser wavelength. The ring tuning control is critical to microring operations as they can calibrate process variation and compensate dynamic temperature fluctuations. Detailed implementation and measurement results of a microring thermal tuner are discussed in Chapter 5. The channel number of the WDM links is mainly limited by the cost of WDM laser source, which motivates our study on pushing the data rate boundary of silicon photonic interconnects.

2.1.3 Coherent optical links

The link architectures mentioned in section 2.1.1 and 2.1.2 are based on amplitude modulation of the optical signal. Optical signals can carry information on both the amplitude and the phase, which allows more complex high-order modulation schemes and higher spectral efficiency. Studies also show that coherent detection schemes could require much fewer photons per bit than intensity modulation direct detection (IMDD) schemes [7–9].

To date, the focus of silicon photonic interconnects research has been on non-coherent optical interconnects. The short-reach optical interconnect standards for data centers are all non-coherent (e.g., 100G-SR4 and 100G-PSM4), which are based on the same intensity modulation direct detection (IMDD) architecture. In practice, coherent optical communication has been largely limited to long haul and metro applications due to its high cost, power, and complexity. For long-reach communication, the primary goal is to achieve high spectral efficiency for each optical channel. Silicon photonics transmitters with high-order modulation schemes such as quadrature amplitude modulation (QAM) have been demonstrated [10–12]. On the receiver side, the major challenge has been optical-carrier phase tracking [7], which requires high speed analog-to-digital converters (ADC) and digital signal processing (DSP). In contrast, cost and energy efficiency are the primary concerns for short-reach optical communication. As a result, coherent optical communication suitable
for short-reach applications has yet to be demonstrated. These challenges motivate us to look into new link architecture to enable low-cost short-reach coherent optical communication with silicon photonics. We propose and demonstrate a new, laser forwarded coherent link architecture tailored for low-cost energy efficient short-reach optical communication in Chapter 6.

2.2 Silicon Photonics Platforms

2.2.1 Monolithic silicon photonics platform

Silicon photonic interconnects benefit from the close integration between photonics and electronics in terms of performance, power and cost. Monolithic integration achieves the single-chip electronic-photonic integration and has been demonstrated in different SOI and bulk processes. One recent milestone in monolithic integration is the demonstration of first single-chip computer with photonic I/O in GF 45nm SOI process [2].

Fig. 2.3 shows the cross-section view of monolithic silicon photonics platform in 45nm SOI CMOS process [2]. The core of the waveguide is based on the same crystalline silicon layer used for transistors, which has thickness of 80-100nm. Silicon has much higher refractive index (~ 3.45) compared to silicon oxide (~ 1.45), and with this high index contrast optical mode can be well confined in the waveguide with very low propagation loss (3-4dB/cm). Low-level metal routings are typically blocked above the silicon waveguide to avoid excess

![Cross-section view of the monolithic silicon photonics platform in 45nm SOI CMOS process.][1]
loss. The thickness of the buried oxide is around 150nm. To prevent light from leaking into the substrate, the silicon substrate is removed with XeF$_2$ dry etching. The author uses the technique to do both full and partial release of the processor chip [2]. This platform allows the integration of photonics and VLSI systems and opens the door to many opportunities on complex electro-optical systems. Based on this platform, we have successfully demonstrated the fastest and most efficient monolithic silicon photonics transmitters in Chapter 4 as well as the concept of laser-forwarding coherent link in Chapter 6.

2.2.2 3D integrated silicon photonics platform

Although monolithic silicon photonics is a very promising technology, it has its own challenges in the short term. First, integration into nodes below 28nm is not possible using the crystalline Si (typically used for silicon-photonics) due to reduced thickness of the epi layers (sub 28nm only thin-body FDSOI and FinFETs are present). Second, it can impose additional constraints on photonics design and limit the performance of photonic devices, in particular, the modulation efficiency of modulator and the responsivity of photodetector.

Heterogeneous integration overcomes some of the limitations by decoupling photonic process from CMOS process. In this case, transistors and photonic devices can be optimized separately. This approach enables large-scale integration of silicon photonics and advanced bulk CMOS electronics with mature packaging solutions. It also relaxation design constraints on photonic devices and improves the performance of photonics.

The first wafer-scale 3D integrated silicon photonics platform is developed through SUNY Poly Colleges of Nanoscale Science and Engineering (CNSE). The cross-section view and the top view of the 12-inch electronic-photonic wafer are shown in Fig. 2.5 [53]. The photonic devices on the top wafer and the electronic circuits on the bottom wafer (65nm bulk) are electrically connected through thru-oxide vias (TOVs), which have ultra-low parasitic capacitance of $\sim 3 fF$. Based on this platform, we have successfully demonstrated the first optical link using 3D integrated silicon photonics. The details are reported in Chapter 5.
2.3 Challenges and Opportunities

2.3.1 Co-optimizing photonics and electronics

Optical links based on silicon photonics hold great promise for meeting the demands of next-generation 400G inter-rack interconnects and 100G intra-rack interconnects in data centers. To meet such demands, optical transceivers with datarates at or higher than 50Gb/s are of most interest in both wavelength-division multiplexed (WDM) and parallel single-mode (PSM) systems. Recent years have seen great effort and rapid progress in the development and commercialization of silicon photonics technologies ranging from platforms, devices, and circuits, to large-scale systems [2, 18, 21–27]. Moreover, silicon photonic modulators and optical transceivers beyond 50Gb/s have recently been demonstrated in various photonic platforms [4, 13, 14, 28–30]. At these high data-rates, it is critical to consider holistically the design of the photonic and circuit components from the perspective of link energy-efficiency and bandwidth density.

In state-of-the-art 50Gb/s NRZ optical link based on Mach-Zehnder modulator (MZM), the driver and laser together consume more than 10pJ/b energy dominating the total link energy, compared to 1.4-3pJ/b consumed by the receiver [4, 14]. In contrast to MZMs, microring modulators (MRM) can consume less than 100fJ/b driver energy due to their compact sizes. The thermal tuning overhead for microrings can be as low as a few milliwatts per channel [31], which has negligible impact on the overall energy efficiency of the transmitter. Microring modulators have great potential for dense WDM systems due to their inherent wavelength selectivity. They have also shown promising high-speed operations for single-wavelength 50Gb/s links [13, 32, 33]. However, full optical links at such high datarate using microring modulators are yet to be demonstrated, which is in part due to unoptimized device designs and the inherent trade-offs between optical modulation amplitude (OMA) and optical bandwidth for microrings [34]. For both types of modulators, there are different architecture choices and also trade-offs between laser power and transmitter power. Addi-
tionally, they are both subject to the same technology constraints from the silicon photonics platforms and link specifications. As a result, it is critical yet challenging to co-optimize photonics alongside circuits. To date, there is still debate on which modulator architecture could be a better design choice for 50Gb/s optical channels in WDM and PSM systems.

2.3.2 Pushing speed limits of photonics transmitters

Close integration of photonics with transceivers circuitry is critical for achieving ultra-low energy and high bandwidth densities (enabling low area overheads and new interconnect topologies). Monolithic silicon photonics is a potential solution that enables the closest proximity of electronics and photonics and large-scale integration. Recently, optical transceivers using monolithic silicon photonics in 45nm and 90nm SOI processes have been demonstrated [2, 23, 32, 33]. However, achieving high data-rates (25+ Gb/s) with sub-pJ/b energy-efficiency in order to meet the demands of the next wave of optical interconnects is still a challenge for these monolithic NRZ transmitters.

On one hand, Mach-Zehnder modulators (MZMs) have large optical bandwidth, but limited electrical bandwidth and large energy cost. On the other hand, microring modulators (MRMs) have narrower optical bandwidth but relatively large electrical bandwidth and low energy cost (due to their compact size). To achieve the high-modulation rate for target modulation depth while keeping the energy cost low, MRM-based NRZ transmitter requires careful co-design of electronic circuitry and photonic devices.

2.3.3 Reducing the high power consumption of laser sources

Integrated silicon photonics has shown unmatched potential in providing high-data bandwidth with low-cost and high-energy efficiency [2, 13–18]. However, embedded laser power consumption for silicon photonic links has become a bottleneck for further improving overall energy efficiency. This problem is more prominent for high-speed optical interconnects, as receiver sensitivity degrades significantly at high data rates [19, 20]. The performance of photonic devices also imposes many constraints on the overall energy efficiency of a photonic interconnect. The key constraints include fiber-to-chip coupling loss, modulator insertion loss, and laser wall-plug efficiency. Most silicon-photonic chips today use off-chip lasers, and thus the total channel loss outside the laser module includes losses of at least three optical coupling interfaces: two on the transmitter and one on the receiver. Post-packaging loss of a fiber-to-chip coupler is typically in the range of 2-4 dB [2, 13, 14]. Typical insertion loss of a Mach-Zehnder modulator with a reasonable extinction ratio is around 5 dB [15]. Under these constraints and typical link loss margins of 3dB, the required optical power from the laser has to be 9-15 dBm to maintain a reasonable bit error rate (BER) at 50 Gbps [14] (i.e. at least 1e-4 or 1e-6 for forward error correction (FEC) links and 1e-12 for uncoded low-latency high-performance computing interconnects). In addition, considering a typical
uncooled laser wall-plug efficiency of 10%, the laser itself can consume 80 mW to 320 mW for a single optical channel. This could be prohibitive, especially for scenarios in which photonic transceivers aim to get closer to the processor and switch chips and further cut the interconnect energy. Typically, lasers are thermally stabilized, which adds another 3-4x reduction in wall-plug efficiency and makes the need for link co-optimization even more significant. Our research aims to explore new solutions to this problem from a link architecture perspective. We propose and demonstrate the concepts of laser-forwarding coherent silicon photonic links in Chapter 6.
Chapter 3

Electronic-photonic Co-Optimization of Silicon Transmitter

This chapter addresses the challenges discussed in Section 2.3.1 and provides new insights and intuition into high-speed silicon photonics transmitters. This chapter focuses on a comparison between microring and Mach-Zehnder modulators given the same technology constraints at 50Gb/s. We begin with an overview of the optimization framework in Section 3.1 and introduction of a compact model for phase shifters in Section 3.2. The phase shifter model is verified with experimental data and later sets the foundation for microring and Mach-Zehnder modulator modeling. In Section 3.3, the optimization of the microring-based transmitter is carried out for 50 Gb/s optical links to obtain the best energy efficiency. A new Simulink toolbox is introduced to capture dynamic behaviors of MRMs. This general-purpose toolbox can be used for simulating other optical systems as well. In addition, an MRM-based PAM4 transmitter is analyzed as a potential way to mitigate optical bandwidth constraints. In Section 3.4, a co-optimization of the Mach-Zehnder transmitter is carried out for both multi-stage (MS) and traveling-wave (TW) drivers. Finally, a comparison between optimized MRM-based TX and optimized MZM-based TX given the same technology constraints is discussed in Section 3.5. Section 3.6 summarizes this chapter.

3.1 Overview of Co-optimization Framework

The objectives of the proposed framework are to lay the foundation for silicon photonics device and link co-design and to be readily applicable to a multitude of silicon photonics platforms. This framework is called “co-optimization” as it optimizes photonic device parameters such as doping levels and geometries alongside CMOS circuits and architectural choices. The optimization goal is to minimize the overall energy-per-bit (E/b) of the transmitter macro (laser plus driver) under both technology and link design constraints. The
Challenges for designing a comprehensive, yet general, co-optimization framework stem mainly from three important criteria. First, the framework needs to be specific enough to capture the intricacies of technology-dependent photonic device physics, without necessarily overburdening the optimizer. Second, the model needs to be generic enough to characterize the common waveguide and junction designs across many silicon photonics platforms. Third, it needs to consider key link constraints and provide a link-level picture that includes the transmitter, receiver and laser. Previous literature on link-level analysis and modeling of silicon photonics transmitters [19, 31, 35] often treats the optical devices as black boxes and does not consider doping and device design parameters altogether. For example, the critical trade-off between phase shift and optical loss under process constraints is often neglected. Other literature focusing on photonic device modeling [36–39] often relies on analytical expressions of optical mode distribution in the waveguide and can be too complex and cumbersome for link-level analysis. To overcome these issues, we model the silicon photonic modulators based on a simple yet accurate compact model for phase shifters. The study focuses on depletion-mode pn-junction-based phase shifters, as they are widely used for high-speed modulators on different silicon photonics platforms [2,21–25,27,40]. This compact model incorporates waveguide geometry, mode confinement factor, and PN junction doping, all with some reasonable approximations. The compact model fits well with experimental results in various silicon photonics platforms.

As shown in Fig. 1, the co-optimization engine uses both technology and link constraints. The technology constraints are related to the photonic processes and includes parameters for waveguides, junctions, couplers and lasers. The link constraints are determined by the overall link budget and specific transceiver circuits. The engine optimizes microring and Mach-Zehnder transmitters separately based on the same phase shifter model with the goal of minimizing total E/b for laser and electronic driver combined. An optical simulation toolbox is developed in Simulink to verify the large-signal transient time-domain performance of the co-optimized transmitters. The optimizer is implemented in Matlab and can be integrated seamlessly with Simulink. Although our study focuses on the transmitter side, the simulation toolbox can be applied to full optical links as well along with other communication toolboxes. This Simulink electronic-photonic co-design simulation toolbox has been released online [41].

3.2 PN-Junction-Based Optical Phase Shifter

3.2.1 Compact model of optical phase shifter

The common building block for both MRM’s and MZM’s is the high-speed optical phase shifter. More specifically, this phase shift allows the constructive or destructive interference
Due to the lack of Pockels effect, silicon photonics phase shifters rely on the carrier plasma dispersion effect [42]. High-speed phase modulation is achieved with depletion-mode PN junctions. Within PN junctions, the number of excess electrons and holes strongly dictate the refractive index and absorption coefficient. Combined with the applied voltage across the junction, these factors affect the maximum phase shift as well as the loss. Device parameters for phase shifters include intrinsic index and absorption, junction geometries and doping concentrations. The foundries often provide a wide range of doping concentrations by default and can potentially tune the doping levels for customers. Therefore, doping level...
is considered a key parameter in our optimization framework. There are three main types of junction designs as shown in Fig. 3.2. In this section, we propose a simplified phase shifter model that is applicable to most junction shapes.

The carrier plasma dispersion effect in crystalline silicon was first shown in [42]. The wavelength-dependent expressions for the material properties were commonly used with fitting parameters. According to the models in [3], both index and absorption vary as wavelength $\lambda$ (m). The changes in refractive index $n(\lambda)$ and absorption $\alpha(\lambda)$ are given by:

$$\Delta n(\lambda) = -A\lambda^2 \Delta N - B\lambda^2 \Delta P^{0.8}$$  \hspace{1cm} (3.1)

$$\Delta \alpha(\lambda) = C\lambda^2 \Delta N + D\lambda^2 \Delta P \text{ (cm}^{-1})$$, \hspace{1cm} (3.2)

where $\Delta N$ and $\Delta P$ are changes in electron and hole concentrations (cm$^{-3}$). The fitting parameters are $A = 3.64 \times 10^{-10}$, $B = 3.51 \times 10^{-6}$, $C = 3.52 \times 10^{-6}$ and $D = 2.4 \times 10^{-6}$. Throughout the study, power absorption coefficient is denoted by $\alpha$, and field absorption coefficient is denoted by $\alpha_f$, where $\alpha = 2\alpha_f$. The effective refractive index and absorption coefficient for an intrinsic silicon waveguide are denoted by $n_{\text{eff}},i$ and $\alpha_i$ respectively. The impacts of junction doping and external bias voltages can be derived in two steps. As the first step, a doped silicon waveguide without depletion region is assumed. For simplicity, the waveguide is assumed to be split evenly between uniform n-doping and uniform p-doping. The intermediate effective index $n_{\text{eff}},d$ and absorption $\alpha_d$ for a doped waveguide can be thereby approximated as

$$n_{\text{eff}},d \approx n_{\text{eff}},i - \gamma(A\lambda^2 N_D + B\lambda^2 N_A^{0.8})/2$$  \hspace{1cm} (3.3)
\[
\alpha_d \approx \alpha_i + \gamma (C\lambda^2 N_D + D\lambda^2 N_A)/2.
\] 

(3.4)

where \(N_D\) and \(N_A\) are impurity densities for n-doping and p-doping respectively. \(\gamma\) represents the mode confinement factor for the waveguide (0<\(\gamma\)<1). When optical mode is more confined in the waveguide, \(\gamma\) increases and thus doping has a larger impact on optical properties.

In reality, depletion region always exists in the PN junction of a depletion-mode phase shifter. As the second step, we assume that bias voltage \(V\) is applied on the junction (\(V=0\) when there is no external bias). The voltage-dependent effective refractive index and absorption coefficient are derived as

\[
n_{\text{eff}}(V) \approx n_{\text{eff},d} + \frac{\gamma}{L_j} \left( A\lambda^2 N_D x_n(V) + B\lambda^2 N_A^{0.8} x_p(V) \right)
\]

(3.5)

\[
\alpha(V) \approx \alpha_d - \frac{\gamma}{L_j} \left( C\lambda^2 N_D x_n(V) + D\lambda^2 N_A x_p(V) \right)
\]

(3.6)

where \(x_n(V)\) and \(x_p(V)\) are depletion widths on the n-doping and p-doping side of the PN junction. They are calculated by the set of equations below:

\[
x_n(V) = \sqrt{\frac{2\epsilon N_A (V_{bi} - V)}{qN_D(N_A + N_D)}}
\]

(3.7)

\[
x_p(V) = \sqrt{\frac{2\epsilon N_D (V_{bi} - V)}{qN_A(N_A + N_D)}}
\]

(3.8)

\[
V_{bi} = \frac{k_B T}{q} \ln \frac{N_A N_D}{n_i^2}.
\]

(3.9)

\(L_j\) is defined as a feature length for PN junction. It is determined by the waveguide geometries and junction shapes. For different junction shapes, different feature lengths \(L_j\) are listed in Fig. 3.2. Intuitively, reducing \(L_j\) would improve phase modulation efficiency as the depletion region takes up a larger portion within the waveguide. For lateral and vertical junctions, feature length \(L_j\) are correlated with the confinement factor \(\gamma\). For interleaved junctions, they are independent parameters. In general, reducing \(L_j/\gamma\) improves the overlap between the confined optical mode and depletion region and thereby improves phase modulation efficiency (Eq. 3.5).

This model assumes that the perturbations in effective refractive index and absorption coefficient vary linearly as depletion width. This is an accurate assumption for interleaved junctions, and is a simplified first-order approximation for other junction designs with typical waveguide geometries. For lateral and vertical junctions, the model assumes uniform distribution of optical power in the waveguide.
Figure 3.3: Modulation efficiency $V\pi L\pi$ vs. junction doping level. Dashed lines are predicted $V\pi L\pi$ at -1V reverse bias when $L_j/\gamma$ equals 200nm, 400nm, 600nm and 800nm. Reported data on various silicon photonics platforms P1-P8 are marked here. Average concentration of n-type and p-type doping is used.

### 3.2.2 Model verification on different platforms

Next the phase shifter model is applied to various silicon photonics platforms developed by multiple foundries. The model is verified against measurement data. Modulation efficiency $V\pi L\pi$ is generally used for characterizing phase shifter performance, which is defined as the product of the required voltage swing ($V_\pi$) and phase shifter length ($L_\pi$) for $\pi$ phase shift. For a voltage swing from 0 to $V$, the product is given by

$$V_\pi L_\pi = \frac{\lambda V}{2(n_{\text{eff}}(V) - n_{\text{eff}}(0))}$$

(3.10)

The relationship between $V\pi L\pi$ and doping levels for different $L_j/\gamma$ is plotted in Fig. 3.3. The reported data points from multiple silicon photonics processes are marked in the same figure (P1: [21], P2: [22], P3: [23], P4: [24], P5: [25], P6: [40], P7: [27] and P8: [5]).

All the measurement data are taken at around 1550nm for consistency. This figure shows the distribution of doping levels in today’s silicon photonics platforms and their corresponding $V\pi L\pi$. In addition, it can be used to estimate the waveguide and junction defined factor $L_j/\gamma$ for these platforms.

More details about these waveguides and phase shifters are summarized in Table 3.1. Based on our proposed compact models, the mode confinement $\gamma$ can be directly calculated from the measured $V\pi L\pi$. It is clear from the calculated results that $\gamma$ decreases as the dimensions of the waveguide cross section shrinks because the optical mode is less confined. The typical value for $\gamma$ is between 0.60 to 0.85. In our optimization framework, $\gamma$ is assumed to be fixed and considered as a technology constraint. In Fig. 3.4, the voltage-dependency of $V\pi L\pi$ predicted by the model are compared with the reported measurement data from three
Figure 3.4: Modulation efficiency $V_{\pi L_{\pi}}$ vs. reversed bias voltage for phase shifters on three different platforms P4, P5, P7. Detailed information are included in Table 1. Note that [5] refers to the interleaved phase shifter on that platform.

Figure 3.5: Reported waveguide loss vs. predicted waveguide loss. The references for each date points are labeled in the figure P1-P8.

different silicon photonics platforms P4, P5 and P7. Among them, two use lateral junctions and one uses interleaved junction. Note that for these three phase shifters, $L_j$ and $\gamma$ are the same as their corresponding values listed in the table. The predicted modulation efficiencies matches well with measurement data.

For higher doping levels, modulation efficiency of the phase shifter improves at the cost of larger optical losses. This inherent trade-off is critical for doping optimization for MRM and MZM devices. The optical losses of the phase shifters, as calculated from Eq. 3.6, match well with measured waveguide losses from various platforms in Fig. 3.5. Overall, the proposed model considers junction design and mode confinement, captures the fundamental device trade-off between loss and phase shift, and fits accurately the voltage dependency on these optical properties. Although the model can be extended to include more physics details for specific designs such as junction asymmetry or actual mode profiles, it is efficient
Table 3.1: Modeled phase shifters on various Si photonic platforms

<table>
<thead>
<tr>
<th>No.</th>
<th>Junction Type</th>
<th>Foundry/Company</th>
<th>Doping (cm$^{-3}$)</th>
<th>Width (nm)</th>
<th>Thickness (nm)</th>
<th>Feature Lj (nm)</th>
<th>Fitted $\gamma$</th>
</tr>
</thead>
<tbody>
<tr>
<td>P1</td>
<td>Interleaved</td>
<td>SMIC</td>
<td>$2 \times 10^{17}$</td>
<td>450</td>
<td>340</td>
<td>300</td>
<td>0.83</td>
</tr>
<tr>
<td>P2</td>
<td>Lateral</td>
<td>LETI</td>
<td>$3 \times 10^{17}$</td>
<td>400</td>
<td>220</td>
<td>400</td>
<td>0.60</td>
</tr>
<tr>
<td>P3</td>
<td>Interleaved</td>
<td>IBM</td>
<td>$4 \times 10^{17}$</td>
<td>500</td>
<td>135</td>
<td>300</td>
<td>0.62</td>
</tr>
<tr>
<td>P4</td>
<td>Lateral</td>
<td>IME</td>
<td>$6 \times 10^{17}$</td>
<td>500</td>
<td>220</td>
<td>500</td>
<td>0.72</td>
</tr>
<tr>
<td>P5</td>
<td>Lateral</td>
<td>IMEC</td>
<td>$1 \times 10^{18}$</td>
<td>500</td>
<td>220</td>
<td>500</td>
<td>0.76</td>
</tr>
<tr>
<td>P5</td>
<td>Interleaved</td>
<td>IMEC</td>
<td>$2 \times 10^{18}$</td>
<td>500</td>
<td>220</td>
<td>300</td>
<td>0.72</td>
</tr>
<tr>
<td>P6</td>
<td>Lateral</td>
<td>Oracle</td>
<td>$1 \times 10^{18}$</td>
<td>480</td>
<td>300</td>
<td>480</td>
<td>0.85</td>
</tr>
<tr>
<td>P7</td>
<td>Interleaved</td>
<td>IBM</td>
<td>$2 \times 10^{18}$</td>
<td>500</td>
<td>170</td>
<td>280</td>
<td>0.67</td>
</tr>
<tr>
<td>P8</td>
<td>Vertical</td>
<td>AIM</td>
<td>$2.5 \times 10^{18}$</td>
<td>N/A</td>
<td>220</td>
<td>220</td>
<td>0.73</td>
</tr>
</tbody>
</table>

and accurate enough for this study’s system-level optimization.

### 3.3 Optimization of Microring-based Transmitter

#### 3.3.1 Static model of microring modulator

A microring modulator (MRM) typically consists of a silicon microring and three waveguide ports – input, output and drop ports, as shown in Fig. 3.6. The microring has a very small footprint compared to other modulators, with diameters as small as 10\(\mu\)m. This enables very low power modulation at sub–100fJ/b driver energy due to its small device capacitance [2]. The microring itself is a pn-junction-based phase shifter that is driven by voltage drivers. The optical power at the output port of the ring changes as the round-trip phase is modulated by the driver. High-speed operation has been demonstrated with depletion-mode phase shifters [13,32,33].

Microring modulators can be sensitive to temperature variations. For any practical system, the microring resonance needs to be adaptively locked to the laser wavelength through a thermal tuning feedback loop. Robust and efficient thermal tuning for microring modulators has been demonstrated with a running processor on the same chip [2]. As the sensing part of feedback loop, a drop port waveguide is coupled to the microring to provide a port for monitoring the optical power level inside the cavity (Fig. 3.6). The feedback loop can be closed with an embedded heater inside the microring for tuning the temperature. More details about thermal tuning feedback designs and algorithms are shown in [31].
To optimize the modulator performance, we begin with the introduction of the static model of MRM’s which relies on the phase shifter model in section 3.3. The static model of the microring is derived based on the transfer matrix method (TMM), where the coupling between input/drop waveguides and the ring is represented by transfer matrices [43]. As shown in Fig. 3.6, the key device parameters for a microring include effective index $n_{\text{eff}}$, group index $n_g$, round-trip length $L_{rt}$, input coupler field transmission $t_1$ and drop coupler field transmission $t_2$. Assuming a lossless coupler, we have $|k_i|^2 + |t_i|^2 = 1, (i = 1, 2)$, where $k_1$ and $k_2$ are cross-coupling coefficients. These coupler coefficients depend on the gap between the waveguide and microring cavity and can be determined through FDTD simulation. According to the TMM, the optical power at the output port $P_t$ and drop port $P_d$ can be derived as follows:

\[
P_t = \left| \frac{t_1 - t_2 e^{-\alpha_f L_{rt} + i\theta}}{1 - t_1 t_2 e^{-\alpha_f L_{rt} + i\theta}} \right|^2 \tag{3.11}
\]

\[
P_d = \left| \frac{k_1^* k_2 e^{(-\alpha_f L_{rt} + i\theta)/2}}{1 - t_1 t_2 e^{-\alpha_f L_{rt} + i\theta}} \right|^2 \tag{3.12}
\]

where $\alpha_f$ is the field absorption coefficient and $\theta$ is round-trip phase shift in the ring: $\theta = 2\pi L_{rt} n_{\text{eff}} / \lambda$. Note that $\alpha_f$ and $n_{\text{eff}}$ are functions of bias voltage $V$ and are dependent on doping and phase shifter designs, which are given by the phase shifter compact model in section 3.3. As a result, $P_t$ and $P_d$ are functions of bias voltage $V$ as well.

Assuming the bias voltage for 0 and 1 levels are $V_0$ and $V_1$ respectively, the normalized optical modulation amplitude (OMA) with $P_{in} = 1$ is given by

\[
\text{OMA} = P_1 - P_0 = P_t(V_1) - P_t(V_0). \tag{3.13}
\]

Throughout the study, OMA will be used to refer to the normalized optical modulation amplitude (or modulation depth). The power transmission spectra of a typical microring modulator is shown in Fig. 3.7. At two different biases, the transfer functions of the microring modulator are labeled.
Figure 3.7: Modeled power transmission spectra of microring modulator under two different bias voltages. Optimal laser wavelength to maximize OMA is labeled. For phase shifter model, we assumed that $N_A = N_D = 10^{18}$ cm$^{-3}$, $L_j = 500$ nm and $\gamma = 0.75$. The Q factor of this microring modulator is 7700. FSR of this microring is around 20nm.

The Q factor of this microring modulator is 7700. FSR of this microring is around 20nm.

The fundamental trade-off between OMA and optical bandwidth is the most critical challenge for high-speed modulation of microring modulators. Modeling dynamic behaviors of microrings accurately is the key for designing optical transmitters, especially at very high datarate such as 50Gb/s. According to coupled mode theory (CMT), the electrical-to-optical
modulation bandwidth of microrings should be inversely proportional to the photon lifetime \( \tau_p \) inside the cavity [34] [44]. In addition, analytic small-signal model has revealed that the small-signal bandwidth depends not only on the photon lifetime but also the detuning of the laser (frequency offset between laser and microring resonance) [39]. However, the more accurate modulation bandwidth for microring modulators has to be estimated through large-signal transient simulations.

The photon lifetime \( \tau_p \) can be calculated from \( \tau_p = Q/\omega_0 \). \( \omega_0 \) is the resonance frequency, and Q factor is defined as the time averaged stored energy per optical cycle divided by the total power loss. The stored energy in the ring is given by \( P_cL_{rt}/v_g \) with the group velocity \( v_g \) and the power flow in the cavity \( P_c \) [44]. In our case, the total power loss in the cavity stems from the input port coupling, drop port coupling and round-trip loss. Therefore, the Q factor is derived as

\[
Q = \frac{\omega_0 P_cL_{rt}/v_g}{\omega_0 P_c(|k_1|^2 + |k_2|^2 + 1 - e^{-\alpha L_{rt}})} \\
\approx \frac{P_cL_{rt}/v_g}{\omega_0 n_g L_{rt}}
\]

(3.14)

with the speed of light \( c \) and group index \( n_g \). The round-trip loss is assumed to be small in the approximation above. Now the photon lifetime \( \tau_p \) in the microring cavity is given by

\[
\tau_p = \frac{Q}{\omega_0} = \frac{n_g L_{rt}}{c (|k_1|^2 + |k_2|^2 + \alpha L_{rt})}
\]

(3.15)

The optical bandwidth or the corresponding full-width-at-half-maximum (FWHM) bandwidth can be calculated as

\[
f_{\text{optical}} = \frac{1}{2\pi \tau_p}
\]

(3.16)

The actual modulation bandwidth \( f_{3\text{dB}} \) of the MRM is proportional to the optical bandwidth \( f_{\text{optical}} \) [34]. Large-signal simulation can be used to estimate the ratio more accurately given the optimized microring design and laser detuning.

An open-source Simulink toolbox is developed for simulating silicon photonics devices and systems [41]. The toolbox contains a library of the basic optical elements such as lasers, waveguides, phase shifters and couplers. Complex photonic devices are constructed with these basic building blocks. The basic theory behind Simulink simulation is the same as the previous Verilog-A co-simulation framework [45]. One of the major differences is that the new toolbox adopts the proposed phase shifter model in Section 3.3, which allows more physical details to be included. The Simulink toolbox works seamlessly with the co-optimization framework developed in Matlab.

The Simulink schematic of a microring-based optical link is shown in Fig. 3.8. It consists of two 2x2 couplers for input and drop ports and two phase shifters with half the round-trip length. The phase shifter blocks compute the phase shift and optical loss using the proposed
Figure 3.8: Schematic of MRM-based optical link in Simulink and the close-ups of microring modulator (MRM) block and the phase shifter (PS) block.

Figure 3.9: Simulated eye diagram at 25 Gb/s. Device parameters are the same as the microring in Fig. 3.7 with optical bandwidth of around 25GHz. Laser detuning is set to optimize OMA. A first-order low-pass filter approximation with 3dB bandwidth of 20GHz is represented with red dashed line.

As an example, the microring modulator in Fig. 3.7 is simulated using this work’s Simulink toolbox. A 25Gb/s eye diagram is shown in Fig. 3.9. In this transient simulation, the driver signal swings between 0.5V and -1.5V with ideal, sharp transitions. Therefore, the eye diagram is solely governed by the optical dynamic behavior of the microring modulator. It is interesting that the rising transition of the eye is faster than the falling transition and even causes a slight overshoot. This is consistent with the small-signal analysis [39] where larger laser detuning corresponds to larger small-signal bandwidth. The asymmetry in the
eye diagram should be balanced by adjusting driver strength for pulling up and pulling down.

In our optimization engine, the modulation bandwidth is assumed to be limited by the slower falling edge. A first-order low-pass filter with a 3dB bandwidth $f_{3dB}$ is used in Simulink to approximate the modulation bandwidth. The simulation results show $f_{3dB} \approx 0.8f_{optical}$ for the microring in Fig. 3.9. In our optimization, $f_{3dB}$ is chosen to be at least $0.8/T_b$ to ensure ISI-free modulation at a datarate of $1/T_b$ according to the eye diagram ($T_b$ is the bit period in NRZ encoding). Therefore, the optical bandwidth constraint for a microring modulator in the optimization can be simply given by

$$f_{optical} \geq \frac{1}{T_b}.$$  (3.17)

For a 50Gb/s MRM, the optical bandwidth constraint is thereby set to 50GHz with the actual electrical-to-optical modulation bandwidth being around 40GHz. Transient simulations will be used to further verify the dynamic performance for 50Gb/s optimized microring transmitters.

### 3.3.3 Optimization of microring modulator design

For a typical MRM-based transmitter, laser power dominates the total power consumption of the transmitter macro as the driver power is usually much lower. Therefore, minimizing the overall E/b for the transmitter (driver plus laser) is equivalent to maximizing the normalized OMA of the microring modulator.

For analysis purposes, we choose to use a typical feature length $L_j$ (500nm) and a typical mode confinement factor $\gamma$ (0.75) for phase shifters in this study. These numbers are within the range of parameters on the typical silicon photonics platforms summarized in Table 3.1. The other fixed parameters for our MRM analysis include round-trip length of the ring and waveguide intrinsic loss, which are set to be 30\(\mu\)m and 13dB/cm [3] respectively. These preset constraints largely depend on the photonic platform and targeted link application. However, the insights and trends discovered through the framework are useful over a wide range of technology and link constraints.

For each doping level for the PN junction ($N_A$ and $N_D$), the optimizer would find the optimal coupling coefficients at input and drop ports ($t_1$ and $t_2$) and the optimal laser detuning $\Delta \lambda$ for thermal locking, with the goal being to maximize the normalized OMA. For simplicity, symmetric pn doping is assumed with $N_A = N_D$. The driver swing is assumed to be from 0.5V to -1.5V. The optimization is subject to the following constraints:

- Optical bandwidth requirement:

$$f_{optical} \geq f_{min}.$$  (3.18)
Figure 3.10: Optimized OMA for $f_{\text{optical}}$ 25GHz, 35GHz and 50GHz versus doping levels in the PN junction. Bias conditions are $V_0 = 0.5\ V$ and $V_1 = -1.5\ V$. Technology constraints: PN junction feature length $L_j = 500\ \text{nm}$ and optical mode confinement factor $\gamma = 0.75$. $L_{\text{rt}} = 30\ \mu\text{m}$, intrinsic loss $13\ \text{dB/cm}$ [3]. Symmetric pn-junctions are assumed for simplicity.

- Extinction ratio (ER) requirement:
  \[
  \text{ER} = \frac{P_t(V_1)}{P_t(V_0)} \geq ER_{\text{min}} \tag{3.19}
  \]

- Enough average drop port power for thermal tuning:
  \[
  \frac{P_d(V_1) + P_d(V_0)}{2} \geq P_{d,\text{min}}. \tag{3.20}
  \]

The extinction ratio requirement $ER_{\text{min}}$ is set to 3.5dB according to 100G PSM4 and CWDM4 technical specifications [46] [47]. Drop port power $P_{d,\text{min}}$ is set to be $0.01P_{\text{in}}$ in order to achieve accurate power monitoring and thermal tuning based on the required drop port current in [2]. Optimizations are carried out for different doping levels for the PN junction. The optimal OMAs are shown in Fig. 3.10 for three targeted NRZ data-rates (25, 35, 50Gb/s). The corresponding optical bandwidths (25, 35, 50GHz) are used in the optimization engine based on the large-signal transient simulation in this study.

According to Fig. 3.10, an optimal doping level exists for each bandwidth requirement. Intuitively, increasing doping could improve the modulation efficiency of the phase shifter and could improve OMA. However, as we increase doping levels, the excessive optical loss in the ring might eventually lower the Q factor and degrade the OMA. Therefore, it is critical to find the optimal doping levels. It is important that the optimal doping level increases as the required optical bandwidth increases. The optimal doping for achieving 50GHz optical bandwidth is around $3.8 \times 10^{18}\ \text{cm}^{-3}$. This doping level is in fact close to that used in the 56Gb/s microring modulator reported in [13].

The corresponding device parameters given by the optimization engine are shown in Fig.
Figure 3.11: Key characteristics of the optimal microring designs for different doping levels with the design points corresponding to maximum OMAs labeled. The optimization constraints corresponds to the curves in the Fig. 3.10. Three operation regions (A-C) are labeled for 25GHz operation as an example. A is coupling-limited region, C is loss-limited region and B is the optimal region. Note that $t_1$ and $t_2$ are transmission coefficients at the couplers. Stronger coupling means smaller $t_1$ and $t_2$.

3.11, including Q factor, extinction ratio (ER), insertion loss (IL), coupler coefficients ($t_1$ and $t_2$) and the microring coupling factor ($\beta$). Here we define microring coupling factor $\beta$ as $\beta = t_1 e^{\alpha f L}/t_2$ to represent the coupling status of microrings. When $\beta < 1$, the microring is over coupled; when $\beta = 1$, it is critically coupled; when $\beta > 1$, the microring is under coupled. These parameters can be used as a design reference or provide in-depth insights for microring design.

For Fig. 3.11 (a-f), we define three different doping regions to get more insights into the microring optimization. The 25GHz microring is used as an example. Region A is the coupling-limited region where doping levels are relatively low. Q factor is effectively controlled by the coupler designs assuming fixed ring circumference and negligible round trip ring loss. Drop port coupling should be used to match the input port coupling. By doing so, the microring can be brought closer to critical coupling ($\beta = 1$) to improve OMA. Region C is the loss-limited region where doping levels are relatively high. The Q factor drops below the targeted value as it is dictated by the excessive doping loss. Interestingly, $t_1$ has to decrease to prevent the microring from getting too under coupled and breaking the ER constraint. All the microrings eventually get limited by the ER constraint as doping increases.

The optimal design with the maximum OMA is achieved in region B, where doping levels are between the regions A and C. For the optimal designs, input coupling is well balanced with the optical loss inside the cavity resulting in minimum drop port coupling. The microrings are slightly under-coupled. In this region, input coupling decreases as doping increases in order to maintain constant Q factor. If the available doping levels are not in
Figure 3.12: Transmitter circuits for ring modulator. \( C_w \) is the wire and packaging parasitic capacitance, and \( C_m \) is the modulator junction capacitance.

region B, different optimization strategies are needed according to the analysis above.

### 3.3.4 Microring-based NRZ transmitter design

The driver circuit is modeled as shown in Fig. 3.12. It consists of a high-speed serializer, pre-drivers and a final driver stage. The final stage driver can be a simple inverter driving one electrode of the modulator in single-end fashion with voltage swing of \( V_{DD} \). Alternatively, the final stage can be a high-swing driver or a pull-push driver, which can be implemented using stacked transistors and level shifters. The typical swing for a high-speed high-swing driver is \( 2V_{DD} \) and \( V_{DD} \) is 1V for standard CMOS processes.

In our optimization, we assume the voltage swing \( V_A \) to be either 1V or 2V. With \( V_b \) applied on the cathode, the voltage bias on the PN junction swings between \(-V_b\) to \( V_A - V_b \). In order to maintain depletion mode, the maximum forward \( V_A - V_b \) should be smaller than the built-in voltage \( V_{bi} \), which is between 0.7V to 1.1V for the typical doping range from \( 10^{16} \) to \( 10^{19} \) cm\(^{-3}\). For simplicity, we always set \( V_A - V_b = 0.5V \) for all doping levels in the optimization engine. This is consistent with experimental settings for microrings on various platforms [2] [28]. Microring performance might degrade due to the effect of free carrier absorption if the forward-bias voltage is further increased. Under these conditions, the E/b for the driver circuits is given by

\[
E_{dr} = \frac{1}{4\eta_d} V_A \int_{-V_b}^{V_A-V_b} (C_m(V) + C_w) \, dV. 
\]  

(3.21)

where driver efficiency \( \eta_d \) is assumed to be 20% considering reasonable fan-out for pre-driver stages at 50Gb/s.

The capacitance density of the PN junction is given by

\[
C_j(V) = \sqrt{\frac{qeN_A N_D}{2(V_{bi} - V)(N_A + N_D)}}. 
\]  

(3.22)
The modulator capacitance depends on the type of the PN junctions. For lateral junctions, \( C_m \approx C_j(V)LH \); for interleaved junctions, \( C_m \approx C_j(V)L^2WH/P \); for vertical junctions, \( C_m \approx C_j(V)LW \). \( H, P \) and \( W \) are defined in Fig. 3.2 and \( L \) is the total length of the PN junction. In the case of microrings, \( L = L_{rt} \). With the typical device parameters, \( C_m \) ranges from 15 to 25fF.

The total wiring capacitance \( C_w \) ranges from 5fF to 40fF depending on the packaging type. For 3D integration using copper pillars, the total wiring capacitance would be around 20fF [14]. For 3D integration with through-oxide-vias (TOVs) or monolithic integration, the wiring parasitics can be reduced to 5-10fF [2, 18]. In our analysis, we assumed \( C_w \) to be 20fF. The energy consumption for modulator driver circuits can be calculated based on the equations above.

Even so, laser power dominates the total power for MRM transmitters. For a typical silicon photonic link in Fig. 3.13, the optical power gets attenuated by three fiber-to-chip optical couplers and the transmitter before it reaches the receiver-side photodetector. The minimal OMA required at the receiver input to reach a certain BER target is defined as the receiver sensitivity, denoted by \( P_{RX} \) in this study. The total E/b consumed by the laser source is derived as

\[
E_{\text{laser}} = \frac{P_{RX}}{\eta_m \cdot \eta_w \cdot \alpha_c^3 \cdot \text{OMA}_\text{mod} \cdot f_b}
\]  

(3.23)

where \( \eta_w \) is the wall-plug efficiency of the laser module, \( \alpha_c \) is optical coupler loss coefficient, \( f_b \) is the symbol rate and \( \eta_m \) accounts for additional margin in the link budget. \( \text{OMA}_\text{mod} \) is the normalized OMA for the modulator. An optical receiver using 14nm FinFET has achieved -10dBm optical sensitivity at 50Gb/s reaching \( 10^{-12} \) BER [4]. We use the measurement data from this study as a reference for receiver sensitivity throughout the study such that the link constraints could reflect the state-of-the-art CMOS technology.

The total E/b for MRM-based transmitter is the sum of the driver circuit and laser power:

\[
E_{\text{tot}} = E_{dr} + E_{\text{laser}}.
\]  

(3.24)

Typical numbers for parameters used in Eq. 3.23 are listed in Table 3.2. Based on the results in Fig. 3.10, the energy-per-bit \( E_{\text{tot}} \) for the optimized 50Gb/s MRM-based transmitters can be calculated. The relationship between optimized \( E_{\text{tot}} \) (laser plus driver power) and doping levels are shown in Fig. 3.14.
Table 3.2: Parameters for 50Gb/s silicon photonic link budgeting

<table>
<thead>
<tr>
<th>Sensitivity $P_{RX}$</th>
<th>Coupler $\alpha_c$</th>
<th>Margin $\eta_m$</th>
<th>Laser $\eta_w$</th>
</tr>
</thead>
<tbody>
<tr>
<td>-10 dBm</td>
<td>3dB</td>
<td>3dB</td>
<td>10%</td>
</tr>
</tbody>
</table>

Figure 3.14: Model-estimated total E/b for microring driver+laser for microring-based NRZ transmitter at 50Gb/s. Two different driver swings are considered (1V and 2V). The microring is optimized for each doping level, which corresponds to the designs in Fig. 3.10 and 3.11.

The results show that higher driver swing improves the overall energy efficiency for MRM transmitters as laser power dominates and higher swing improves OMA. The total transmitter power is not sensitive to the increased driver power due to higher swing. Therefore it makes sense to always choose high swing drivers if driver bandwidth allows. In addition, it is also critical to co-optimize the modulator design to maximize OMA as discussed before. The 50Gb/s MRM-based NRZ transmitter with the optimized microring device and 2V driver voltage swing consumes 1.7pJ/b in total – 1.5pJ/b by laser and only 0.2pJ/b by driver circuits. More results and the optimal dopings can be found in Table 3.3. Note that the above analysis is done assuming 3D hybrid integration between circuits and photonics. Switching to monolithic integration would yield even lower driver power and thus further improve the energy efficiency of the transmitter and be even further dominated by laser power.

3.3.5 Microring-based PAM4 transmitter design

As shown in Fig. 3.10, the microring modulator can achieve higher OMA at the cost of optical bandwidth. In other words, reducing bandwidth requirement means improving OMA and lowering laser power for MRM-based optical links. One potential way to relax the bandwidth constraint while maintaining the same data-rate is to use PAM4 instead of conventional NRZ modulation, where the front-end bandwidth is halved in a PAM4 modulation scheme to attain the same bit rate. There has been analysis comparing the energy efficiency
of microring-based PAM4 transmitters with NRZ transmitters [48]. In this study, we use the proposed optimization framework to optimize microring designs for NRZ and PAM4 separately. The optimized microring-based PAM4 transmitter is then compared to the optimized NRZ transmitter at 50 Gb/s under the same process constraints. Transient simulations are also used to verify the transmitter performances. Note that practical design constraints outside the scope of this work would need to be taken into consideration as well to validate the benefits of PAM4 versus NRZ. This study is intended to give a first-pass, fundamental comparison between the two schemes.

There are multiple ways to generate the PAM4 optical signal with silicon microring modulators. For the first approach, an electrical DAC is used to drive the microring modulator [49] [50]. This architecture is shown in Fig. 3.15(a). Due to nonlinearity of the electrical-to-optical response of microrings, a lookup table is required in order to pre-distort the drive signal and achieve symmetric PAM4 eyes. The second approach uses an optical DAC to generate the PAM4 signals instead of using an electrical DAC [32, 33]. As shown in Fig. 3.15(b), the microring is segmented into $2^N$ uniform segments to form an N-bit optical DAC. In this topology, the PAM4 data needs to be thermometer-coded, and each slice of driver connects to one segment in the microring. The segmentation can be directly implemented in microrings with interleaved PN junction [32, 33]. For the third approach, the microring is segmented into only two segments – one LSB and one MSB with binary weights [49]. Each of two segments is driven by one driver and serializer as in Fig. 3.15(c).

Linearity is the key criterion for choosing microring-based PAM4 architectures, which can be evaluated with the static model. The optical responses of a 4-bit electrical DAC and a 4-bit optical DAC are compared in Fig. 3.16(a). For the comparison, the total voltage swing of the electrical DAC equals that of the driver for each small segment in the optical DAC. The same optimized microring modulators are used with 25GHz optical bandwidth and 50Gb/s targeted data-rate. For such microrings, the optical response of the optical DAC is more linear compared to that of the electrical DAC. It also shows that the third architecture using two segments should offer sufficient linearity for generating the balanced PAM4 signals at 50Gb/s.

The second and third architectures should be chosen depending on whether programmability is required to handle process variations. Despite of the architecture difference, they are both based on the same operation principles. The common transfer functions are shown in Fig. 3.16 as different portion of the ring is reverse biased by the corresponding driver. The microring design and laser detuning are optimized for 50Gb/s PAM4, and the four optical levels show very good linearity.

For 50Gb/s PAM4, the optimization engine optimizes the OMA of a microring modulator with 25GHz optical bandwidth, as shown in Fig. 3.10. A 25Gb/s NRZ receiver could achieve a sensitivity of -14 dBm according to [4]. In our analysis, the required total eye height for 50Gb/s PAM4 receiver is approximated as 3x single eye height for 25Gb/s NRZ receiver, neglecting any other circuit overhead. Therefore the new receiver sensitivity $P_{RX}$ becomes -9.2 dBm, and the new laser power $E_{laser}$ can be calculated according to Eq. 3.23. For the
new driver power $E_{dr}$, the wiring parasitics are now doubled for two-segment microring as packaging capacitance doubles and still dominates $C_w$. The new expression for $E_{dr}$ should be

$$E_{dr} = \frac{1}{2} \cdot \frac{1}{4\eta_d} V_A \int_{-V_b}^{V_A - V_b} \left( C_m(V) + 2C_w \right) dV$$  \hspace{1cm} (3.25)$$

Now the total energy $E_{tot}$ for driver and laser for microring-based PAM4 transmitter at 50Gb/s can be calculated. The optimization results at two different bias voltages (1V and 2V) are shown in Fig. 3.17 assuming the same bias condition for $V_b$ as the NRZ case. The optimal doping level and the best E/b for microring-based NRZ and PAM4 transmitters are compared in Table 3.3. The required Q for PAM4 microring modulator are doubled as optical bandwidth requirement is halved. Therefore the optimal doping for PAM4 microring is less than that for NRZ microring. From Table 3.3, the optimal doping level for PAM4 microrings is only half of the optimal doping in the NRZ case. Given the same technology and link constraints, microring-based PAM4 modulator can save nearly 20% total TX power compared to the NRZ modulator.

Transient simulations are carried out in order to verify the performance of 50Gb/s mi-
Figure 3.16: (a) Linearity comparison between 5-bit electrical DAC and 5-bit optical DAC for MRM-based transmitter. (b) Transmission spectra for a microring with two binary weighted segments. The microring here has an optical bandwidth of 25GHz.

<table>
<thead>
<tr>
<th></th>
<th>$V_A$ (V)</th>
<th>Doping (cm$^{-3}$)</th>
<th>Laser</th>
<th>Driver</th>
<th>Total</th>
</tr>
</thead>
<tbody>
<tr>
<td>NRZ</td>
<td>1.0</td>
<td>$3.7 \times 10^{18}$</td>
<td>2.9</td>
<td>0.06</td>
<td>3.0</td>
</tr>
<tr>
<td>NRZ</td>
<td>2.0</td>
<td>$3.8 \times 10^{18}$</td>
<td>1.5</td>
<td>0.21</td>
<td>1.7</td>
</tr>
<tr>
<td>PAM4</td>
<td>1.0</td>
<td>$1.5 \times 10^{18}$</td>
<td>2.1</td>
<td>0.04</td>
<td>2.1</td>
</tr>
<tr>
<td>PAM4</td>
<td>2.0</td>
<td>$1.7 \times 10^{18}$</td>
<td>1.3</td>
<td>0.10</td>
<td>1.4</td>
</tr>
</tbody>
</table>
Figure 3.17: Model estimated total E/b for microring driver+laser for microring-based PAM4 transmitter at 50Gb/s. Two different driver swings are considered (1V and 2V). The microring is optimized for each doping levels, which corresponds to the designs in Fig. 3.10 and 3.11.

pare the full link power. For driver and laser portion, the potential power saving for PAM4 is around 20% at 50Gb/s. Another observation is that microring PAM4 eye diagram is not balanced due to the asymmetry of the rise and fall time. Therefore, it is even more critical for PAM4 drivers to adjust pull-up and pull-down strengths compared with NRZ drivers. By doing so, the four signal levels in the optical PAM4 eye diagram can be well balanced.

3.4 Optimization of Mach-Zehnder Transmitter

3.4.1 Overview of Mach-Zehnder modulator

Mach-Zehnder modulators (MZM) have traditionally been used for optical communication due to its simple interferometric structure. A MZM consists of two balanced arms with embedded phase shifters. The output light intensity is modulated as a result of optical interference when phase shifts are introduced in the arms. On silicon photonics platforms, the phase shifters are normally made of PN junctions. The same set of technology constraints need to be applied. There are two major challenges for designing an energy efficient MZM. First, there is a trade-off between phase modulation efficiency and propagation loss for phase shifters. This would cause high insertion loss and low OMA for the MZM. Second, the device capacitance is much larger than microrings and the driver power could dominate the total power consumption. As a result, co-optimization of electrical driver and optical modulator is essential for designing low power MZM transmitters.

There are two basic architectures for MZM transmitter, one based on multi-stage drivers
Figure 3.18: Transient simulation of the 50Gb/s MRM-based NRZ transmitter and PAM4 transmitter. The microrings are optimized in each case with the same process and link constraints. The optical power for y-axis is normalized to the input power for microrings.

and one based on traveling-wave drivers, as shown in Fig. 3.19. For multi-stage MZM (MS-MZM), the arms are segmented into multiple segments which are modulated individually by distributed voltage drivers. Delay units are inserted between these electrical drivers to match with the propagation velocity of optical signal inside the waveguide. For traveling-wave MZM (TW-MZM), the transmission lines are used as the electrodes. Delay matching is also required between optical waveguide and electrical transmission line. Traveling-wave drivers are typically more energy efficient than multi-stage drivers at high data-rates [36]. This is because the power of TW driver is independent of the device capacitance of MZM and gets amortized at high data-rates. However TW-MZM may suffer from limited OMA due to lower voltage swing and high transmission line loss. Therefore, electronic-photonic co-optimization is needed to compare the overall energy efficiency of these two architectures,

The normalized transmitted power of both MZMs can be approximated as the following [36]:

\[
P_t = e^{-\alpha L} \sin^2 \left( \frac{\Delta \phi}{2} \right)
\]

where \( \alpha \) is the optical absorption coefficient, \( L \) is the length of each arm, and \( \Delta \phi \) is the phase

33
Figure 3.19: (a) Architecture of Multi-stage Mach-Zehnder Modulator (MS-MZM) (b) Architecture of Traveling wave Mach-Zehnder Modulator (TW-MZM)

difference between the two paths. Since the two arms of MZM are driven differentially, $\Delta \phi$ equals $\phi_0 + \Delta \phi_{\text{mod}}$ for bit “1” or $\phi_0 - \Delta \phi_{\text{mod}}$ bit “0”. $\phi_0$ is the static phase offset for adjusting OMA and ER. $\Delta \phi_{\text{mod}}$ is the modulation phase shift introduced on each arm by the voltage drivers, which will be derived depending on the architecture choice. The same compact model for optical phase that is used for microring modulator will be reused for MS-MZM and TW-MZM in the proceeding sections.

3.4.2 Multi-stage Mach-Zehnder transmitter

Since the same voltage swing is applied to each segment of the arm, the total modulation phase shift for one arm now becomes

$$\Delta \phi_{\text{mod}} = (2\pi L/\lambda) \cdot (n_{\text{eff}}(V_1) - n_{\text{eff}}(V_0))$$ \hspace{1cm} (3.27)

where $V_1$ and $V_0$ correspond to bias voltages for generating bit “1” and bit “0”. The effective index $n_{\text{eff}}$ and optical loss $\alpha$ are governed by the phase shifter model. They are both functions of doping levels for the PN junction based phase shifter. In the optimization engine for MS-MZM, the power levels for bit 1 and bit 0 are calculated as

$$P_{t1} = e^{-\alpha_1 L} \sin^2\left(\frac{\phi_0 + \Delta \phi_{\text{mod}}}{2}\right)$$ \hspace{1cm} (3.28)

$$P_{t0} = e^{-\alpha_0 L} \sin^2\left(\frac{\phi_0 - \Delta \phi_{\text{mod}}}{2}\right)$$ \hspace{1cm} (3.29)
The normalized OMA and ER are given by

\[ \text{OMA} = P_{t1} - P_{t0} \]  
\[ \text{ER} = \frac{P_{t1}}{P_{t0}} \]  (3.30, 3.31)

The E/b for the laser \( E_{\text{laser}} \) can be calculated according to Eq. 3.23 similar to microring-based optical links. For the 50Gb/s MZM link, the receiver sensitivity, couplers losses, link margin and laser wall-plug efficiency are assumed to be same as the 50Gb/s microring-based NRZ link.

MS-MZM drivers are generally very power hungry. The total E/b for the modulator drivers is calculated as follows:

\[ E_{\text{dr,MS}} = \frac{1}{4\eta_d} V_{dd} \int_{-V_{dd}}^{0} (C_m(V) + C_w L) \, dV. \]  (3.32)

where driver efficiency \( \eta_d \) is set to 20\%. \( C_w \) is set to 0.3fF/\( \mu \)m assuming that the parasitic capacitance for the electrodes is 0.2 fF/\( \mu \)m and the amortized pad capacitance is 0.1fF/\( \mu \)m [36]. Modulator capacitance \( C_m \) can be calculated the same way as microring modulators. The optimization engine for MZM assumes the same junction feature length \( L_j \) (500nm) and mode confinement factor \( \gamma \) (0.75) for the waveguides as MRM. The intrinsic loss for the straight waveguide is set to 3dB/cm [3]. We assume that the MZM drivers can be sufficiently sized to meet the bandwidth requirement for the target data rate regardless of doping levels and bias conditions.

The objective of the MZM optimization engine is to minimize total transmitter energy-per-bit \( E/\text{b} \), including both laser wall-plug energy and TX driver energy. When the arms are driven differentially as in Fig. 3.19(a), the total transmitter energy for MS-MZM is given by

\[ E_{\text{TX,MS}} = E_{\text{laser}} + 2E_{\text{dr,MS}} \]  (3.33)

For each doping level in the PN junction (\( N_A \) and \( N_D \)), the optimization engine finds the optimal arm length \( L \) and static phase offset \( \phi_0 \) to minimize the total E/b for the transmitter.
Figure 3.21: Optimization results for traveling-wave MZM transmitters at 50Gb/s with three differential peak-to-peak voltage swings $V_{TW}$ (0.6, 0.8, 1.0V). (a) the total transmitter E/b, (b) laser E/b, (c) optimal arm length $L$.

Table 3.4: 50Gb/s MZM Optimal design parameters and power (pJ/b)

<table>
<thead>
<tr>
<th>$V_{pp}$ (V)</th>
<th>Doping (cm$^{-3}$)</th>
<th>$L$ (mm)</th>
<th>Laser</th>
<th>Driver</th>
<th>Total</th>
</tr>
</thead>
<tbody>
<tr>
<td>MS 1.0</td>
<td>3.3×10$^{17}$</td>
<td>2.3</td>
<td>1.4</td>
<td>3.3</td>
<td>4.7</td>
</tr>
<tr>
<td>MS 2.0</td>
<td>3.7×10$^{17}$</td>
<td>1.3</td>
<td>1.1</td>
<td>5.3</td>
<td>6.4</td>
</tr>
<tr>
<td>TW 0.6</td>
<td>6.1×10$^{17}$</td>
<td>2.8</td>
<td>4.5</td>
<td>1.2</td>
<td>5.7</td>
</tr>
<tr>
<td>TW 0.8</td>
<td>3.0×10$^{17}$</td>
<td>1.4</td>
<td>3.5</td>
<td>1.6</td>
<td>5.1</td>
</tr>
<tr>
<td>TW 1.0</td>
<td>2.0×10$^{17}$</td>
<td>2.0</td>
<td>3.2</td>
<td>2.0</td>
<td>5.2</td>
</tr>
</tbody>
</table>

$E_{TX,MS}$. It is subject to the same ER constraint (3.5dB) and the same receiver sensitivity (-10dBm at 50Gb/s) as MRM-based photonic links.

Co-optimization is carried out for MS-MZMs with two different driver voltages (1V and 2V) across the typical doping range. The total transmitter E/b, the laser power and the optimal arm length $L_{opt}$ found by the optimization engine are shown in Fig. 3.20. Optimal doping levels exist for each voltage swing. Initially increasing the doping levels could improve the modulation efficiency and effectively improve the transmitter OMA. When doping levels are relatively high, the increased insertion loss starts to play the dominating role and leads to higher laser power consumption. Another key observation is that the MS-MZM transmitter with 1V driver is in fact more energy efficient than the transmitter with 2V driver. Because the driver power dominates, the total power consumption for MS-MZM under the current technology and link constraints. The optimized MS-MZM transmitter consumes 4.9pJ/b at 50Gb/s. More details about the optimization results can be found in Table 3.4.

3.4.3 Traveling-wave Mach Zehnder transmitter

The driver for traveling-wave MZM could potentially be more energy efficient at high datarates. The output signal of the driver propagates along the on-chip electrode as shown in Fig. 3.19(b). In the optimized design, the RF and optical group velocities are matched. Any mismatch in them degrades the OMA and thus increases the total optimal transmitter energy.
In our optimization engine, such velocity matching condition is assumed to be satisfied for first-order system analysis. The impact of mismatch can be simulated in time domain through the proposed Simulink toolbox for specific designs [41].

The final stage of the driver can be a CML driver with load resistance $R_L$. The differential peak-to-peak output swing of the driver is denoted as $V_{TW}$ and the attenuation coefficient of electrical signal on transmission is denoted as $\alpha_t$. In the optimization engine for TW-MZM, $\alpha_t$ is set based on the frequency-dependent measurement results in [36]. For 50Gb/s modulation, $\alpha_t$ corresponds to 2.5dB/mm. Note that the effect of waveguide dopings on $\alpha_t$ is neglected. As the voltage bias attenuates along the transmission lines, the effective modulation phase shift for TW-MZM can be derived as

$$\Delta \phi_{mod} = \frac{2\pi}{\lambda} \int_0^L [n_{eff} (-V(z)) - n_{eff} (V(z))] \, dz \quad (3.34)$$

where the effective index $n_{eff}$ depends on the location $z$ on the waveguide and the driver voltage $V_{TW}$. Based on the modified $\Delta \phi_{mod}$, the normalized OMA and ER for TW-MZM can thereby be calculated. Given the same link constraints as MS-MZM, the required laser energy-per-bit for TW-MZM $E_{laser}$ can be calculated as well.

When a CML driver is used for the final stage with supply voltage $V_{DD}$ and single-end swing $V_{TW}/2$, the driver energy-per-bit for TW-MZM can be calculated as

$$E_{dr,TW} = \frac{1}{\eta_d \cdot f_b} \cdot \frac{V_{TW}}{2(Z_0/2)} \cdot V_{DD} = \frac{V_{TW}V_{DD}}{\eta_d Z_0 \cdot f_b} \quad (3.36)$$

The effective load impedance of the parallel transmission lines is $Z_0/2$, and $Z_0$ is assumed to be 60Ω according to the typical transmission line design in [36]. The driver efficiency $\eta_d$ is assumed to be 20% which accounts for power loss on load resistance $R_L$ and any power consumed by the pre-drivers.

Under the same technology and link constraints, the optimization engine minimizes the total E/b for TW-MZM transmitter by finding the optimal arm length for different doping levels. For 50Gb/s TW-MZM transmitter, co-optimizations are carried out for three different $V_{TW}$ across the typical doping range as shown in Fig. 3.21. The optimization results show that laser power would dominate the total transmitter power and increase dramatically when doping levels are relatively low. From the optimization results, the optimal E/b for TW-MZM transmitter is achieved when the differential peak-to-peak voltage swing is around 0.8V. Similar to MS-MZM, the optimal arm length of TW-MZM also decreases as the doping levels increase.

The optimal doping levels for MS-MZM and TW-MZM as well as their corresponding laser and driver power are listed in Table 3.4. At 50Gb/s, the optimized TW-MZM tends
to consume more laser power, whereas MS-MZM consumes more driver power. Overall, the optimized TW-MZM transmitter consumes around 5.1pJ/b energy, slightly higher than the 4.7pJ/b consumed by the optimized MS-MZM transmitter. For both transmitter architectures, optimizing doping levels is crucial for achieving the best energy efficiency.

### 3.5 Comparisons

The optimization framework allows us to compare the energy efficiency of MRM and MZM optical transmitters including laser and driver power. PAM4 modulation is discussed as a potential way to mitigate the inherent optical bandwidth constraint for microrings. For Mach-Zehnder modulators, we have focused on NRZ modulation and analyzed both multi-stage and traveling-wave MZ-modulators. All the transmitters are optimized under the same technology and link constraints. The impact of doping levels for transmitter designs has been addressed in depth using the optimization framework.

For microring modulators, thermal tuning is essential for keeping the resonant frequency of microring locked to the laser frequency. Microring’s thermal tuning can be done via an embedded microheater and a feedback mechanism. The heaters have been implemented in silicon or polysilicon to be more efficient and robust to electromigration. In a recent work [31], the thermal tuner for microrings achieves a 524GHz (>50°C temperature) tuning range at 3.8µW/GHz consuming 2mW in the heater driver and 0.74mW in tuner logic. In order to estimate the thermal tuner power, ring’s resonance has to be adjusted for the entire commercial temperature range (COM) in data-centers (0-70°C) leading to 3.5mW. Therefore, the thermal tuning power for microrings is almost negligible compared to other link components at 50Gb/s.

For the analysis above, we have set the link margin to 3dB and assumed 3dB coupler loss, 10% laser wall-plug efficiency and -10dBm receiver sensitivity for NRZ at 50Gb/s. In practice, any deviation from these link constraints can be considered by adjusting the link margin. Now we consider three different link margins (0dB, 3dB and 6dB) and show how the energy efficiency comparison would change between the different transmitter architectures at 50Gb/s. The new energy breakdowns are shown in Fig. 3.22 with different link margins. For Mach-Zehnder modulators, the optimized multi-stage MZM transmitter can be more energy efficient than traveling-wave MZM transmitter when higher link margin is used. Because the MS-MZM uses significantly less laser power. If a smaller link margin or further relaxed link constraints are used, traveling-wave MZM may become more energy efficient. In this case, the driver energy takes up larger portion of total energy budget and the optimized traveling-wave MZM transmitter benefits from its relatively low driver E/b.

For 50Gb/s NRZ optical links with a typical 3dB link margin, MRM transmitters could save more than 60% of the total power compared to MZM transmitters when both are optimized through co-optimization framework. For microring modulators, switching to PAM4.
modulation could further save around 20% total transmitter power from NRZ modulation. For all the cases here, we assumed a fixed receiver sensitivity, a fixed data-rate and the same technology constraints from the same silicon photonics platform.

As the data-rate increases, the sensitivity of the high-speed optical receivers would drop mainly due to the bandwidth limitations of the circuit blocks as shown in Fig. 3.23. In our optimization framework, we set the receiver sensitivity based on the measurement results of the 65Gb/s receiver design in 14nm FinFet [4]. In addition to receiver sensitivity, the optical bandwidth of microrings and the transmission line loss also vary as the targeted...
data-rate varies. The minimum E/b for transmitters using the optimized NRZ-MRM, MS-MZM and TW-MZM are obtained for 32-60 Gb/s, as shown in Fig. 3.23. Only NRZ links are considered limited to the available receiver sensitivity data. It is clear that the optimized microring modulator always consumes much less power than the optimized Mach-Zehnder modulator for the data-rates of interest. This is generally due to the compact size of microrings. The optimizations at different data-rate are also done under the same technology and link constraints.

### 3.6 Summary

This study proposes a co-optimization framework for designing high-speed silicon photonics transmitters. The new framework integrates a simple but accurate compact model for optical phase shifters, analytical models for photonic modulators and a new Simulink simulation toolbox. It allows us to explore the design trade-offs in depth for microring and Mach-Zehnder optical transmitters and compare their performances given the same set of technology and link constraints. Our results show that silicon photonic links, especially microring-based links, have great potential to provide energy-efficient optical solutions for next-generation inter-rack and intra-rack links.

Although the study does not go into circuit implementation details, it provides a useful co-optimization and verification framework for designing high-speed silicon photonics transmitters in the context of a practical optical link. This framework can be applicable to most of today’s silicon photonics platforms that rely on PN junction based phase shifters. It can be extended to include receiver designs and thermal tuning designs, and assist the co-optimization of the next-generation silicon photonic interconnects.
Chapter 4

High-speed Monolithic Silicon Photonics Transmitters

In this chapter, we apply the co-design techniques to silicon photonics chip design and demonstrate a 40Gb/s optical NRZ transmitter achieving 330fJ/b complete transmitter energy (including clock distribution and serializer sub-systems) in a monolithic zero-change 45nm SOI CMOS process. This platform requires no modifications to the standard CMOS process providing high-yield and low-cost photonics with fast transistors on a single die. The transmitters are based on microring modulators (MRM) with segmented PN junctions.

4.1 Microring-based Transmitter Design Challenges

As resonant devices, MRMs are subject to the fundamental tradeoff between optical bandwidth and optical modulation amplitude (OMA) [34]. Lowering the quality factor (Q) of MRMs can extend the optical bandwidth (which can be considered as equalization in optical domain), however at the cost of decreasing OMA, as illustrated in Fig. 4.1. In our case, the monolithic MRM has a diameter of 10µm and utilizes interleaved PN junctions operating in the depletion mode similar to [32,33].

4.2 Improved Design of Microring Modulator

To achieve an optimal OMA with fixed Q, an MRM needs to meet the critical coupling condition, where the input coupling strength equals the total round-trip loss (Fig. 4.2 and Fig. 4.3). One can balance the two setting the coupling strength k by choosing the gap between the waveguide and the microring, and adjusting the round-trip loss due to absorption
by choosing the doping levels. However, doping options are limited in most technologies. In our microring designs, we used an optical drop-port as an extra knob to adjust the round-trip loss and achieve critical coupling. Both ends of the drop-port waveguide are tapered to prevent reflections. The drop-port can be also used to close the thermal tuning feedback loop to adjust thermal and process variations of the microrings resonance wavelength [2,32,33]. To further improve the OMA, we have used high swing drivers with differential driver and an AC coupler. In doing so, we increase the depletion width of the junctions, which consequently introduces a larger resonance shift and improves the OMA.

The critical coupling condition for microring modulator is shown in Fig. 4.3. The cavity absorption coefficient $\alpha_f$ can be extracted from previous measurements for the available doping profiles. The extracted $\alpha_f$ is approximately 400 m$^{-1}$, which corresponds to -35dB/cm loss in the cavity of the microring.

According to Equation 3.14, the Q factor of the microring can be derived as

$$Q = \frac{2\pi n_g L}{\lambda (|k_{\text{in}}|^2 + |k_{\text{drop}}|^2 + 1 - e^{-2\alpha_f L})}$$ \hspace{1cm} (4.1)

In our case, the target wavelength $\lambda = 1300$nm, the group index $n_g = 2.9$ (based on measured FSR) and the round-trip length of the microring $L = 31.4 \mu m$. Plugging in the
Low-Q Microring Modulator Optimization

Device Parameters

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Symbol</th>
</tr>
</thead>
<tbody>
<tr>
<td>Input coupling coefficient</td>
<td>$k_{in}$</td>
</tr>
<tr>
<td>Drop coupling coefficient</td>
<td>$k_{drop}$</td>
</tr>
<tr>
<td>Cavity absorption coefficient</td>
<td>$a_f$</td>
</tr>
<tr>
<td>Round-trip length</td>
<td>$L$</td>
</tr>
</tbody>
</table>

Critical Coupling Condition

$$t_{in} = t_{drop} \cdot e^{-a_f L}$$

$$k_{in}^2 + t_{in}^2 = 1 \quad k_{drop}^2 + t_{drop}^2 = 1$$

Choosing Gap between Ring and Waveguide

Laser Wavelength $\lambda = 1.3 \, \text{um}$

Figure 4.2: Microring modulator design considerations

Figure 4.3: Critical coupling condition of microring modulator, and the power coupling coefficient extracted from Lumerical FDTD simulation.

values and using the critical coupling constraint in Fig. 4.3, we can get

$$Q = \frac{440}{1 - |t_{in}|^2 + 1 - |t_{drop}|^2 + 0.025}$$

$$= \frac{1 - 0.975|t_{drop}|^2 + 1 - |t_{drop}|^2 + 0.025}{440}$$

(4.2)
Plugging in the target $Q$ into Equation 4.2, we can get the value for $t_{\text{drop}}^2$ and calculate the coupling coefficients. When target $Q = 7500$, $k_{in}^2 = 0.029$ and $k_{drop}^2 = 0.005$, and when target $Q = 6000$, $k_{in}^2 = 0.037$ and $k_{drop}^2 = 0.012$. Lumerical MODE is used to simulate the coupling strength between microring and waveguide. The relationship between the coupling strength and the coupling gap between the microring and waveguide is obtained and shown in Fig. 4.3. The coupling gap sizes are then chosen according to the target $Q$ factors.

4.3 Design of High-speed Transmitter Circuits

4.3.1 NRZ transmitter with AC-coupled driver

The full NRZ transmitter consists of a 16-to-1 high-speed serializer, a modulator driver stage, clocking circuits and a PRBS31 generator. The circuit block diagram of the transmitter is shown in Fig. 4.4. As discussed in Chapter 3, the OMA of the transmitter increases as the driver swing increases and laser power typically dominates the total power of the optical transmitter. Therefore higher driver swing is preferred as long as the p-n junction is not fully depleted under the highest bias voltage.

The typical microring driver is based on single-end inverter chain, therefore the transmitter voltage swing cannot exceed the supply voltage $V_{DD}$. Our improved transmitter doubles the voltage swing by driving the anode and cathode of microring modulator differentially with two drivers. One driver is AC-coupled using a bias tee circuit with 2pF coupling capacitor and 15kΩ bias resistor (approximately 5MHz cut-off frequency). The DC bias voltage is set to $V_b$ on the modulator cathode, so that it sees a voltage swing from $-V_{DD} - V_b$ to $V_{DD} - V_b$. The transmitter contains a custom-designed high-speed 16-to-1 serializer, which uses a fast tri-state gate as its last-stage multiplexer. Inside the 16-to-1 serializer, differential signals are generated from the second last stage instead of the last stage to avoid timing issues. This approach generates differential data signals at 40Gb/s with sufficient timing margins and low power.

A digital LC-PLL provides 10GHz CMOS and 20GHz CML clock sources for the double data rate (DDR) serializer. This PLL uses a bang-bang phase detector and a digital loop filter with ΣΔ modulation [51]. Two CML-to-CMOS clock converters generate 20GHz differential full-swing clock signals for the last stage in the serializer. The serializer and driver circuits are optimized to support up to 50Gb/s NRZ modulation. The highest modulation data rate is set by the highest clock frequency generated by the LC-PLL.
4.3.2 NRZ transmitter with single-ended driver

A full NRZ transmitter with single-ended driver is designed as a baseline reference as shown in Fig. 4.5. This transmitter uses the same clocking circuits and digital backend. Instead of having two drivers for push-pull operation, only a single driver head is used to drive the microring modulator. Therefore this version of the transmitter would reduce the total circuit power at the cost of a reduced voltage swing and a reduced OMA. The single-ended version also reduces the driver area from 0.0020 mm$^2$ to 0.0016 mm$^2$ by removing of the large ac-coupling capacitor. Detailed comparison is discussed in Section 4.4.

4.3.3 PAM4 transmitter

As discussed in Section 3.3.5, two different PAM4 transmitter architectures can be used to generate the PAM4 signal. PAM4 modulation doubles the data rate from NRZ modulation under the same bandwidth constraint. One can either use the linearly segmented microring as the optical DAC or use microrings with two binary-weighted segments to generate the...
four signal levels in PAM4 signaling. Microring modulators designed for high-speed operation typically has a very linear electro-optical response due to their low quality factor. Therefore, microring modulator with two binary segments should be the preferred device structure for PAM4 modulation. A high-speed PAM4 microring transmitter is designed in the same process with binary-weighted segments in microrings, reusing most of the high-speed circuits designed for NRZ modulation, as shown in Fig. 4.6. The original 16-to-1 serializer is changed to two 8-to-1 serializers. As a result, the PAM4 transmitter targets half of the baud rate (20GBaud/s) and the same data rate (40Gb/s).

Since we use the spoked microring modulator with intrinsic p-n junction segments, the modification on photonic devices is only on the metal routing and electrode connections. Two additional segments are tied to constant bias voltage between MSB and LSB segments of the microring in order to form reverse bias and electrically decouple the LSB and MSB signals. The other design parameters of the PAM4 microring stay the same as the optimized NRZ microring.
4.3.4 Digital PLL

A digital LC-PLL is designed to provide the on-chip high-speed clock sources for the transmitter. This PLL uses a bang-bang phase detector (BPD) and a digital loop filter with ΣΔ modulation [51]. ΣΔ modulation is implemented to improve the effective resolution of the digital controlled oscillator (DCO) and reduce the effect of quantization noise on the jitter performance. The system diagram of the digital PLL is shown in Fig. 4.7.

The detailed block diagram of the digital PLL is shown in Fig. 4.8. The digital backend consists of a bang-bang phase detector, a frequency detector, a digital filter and the ΣΔ modulator. A digital controlled LC-oscillator is custom designed with a tunable capacitor DAC. The tuning range of the DCO frequency is from 16GHz to 21GHz with the reference clock frequency equal to 1/32 of the output clock frequency ($f_{\text{ref}}$: 500MHz - 656MHz).
Fig. 4.7: System diagram of the digital PLL

Fig. 4.9 shows the block diagram of the high-speed clock divider. A CML divider is used for the first stage of the divider chain and followed by ac-coupled CML-to-CMOS converters and a 16-to-1 CMOS divider chain. Fig. 4.10 shows the schematic of the LC-DCO, where the inductor has an inductance of 580pH and the LC tank has a quality factor of 10. The cap DAC consists of 17 LSB capacitor units and 31 MSB capacitor units, both of which are thermometer coded. The output of the ΣΔ modulator drives one of the LSB units. The sizes of the switch transistors are optimized to achieve the highest quality factor while maintaining reasonable tuning range. The layout of the digital PLL is shown in Fig. 4.11 with dimensions of 250µm by 80 µm. The sub-blocks of the PLL are labeled in the layout, including the inductor, decap array, cap DAC, divider, scan chain and digital control logic.

4.3.5 Overview of the test chip

The test chip for this transmitter is designed and fabricated in 45nm SOI CMOS process. The chip is flip-chip packaged onto high-density printed circuit board and has its silicon substrate removed for electro-optical measurements. The micrographs of the test chip and the sub-blocks are shown in Fig. 7. Our platform uses vertical grating couplers and the measured coupling loss is < 3dB per coupler at 1300nm.
**4.4 Measurement Results**

### 4.4.1 NRZ transmitter performance

The optical transmitter with AC-coupled driver (Section 4.3.1) is measured with VDD = 1.2V and Vb = 1.7V, therefore the voltage swing that the modulator sees is between -0.5V to -2.9V. The modulator is always reverse-biased to keep enough electrical field in the depletion region sweeping-out the generated carriers, for fast modulation. The measured Q factor for the microring is around 7000. Fig. 4.13 shows the measured NRZ eye diagrams at 20Gb/s and 40Gb/s. Dynamic insertion loss (IL) and extinction ratio (ER) are measured on the optical scope. At 20Gb/s, the transmitter achieves 4dB IL and 3dB ER, while at 40Gb/s, it achieves 4.7dB IL and 3dB ER with the same supply and bias voltage.

The measured energy efficiency for the full transmitter is 0.36pJ/b at 20Gb/s and 0.33pJ/b at 40Gb/s. The power of the digital PLL is 14.4mW. Fig. 4.14 shows the detailed power
breakdown. For 40Gb/s, the modulator and driver stage consumes only 40fJ/b. The serializer, the clock divider and the clock buffers consume 290fJ/b, and the digital PLL consumes 360fJ/b at this data rate.

As shown in Section 4.3.2, a reference transmitter with 1.2V voltage swing is also designed and tested, where the anode of the modulator is connected to one driver and cathode tied to constant bias. Compared with this reference design, the AC-coupled transmitter increases the optical modulation amplitude by 70% and reduces the insertion loss and hence the required laser power by 40%. This improves the overall energy efficiency significantly, as laser source can be the dominant energy consumer in microring based optical transmitters. The detailed performance comparison and power analysis of these two transmitter designs are summarized in Fig. 4.15.

4.4.2 PAM4 transmitter performance

The optical PAM4 transmitter based on two-segment microrings is also demonstrated as shown in Fig. 4.16. The measured ER is 3.0dB and the measured IL is 6.0dB, which is
consistent with the NRZ results of the single-ended driver in Fig. 4.15. This result proves that the two-segment approach indeed works with very good linearity at high data rates, in this case, 40Gb/s. This provides a simple DAC-less solution for microring-based PAM4 transmission without using an electrical DAC or an optical DAC.
**Transmitter Test Macros**

Transmitter Macro

- **Transmitter Macro**
- **0.35 mm**
- **0.18 mm**
- **Grating Coupler**
- **Digital PLL**
- **Driver**
- **Serializer**
- **Digital Backend**
- **Heater Driver**

**Chip 3.0x3.0mm**

10 µm

---

**Figure 4.12: Die photo of test chip and sub-blocks**

Table 4.1: Measured transmit 20Gb/s and 40Gb/s NRZ eye-diagrams and dynamic IL/ER

<table>
<thead>
<tr>
<th>Modulation</th>
<th>20 Gb/s NRZ</th>
<th>40 Gb/s NRZ</th>
</tr>
</thead>
<tbody>
<tr>
<td>Insertion Loss (IL)</td>
<td>4.0 dB</td>
<td>4.7 dB</td>
</tr>
<tr>
<td>Extinction Ratio (ER)</td>
<td>3.0 dB</td>
<td>3.0 dB</td>
</tr>
<tr>
<td>Voltage for Bit 1 (V1)</td>
<td>-2.9 V</td>
<td>-2.9 V</td>
</tr>
<tr>
<td>Voltage for Bit 0 (V0)</td>
<td>-0.5 V</td>
<td>-0.5 V</td>
</tr>
</tbody>
</table>

---

**Figure 4.13: Measured transmit 20Gb/s and 40Gb/s NRZ eye-diagrams and dynamic IL/ER**
4.4.3 Comparison to prior works

The measured results are summarized and compared with other state-of-the-art optical transmitters in Fig. 4.17. Thanks to the monolithic integration and co-optimization, this work has achieved higher bandwidth density and improved energy efficiency than the MRM-based transmitters with electronics and photonics on separate dies. It has also achieved the fastest data rate and the highest energy efficiency and bandwidth density compared to prior works in monolithic silicon photonics transmitters.

4.5 Summary

We have demonstrated a 40Gb/s optical NRZ transmitter using MRM in 45nm SOI process. Electronic-photonic co-design with the high swing driver enabled this transmitter to achieve a total energy efficiency of 330fJ/b and the photonics and modulator driver area bandwidth density of 6.7Tb/s/mm² at 40Gb/s. These performance metrics make the MRM-based transceivers an attractive solution for the next-generation inter and intra-rack photonic interconnects.
Measured Power Consumption Breakdown

<table>
<thead>
<tr>
<th></th>
<th>TX A (20Gb/s)</th>
<th>TX A (40Gb/s)</th>
<th>TX B (20Gb/s)</th>
<th>TX B (40Gb/s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Modulator &amp; Driver (mW)</td>
<td>0.4</td>
<td>0.8</td>
<td>0.8</td>
<td>1.6</td>
</tr>
<tr>
<td>Serializer (mW)</td>
<td>5.6</td>
<td>10.1</td>
<td>6.4</td>
<td>11.6</td>
</tr>
<tr>
<td>Digital PLL (mW)</td>
<td>14.4</td>
<td>14.4</td>
<td>14.4</td>
<td>14.4</td>
</tr>
<tr>
<td>Total TX Power (mW)</td>
<td>20.4</td>
<td>25.3</td>
<td>21.6</td>
<td>27.6</td>
</tr>
<tr>
<td>TX Energy Efficiency (pJ/b)</td>
<td>1.0</td>
<td>0.63</td>
<td>1.1</td>
<td>0.69</td>
</tr>
</tbody>
</table>

Figure 4.15: Performance comparison between single-ended and AC-coupled NRZ transmitters

Figure 4.16: Measured PAM4 eye diagram and transmit waveform
<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Integration</td>
<td>Wire Bonding</td>
<td>Wire Bonding</td>
<td>Monolithic</td>
<td>Monolithic</td>
<td>Monolithic</td>
</tr>
<tr>
<td>CMOS Technology</td>
<td>28nm</td>
<td>65nm</td>
<td>90nm SOI</td>
<td>45nm SOI</td>
<td>45nm SOI</td>
</tr>
<tr>
<td>Supply</td>
<td>1.1V</td>
<td>1.2V, 2.4V</td>
<td>-</td>
<td>1.55V</td>
<td>1.2V</td>
</tr>
<tr>
<td>Output Swing</td>
<td>$1.8V_{pp}$</td>
<td>$4.4V_{pp-diff}$</td>
<td>$1.85V_{pp-diff}$</td>
<td>$1.55V_{pp}$</td>
<td>$2.4V_{pp-diff}$</td>
</tr>
<tr>
<td>Modulator Type</td>
<td>Microring</td>
<td>Microring</td>
<td>MZM</td>
<td>Microring</td>
<td>Microring</td>
</tr>
<tr>
<td>NRZ Data-rate (Gb/s)</td>
<td>50</td>
<td>25</td>
<td>32</td>
<td>20</td>
<td>20</td>
</tr>
<tr>
<td>ER (dB)</td>
<td>5.0</td>
<td>7.0</td>
<td>4.4</td>
<td>3.0</td>
<td>3.0</td>
</tr>
<tr>
<td>IL (dB)</td>
<td>5.0 (1)</td>
<td>5.0 (1)</td>
<td>4.9</td>
<td>5.5</td>
<td>4.0</td>
</tr>
<tr>
<td>Modulator and Driver</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Energy Efficiency (pJ/b)</td>
<td>0.61</td>
<td>2.47</td>
<td>4.4</td>
<td>0.155</td>
<td>0.04</td>
</tr>
<tr>
<td>Serializer Energy</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Efficiency (pJ/b)</td>
<td>-</td>
<td>1.25</td>
<td>-</td>
<td>0.25</td>
<td>0.32 (4)</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>0.29 (4)</td>
</tr>
<tr>
<td>Photonics Area (mm$^2$)</td>
<td>0.065 (2)</td>
<td>0.05 (2)</td>
<td>1.46 (2)</td>
<td>0.01</td>
<td>0.004</td>
</tr>
<tr>
<td>Driver Area (mm$^2$)</td>
<td>0.085 (2)</td>
<td>0.1 (3)</td>
<td>0.042 (2)</td>
<td>0.001</td>
<td>0.002 (3)</td>
</tr>
<tr>
<td>Photonics and Driver</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>BW Density (Tb/s/mm$^2$)</td>
<td>0.33</td>
<td>0.17</td>
<td>0.02</td>
<td>1.8</td>
<td>3.3</td>
</tr>
</tbody>
</table>

(1) Estimated from reported DC data  
(2) Measured from chip photos in the paper or related papers  
(3) Including the serializer area in measurements  
(4) Including clock divider and clock buffers

Figure 4.17: Performance comparison to prior works on high-speed silicon photonics transmitters
Chapter 5

3D Integrated Silicon Photonic Interconnects

To enable full optical links for interconnection networks, high speed and low power optical transmitters as well as high bandwidth and high sensitivity optical receivers are required. These necessitate the need for close integration in order to achieve small parasitic capacitance between electronics and photonic devices. Furthermore, a two-wafer solution is desirable to separately optimize the performance of the photonic components and the CMOS circuits. This work demonstrates for the first time an optical chip-to-chip link built in a heterogeneous, 3D integration platform using thru-oxide via (TOV) technology [52]. The TOV technology overcomes the challenges of close integration of electronic and photonic components, by simultaneously enabling separate wafer optimization of electronic and photonic components while providing a low-capacitance, high-density connection between the photonic and electronic wafers.

The chapter presents a full optical chip-to-chip link demonstrated in a wafer-scale heterogeneous platform [18,53]. We first introduce the background of silicon photonics interconnects and platforms in section 5.1. In section 5.2, we discuss the detailed system and circuit implementation for 3D integrated silicon photonics transceiver. We also present the measurement results for the transceiver. Full link implementation and measurement results are discussed and analyzed in section 5.3.

5.1 3D Integration of CMOS and Photonics

Traditional heterogeneous platforms capitalize on the ability to individually optimize the photonic and electronic macros, an element missing in other forms of integration. However, the large interface capacitance associated with thru-silicon via (TSV) and $\mu$-bump technolo-
gies limits the overall system performance as well as energy-efficiency.

As illustrated in Figure 5.1 and Figure 2.5, in this process, 300mm photonic and electronic wafers are manufactured separately in CNSE 300mm foundry and then bonded face-to-face using oxide bonding. The silicon substrate is then removed on the photonic SOI wafer and TOVs are punched through at 4µm pitch to connect the top layer metal of the photonic wafer to the top layer metal on the 65nm bulk CMOS wafer.

For packaging, wire-bonded back metal pads are deposited on top of the selected TOVs. The connection from the CMOS wafer to the photonic device is achieved through the TOVs passivated on top with an oxide layer, which minimizes the parasitic capacitance. Our measurements estimate the TOV capacitance to be 3fF, which enables low-power and high-sensitivity electronic-photonic systems for a variety of applications. This represents an order of magnitude reduction in parasitic capacitance, and two orders of magnitude higher density compared to previously demonstrated -bump flip-chip electronic-photonic integration approaches.

5.2 Circuits and System Implementation

5.2.1 Chip architecture

The optical chip-to-chip link is a part of the wafer-scale heterogeneously integrated technology-development and demonstration platform with low-energy optical transmitters, receivers,
and comprehensive backends for performance characterization (Figure 5.2). Apart from containing vertical junction depletion mode microdisk modulators [5] within the photonics die, hetero-epitaxially grown Germanium photodiodes and body crystalline silicon low-loss waveguides are also used to enable electro-optic transceiver functionality. The 16M transistor electronic chip contains 32 Multicell sub-blocks that enable a full self-test of modulators and receivers within the link.

Each Multicell is composed of eight RX as well as eight TX macros, enabling in-situ testing of a wide variety of photonic devices. The Multicell also contains an expansive digital backend infrastructure to enable full, self-contained characterization of each of the eight TX and RX sites. Characterization is accomplished through on-chip, self-seeding PRBS generators and counters. The $2^{31} - 1$ length PRBS data sequence gets fed into one of the TX macro sites, which serializes the data and drives the resonant modulator device imprinting the data sequence on the light in photonic waveguide. On the RX side, this modulated light is fed into one of the eight RX macros. The output of this RX macro is an eight-channel bus, marking the deserialized input optical data. These eight channels proceed on into the backends bit-error-rate (BER) checkers, which count the total number of errors between the received data from the RX macro and the ideal sequence provided by the seeded PRBS generator. Each of TX and RX sites also contains thermal tuning circuits for stabilizing the resonance wavelengths of microring devices.
5.2.2 Transmitter design

The TX macro (Figure 5.3) consists of a tunable vertical junction depletion-mode ring resonator similar to [5, 53] driven by an 8 to-1 serializer and driver head with on-chip PRBS input. The applied reverse-bias voltage to the junction via the driver head depletes free carriers and perturbs the refractive index of silicon, which in turn shifts the resonance wavelength (or frequency) of the optical modulator. The cathode of the modulator diode is connected to 1.2V while the anode is modulated from 0 to 1.2V. The modulator p-n junction is reverse-biased during modulation.

Given that the leakage current is small, the energy is consumed only when the transitions...
charge the reverse-biased junction capacitor. With a total modulator driver capacitive load of 12.4fF (modulator diode and TOV), at 6Gb/s the whole macro consumes 100fJ/b (5fJ/b modulator, 15fJ/b driver, and 80fJ/b serializer).

Heterogeneous integration allows us to use the state-of-the-art ring resonant modulators with a large electro-optic response of 150pm/V (20GHz/V), which enables low power modulation using small voltage swing (1.2V) while still maintaining sufficient extinction ratio (Figure 5.4(a)). Measured from the modulator transmission spectra at 0V and -1.2V dc biases, the device should ideally achieve 6.2dB extinction ratio (ER) and 1.8dB insertion losses (IL). The modulator can also be modulated between a slightly forward-biased regime and depletion regime by lowering the bias voltage of the anode (i.e. -0.2 to 1.0V). This will further improve extinction ratio of the modulator.

A tunable CW laser source was coupled to an on-chip silicon waveguide through a vertical grating coupler. The laser frequency was aligned adjacent to the resonance frequency of the modulator ring (λ 1520nm, see Figure 5.4(a)). The TX circuits drive the 31-bit PRBS sequence into the modulator, achieving the non-return to zero on-off keying (NRZ-OOK) modulation eye at 6Gb/s, as shown in Figure 5.4(b), with 6dB extinction ratio and 2dB of insertion loss, which agrees well with the transmission spectrum. The fast rise-time indicates the potential for faster operation, but the results are currently limited by the global high-speed clock distribution network that spans the whole chiplet and supplies the clock to all the Multicell macros.

5.2.3 Receiver design

The receiver (Figure 5.5) consists of a Ge photodiode placed on top of the electronics and connected to the receiver circuitry via TOVs with minimal parasitic capacitance. In Figure 5.6, the TIA-based receiver circuit has a pseudo-differential front-end with a cascode pre-amplifier feeding into double-data rate (DDR) sense-amplifiers and dynamic-to-static converters (D2S). The TIA stage with 3kOhm feedback contains a 5-bit current bleeder at the input node, which is set to the average current of the photodiode. This allows the TIA input and output to swing around the midpoint voltage of the inverter. The TIA input and output are directly fed into a cascode amplifier with resistive pull up.

The bias voltage of the cascode is tuned through a 5-bit DAC. Adjusting this bias voltage results in a trade-off between the output common-mode voltage and the signal gain of the cascode stage. More specifically, increasing this bias voltage results in a higher cascade gain but lower output common-mode voltage that reduces the sense-amplifier speed. For a given data rate, an optimal bias voltage is determined so as to minimize the overall evaluation time of the sense amplifier. The proceeding sense amplifiers then evaluate the cascode outputs before getting deserialized and fed into on-chip BER checkers. Each sense amplifier has a coarse, 3-bit current bleeding DAC as well as a fine, 5-bit capacitive DAC for offset correction. An external Mach-Zehnder modulator with extinction ratio of about 10dB driven
Figure 5.5: 3D render and die photo of Ge photodetector.

Figure 5.6: Optical receiver schematic.
The responsivity and bandwidth of this process variant of the Ge photodiode in [53], are shown in Figure 5.7. At 1520nm, the responsivity is 0.73A/W, resulting in optical RX sensitivity of 14.5dBm at 7Gb/s, for electrical sensitivity of 26A. The overall energy consumption is 340fJ/bit. The TIA+cascode pre-amplifier stage consumes 70fJ/bit. The sense amplifier, current plus capacitive correction DACs, and the dynamic-to-static converter together consume 120fJ/bit. Finally, the deserializer consumes 150fJ/bit. Figure 5.8 shows the sensitivity of the receiver as a function of data rate. Additionally, bathtub curves for the two slices of the DDR receiver are also shown.
5.2.4 Thermal Tuner Design

We designed thermal tuning circuits to stabilize the resonance wavelength of microring resonators in order to compensate process variations and temperature fluctuations. The thermal tuner for microring transmitters is based on a bit-statitical tuning algorithm [31]. The similar thermal tuning backend is implemented in 65nm process. The system diagram of the tuning backend is shown in Figure 5.9.

As shown in Figure 5.9, a drop port waveguide is weakly coupled to microring resonator to detect power level inside the microring. The photocurrent at the drop port is then integrated and quantized by a ring oscillator based SAR ADC. The power strengths for optical level 1 and 0 can be calculated by the tuning backend based on the knowledge of transmitted data. With the goal to maximize the optical eye opening, a thermal controller actively sets the coefficients for a sigma-delta heater DAC. This heater DAC drives the embedded silicon heater inside the microring and controls the local temperature and thereby the resonance of the microrings. For initial locking, the heater strength is swept to search for the laser wavelength and optimal locking point (Figure 5.10). The optimal heater strength for maximizing optical eye diagram is stored in this initial sweeping process. The heater strength is then reset to this optimal value while the thermal tuning loop continues to thermally lock the microring. The captured eye diagrams in a slowed down thermal locking process show that the thermal tuning loop works as expected.

5.3 Link Measurement Results

5.3.1 Link implementation

A 100-meter optical link operating at 5Gb/s is demonstrated (Figure 5.11) illustrating the functionality of all the required optical and electrical components in this heterogeneous platform. The experiment setup in the lab is shown in Figure 5.12 with transmitter chip and receiver chip mounted on the same optical table.

Figure 5.11 also shows the optical power breakdown per stage within the full link. A CW laser at λ 1520nm is coupled to the on-chip TX macro of Chip 1 using a vertical grating coupler. The coupler results in 7.5dB of loss in optical power. A PRBS generated data within this TX macro are fed into the modulator driver, which in turn modulates the ring resonator. The output of the TX macro including the coupler is the modulated light with 6dB extinction ratio. This light is fed into an optical amplifier providing 8dB of gain. The 8dB amplifier is necessary to mitigate part of the 15 dB chip-to-chip coupler loss in the optical data path (7.5dB per coupler) due to unoptimized coupler designs. The amplifier feeds into the 100 meter fiber proceeded by a 90/10 power splitter. A monitoring scope, using the 10% output, is used to ensure that an optical eye is visible. The 90% output is
Figure 5.9: System diagram of the thermal tuning loop.
coupled into the RX macro. The Ge photodiode is used within the RX macro to convert incoming optical data to an electrical bit stream. This photodiode sees 12.3dB and -18.3dB optical power for a bit 1 and 0, respectively.

Figure 5.13 shows the output BER plot indicating at least $10^{-10}$ bit accuracy. This BER plot sweeps two parameters within the RX macro. First, the delay of the RX clock with respect to the TX clock is shown on the x-axis. Second, the corrective capacitor DAC within the receiver sense amplifiers is swept and shown on the y-axis. For particular delays and capacitive DAC values, a steady BER $<10^{-10}$ is observed, illustrating the margins for the robust operation of the link. The transceiver electrical energy cost is 560fJ/bit and the optical energy cost is 4.2pJ/bit (taking into account the amplifier gain). With optimized couplers (<3dB readily achievable in literature [54]), the required optical energy would scale down to below 0dBm (200fJ/bit) thereby eliminating the need for the optical amplifier.

5.3.2 Analysis and comparison

Figure 5.14 shows the electrical power breakdown of TX and RX macros within the link at 5Gb/s data rate. Figure 5.15 presents the comparison to previous non-monolithic electronic-photonic transceiver works.
Figure 5.11: Link budget of the full optical link.

Figure 5.12: Lab setup for full optical link testing.
Figure 5.13: Full optical link BER performance.

Figure 5.14: Electrical energy breakdown for TX and RX macros in a 5Gb/s link.
5.4 Summary

This work demonstrates the first large-scale 3D integrated photonic chip-to-chip link manufactured in a 300mm CMOS foundry. The functional 3D-assembled chips with 16M transistors and 1000s of photonic devices illustrate the high yield of the CMOS, photonic fabrication and 3D integration processes.

A full optical chip-to-chip link is demonstrated for the first time in a wafer-scale heterogeneous platform, where the photonics and CMOS chips are 3D integrated using wafer bonding and low-parasitic capacitance thru-oxide vias (TOVs). This development platform yields 1000s of functional photonic components as well as 16M transistors per chip module. The transmitter operates at 6Gb/s with an energy cost of 100fJ/bit and the receiver at 7Gb/s with a sensitivity of 26µA (-14.5dBm) and 340fJ/bit energy consumption. A full 5Gb/s chip-to-chip link, with the on-chip calibration and self-test, is demonstrated over a 100m single mode optical fiber with 560fJ/bit of electrical and 4.2pJ/bit of optical energy. These results show that the 3D integrated electronic-photonic platform holds great promise for future energy-efficient high-speed WDM communication links.
Chapter 6

Coherent Silicon Photonic Links

6.1 Introduction

As discussed in Chapter 2, the embedded laser power consumption for silicon photonic links has become a bottleneck for further improving overall energy efficiency. This problem is more prominent for high-speed optical interconnects, as receiver sensitivity degrades significantly at high data rates [19,20]. This chapter aims to explore new solutions to this problem from a link architecture perspective under the existing device constraints. To date, short-reach optical links for data centers are all non-coherent (e.g., 100G-SR4 and 100G-PSM4), using a simple intensity modulation direct detection (IMDD) architecture. Studies show that coherent detection schemes could require much fewer photons per bit than IMDD schemes in theory [7–9]. In practice, coherent optical communication has been largely limited to long haul and metro applications due to its high cost, power, and complexity. On the receiver side, the major challenge for coherent communication has been optical-carrier phase tracking as well as PMD (polarization mode dispersion) removal [7], which requires high speed analog-to-digital converters (ADC) and digital signal processing (DSP). In contrast, cost and, energy efficiency are the primary concerns for short-reach optical communication and the PMD is no longer an issue. As a result, coherent optical communication suitable for short-reach applications requires a different architecture from long haul optical communication, which has yet to be demonstrated.

This study proposes a novel coherent link architecture tailored for low-cost energy efficient short-reach optical communication. The basic idea is to forward some portion of the laser power directly to the homodyne receiver, bypassing a significant part of the optical channel loss. This new architecture can not only enable coherent modulation schemes such as binary phase-shift keying (BPSK) but also save the overall laser power significantly. Section 6.2 describes the working principles and key benefits of the architecture. In Section 6.3, the BERs for the proposed link architecture in the noise-limited and swing-limited regimes
are first derived. The link performance is compared with the non-coherent architecture. Next, the impact of laser phase noise is taken into consideration, and the link performance is reevaluated. In Section 6.3, we explore the feasibility of using integrated silicon microring modulators for the proposed coherent architecture. The advantage of the microring-based coherent links is justified through static behavioral modeling and transient simulations. Finally, in Section 6.4, the concept of laser-forwarding coherent link is demonstrated experimentally with a monolithic silicon photonics chip.

### 6.2 Laser-forwarding Coherent Link Architecture

We start the introduction of the proposed coherent architecture by reviewing the basics of balanced detection. The goal of balanced detection is to obtain accurate phase information from the received optical signal. To achieve that, the received signal $S$ is mixed with a local oscillator (LO) signal $L$ to generate a current signal that carries frequency and phase information [8]. The schematic of a balanced photodetector is shown in Fig. 6.1. In this configuration, signal and LO are first mixed through a 3dB coupler and then converted into currents by two identical photodiodes.

![Figure 6.1: Schematic of balanced detection. Signals $L$, $S$, $X_1$ and $X_2$ are complex numbers and present the electric field of the lightwave. $i(t)$ is the final output current. Bias voltages are set to ensure the same reverse biases across the two identical photodiodes.](image)

The transfer matrix of an ideal lossless 3dB coupler in Fig. 6.1 is

$$
\begin{bmatrix}
X_1 \\
X_2
\end{bmatrix} = \frac{1}{\sqrt{2}} \begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix} \begin{bmatrix}
S \\
L
\end{bmatrix}. \tag{6.1}
$$

Assuming the same responsivity $R$ and different shot-noise $n_{s1}(t)$ and $n_{s2}(t)$, the currents generated by the two photodiodes are

$$
i_1(t) = R|X_1|^2 + n_{s1}(t), \quad i_2(t) = R|X_2|^2 + n_{s2}(t). \tag{6.2}
$$
The final output current is the difference between \( i_1(t) \) and \( i_2(t) \), which is derived as:

\[
i(t) = i_1(t) - i_2(t) = 2R\text{Re}\{SL^*\} + n_s(t),
\]

(6.3)

where \( n_s(t) \) is the total shot noise of the photodetectors. It is usually approximated as a zero-mean white Gaussian noise with power spectral density (PSD):

\[
S_n(\omega) = qR(|S|^2 + |L|^2).
\]

(6.4)

When the frequencies of the signal and the LO are the same, their electric fields are

\[
S = a(t)\sqrt{P_S}e^{j(\omega_S t + \phi_S(t))}, \quad L = \sqrt{P_{LO}}e^{j(\omega_{LO} t + \phi_{LO}(t))},
\]

(6.5)

where \( P_S \) and \( P_{LO} \) are the power of the signal carrier and LO, \( \phi_S(t) \) and \( \phi_{LO}(t) \) are their phases, and \( a(t) \) is the ideal modulation term of the signal carrier. Note that \( a(t) \) is a complex number and can represent both phase and amplitude modulations. Now the final output current \( i(t) \) becomes:

\[
i(t) = 2a(t)R\sqrt{P_S P_{LO}} \cos(\phi_S(t) - \phi_{LO}(t)) + n_s(t).
\]

(6.6)

If the phase difference between the signal carrier and LO can be fixed, the final current signal would represent the modulation term \( a(t) \) directly. Unfortunately, this is not practical for long-reach coherent communications, as the carrier signal and LO originate from transmit- and receive-side laser sources that are kilometers apart. Therefore the frequencies of the signal carrier and the free running LO are inherently different and the discrepancy in phase is constantly changing. To overcome this issue, optical carrier phase estimation is commonly implemented in the digital domain.

The proposed architectures are shown in Fig. 6.2. Two variants are presented here, but the key principles are the same. In both architectures, the laser power is split at a certain ratio between transmitter and receiver. Part of the laser power goes to the transmitter for modulation, and the remaining laser power is forwarded to the receiver as an LO signal for homodyne balanced detection. Contrary to the case in long-reach communication, the optical LO and the optical carrier now originate from the same laser source. Hence they are inherently phase synchronous in the ideal situation. In practice, this architecture would be limited to short-reach links due to the impact of laser phase noise and phase drift. Detailed analysis on feasible communication distance is carried out in Section 6.3.2. Throughout the study, the proposed architecture will be referred to as laser-forwarding BPSK or LF-BPSK, assuming a BPSK modulator is used in the link.

In the first architecture (Fig. 6.2), a single laser source is shared between the two chips and provides optical power for the whole bidirectional link. On each chip, laser power is further divided between the transmitter path and receiver LO path. A tunable phase shifter is added onto the LO path to provide the optimal offset phase \( \phi_{OS} \) for balanced detection. Assuming the laser source has zero phase noise for now, the phase offset would
Figure 6.2: Proposed laser-forwarding architectures (a): a single laser source is shared between Chip 1 and Chip 2.

Figure 6.3: Proposed laser-forwarding architectures (b): each chip has its own co-located laser source and forwards it to the receiving chip. Blue lines are the signal transmission fibers and red lines are the laser distribution fibers.
be used to cancel low-frequency phase noise caused by potential temperature fluctuations and mechanical vibrations. Considering laser phase noise, the signal-to-noise ratio (SNR) of the coherent receiver would inevitably degrade as the difference in propagation distances of the LO and the signal carrier increases. The physical distance between two chips is then strictly limited by laser linewidth, which makes it more suitable for ultra short-reach optical links such as on-board chip-to-chip optical interconnects. In this scenario, the shared laser diode could be mounted on the board and considered as an optical power supply for all the connected chips on the same board.

The second architecture in Fig. 6.3 can mitigate the laser linewidth limitation to a large extent. In this case, different laser source is put in close proximity of each chip and is split off-chip between transmitter and the receiver LO. Part of the laser power is forwarded along with the modulated signal in a fiber bundle to the receiver. Because the length of the LO and signal fibers can be matched very well inside the bundle, laser linewidth requirement can be relaxed significantly. Note that the first architecture does not require additional couplers since one level of power splitting happens on the chip, while the second architecture requires one additional coupler for LO. However, a fiber array is usually coupled onto a single chip to provide very high bandwidth density. In this case, the LO signal can be coupled through a single coupler and distributed on-chip to the individual site. Since the cost of one additional LO port is amortized within the fiber bundle, the packaging overhead for the second architecture should be acceptable (especially in the case of PSM-style modulations where 4 or more fibers carry data modulated from the same transmit laser).

Figure 6.4: System diagram of the receiver in the proposed laser forwarding link architecture.

The proposed receiver system is shown in Fig. 6.4. The analog frontend (AFE) of the receiver consists of transimpedance amplifier (TIA) and continuous-time linear equalizer (CTLE) stages. A data sampler quantizes the data and outputs digital bits. An adaptive sampler and digital phase tuner are needed to control the offset phase $\phi_{OS}$ and maintain the maximum signal amplitude or the eye opening dynamically, a feedback control scheme similar to adaptive equalization in electrical digital communication [55].

To maximize signal swing, the phase difference $\phi_S(t) - \phi_{LO}(t)$ should be adjusted to zero:

$$i(t) = 2a(t)R\sqrt{P_S P_{LO}} + n_s(t).$$

(6.7)
If the modulator utilizes BPSK \((a = +1, -1)\), noiseless signal swing would be \(A = I_{bit1} - I_{bit0} = 4R\sqrt{P_S P_{LO}}\). In the proposed laser forwarding coherent link, signal and LO come from the same laser source. Therefore we have \(P_S = \alpha_c^2 \alpha_m k P_L\) and \(P_{LO} = \alpha_c (1-k) P_L\), where \(P_L\) is the laser output power and \(k\) is the splitting ratio \((0 < k < 1)\), \(\alpha_c\) is coupler loss and \(\alpha_m\) is modulator insertion loss. Substituting the parameters, the final signal amplitude in the proposed architecture can now be written as

\[
A_{LF-BPSK} = 4R\sqrt{P_S P_{LO}} = 4RP_L \alpha_c^2 \sqrt{\alpha_m k (1-k)}. \tag{6.8}
\]

Maximum signal amplitude is reached when laser power is split evenly between signal and LO \((k = 0.5)\). Assuming the same total laser power \(P_L\), signal amplitude for IMDD link is

\[
A_{IMDD} = R\alpha_c^3 \alpha_m P_L. \tag{6.9}
\]

Note that we assume equal modulator insertion loss for IMDD and BPSK links for simplicity. We will revisit this comparison with realistic device models in section 6.3.3. Now with these assumptions, the ratio between the signal amplitude in LF-BPSK and IMDD is

\[
\left( \frac{A_{LF-BPSK}}{A_{IMDD}} \right)_{max} = \frac{2}{\alpha_c \sqrt{\alpha_m}}. \tag{6.10}
\]

Throughout the study, we assume 3dB coupler loss and 5dB modulator insertion loss if not specified. From Eq. 6.10, the signal gain could potentially reach 7.3×. When the noise of the receiver AFE dominates, this is equivalent to 17dB improvement in SNR. On the other hand, the total power can be reduced by 7.3× if the receiver sensitivity is kept the same. It is critical to understand two major reasons behind the laser power reduction. First, BPSK inherently has 3 dB signal gain over IMDD as twice the power is received by the receiver given the same signal amplitude. Second, the forwarded LO only has to go through one coupler in the LF-BPSK architecture, whereas in IMDD link the entire laser power has to go through all three couplers before hitting the photodetector. The new architecture mitigates the impact of coupler loss, saves laser power and potentially relaxes optical packaging requirements.

### 6.3 Modeling of Laser-forwarding Architecture

#### 6.3.1 Link performance analysis

So far we have assumed a noiseless channel and an ideal receiver. In this section, performance of the proposed architecture is analyzed considering photodiode shot noise, circuit thermal noise and sampler sensitivity.

We first take shot noise and thermal noise into consideration. It is well-known that
homodyne coherent detection can achieve higher photon sensitivity than non-coherent links. Here we extend the analysis to the proposed LF-BPSK link. The input-referred thermal noise is dominated by AFE circuits in Fig. 6.4. PSD for thermal noise is \( S_o(\omega) = N_{th} \), and one-side PSD for shot noise is \( S_s(\omega) = qRP \) with responsivity \( R \) and optical power \( P \). Both thermal and shot noise can be approximated as additive white Gaussian. Assuming an ideal maximum likelihood (ML) receiver, one can derive the BER for IMDD receiver and homodyne BPSK receiver accordingly [8].

Taking into account shot noise and thermal noise, BER of an IMDD receiver is derived as

\[
BER = Q \left( \frac{RP_S \sqrt{T_b}}{\sqrt{S_o(\omega)} + \sqrt{S_o(\omega) + qRP_S}} \right),
\]

where signal power is \( P_S = \alpha^3 \alpha m P_L \) and bit period is \( T_b \).

Similarly, BER of a homodyne BPSK receiver is derived as

\[
BER = Q \left( \frac{2R \sqrt{P_S P_{LO} T_b}}{\sqrt{qR (P_{LO} + P_S) + S_o(\omega)}} \right),
\]

where \( P_s = \alpha^3 \alpha m k P_0 \) and \( P_{LO} = \alpha_c (1-k) P_0 \) in the proposed LF-BPSK architecture.

Plugging in typical parameters in a high-speed optical link [14]: thermal current PSD \( N_{th} = 300 \text{ pA}^2/\text{Hz} \), responsivity \( R = 1.0 \text{ A/W} \) and bit rate \( 1/T_b = 50 \text{ Gbps} \), one can compare the best achievable BER for IMDD and LF-BPSK. 3 dB coupler loss and 5 dB modulator insertion loss are assumed. The relationship between BER and total laser power is shown in Fig. 6.5(a). To show the benefits of using BPSK and bypassing coupler losses separately, we consider a reference case where a BPSK receiver uses the same homodyne detection scheme as LF-BPSK while LO does not bypass any couplers. The reference case is simply labeled BPSK in the figures.

The figure shows that the proposed LF-BPSK architecture can achieve the same BER as IMDD with much lower power. To achieve BER of \( 10^{-15} \), LF-BPSK requires \( 7.3 \times \) less laser power than IMDD. In other words, the total laser wall-plug power can be reduced by 8.6 dB. BPSK modulation scheme contributes 3 dB and the architecture contributes 5.6 dB. As thermal noise is the dominating factor for both IMDD and LF-BPSK architectures assuming the typical link parameters, the total power saving benefit matches the direct estimation from Eq. 6.10.

However, the receiver sensitivity is often swing-limited rather than AFE noise-limited at very high data rate [20]. The data sampler in the receiver requires a minimum signal swing for its digital outputs to be regenerated reliably within one bit period through positive feedback. This means the input signal has to be larger than a required swing regardless of thermal noise of AFE circuits. To model the minimum swing requirement, one can allocate \( I_{th} \) in the entire swing for sampler data regeneration, as depicted in Fig. 6.6. When signal
Figure 6.5: (a) BER vs. laser output power at 50Gbps in the noise-limited regime. The proposed LF-BPSK architecture could reduce laser power by 7.3x compared with conventional IMDD link. (b) BER vs. laser output power in the swing-limited regime. The proposed LF-BPSK architecture could reduce laser power by 6.8x compared with IMDD. No link margin is considered for optical power estimation.

\( y \) is lower than threshold 0, it is considered bit 0. When \( y \) is higher than threshold 1, it is considered bit 1. If it falls in between these two thresholds, the bit value is undetermined.

Considering both swing and noise limitations, BER of the conventional IMDD now becomes:

\[
\text{BER} = Q \left( \frac{(R_P S - I_{th}) \sqrt{T_b}}{\sqrt{S_o(\omega) + q R_P S}} \right),
\]

\[(6.13)\]

where \( P_S = \alpha^3 \alpha_m P_L \). BER of the LF-BPSK architecture is recalculated as well:

\[
\text{BER} = Q \left( \frac{(2R \sqrt{P_S} P_{LO} - \frac{I_{th}}{2}) \sqrt{T_b}}{\sqrt{q R (P_{LO} + P_S) + S_o(\omega)}} \right),
\]

\[(6.14)\]

where \( P_S = \alpha^3 \alpha_m k P_L \) and \( P_{LO} = \alpha (1 - k) P_L \).

Figure 6.6: (a) Probability density function (PDF) of the received signal \( y \) in a IMDD receiver, conditioned on the transmitted bit, ZERO or ONE. (b) PDF of \( y \) in a BPSK link. \( I_{th} \) represents the minimum input swing requirement imposed by the data sampler.
Assuming $I_{th} = 200 \mu A$ at 50 Gbps, the relationship between BER and laser power for different architectures is shown in Fig.6.5(b). Much higher laser power is required as the receiver now enters swing-limited regime. The new architecture could save 6.8x laser power compared to IMDD. The slight decrease in power saving benefit is caused by stronger shot noise in BPSK due to higher laser power in swing-limited regime.

The analysis above assumed 200 $\mu A$ sampler-required swing and 3 dB coupler loss. These parameters could vary for different photonic platforms. It is critical to understand how the benefit of the proposed architecture would vary with these parameters. The required laser to achieve a BER of $10^{-12}$ is calculated for these architectures under different scenarios as shown in Fig.6.7. The benefit of LF-BPSK increases as the coupler loss increases and stays constant as long as receiver enters swing-limited regime.

\begin{figure}[h]
\centering
\includegraphics[width=\textwidth]{figure6.7.png}
\caption{(a) To achieve $10^{-12}$ BER at 50 Gbps, required laser output power versus coupler loss for different link architectures, (b) To achieve $10^{-12}$ BER at 50 Gbps, required laser output power versus sampler-limited swing for different link architectures.}
\end{figure}

### 6.3.2 Laser phase noise limitations

In practice, phase noise of the laser source also degrades the performance of the coherent link. This is especially critical in the laser forwarding architecture, as the laser is free running. A laser with linewidth $\Delta \nu = 1.0$ MHz has coherent length $L = c/\pi \Delta \nu = 96$ m. Intuitively, the maximum distance of LF-BPSK using this laser will be much shorter than 96 m in order to maintain good coherence between the LO and the signal. In this section, we will analyze the impact of the laser phase noise on link performance and estimate the feasible range of the proposed link architecture.

Taking phase noise into consideration, one can rewrite Eq.6.6 as

$$i(t) = 2aR\sqrt{P_s P_{LO}} \cos(\phi_n(t_p)) + n_s(t), \quad (6.15)$$

where $t_p$ is the difference in propagation time between signal path and LO path. $t_p = \Delta L/c'$. $\Delta L$ is the length mismatch between signal fiber and the laser forwarding fiber, and $c'$ is the
speed of light inside fiber \((2.1 \times 10^8 \text{ m/s})\). Laser phase noise \(\phi_n(t)\) is a Wiener process such that
\[
\phi_n(t) = \int_0^t \phi'(\tau) \, d\tau,
\] (6.16)
where its time derivative \(\phi'(t)\) is a zero-mean white Gaussian process with PSD \(S_{\phi'}(\omega) = 2\pi\Delta\nu\). As a result, \(\phi_n(t_p)\) also has a Gaussian distribution, centered at zero [8]:
\[
\phi_n(t_p) \sim N(0, 2\pi\Delta\nu t_p).
\] (6.17)
Assuming \(\phi_n(t_p) \ll \pi/2\), one can take the first-order Taylor approximation of Eq.6.15:
\[
i(t) \approx 2aR\sqrt{P_sP_{\text{LO}}}(1 - \frac{\phi_n^2(t_p)}{2}) + n_0(t).
\] (6.18)
Because phase noise has a Gaussian distribution, the square of it would obey \(\chi^2\) distribution with one degree of freedom:
\[
\frac{\phi_n^2(t_p)}{2\pi\Delta\nu t_p} \sim \chi_1^2.
\] (6.19)
Finally the receiver current can be derived as
\[
i(t) \approx 2aR\sqrt{P_sP_{\text{LO}}}(1 - n_p) + n_0(t),
\] (6.20)
\[
\frac{n_p}{\sigma_p^2} \sim \chi_1^2 \quad \text{where} \quad \sigma_p^2 = \pi\Delta\nu t_p.
\] (6.21)

One needs to calculate the supposition of the Gaussian noise and \(\chi^2\) noise to get the BER of a BPSK link, as illustrated in Fig.6.8. Convolution of these noise sources can be taken to get the final noise distribution, which has a non-intuitive form. Instead, one can estimate the upper bound of the final BER through approximations.

The impact of \(\chi^2\) noise on BER can be considered as requiring extra margins on the received signal. For the target BER of \(10^{-12}\), assume half the bit errors are caused solely by chi-squared noise. According to the cumulative distribution function (cdf) of \(\chi^2\) distribution, the probability of a bit error drops below \(e_t = 5 \times 10^{-13}\) when the signal is above \(38.7\sigma_1^2\). Denote the extra margin needed for phase noise as \(m_p\). Let the margin be
\[
m_p = 38.7\sigma_1^2 = 52.2\pi\Delta\nu t_p = 164\Delta\nu\Delta L/c'.
\] (6.22)

Due to symmetry in BPSK, one only needs to consider noise distribution on either negative or positive axle. For simplicity, one can assume \(\chi^2\) noise equals \(m_p\) at a probability of \(1 - e_t\) and use \(e_t\) as a baseline BER. This is equivalent to reducing the signal swing by \(m_p\) and assuming error whenever chi-squared noise exceeds \(m_p\). Hence, an upper bound of BER
Figure 6.8: Noise distribution with laser phase noise effects. The blue margin is reserved for sampler-limited swing $I_{th}$. The red margins on the two sides are chi-squared noise caused by laser phase noise. The sum of thermal noise and shot noise obeys Gaussian distribution with variation $\sigma_0^2$. The peaks of the conditioned pdf are moved closer due to the added margin $m_p$ for phase noise effects.

The BER can be given as

$$BER = Q \left( \frac{(2R\sqrt{P_S P_{LO}}(1 - m_p) - \frac{I_{th}}{2})\sqrt{T_b}}{\sqrt{qR(P_{LO} + P_S) + S_o(\omega)}} \right) (1 - e_t) + e_t. \quad (6.23)$$

The estimated upper bound of BER for a 50 Gbps LF-BPSK link is shown in Fig.6.9 with typical device parameters. As expected, the benefit of the laser forwarding coherent architecture diminishes as laser linewidth and length mismatch between signal and LO fibers increase. When $\Delta\nu\Delta L = 5 \times 10^5$, there is still at least $4.3 \times$ reduction in laser power for LF-BPSK. In this case, the length mismatch between signal and LO paths should be less than 0.5m when the laser linewidth is 1 MHz which is within typical range of DFB laser modules. If the laser linewidth is reduced to 100 kHz, the maximum length match is extended to 5m. This implies that laser forwarding coherent link architecture with on-chip splitting (Fig.6.2(a)) is more suitable for short-reach on-board optical communication with low cost DFB lasers. However, laser forwarding coherent link with off-chip splitting (Fig.6.2(b)) can go much farther as long as the length match between signal and LO path is smaller, which is easy to achieve as long as the two fibers are in a single fiber bundle. So the second architecture has the potential for much longer intra-rack and inter-rack interconnects.

### 6.3.3 Microring modulator for the laser-forwarding BPSK architecture

An ideal BPSK transmitter is assumed in the analysis above, which modulates the optical phase between 0 and $\pi$ with infinite bandwidth and maintains constant optical intensity during modulation. On silicon photonics platforms, BPSK modulator can be realized by modulating a section of p-n diode based on plasma dispersion effect similar to the arm in
Figure 6.9: Estimated upper bound of BER for a 50 Gbps LF-BPSK link, considering laser phase noise, shot noise, thermal noise and sampler-limited swing requirement.

A Mach Zehnder modulator. However, to obtain a full $\pi$ phase shift, high voltage swing and long device structure are needed. This is problematic for energy-efficient shot-reach communication as the insertion loss increases and the modulator driver power is too high. A promising solution is to realize BPSK modulation with a silicon microring modulator, an ultra energy efficient resonator device with a very compact footprint, as shown in Fig.6.10.

Although most research on microring modulator is on non-coherent optical communication, there has been a couple of works on designing optical coherent links based on microring modulators [56–59]. However, little work has been done to study the phase switching dynamics of microring modulators in BPSK scheme. In this section, we will both explore the feasibility of using realistic microring modulators for the proposed coherent LF-BPSK architecture through analysis and show the transient dynamics of microring-based phase modulators based on simulations.

Figure 6.10: From left to right are an SEM image of a microring modulator fabricated in zero-change 45nm CMOS process [6], model diagram of the microring, and the measured and modeled transmission spectra.

A system-level optical simulation toolbox [41] is built in Simulink based on our previous Verilog-A framework [45]. To extract key parameters for simulation, an analytic model of microring modulator is fitted to measurement data [43]. The key device parameters in the
model include ring radius $R_0$, round-trip loss $\alpha$, effective index of refraction $n_e$, input port coupler transmission coefficient $t$ and coupling coefficient $\kappa$, where $|t|^2 + |\kappa|^2 = 1$. Some parameters are labeled in Fig.6.10. Transmission matrix for the input coupler is

$$
\begin{bmatrix}
  E_{t1} \\
  E_{t2}
\end{bmatrix} =
\begin{bmatrix}
  t & \kappa \\
  -\kappa^* & t^*
\end{bmatrix}
\begin{bmatrix}
  E_{i1} \\
  E_{i2}
\end{bmatrix}
$$

(6.24)

For simplicity, $E_{i1}$ is set to 1. The round-trip transfer function inside the ring resonator is

$$
E_{i2} = \alpha e^{j\theta} E_{i2},
$$

(6.25)

where the phase shift can be calculated as

$$
\theta(\lambda) = \frac{\omega L}{c} = 4\pi^2 n_e(\lambda) \frac{R_0}{\lambda}.
$$

(6.26)

Thus the electric field at the through port of the ring can be derived as

$$
E_{t1} = \frac{-\alpha + te^{-j\theta(\lambda)}}{-\alpha t^* + e^{-j\theta(\lambda)}}
$$

(6.27)

The dispersion effect in silicon is modeled as a relationship between effective index $n_e$ and group index $n_g$:

$$
n_e(\lambda) = n_e(\lambda_0) - \frac{\lambda - \lambda_0}{\lambda_0} (n_g - n_e(\lambda_0))
$$

(6.28)

where $\lambda_0$ is the resonance wavelength. Group index is calculated from the measured free spectral range (FSR) of the ring, and effective index is fitted from resonance wavelength. The FSR of a ring is

$$
\text{FSR} = -\frac{\lambda^2}{2\pi R_0 n_g}.
$$

(6.29)

Using these equations, one can model the static transmission spectrum of a microring modulator. As shown in Fig.6.10, the analytical model is fitted to measurement data taken on a realistic high-speed ring modulator in monolithic zero-change 45nm CMOS process [6,60]. The radius of the ring is $R_0 = 5.0 \mu m$ with FSR = 18.0 nm. After fitting, the parameters are $n_g = 2.971$, $n_e = 1.9392$ at the resonance point $\lambda_0 = 1296.25$ nm, $t = 0.9788$, $\alpha = 0.9860$. The index shift on voltage is also measured: $dn_e/dV = 4.2 \times 10^{-5} V^{-1}$, which corresponds to approximately 5 GHz/V resonance shift.

Based on the fitted model, we get the static transmission spectra and phase response of the microring at different voltage biases, as shown in Fig.6.11. Reverse bias of -4 V is applied across the p-n junction to create sufficient resonance shift for modulation. The nominal laser wavelength for IMDD and BPSK are different as labeled in the figure. Fig. 6.11 also contains a phasor diagram of the ring transmission curve with different trajectories for BPSK and IMDD modulators. In practice, the relative distance between the microring resonance and laser wavelength can be tracked and stabilized through a feedback thermal
tuner. Robust and energy efficient thermal tuning for microring modulators have been demonstrated recently [31,60].

Figure 6.11: Left is the transmission spectra and phase response of a microring modulator, where the two dashed lines represent nominal laser wavelength for the two modulation schemes. Right is the phasor diagram of a microring modulator marked with modulation trajectories.

The system diagram of the microring-based laser-forwarding coherent link is shown in Fig.6.12. Denote transfer function for modulator as $a_m$, of which the amplitude and phase are $a_{m0,1}$ and $\phi_{m0,1}$ respectively. Optical power at different location of the laser-forwarding link is labeled. One can evaluate the system performance with the static model developed above.

In a conventional IMDD link using microring modulator, the received signal $A_{DD}$ is

$$A_{DD} = R(P_{S1} - P_{S0}) = \alpha_c^3(a_{m1} - a_{m0})RP_L.$$  (6.30)
In the proposed laser-forwarding coherent link, the received signal is maximized when laser power is split evenly between signal and the LO. The received signal in the coherent detection can be calculated as

\[ A_{CD} = 2R \sqrt{P_{S1}P_{LO}} \cos(\phi_{m1} - \phi_{os}) - 2R \sqrt{P_{S0}P_{LO}} \cos(\phi_{m0} - \phi_{os}) \]

\[ = \alpha_c^2 (\sqrt{a_{m1}} \cos(\phi_{m1} - \phi_{os}) - \sqrt{a_{m0}} \cos(\phi_{m0} - \phi_{os})) RP_L \]

For simplicity, let \( RP_L = 1 \) for both cases. Coupler loss is again assumed to be 3 dB. Transfer characteristics of the microring modulator vary with the laser wavelength. Moreover, the received signal also depends on the phase offset between the signal carrier and the LO. Optimal phase offset and laser wavelength that maximizes received signal can be found according to Fig. 6.13. In real system implementation, laser wavelength is fixed but the resonance frequency of ring modulator is tunable through integrated heater. For the purpose of optimization, tuning ring modulator with respect to laser wavelength is equivalent to sweeping the laser wavelength with fixed ring resonance frequency. For each locking condition, phase offset \( \phi_{OS} \) can be further tuned to optimize the output signal. The optimal combination of phase offset and laser wavelength is in fact the condition for BPSK, although the actual peak-to-peak phase change could be smaller than \( \pi \), as in Fig. 6.11. Under this condition, the modulation trajectory is symmetric against x-axis in the phasor diagram with the optimal laser wavelength. The symbols are projected to y-axis to get the maximum receiver signal with the optimal phase offset.

The final signal amplitudes for the conventional IMDD link and the proposed LF-BPSK link are compared in Fig. 6.13. Despite of the phase-amplitude correlation of the microring modulator, the proposed LF-BPSK could still achieve around 4× signal gain or 4× laser power reduction. The impact of noise on link performance is similar to ideal LF-BPSK above and is not repeated here. Taking noises into consideration, the signal gain is still large-enough for microring to be a promising candidate for the proposed architecture due to its compact footprint and superior energy efficiency.

So far we have only considered static characteristics of microrings. However, it is critical that dynamic behavior of the microring is considered for high speed optical links, as the bandwidth of the microring can be limited by Q factor of the cavity [34, 44]. Transient simulation of the microring-based coherent link is carried out in Simulink based on the same principles as our previous Verilog-A simulation framework [45]. Simulink schematics for the conventional IMDD link and the proposed LF-BPSK link are shown in Fig.6.14. PRBS signal is used as the drive signal to generate the eye diagram at the receiver.

Transient waveforms for a microring modulator in BPSK mode are in shown in Fig.6.15. As expected, the amplitude of the modulated signal dips at each transition between bit 1 and bit 0 due to the Lorentzian transmission curve. Note the data rate is currently limited to around 40 Gbps by the optical bandwidth of the modeled microring modulator.

The final receiver diagrams of IMDD link and laser forwarding BPSK link is compared
Figure 6.13: Left is the contour of the received signal in LF-BPSK vs. phase offset $\phi_{OS}$ and laser wavelength. The optimal point is marked with the black dot and the optimal phase offset with the dashed line. Right is the received signal for IMDD and LF-BPSK vs. laser wavelength, assuming an optimal $\phi_{OS}$.

Figure 6.14: Simulink schematics for optical link simulation. Top schematic is for microring modulator based IMDD link. Bottom schematic is for microring modulator based laser forwarding coherent link. The simulation framework supports all basic optical devices such as laser, modulator, photodetector, coupler and splitter.
Figure 6.15: Transient simulation results for microring modulator in BPSK mode. From the top to bottom are the waveforms of drive voltage, amplitude of the modulated signal and phase of the modulated signal.

Figure 6.16: Simulated eye diagram for IMDD link and laser forwarding BPSK link. Note the laser power is set the same and the scale for signal amplitude is the same.
in Fig.6.16. It is clear that with the same laser power, the settled eye height in BPSK case is 4× taller than the IMDD eye. This is consistent with static modeling results. The benefit of the new architecture comes from the phase modulation scheme itself and more importantly the proposed laser forwarding architecture. Although the BPSK has slower rising and falling edges compared with IMDD, the signal gain from the coherent architecture is still better than simply trading off bandwidth for eye opening for a non-coherent ring modulator. To achieve a higher data rate, the quality factor of the ring should be further lowered by stronger coupling.

Although the analysis above has focused on single-wavelength optical links, the proposed microring-based architecture can be generalized for Wavelength-Division Multiplexing (WDM) links. One possible configuration for a laser-forwarding coherent WDM architecture is shown in Fig.6.17. In this case, each coherent receiver uses two identical microring filters that are thermally locked to the same wavelength for LO and signal channel selection. The drop ports of these two microring filters are connected to a standard balanced photodetector where homodyne detection takes place. The benefit of the laser forwarding architecture would still exist compared with conventional non-coherent microring-based WDM links. This provides a potential way to scale up the total data bandwidth per fiber while using the proposed link architecture.

![Figure 6.17: Proposed microring-based WDM coherent link architecture with laser forwarding configuration. Microring-based modulators and filters are used for energy-efficient modulation and intrinsic wavelength selectivity. In this example, on-chip optical power splitting between LO and signal is adopted. Off-chip optical power splitting can also be used.](image)

### 6.4 Demonstration of Silicon Photonics Coherent Link

The laser-forwarding coherent link architecture is demonstrated with a microring-based silicon photonics transmitter. Fig. 6.18 shows the measurement setup for laser-forwarding coherent link. A 50/50 fiber coupler splits the optical power between the transmitter chip and the receiver chip (LO signal). A polarization controller (PC) is used before the transmitter chip to optimize fiber coupling through the vertical grating couplers (VGC). The measured
insertion losses of the PC and VGCs are 1.7dB and 4.0dB, respectively. The LO signal and modulated signal are mixed through another 50/50 fiber coupler and converted to electrical signals by an external AC-coupled balanced receiver (Thorlab PDB480C). The high-speed transmitter chip is designed and fabricated in 45nm SOI process with the same monolithic silicon microring modulator as in [32], which has a Q factor of 7500 and bandwidth higher than 20GHz. The transmitter circuitry along with the photonics can support up to 20Gb/s NRZ and 40Gb/s PAM4 modulation as demonstrated in [32]. The length of the LO and transmission fibers is approximately two meters. There is no special control over mechanical vibration of the fibers besides taping them down onto the optical table. A non-coherent NRZ link setup is built for performance comparison using the same transmitter chip, the same receiver module and the same channel losses. Although the bandwidth of discrete balanced receiver (1.6 GHz) limits the maximum speed of coherent link, the key concepts of the laser-forwarding coherent detection scheme as well as the coherent modulation of microrings can be proven experimentally.

Figure 6.18: Measurement setup of NRZ link and laser-forwarding coherent link
6.4.1 Microring-based laser-forwarding BPSK

On the transmitter chip, a PRBS31 generator feeds digital data into an 8-to-1 serializer, which then drives the anode of the microring modulator through a simple inverter chain. The anode of the modulator sees voltage swing of 1.2V (0 to 1.2V), while the cathode is connected to constant bias voltage $V_b$. On the receiver side, the output signal of the balanced photodetector is sampled and stored by a 40GS/s real-time oscilloscope. Eye diagrams are generated in Matlab from the captured real-time waveforms. The measurements were taken under two different bias conditions (A: $V_b = 0.5V$ and B: $V_b = 1.0V$). Bias condition A increases the resonance shift of the microring significantly by driving the p-n junction more into the forward-biasing mode at the cost of slower modulation. Bias condition B is the typical bias condition for high-speed operations. For each bias condition, the eye diagrams of the conventional NRZ link and laser-forwarding BPSK (LF-BPSK) link are shown in Fig. 6.19 and Fig. 6.20 respectively. The laser wavelength is fine tuned manually to maximize the total OMA in each case. In all the cases, channel loss maintains the same, and the captured waveform duration is 25$\mu$s, which corresponds to 1 million sampling points.

![Figure 6.19: Comparison between the eye diagrams of NRZ link and LF-BPSK link. Common conditions: bias condition $V_b = 0.5V$, measurement duration = 25$\mu$s, samples = 1Mpts. (a) NRZ modulation, laser output power $P_L = 4$dBm, (b) LF-BPSK, laser output power $P_L = 0$dBm.](image)

In Fig. 6.19, the height of the NRZ eye diagram is 0.2V with laser output power $P_L = 4$dBm; while in Fig. 6.20, the height of the BPSK eye diagram is 0.6V with smaller laser output power $P_L = 0$dBm. These results show that LF-BPSK achieve 3x larger OMA with 2.5x less laser power. In other words, the microring-based LF-BPSK link achieves 7.5x total OMA gain or 7.5x total laser power reduction compared to conventional microring-based NRZ links. In Fig. 6.20, with typical high-speed bias conditions, the height of the NRZ eye diagram is 0.15V with laser output power $P_L = 4$dBm; while in Fig. 6.20, the height of the BPSK eye diagram is 0.37V with smaller laser output power $P_L = 0$dBm. These results show that the microring-based LF-BPSK link achieves 6.2x total OMA gain or 6.2x total laser power reduction.
The bandwidth of the balanced receiver in Fig. 6.18 limits the highest data rate that can be demonstrated with this setup. To verify the scheme at higher data rates, we replace the balanced receiver with two standalone receivers (RX0 and RX1) with higher bandwidth (Thorlabs PDA8GS, 9.5 GHz). The output signals of the two receivers are recorded by the real-time scope and the measured waveforms are shown in Fig. 6.22. By substrating the output of RX1 from that of RX0, we can get the same result as an actual balanced receiver.

With the modified setup in Fig. 6.23, we are able to operate the microring modulator at 10Gb/s for LF-BPSK architecture with $V_b = 1.0V$. We also measured 10Gb/s NRZ modulation with the same 9.5GHz receiver using the NRZ setup in Fig. 6.18. The eye diagrams of the 10Gb/s NRZ and BPSK modulation are shown in Fig. 6.23 with a measurement duration of 10µs. The height of the NRZ eye diagram is 0.016V with laser output power $P_L = 10$dBm; while the height of the LF-BPSK eye diagram is 0.050V with smaller laser output power $P_L = 7$dBm. These results show that at 10Gb/s the microring-based LF-BPSK link achieves 6.3x total OMA gain or 6.3x total laser power reduction compared to conventional microring-based NRZ links. This gain is consistent with the 1Gb/s results in Fig. 6.20 since the bias condition for the microring modulator is the same in both cases. Although the data rate is still limited by the available optical receivers, this measurement demonstrates that microring-based LF-BPSK has the potential to be applied to even higher data rates.

In summary, the microring-based LF-BPSK achieves 6-7.5x gain in signal or reduction in laser power under typical bias conditions. According to the analysis in Section 6.3.3, the microring modulator based LF-BPSK achieves an optimal gain of 4.0x with 3.0dB vertical grating coupler (VGC) loss ($\alpha_c = 0.5$). In our case, the effective VGC loss is approximately 5dB ($\alpha_c = 0.3$) considering the loss from polarization controller. From Equation 6.31 and Equation 6.30, the signal gain of LF-BPSK is inversely proportional to $\alpha_c$. Given higher coupling loss, the simulated gain of LF-BPSK in Section 6.3.3 becomes 6.7x, which is close to
Figure 6.21: LF-BPSK measurement setup with two standalone receivers

Figure 6.22: The output waveforms of the two receivers in the modified LF-BPSK setup at 10Gbps

(a) 10Gb/s NRZ modulation, laser output power $P_L = 10$dBm, (b) 10Gb/s LF-BPSK, laser output power $P_L = 7$dBm.

Figure 6.23: Comparison between the 10Gb/s eye diagrams of NRZ link and LF-BPSK link. Bias condition $V_b = 1.0V$
the actual signal gain measured in the experiments. The measurement results also show that larger resonance shift ($V_b=0.5V$) increases the gain of LF-BPSK over NRZ. This indicates that with typical bias conditions, the phase shift of microring modulator is smaller than $\pi$ and the OMA can be further improved by optimizing the modulator ring designs discussed in Chapter 3.

### 6.4.2 Impact of the random phase drift on coherent detection

The major challenge of the laser-forwarding coherent architecture is the impact of the random phase drift between LO and received signal. This low-frequency phase noise is mainly caused by temperature fluctuation and mechanical vibrations, which exist in any practical fiber optics systems. One potential solution to this issue is using an on-chip phase tuner to track and cancel the phase drift adaptively, as shown in Fig. 6.5. For this adaptive phase tuning scheme to be feasible, the bandwidth of phase drift needs to be relatively low.

The bandwidth of the random phase drift is estimated by varying the measurement window for the eye diagram. In Fig. 6.24(a), the measurement window is set to 25 $\mu$s and the corresponding record length is 1Mpts. In Fig. 6.24(b), the measurement window is set to 50$\mu$s and the corresponding record length is 2Mpts. With 50 $\mu$s measurement window, the random phase drift starts to degrade the eye quality by making the edges more jittery. Therefore, the bandwidth of phase drift is below at least 40kHz in our measurement setup, which makes real-time adaptive phase tracking possible.

![Figure 6.24: (a) 1Gb/s LF-BPSK eye diagram, measurement window = 25$\mu$s, samples = 1Mpts; (b) 1Gb/s LF-BPSK eye diagram, measurement window = 50$\mu$s, samples = 2Mpts](image)

### 6.4.3 Microring-based laser-forwarding QPSK

The microring transmitter used for our coherent measurements is also capable of PAM4 modulation. Each segment of the segmented microring modulator is driven by an individ-
ual driver to form an optical DAC. The high-speed PAM4 modulation based on microring transmitter has been demonstrated in [32]. Therefore, we can generate the multi-level signals with the same segmented microring based on the same principle, and decode the received signal with the laser-forwarding coherent detection scheme.

With the same experimental setup as LF-BPSK, we demonstrated 3Gbps laser-forwarding QPSK transmission using microring-based optical DAC. The measured QPSK eye diagrams are shown in Fig. 6.25 with different DAC codes and different measurement windows. The signal gain or laser power reduction from laser-forwarding architecture are the same as the LF-BPSK case because the highest and lowest levels directly correspond to level 1 and level 0 in the BPSK case. The measurement results show that the ratio between the three eye openings depend on the DAC code (comparing (a) and (c)). They also show that the random phase drift between LO and receiver impacts the eye quality in a similar way as LF-BPSK. The potential of using laser-forwarding architecture for more complex coherent modulation schemes has been proven.

Figure 6.25: (a) 3Gb/s LF-QPSK eye diagram, DAC code = 4/9/15, measurement window = 25µs, samples = 1Mpts; (b) 3Gb/s LF-QPSK eye diagram, DAC code = 4/9/15, measurement window = 50µs, samples = 2Mpts; (c) 3Gb/s LF-QPSK eye diagram, DAC code = 5/10/15, measurement window = 25µs, samples = 1Mpts; (d) 3Gb/s LF-QPSK eye diagram, DAC code = 5/10/15, measurement window = 50µs, samples = 2Mpts.
6.5 Summary

A new laser-forwarding link architecture is proposed, analyzed and demonstrated in this chapter. The new architecture enables phase modulation and coherent detection for short-reach optical communication. The key advantage is that it significantly improves laser photon efficiency by utilizing homodyne detection and bypassing coupler losses in the system. Analysis has shown that with typical technology parameters, the laser-forwarding BPSK link could potentially save laser power by $7\times$, compared to conventional non-coherent links. Moreover, the performance of the silicon microring-based coherent link has been evaluated based on static modeling and transient simulations. Compared with microring-based non-coherent links, a 6 dB reduction in total laser power is proven possible using realistic device parameters. The impact of shot noise, thermal noise, phase noise, and sampler swing requirement on link performance is studied in this study. Among the noise sources, phase noise of the laser source imposes the fundamental limit on the potential communication distance of the proposed coherent architecture. Around 1m of distance mismatch between the LO fiber and the signal fiber can be tolerated with typical lasers used in data-center link applications. In the proposed architectures, this is enough to address a variety of photonic interconnects - from on-board links, to intra- and inter-rack links (assuming LO sent in the same fiber bundle as the modulated signals). Although this work has focused on simple BPSK modulation, the architecture could also be used for high-order modulation in short-reach optical links.

The key concepts of the proposed laser-forwarding architecture are proven experimentally. Coherent optical links (BPSK and QPSK) based on silicon microring transmitters are demonstrated with the laser-forwarding coherent architecture. Based on the experiments, we can achieve 6-7.5x OMA gain or 6-7.5x laser power reduction with microring modulators. A fast balanced receiver and phase tracking loop would increase the data rate and reability of this scheme for short-reach optical communications. In summary, the proposed architecture could potentially solve the energy bottle-neck imposed by laser sources and open opportunities of coherent communication for short-reach optical links with low-cost and high energy efficiency.
Chapter 7

Final Thoughts and Conclusions

Silicon photonics is a fascinating example of interdisciplinary technology. The recent progress in large-scale integration between photonics and VLSI systems has opened the door to a wide range of research topics and industrial applications, from high-speed optical links, LIDAR, photonic ADC to bio-sensing, and so on. At the same time, the close integration presents new challenges for researchers and engineers, and solving these challenges requires expertise on both electronics and photonics.

The goals of this study are to develop co-design techniques, push performance limits and explore new architectures for high-speed silicon photonic interconnects. This study took the perspective of electronic-photonic co-design and tackled the major design challenges from device, circuits and system levels. Built upon the achievements of other researchers, the author was fortunate to go through the whole process of silicon photonics research, including platform development, model development, photonics design, circuits design, chip tapeout and chip measurements. In this process, new models, new designs and new architectures have been proposed, implemented and verified. The contributions of this work are summarized as follows.

First, a new co-optimization framework is developed for designing high-speed silicon photonics transmitters. It enables the engineers to explore the design trade-offs in depth for microring and Mach-Zehnder optical transmitters and compare their performances given the same set of technology and link constraints. This framework can be applicable to most of today’s silicon photonics platforms and can be extended to include receiver designs and thermal tuning designs, and assist the co-optimization of the next-generation silicon photonic interconnects.

Second, a full 40Gb/s optical NRZ transmitter using microring modulators has been demonstrated in 45nm SOI process with record energy efficiency and bandwidth density. Electronic-photonic co-design with the high swing driver enabled this transmitter to achieve a total energy efficiency of 330fJ/b and the photonics and modulator driver area bandwidth
density of $6.7\text{Tb/s/mm}^2$ at 40Gb/s. A full 40Gb/s PAM4 transmitter based on two-segment microring modulators has also been demonstrated on the same platform. These results makes the microring-based transceivers an attractive solution for the next-generation 100Gb/s and 400Gb/s interconnects.

Third, a full 3D integrated chip-to-chip photonic link has been demonstrated for the first time. The transceiver chips are developed in a new wafer-scale heterogeneous platform, where the photonics and CMOS chips are 3D integrated using wafer bonding and low-parasitic capacitance thru-oxide vias (TOVs). This development platform yields 1000s of functional photonic components as well as 16M transistors per chip module. The transmitter operates at 6Gb/s with an energy cost of 100fJ/bit and the receiver at 7Gb/s with a sensitivity of $26\mu\text{A}$ (-14.5dBm) and 340fJ/bit energy consumption. A full 5Gb/s chip-to-chip link, with the on-chip calibration and self-test, is demonstrated over a 100m single mode optical fiber with 560fJ/bit of electrical and 4.2pJ/bit of optical energy.

Finally, a new short-reach laser-forwarding coherent link architecture is proposed, analyzed and verified with experiments. Coherent optical links (BPSK and QPSK) based on silicon microring transmitters are demonstrated with laser forwarding architecture. It achieves an OMA gain or a laser power reduction of 6-7.5 x with microring modulators. This new approach could open opportunities of coherent communication for short-reach optical links with low-cost and high-energy efficiency.

Based on the co-design approach in this work, there are a few directions to further improve the performance of microring-based silicon photonic interconnects. From photonics perspective, the doping options were limited for microring modulators for the 40Gb/s monolithic transmitter design. Higher doping level should be used to achieve larger resonance shift and thus higher OMA at higher data-rates. In addition, the maximum depletion width was much smaller than the pitch of the interleaved junctions. Therefore, it is beneficial to reduce the pitch of the junctions given the DRC constraints. From circuits perspective, driver circuits with even higher swing should be explored to further improve the OMA and save laser power as there is still room in the total resonance shift. From architecture perspective, the clock-forwarded microring-based WDM can be implemented to improve the system energy efficiency by simplifying clocking circuits.

This work has laid the theoretical foundation for the laser-forwarding coherent architecture and verified the key concepts through experiments. More work still needs to done to make the proposed scheme practical in real-world short-reach communication interconnects. The most critical system block that needs to developed is the phase tracking loop for compensating the slow phase drift in the fibers.

At last, I was fortunate to experience the beauty of photonics and electronics on this quest for faster and better silicon photonic interconnects. I hope that more people will join the efforts in pushing the frontier of silicon photonics and contributing to the success of the industry.
Bibliography


[41] [Online]. Available: https://github.com/isgcal/SiPh_Simulink


