Integrated Circuit Design Techniques for High-Speed Low-Power Analog-to-Digital Converters and On-Chip Calibration of Sensor Interface Circuits

A Dissertation Presented

by

SEYED ALIREZA ZAHRAI

to

The Department of Electrical and Computer Engineering

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

in the field of

Electrical Engineering

Northeastern University
Boston, Massachusetts
August 2017
Abstract

To improve software-defined radio (SDR) architectures, it is desirable to capture wideband radio frequency (RF) signals with minimal analog receiver front-end circuitry, and to quantize these signals prior to adaptable digital signal processing operations. However, the high power consumption of wideband analog-to-digital converters limits the application range of this SDR design approach, especially for battery-powered devices. This creates an incentive for the development of low-power high-speed data converter architectures that enable the processing of wideband signals in communication applications such as direct RF sampling transceivers for SDRs and ultra-wideband (UWB) radios.

An 8-bit 1GS/s hybrid analog-to-digital converter (ADC) for high-speed low-power applications is introduced in this dissertation. It has a subranging architecture with a 3-bit flash ADC as a first stage and a 5-bit 4-channel time-interleaved comparator-based asynchronous binary search (CABS) ADC as a second stage. In each channel, a merged sample-and-hold and capacitive digital-to-analog converter (SHDAC) performs the tasks of sampling the input and generating the residue voltage for the subranging operation. The effects of the parasitic capacitances on the SHDAC linearity are analyzed, and a linearity correction method is introduced to enable power-efficient high-speed operation in the presence of parasitics. Furthermore, the sampling network configuration incorporates an error reduction technique to alleviate the clock feedthrough of bootstrap switches. The offsets of the comparators in the flash ADC are calibrated in the foreground using a built-in on-chip calibration system. Post-layout simulations of an 8-bit 1GS/s hybrid ADC design in 130nm complementary metal-oxide-semiconductor (CMOS) technology resulted in an effective number of bits (ENOB) above 7.37 up to the Nyquist frequency while consuming 13.3mW from a 1.2V supply. A prototype chip was fabricated in 130nm CMOS technology for experimental verification of the concepts. The evaluations of operation with 6-bit resolution at 1GS/s demonstrated a measured ENOB above 5.26 up to the Nyquist frequency with a power consumption of 10.5mW from a 1.2V supply.

Another part of this dissertation research addresses the enhancement of sensor interfaces through calibration to optimize signal conditioning. In particular, an
on-chip digital calibration system was developed to automatically boost the input impedance of an analog front-end for monitoring of electroencephalography (EEG) signals over long durations. In calibration mode, an on-chip test signal is injected into the input of an instrumentation amplifier with digitally programmable negative capacitance generation feedback (NCGFB). The digital calibration unit has been designed to automatically control the capacitors in the NCGFB network based on the outputs of on-chip comparators for amplitude detection. An oscillation detection feature prevents unstable operation in the analog front-end. The main benefit of this built-in calibration approach is the added capability to automatically tune the system for compensation of manufacturing process variations. The on-chip calibration technique was demonstrated with measurements of a prototype chip fabricated in 130nm CMOS technology.
Acknowledgements

First and foremost, I would like to thank my Ph.D. advisor, Professor Marvin Onabajo, for his invaluable guidance and support throughout the years of my Ph.D. study. I would also like to thank my Ph.D. committee members, Professor Yong-Bin Kim and Professor Bradley Lehman, for their guidance through the final stages of the degree completion.

I would like to express my deepest appreciations to my parents for their love, patience, and support. Without their consistent support, I would have not accomplished this achievement. I would also like to thank my brother, Mehdi, for all his support and consultations during the years of my graduate studies.

I would like to thank Nicolas Le Dortz and Marina Zlochisti for their collaboration on the hybrid ADC project. I would also like to thank Chun-hsiang Chang, Li Xu, Kainan Wang, and Ibrahim Farah for collaborating on the SCAFELAB project. Without their hard work, I could not have completed the two research projects with success.

I would also like to thank the ECE Department and the College of Engineering at Northeastern University for the teaching assistantship opportunity, which was a unique and joyful experience for me in parallel with my Ph.D. research.

I thank the MOSIS service for providing the fabrication service for the microchips of the research projects. I would also like to thank the National Science Foundation (NSF) for the financial support of the SCAFELAB research project.
Table of Contents

1. Introduction ......................................................................................................................................................... 17
   1.1 Overview of Existing and Emerging Applications ....................................................................................... 17
   1.2 Design Challenges for Low-Power High-Speed ADCs ............................................................................... 18
   1.3 Analog Front-End Requirements for Long-Term EEG Monitoring ......................................................... 20
   1.4 Contributions of this Research ..................................................................................................................... 21
       1.4.1 Low-power high-sampling-rate hybrid ADC with a subranging time-interleaved architecture ......... 21
       1.4.2 Automatic on-chip digital calibration of an analog front-end for EEG ........................................ 22
   1.5 Dissertation Structure ................................................................................................................................. 23

   2.1 Analog-to-Digital Converters .......................................................................................................................... 24
       2.1.1 Sampling theory .................................................................................................................................... 24
       2.1.2 Quantization ......................................................................................................................................... 25
       2.1.3 Dynamic performance parameters ....................................................................................................... 27
       2.1.4 Figure of Merit (FoM) ......................................................................................................................... 29
   2.2 Conventional Nyquist-Rate ADC Architectures ......................................................................................... 29
       2.2.1 Flash ADCs ......................................................................................................................................... 29
       2.2.2 Interpolating and folding ADCs ........................................................................................................... 31
       2.2.3 Subranging and two-step ADCs .......................................................................................................... 33
       2.2.4 Pipelined ADCs .................................................................................................................................... 34
       2.2.5 Successive approximation register ADCs ........................................................................................... 36
       2.2.6 Time-interleaved ADCs ....................................................................................................................... 37
   2.3 Summary ....................................................................................................................................................... 38

3. Proposed High-Sampling Rate Hybrid ADC Architecture ........................................................................ 39
   3.1 Time-Interleaved ADC Design Considerations ............................................................................................ 39
       3.1.1 Channel offset mismatch ....................................................................................................................... 39
3.1.2 Channel gain mismatch .............................................................. 40
3.1.3 Channel timing mismatch (timing skews) .................................. 41
3.1.4 Channel bandwidth mismatch .................................................... 42

3.2 Sub-ADC Architectures in Time-Interleaved ADCs .................... 43
3.2.1 Overview of SAR ADC architectures ......................................... 44
3.2.2 Comparator-based asynchronous binary search (CABS) ADC .... 46

3.3 Power-Efficient High-Speed Medium-Resolution ADCs .............. 46

3.4 Proposed Hybrid ADC Architecture ............................................ 49
3.4.1 Architectural power and area tradeoffs for the resolutions of the coarse and fine ADCs ............................................................... 52

4. Hybrid ADC Design Considerations and Circuit-Level Implementation .......... 54
4.1 Merged Sample-and-Hold and Digital-to-Analog Converter (SHDAC) Circuit ................................................................. 54
4.2 Analysis and Correction of the Parasitic Capacitances’ Impacts on the Residue Voltage .............................................................. 57
4.3 Bootstrap Switches .................................................................... 61
4.4 Analysis of the Clock Feedthrough Cancellation Technique for Bootstrap Switches ................................................................. 62
4.4.1 Sampling error due to charge injection ...................................... 63
4.4.2 Sampling error due to clock feedthrough .................................. 64
4.4.3 Sampling voltage error cancellation ......................................... 65
4.5 Flash ADC ................................................................................. 68
4.6 Unity-Gain Voltage Buffer .............................................................. 72
4.7 Comparator-Based Asynchronous Binary Search (CABS) ADC ....... 74
4.8 Clock Generation System .............................................................. 79
4.8.1 Timing considerations for time-interleaved high-frequency ADCs.... 79
4.8.2 Clock buffer .......................................................................... 81
4.8.3 Circuit implementation of the clock generation system .............. 81
4.8.4 Synchronous clock reset.......................................................................................... 84
4.8.5 Layout and routing considerations for clock generation........................................ 85
4.9 Channel Bandwidth Mismatch Considerations.......................................................... 87
4.10 Calibration Technique for Flash ADC Offset Cancellation........................................ 88
4.11 Hybrid ADC Post-Layout Simulation Results ........................................................... 93
4.12 Interpretation of the Hybrid ADC Simulation Results .............................................. 97
4.13 Summary .................................................................................................................. 101

5. Hybrid ADC Testing and Measurement Results .......................................................... 102
5.1 Bit Alignment Unit ........................................................................................................ 102
5.2 Low-Voltage Differential Signaling (LVDS) Driver .................................................... 103
5.3 ADC Chip Fabrication and Packaging ........................................................................ 105
5.4 Hybrid ADC Test Setup ............................................................................................ 107
  5.4.1 Interface circuits for the input and clock signals of the ADC ......................... 107
  5.4.2 ADC output interface ......................................................................................... 108
5.5 Printed Circuit Board .................................................................................................. 109
5.6 Manual Flash ADC Offset Calibration ..................................................................... 111
5.7 Measurement Results ................................................................................................. 112
5.8 Summary .................................................................................................................... 122

6. On-Chip Digital Calibration of an Analog Front-End for Biopotential Measurements .............................................................................................................. 123
6.1 Self-Calibrated Analog Front-End for Long Acquisitions of Biosignals (SCAFELAB) ......................................................................................................................... 123
6.2 Oscillation Detection Technique to Prevent Unstable Operation ............................... 127
6.3 Analysis of the Required Time for Automatic Calibration .......................................... 129
6.5 Post-Layout Simulation Results .................................................................................. 134
6.6 Test Setup and Measurement Results ......................................................................... 137
6.7 Summary ........................................................................................................ 143

7. General Conclusion and Future Work .............................................................. 144

8. References ........................................................................................................ 146
List of Figures

Figure 1. Envisioned application of a wideband ADC in a short-range wireless transceiver................................................................. 18
Figure 2. Block diagram of a typical EEG acquisition system................................. 20
Figure 3. Sampling and quantization operations on an analog signal. ....................... 24
Figure 4. Frequency domain representation of a sampled signal with bandwidth of $f_B$ sampled at a sampling rate of $f_S$ when (a) $f_S > 2f_B$ and (b) $f_S < 2f_B$. ......................... 25
Figure 5. Ideal 3-bit ADC transfer function. ............................................................. 26
Figure 6. Quantization error of an ideal 3-bit ADC..................................................... 27
Figure 7. Spurious free dynamic range (SFDR) definition for an ADC..................... 28
Figure 8. Conventional flash ADC (2-bit resolution).................................................. 30
Figure 9. Interpolation concept.............................................................................. 32
Figure 10. Folding ADC example. ................................................................. 32
Figure 11. Subranging ADC architecture ............................................................... 33
Figure 12. Pipelined ADC architecture ................................................................. 35
Figure 13. SAR ADC architecture ...................................................................... 36
Figure 14. Time-interleaved ADC architecture....................................................... 37
Figure 14. Modeling of voltage offsets between channels in a TI ADC.................... 40
Figure 16. Modeling of gain errors between channels in a TI ADC......................... 40
Figure 17. Timing mismatches (clock skews) of sampling clocks between time-interleaved channels......................................................... 41
Figure 18. Modeling of clock skews amongst channels in a TI ADC....................... 42
Figure 19. Modeling of channel bandwidths in a TI ADC....................................... 43
Figure 20. Block diagram of the proposed hybrid ADC architecture. (A single-ended representation is shown for simplicity.)....................................................... 50
Figure 21. Operational sequences of the hybrid ADC.............................................. 51
Figure 22. Timing diagram for one channel of the hybrid ADC............................... 52
Figure 23. Estimated (a) power and (b) area for this hybrid ADC architecture with L-M bits, where $L =$ flash ADC resolution and $M =$ CABS ADC resolution. ................. 53
Figure 24. SHDAC transfer function during sampling and residue generation........ 54
Figure 25. Fully differential schematic of the semi-thermometer-coded SHDAC....... 55
Figure 26. SHDAC controller and flash ADC thermometer-to-binary encoder .......... 56
Figure 27. Model for analysis with parasitic capacitances at the output of the SHDAC in one channel (single-ended equivalent) ................................................................. 57
Figure 28. Residue voltage generated by the SHDAC using the reference scaling method with (a) non-optimized design ($C_{pX1} \neq C_{pX2}$), (b) $C_{pX1}$ and $C_{pX2}$ values that are close to each other (through design optimization) .......................................................... 60
Figure 29. Bootstrap switch between the SHDAC and flash ADC with a dummy switch ($M_{13}$) .............................................................................................................. 61
Figure 30. Simplified differential model of the sampling network to analyze the operation of the proposed dummy switch technique ........................................ 62
Figure 31. Simulated differential output voltage of the SHDAC, showing the voltage errors for three simulation cases ................................................................. 66
Figure 32. Histogram of the error of the sampled voltage from 100 Monte Carlo post-layout simulation runs ................................................................. 68
Figure 33. Single-ended equivalent of the 3-bit flash ADC architecture in the hybrid ADC. ........................................................................................................ 69
Figure 34. Dynamic latched comparator with kickback reduction and offset compensation circuitry ................................................................. 70
Figure 35. Simulation results with and without kickback reduction transistors: (a) differential input voltage signal of the flash ADC, (b) kickback voltage error vs. input amplitude ......................................................................................................... 70
Figure 36. Impacts of flash ADC comparator offsets on the subranging transfer function ........................................................................................................ 71
Figure 37. (a) Unity-gain buffer configuration, (b) schematic of the telescopic OTA in the unity-gain buffer ........................................................................................................ 73
Figure 38. Simulated (a) PSRR, and (b) THD of the unity-gain buffer from 100 Monte Carlo simulation runs ........................................................................................................ 73
Figure 39. 5-bit CABS ADC architecture (single-ended equivalent) .................. 74
Figure 40. Outputs of the activated CABS comparators for one 5-bit conversion ......... 75
Figure 41. (a) CABS ADC comparator schematic, (b) pull up/down latch encoder ..... 76
Figure 42. Histogram of the CABS comparator’s offset from 100 Monte Carlo simulations.

Figure 43. Results from simulations to assess metastability: (a) regenerative latch output waveforms in the CABS comparator, and (b) propagation delay of the latch and the complete comparator.

Figure 44. (a) Generation of delayed versions of the main clock, (b) insertion of further delay and fan-out strength to drive the flash ADC.

Figure 45. Generation of the gating signals from a single reference clock using a ring counter.

Figure 46. (a) Clock gating scheme and (b) the combinational logic, shown for two of the four identical channels of the hybrid ADC.

Figure 47. Modeling the die pad and chip package parasitics.

Figure 48. Generation of non-overlapping signals for the second set of bootstrap switches (shown for two of the four identical channels).

Figure 49. Ring counter with two circulating bits of “1” to produce the clocks for the CABS ADCs.

Figure 50. Circuit for synchronous resetting.

Figure 51. Layout of the clock generation system.

Figure 52. Histogram of timing skews from 100 Monte Carlo simulations.

Figure 53. Histogram of (a) the total simulated BW mismatch \( \sigma(\text{BW})/\text{BW} \) among TI channels, (b) the simulated BW mismatch \( \sigma(\text{BW})/\text{BW} \) among TI channels caused only by the second bootstrap switch.

Figure 54. Offset calibration system for the flash ADC (single-ended equivalent).

Figure 55. Offset calibration ranges of the flash comparators.

Figure 56. Histograms of the comparator offsets from 100 Monte Carlo runs in the presence of transient noise (a) before and (b) after calibration.

Figure 57. Hybrid ADC dynamic performance at \( f_s = 1\text{GS/s} \) vs. input frequency.

Figure 58. Layout of the hybrid ADC.

Figure 59. Output spectra (1024-point FFT) of the 8-bit 1GS/s hybrid ADC from post-layout simulation: (a) \( f_{\text{in}} = 6.84\text{MHz} \), (b) \( f_{\text{in}} = 491.2\text{MHz} \) with and without flash offset calibration.
Figure 60. Breakdown of the simulated power consumptions in the hybrid ADC........ 97
Figure 61. Bit alignment unit for the hybrid ADC. .................................................... 102
Figure 62. Timing of the bit alignment unit’s synchronized outputs. .................... 103
Figure 63. On-chip LVDS Driver circuit schematic.................................................. 104
Figure 64. Simulated transient differential output waveform of the LVDS driver. ...... 104
Figure 65. Hybrid ADC die layout with pads.......................................................... 105
Figure 66. Micrograph of the fabricated hybrid ADC chip................................. 106
Figure 67. Test setup configuration at the ADC’s differential inputs. ............... 107
Figure 68. Test setup configuration at the ADC’s differential inputs for low-frequency measurements................................................................. 108
Figure 69. Test setup configuration for the single-ended 1GHz input clock signal. 108
Figure 70. Test setup configuration at the ADC’s outputs................................. 109
Figure 71. Evaluation board for the hybrid ADC.................................................. 110
Figure 72. Measured DNL and INL of the hybrid ADC before flash ADC calibration (6-bit evaluation). ........................................................................................................ 113
Figure 73. Measured DNL and INL of the hybrid ADC after flash ADC calibration (6-bit evaluation)........................................................................................................ 113
Figure 74. Measured output spectra (8192-point FFT) of the 8-bit 1GS/s hybrid ADC output for (a) f_in = 10.193MHz, (b) f_in = 493.958MHz before and after flash offset calibration ........................................................................................................... 114
Figure 75. Measured output spectra (8192-point FFT) of the 6-bit 1GS/s hybrid ADC output for (a) f_in = 10.193MHz, (b) f_in = 493.958MHz before and after flash offset calibration ........................................................................................................... 116
Figure 76. Hybrid SNDR and SFDR vs. input frequency at Fs = 1GS/s (6-bit evaluation). ........................................................................................................... 117
Figure 77. Measured ENOB of the hybrid ADC vs. input frequency at Fs = 1GS/s (6-bit evaluation). ........................................................................................................... 118
Figure 78. Model of a conventional dry skin-electrode-amplifier interface........... 124
Figure 79. Self-calibrated analog front-end for dry-contact EEG measurements. .... 125
Figure 80. (a) Instrumentation amplifier (IA) with direct current feedback and negative capacitance generation feedback (NCGFB), (b) implemented NCGFB with programmable capacitor bank. .................................................................................................................. 126

Figure 81. Monitoring scheme at the test amplifier output voltage for maximum impedance detection with comparators and SR latches.................................................................................. 127

Figure 82. Conceptual waveform diagrams of the IA’s differential output for the two possible cases when oscillation occurs after switching from a stable code to an unstable code .................................................................................................................. 128

Figure 83. Oscillation detection circuit .............................................................. 129

Figure 84. Total calibration time vs. input capacitance for different alternatives to respond to oscillation events during calibrations. .......................................................................................... 130

Figure 85. Calibration flow chart........................................................................ 132

Figure 86. On-chip digital calibration unit. .......................................................... 132

Figure 87. Layout of the digital calibration unit in 130nm CMOS technology: (a) control block, (b) memory block.................................................................................................................. 134

Figure 88. Simulated waveforms of the test amplifier output and the digital control signals for the capacitor bank switches (C_{inn} = C_{inp} = 100 pF). .................................................................................. 135

Figure 89. Chip micrograph of the EEG front-end with circuits for automatic input impedance boosting (130nm CMOS technology). ................................................................. 138

Figure 90. The evaluation board designed for testing of the SCAFELAB chip. .... 139

Figure 91. Measurement setup to test the digital calibration of the SCAFELAB chip. 140

Figure 92. Transient waveforms of the (a) instrumentation amplifier’s single-ended buffered output during the complete calibration, and (b) the test amplifier’s single-ended output during the complete calibration, (c) before calibration, and (d) after calibration. .... 141

Figure 93. Digital switch control bits acquired with a logic analyzer at the (a) start and (b) end of a calibration with C_{inp} = C_{inn} = 100pF. .................................................................................. 142
List of Tables

Table 1. Comparison of various options for the first and second stage resolutions in the proposed ADC architecture ................................................................. 53
Table 2. SHDAC switch control truth table for the residue generation and the resulting analog output voltage................................................................. 55
Table 3. Sampling error from schematic simulations for the three different cases ........ 67
Table 4. Hybrid ADC ENOB and SFDR for different process corner cases ............. 94
Table 5. Detailed simulation results of the hybrid ADC for all PVT corner cases ........ 94
Table 6. Performance summary and comparison ................................................. 99
Table 7. Specification summary of high-speed ADCs designed in 130nm and 90nm CMOS technologies ................................................................. 100
Table 8. Offset calibration codes for the comparators in the flash ADC ............... 111
Table 9. Summary of the hybrid ADC measurement results and comparison to other works ................................................................................................. 121
Table 10. Summary of the digital calibration unit on the prototype chip ............... 134
Table 11. Simulation results for different parasitic capacitance values and process corners ................................................................................................. 137
Table 12. On-chip calibration unit’s output code with resulting instrumentation amplifier (IA) amplitude and input impedance ........................................ 143
List of Abbreviations

ADC ................................................................. analog-to-digital converter
AM ................................................................. amplitude modulation
ASIC .............................................................. application-specific integrated circuit
BER ........................................................................ bit error rate
BLE .......................................................................... Bluetooth low energy
BPF .......................................................................... band-pass filter
BW ........................................................................ bandwidth
CABS ....................................................................... comparator-based asynchronous binary search
CMOS ...................................................................... complementary metal-oxide-semiconductor
CMRR ...................................................................... common-mode rejection ratio
DAC .......................................................................... digital-to-analog converter
DFF ..................................................................... D flip-flop
DNL ........................................................................ differential nonlinearity
EEG ........................................................................ electroencephalography
ENOB ....................................................................... effective number of bits
FFT ........................................................................ fast Fourier transform
FoM ........................................................................ figure of merit
IA .............................................................................. instrumentation amplifier
IC .............................................................................. integrated circuit
INL ........................................................................ integral nonlinearity
IoT ........................................................................ internet-of-things
LSB ........................................................................ least significant bit
LVDS ...................................................................... low-voltage differential signaling
LVTTL ....................................................................... low voltage transistor-transistor logic
MDAC ........................................................................ multiplying digital-to-analog converter
MICS ....................................................................... medical implant communications service
MIM ........................................................................ metal-insulator-metal
MOM ........................................................................ metal-oxide-metal
MSB ........................................................................ most significant bit
NCGFB ....................................................................... negative capacitance generation feedback
Opamp ................................................................. operational amplifier
OTA ............................................................... operational transconductance amplifier
PCB ............................................................... printed circuit board
PDF ............................................................... probability density function
PM ............................................................... phase modulation
PSRR ............................................................. power supply rejection ratio
PVT ............................................................... process voltage temperature
RC ............................................................... resistor-capacitor
RF ............................................................... radio frequency
RMS .............................................................. root-mean-square
SAR ............................................................... Successive approximation register
SCAFELAB .......... self-calibrated analog front-end for long acquisitions of biosignals
SDR .............................................................. software-defined radio
SFDR ............................................................. spurious free dynamic range
SH ............................................................... sample-and-hold
SHDAC ......................................................... sample-and-hold and capacitive digital-to-analog converter
SNDR ............................................................ signal-to-noise-and-distortion ratio
SNQR ............................................................ signal-to-quantization noise ratio
SNR ............................................................... signal-to-noise ratio
SoC .............................................................. system-on-a-chip
TH ............................................................... track-and-hold
THD ............................................................. total harmonic distortion
TI ................................................................. time-interleaved
UWB ............................................................. ultra-wideband
VGA ............................................................ variable gain amplifier
WMTS .......................................................... wireless medical telemetry service
1. Introduction

Analog-to-digital converters (ADCs) are fundamental building blocks in electronic systems that process or store analog signals in the digital domain. With the advances of complementary metal-oxide-semiconductor (CMOS) process technologies, the cost and power consumption of digital signal processing per function is decreasing. Furthermore, the flexibility and software reprogrammability provided by digital systems persuade designers to transfer more analog signal processing tasks into digital domain. High-speed ADCs are widely used in various applications that will be reviewed in this chapter. Lowering the power consumption of such wideband ADCs can make it feasible to employ them in a broader range of demanding portable applications.

1.1 Overview of Existing and Emerging Applications

High sampling rate (1-3 GS/s) ADCs with medium resolutions (6-10 bits) are utilized in diverse applications including wireless communication systems [1], [2], ultra-wideband (UWB) [3], direct-sampling TV receivers [4], [5] and digital oscilloscopes [6]. Additional applications of these wideband ADCs are in high-speed communication systems such as serial-link receivers [7], [8], optical communications [9], and disk drives read channels [10]–[12].

Remote healthcare (telemedicine) offers convenience, early diagnosis, and lower healthcare costs. One of the device development goals for the internet-of-things (IoT) is to improve healthcare systems [13], [14]. This can be realized through wireless communication with implantable or wearable sensors [15], [16]. Thanks to the high processing capability of modern digital processors, software-defined radio (SDR) architectures can be utilized to minimize the number of analog blocks in wireless transceivers and to process signals in the digital domain. Integrating this type of wireless transceiver into portable devices such as smartphones or smartwatches demands low power consumption to extend battery lifetime, especially for high-speed analog-to-digital conversion. As an envisioned application, a low-power wideband ADC can be utilized in future short-range communication applications as visualized in Figure 1. The receivers in such systems can employ discrete-time radio frequency (RF) sampling analog front-ends.
to selectively capture a wide range of RF input signals [17]–[19]. A wideband ADC is required as one of the key elements in these transceivers, and it usually consumes excessive power due to the high specification requirements. Utilizing a low-power 1GS/s ADC in systems such as the one depicted in Figure 1 will enable a 500MHz signal to be digitized before it is processed digitally to extract signal components of interest; e.g., Medical Implant Communications Service (MICS) signals in the 402–405MHz band. For certain communication standards, the input signal would need to be down-converted to the 500MHz range. This is the case for several standards with frequency bands centered at 2.4GHz, such as IEEE 802.15.4 ZigBee, 802.15.6 Medical Body-Area Networks and Bluetooth Low Energy (BLE). Other standards operating at lower frequencies, such as Wireless Medical Telemetry Service (WMTS) in the 608-614MHz, 1395-1400MHz and 1427-1432MHz bands, should also be down-converted prior to quantization.

1.2 Design Challenges for Low-Power High-Speed ADCs

Designing energy-efficient wideband ADCs is essential, especially for portable battery-powered devices. Traditionally, flash ADCs have been popular for high-speed analog-to-digital conversion [20]–[22]. However, their input capacitance and power consumption increase exponentially with the number of bits, which makes them less power-efficient when designed for higher resolution. A time-interleaved (TI) architecture
that uses lower speed ADCs in parallel is an alternative to simultaneously achieve high sampling rate and high energy efficiency. The energy per conversion step of a TI-ADC is ideally equal to the one for the sub-ADC in each channel, but in practice there is a power overhead caused by multi-channel clock generation and calibration of channel mismatches [4], [25], [26]. Successive approximation register (SAR) ADCs are commonly used to implement each channel of a TI-ADC [4], [23], [24]. SAR ADCs are usually power efficient for medium resolutions (6-10 bits) and medium samplings rates (10-200 MS/s), and their digital features benefit from modern CMOS technologies [27]–[29]. However, their sampling rate is limited by the need for a high-speed clock for the SAR logic, and by the settling time of the capacitive digital-to-analog converter (DAC) in every cycle. Multi-bit per cycle SAR ADCs are suitable for high-speed low-power performance because of the reduced number of required cycles for a full conversion [3], [30]–[32]. However, when fabricated in CMOS technologies with relatively long channel length [3], they are not as power efficient as they are in short-channel CMOS technologies [30], [31]. Comparator-based asynchronous binary search (CABS) ADCs [33], [34] are capable of high conversion rates while consuming relatively low power. In comparison to conventional asynchronous SAR ADCs [30], a CABS ADC can support faster speed since it does not depend on settling delays associated with switched capacitors or changing reference voltages for each comparator decision. Using subranging or two-step architectures can help to reduce the power consumption in ADCs [1], [35].

In recent years, new architectures such as the subranging flash-SAR ADC [29] or flash-TI-SAR ADCs [36]–[39] were introduced as alternatives to TI-SAR ADCs. However, redundancy or calibration techniques are required to suppress the mismatches between the two stages in such architectures. Parasitic capacitances are one of the main challenges during high-speed ADC design. They impact the maximum speed by increasing the resistor-capacitor (RC) delays at internal nodes, degrade the linearity of the ADC, and limit its power-efficiency. Therefore, techniques are required to minimize the impacts of inevitable parasitic capacitances in order to achieve high-speed and medium-resolution with reduced power consumption. In addition, clock signals in high-speed ADCs have a
significant importance due to their impact on performance, making the generation and distribution of multi-phase clock signals a part of the research efforts.

1.3 Analog Front-End Requirements for Long-Term EEG Monitoring

Electroencephalography (EEG) signals are biopotential signals across the human scalp resulting from ionic current between the neurons of brain cells [40]. These signals are recorded from different regions of the scalp, usually by non-invasive electrodes. Each region has its own special importance, as they represent the neuronal activity at different locations of the brain [40]. EEG is commonly used for the study and diagnosis of neurologic disorders such as epilepsy [41]. On the scalp, EEG signals have small amplitudes of 10μV to 100μV when acquired by electrodes [42], requiring an appropriate analog front-end with circuits for signal amplification and filtering prior to recording or processing. Figure 2 represents a block diagram of a typical EEG acquisition system. As a popular and non-invasive method, the EEG signal is collected with an array of electrodes distributed at different locations on the scalp to be transferred to the analog front-end through cables. In the analog front-end, the signal is amplified by the instrumentation amplifier (IA), filtered by a low-pass and/or notch filter, and then amplified by a gain stage such as a variable gain amplifier (VGA). Then, an ADC converts the analog signal to a digital signal to be processed or recorded by computers or other digital signal processors.

![Figure 2. Block diagram of a typical EEG acquisition system.](image)

In emerging health monitoring applications with brain-computer interfaces, EEG signals are acquired and analyzed over long time periods. Some applications for long-term EEG monitoring are epilepsy diagnosis, drowsiness detection, and the recognition of a person’s intentions [43]. In general, dry electrodes are better suited for long-term monitoring, but their use is associated with increased contact resistances [44]. This characteristic complicates the measurement of small biopotentials by requiring very high
input impedance at the analog front-end amplifier as high as 500 MΩ [45]. However, the input impedance is attenuated by parasitic capacitances from the package of the integrated circuit as well as electrode cable and printed circuit board (PCB) capacitances that can be 50-150 pF at the IA input. An instrumentation amplifier was designed in our research group with a negative capacitance generation feedback (NCGFB) technique to cancel the adverse effects of input capacitances from electrode cables and printed circuit boards [46], [47]. However, the IA’s NCGFB has to be adjusted to effectively boost the input impedance in the presence of process voltage temperature (PVT) variations and expected changing of electrodes/cables. In addition, overcompensation from the extra negative capacitance of the NCGFB method can lead to an unstable condition and oscillation in the analog front-end. To resolve these issues, a built-in on-chip calibration technique [48] has been created as part of this dissertation research to automatically tune the digitally-controllable NCGFB capacitor bank in the IA for adaptive boosting of the analog front-end’s input impedance.

1.4 Contributions of this Research

1.4.1 Low-power high-sampling-rate hybrid ADC with a subranging time-interleaved architecture

To address the rising demands for low-power high-speed ADCs, this research advances the concept of subranging and time-interleaved operation by realizing a novel hybrid flash-TI-CABS ADC architecture. A flash ADC is employed as the coarse ADC to resolve the most significant bits (MSBs), whereas four time-interleaved CABS ADCs in the second stage resolve the least significant bits (LSBs). The fast MSB conversion by the flash ADC, together with the use of high-speed CABS ADCs in TI structure, help to reduce the number of interleaved channels, which results in higher input bandwidth. This work introduces the first-time use of a CABS ADC in a time-interleaved architecture to take advantage of its high-speed low-power characteristics, providing a high time-interleaved sampling-rate with a low number of channels. A new sample-and-hold and capacitive digital-to-analog converter (SHDAC) was designed to perform the sampling and residue generation for the subranging operation in each channel. Furthermore, a linearity
enhancement technique that involves scaling of the SHDAC reference voltages is introduced to suppress the impacts of parasitic capacitances on the residue voltage.

A clock feedthrough cancellation technique for bootstrap switches has been developed in this research to suppress the corresponding sampling errors and to enhance linearity. The systematic and random offsets of the flash ADC comparators are reduced using a foreground calibration technique to assure sufficient matching between the subrange stages. In addition, through the use of calibration, there is no requirement for preamplifiers in the flash ADC, resulting in significant reduction of its power consumption. Furthermore, the designed clock generation circuitry satisfies the jitter and timing-skew requirements for the 1GS/s operation with 8-bit resolution.

The proposed ADC architecture and associated design techniques described in this dissertation will assist designers to address some of the major challenges related to advancing the high-speed ADC state-of-the-art. Commercial ADCs with a similar range of sampling frequency and resolution are available, but they consume excessive amounts of power. The ADC design approach in this work will allow system designers to develop new applications that benefit from its wideband and low-power characteristics, such as software-reconfigurable transceivers with direct RF sampling in portable devices.

1.4.2 Automatic on-chip digital calibration of an analog front-end for EEG

Digitally-assisted analog circuit design approaches can substantially improve system performance [48], [49]. They make analog blocks more robust through digital calibration and correction techniques that compensate for device non-idealities and fabrication process variations [50]. In this dissertation research, an on-chip digital calibration scheme has been developed to automatically boost the input impedance of the instrumentation amplifier in analog front-ends for electroencephalography (EEG) measurements with dry electrodes. The digital on-chip calibration system automatically controls two on-chip programmable capacitor banks to boost the input impedance of an instrumentation amplifier (IA) in an EEG signal acquisition front-end. This promises to enable long-term brain signal measurements that require very high input impedance. In addition, an oscillation detection scheme was designed for the analog front-end to prevent unstable
operation that can occur due to overcompensation of the IA. The digital on-chip calibration system was integrated into an analog EEG front-end designed by other research group members. This prototype chip was fabricated in 130nm CMOS technology and verified with measurements.

Integration of the calibration system into a system-on-a-chip (SoC) with the analog front-end helps to adaptively boost the input impedance while minimizing additional power consumption and off-chip circuitry. Such automatic impedance boosting leads to better reliability and accuracy during long-term EEG signal monitoring. Hence, it is expected to aid the design of systems used by researchers and doctors who analyze the recorded EEG data, resulting in improved diagnosis of neurologic disorders as well as helping to develop new brain-controlled machine interfaces. In addition, the calibration can potentially reduce the patient preparation time prior to clinical EEG measurements.

1.5 Dissertation Structure

The organization of this dissertation is as follows: The relevant fundamentals of analog-to-digital conversion are presented in Chapter 2 together with an overview of the most conventional Nyquist-rate ADC architectures. Chapter 3 reviews general design considerations for time-interleaved ADCs as well as their sub-ADCs. In addition, it introduces the proposed hybrid ADC architecture. In Chapter 4, the design considerations, analysis of the errors, circuit-level implementation, and post-layout simulation results of the hybrid ADC are discussed. The ADC test setup and measurement results are presented in Chapter 5. Chapter 6 describes the on-chip digital calibration system for the EEG front-end with input impedance boosting, the experimental setup, and chip measurement results. An overall conclusion and suggestions for future research are provided in Chapter 7.
2. Analog-to-Digital Conversion Fundamentals and Conventional Architectures

In this chapter, we review the fundamentals of analog-to-digital conversion. The common Nyquist-rate ADC architectures will be summarized with a focus on their advantages and drawbacks that relate to more complex architectures such as hybrid ADCs.

2.1 Analog-to-Digital Converters

The analog-to-digital converter (ADC) is an electronic system that transforms a continuous-time and continuous-amplitude analog signal to a discrete-time and discrete-amplitude digital signal. An ADC can be designed and utilized as a standalone integrated circuit (IC) or as a subsystem of a larger system such as application-specific IC (ASIC) or system-on-chip (SoC). As shown in Figure 3, the two essential operations required for analog-to-digital conversion are sampling and quantization.

![Figure 3. Sampling and quantization operations on an analog signal.](image-url)

2.1.1 Sampling theory

An ideal sampling function in the time domain is similar to multiplying the signal with a train of impulses having a period of $T_s$ (Figure 3). Thus, the frequency domain equivalent of the sampled signal is the convolution of the Fourier transform of the signal with a train of impulses with period of $f_s$ (in the frequency domain), where $f_s = 1/T_s$ is the sampling frequency. This implies that the frequency domain representation of the signal is repeated with integer multiples of $f_s$ in the frequency domain, as shown in Figure 4.
According to Nyquist’s theorem, the sampling frequency \( f_s \) must be twice of the analog signal bandwidth \( f_B \) to be able to recover the original signal from the sampled version, as shown in Figure 4(a). As seen from Figure 4(b), if \( f_s/2 < f_B \), the frequency components of the desired bandwidth will be mixed with the ones from the neighboring replicas. This effect is known as aliasing [51], and once it occurs the original signal cannot be correctly recovered from the sampled version. In practice, the signal can be passed through an anti-aliasing filter before sampling to ensure that the undesired frequency components of the input that are above \( f_s/2 \) are filtered out before entering the ADC [52]. ADCs that are operating based on the Nyquist sampling assumption \( f_s > 2f_B \) are referred to as Nyquist-rate ADCs. On the other hand, ADCs that operate with a sampling rate much higher than the input signal bandwidth \( f_s >> 2f_B \) are called oversampling ADCs, which normally do not require a front-end antialiasing filter [52], [53]. A sample-and-hold (or track-and-hold) circuit is typically used in practice to sample the analog input signal.

### 2.1.2 Quantization

To be able to use the sampled signal for digital processing, the analog amplitudes must be approximated with a limited number of discrete levels. This process is called quantization. For an ADC with \( N \)-bit resolution, the total number of quantization levels are \( 2^N \). The least significant bit (LSB) of an \( N \)-bit ADC, as the minimum detectable input
voltage, is defined as $\text{LSB} = \frac{V_{\text{FS}}}{2^N}$, where $V_{\text{FS}}$ is the full-scale amplitude of the analog input signal. Figure 5 displays the transfer function of an ideal 3-bit ADC, where the quantized voltage levels can be observed from the corresponding digital output codes.

![Figure 5. Ideal 3-bit ADC transfer function.](image)

There is a fundamental limitation while converting analog values to a limited number of quantized levels: The voltage difference between an analog voltage and its corresponding quantized voltage level is defined as quantization error. If the ADC resolution could approach infinity, then the quantization error would theoretically become zero. However, any realizable ADC has a finite number of levels, and increasing the number of bits is associated with complexity, chip area, and power consumption tradeoffs. Figure 6 shows the quantization error plotted for an ideal 3-bit ADC, calculated from the subtraction of the weighted digital outputs of the ADC from the analog inputs (Figure 5). With the assumption of a uniform probability density function (PDF) for the quantization error over the interval of $-\text{LSB}/2$ to $\text{LSB}/2$, the calculated quantization noise power ($P_{\text{qn}}$) is equal to $\text{LSB}^2/12$ [54].
2.1.3 Dynamic performance parameters

The signal-to-noise ratio (SNR) of an ADC, also referred as signal-to-quantization noise ratio (SNQR), with a full-scale sinusoidal input signal having power of $P_{\text{sig}}$ is defined as

$$SNR = 10 \log \frac{P_{\text{sig}}}{P_{\text{qn}}} = 10 \log \left( \frac{V_{FS}^2 / 2}{LSB^2} \right) = 10 \log \left( \frac{2^{2N-2} \cdot LSB^2 / 2}{LSB^2 / 12} \right) = 6.02N + 1.76 .$$

This SNR is calculated over the entire Nyquist band ($f_s / 2$). If the input signal bandwidth is much smaller than $f_s / 2$, the effective quantization noise inside the band of interest would be smaller. To improve the SNR, the quantization noise outside the signal bandwidth can be filtered out without degrading the desired signal, which is a popular technique in oversampling ADCs [35], [53].

In the theoretical equation (1) representing an ideal ADC, the only limiting factor of the SNR is the quantization noise. However, in a real ADC there are other noise sources that degrade the SNR, such as thermal noise, static errors (nonlinearities), and clock jitter noise. The signal-to-noise-and-distortion ratio (SNDR), also abbreviated as SINAD in some references [52], [55], has a similar definition as the SNR but also includes the distortion components generated by an input sine wave; i.e., the total harmonic distortion (THD) [55]. Hence, SNDR is defined as the ratio of the input signal power ($P_{\text{sig}}$) to the total noise power ($P_N$) in addition to the total power of the harmonic components ($P_D$):
The effective resolution of an ADC is always less than the number of output bits (N) due to the practical non-idealities that degrades the SNDR. The effective number of bits (ENOB) of a Nyquist-rate ADC is defined in equation (3). The ENOB is a critical specification that is often used to convey the dynamic performance of an ADC [55].

$$ENOB = \frac{SNDR - 1.76}{6.02}$$  \hspace{1cm} (3)

The spurious free dynamic range (SFDR) is the ratio of root-mean-square (RMS) amplitude at the fundamental frequency to the RMS amplitude of the largest distortion component in a specified frequency range (fₛ/2 for Nyquist-rate ADCs). In decibel (dB), it is the distance from the fundamental input signal to the worst (or highest) spur in a spectrum plot, as visualized in Figure 7. SFDR is important because noise and harmonics limit the dynamic range of a data converter [56], [57]. Hence, the SFDR is a critical specification in telecommunication and video applications, where the signal needs to be distinguished from the other frequency components that might be located close to the fundamental frequency [55].

![Figure 7. Spurious free dynamic range (SFDR) definition for an ADC.](image-url)
2.1.4 Figure of Merit (FoM)

Equation (4) is a frequently used figure of merit (FoM) for ADCs, which represents the consumed energy per conversion step to compare efficiencies of different designs [58]. Although a few other ADC FoMs have been used (for instance in [9], [59]), it should be noted that these FoMs do not take all design aspects into account, such as the fabrication technology and supply voltage. Fabrication technology can significantly limit the ADC speed, resolution, and power consumption. Therefore, it is fairer to compare ADCs that are designed and fabricated in similar technologies.

\[
FOM = \frac{Total\ Power}{2^{ENOB} \cdot f_S}
\]  

(4)

2.2 Conventional Nyquist-Rate ADC Architectures

The task of analog-to-digital conversion can be carried out with different techniques and various types of ADCs. The required resolution, speed, power, area, and latency for the ADC in a system is dictated by the specific application. In general, each type of ADCs exhibits its most efficient performance for a certain range of speed and resolution. For example, some types of ADCs are efficient with high resolutions but only with a low sampling rate, while others are very fast but not efficient in high-resolution applications. In the remainder of this chapter, the most conventional ADC architectures are briefly reviewed to establish how their pros and cons motivated the development of the hybrid ADC in Chapter 3.

2.2.1 Flash ADCs

An N-bit flash ADC consists of \(2^N-1\) parallel comparators that are all clocked simultaneously. As an example, Figure 8 shows a conventional 2-bit flash ADC. The reference voltages are generated by a ladder with \(2^N\) resistors having identical values, dividing the full-scale range (\(V_{FS}\)) into \(2^N\) regions. All comparators simultaneously compare the input signal with their corresponding reference voltages. If the input voltage is larger than a reference voltage, then the corresponding comparator generates a logic output of 1 (high), otherwise the logic output is equal to 0 (low). As a result, when monitoring the outputs of the comparators connected between the low and high reference
voltages, there will be a series of 1s transition to a series of 0s. This pattern, often referred to as thermometer code, allows to determine the quantization level closest to the input amplitude. A thermometer to binary encoder can generate the final binary output.

![Thermometer to binary encoder](image)

**Figure 8. Conventional flash ADC (2-bit resolution).**

Since all the comparators operate in parallel and at the same time, the conversion time of a flash ADC only equals one clock cycle, making it ideal for high-speed applications. However, since the number of comparators increases exponentially with the number of bits, the flash ADC is not area- and power-efficient when high resolution is required. Increasing the number of comparators creates a significantly high input capacitance that originates from the total parasitic capacitances of the transistors at the input of the comparators as well as the total routing capacitance from distributing the input signal to the comparators on the chip.

Due to the inevitable mismatches between the input and clock routing networks to the comparators, the RC delays of the signal paths to each comparator vary. These timing mismatches can result in significant conversion errors (especially for high input frequencies) because each comparator processes a different (i.e., delayed) voltage during the same conversion. To avoid this problem, a front-end sample-and-hold (S/H) or track-and-hold (T/H) is often included in flash ADCs. Moreover, the parasitic capacitances of the transistors at the comparator inputs vary with the applied input voltage amplitude.
Therefore, the total input capacitance of a flash ADC changes nonlinearly with the input voltage. This nonlinear input capacitance can cause SNDR degradation at the output of the front-end S/H.

The input-referred offset voltage of the comparators in a flash ADC is another problem, which creates nonlinearity errors that become more severe for higher resolutions. To reduce the input-referred offset, a common method is to use preamplifiers for each comparator [60], such that the input-referred offset is divided by the gain of the preamplifier. However, these preamplifiers should be designed with high bandwidth and gain, which significantly increases the power consumption. As a low-power alternative, there are several offset calibration techniques to suppress comparator offset errors [21], [61], [62]. In flash ADCs with high resolution, not only the offset requirement becomes more stringent, but the number of comparators to be calibrated also increases exponentially; making such calibration systems more complicated while requiring more layout area for on-chip implementation.

2.2.2 Interpolating and folding ADCs

Interpolating and folding architectures have been introduced to alleviate some of the main limitations of flash ADCs such as high power consumption and large layout area [63]. These architectures operate with single-step conversion, and can be as fast as a flash ADC in theory. Figure 9 displays the concept of interpolation in ADCs. One output from each of the two adjacent preamplifiers is connected to the middle comparator in a way that it effectively compares the signal with a reference voltage in middle of $V_{R1}$ and $V_{R2}$. Interpolating reduces the number of preamplifiers as well as resistors in the reference ladder to half (or less, depending on the interpolation factor) in comparison to a conventional flash ADC, resulting in a reduction of the total area and power consumption. However, the number of latched comparators in an interpolating ADC is still the same as in a standard flash ADC with the same resolution. Another benefit of interpolation is improved linearity due to the distribution of the errors [64]. In addition, further interpolation levels are possible with extra interpolation resistor ladders between the preamplifiers and latched comparators [65].
Figure 9. Interpolation concept.

Figure 10 displays the block diagram of a folding ADC. A coarse ADC resolves the most significant bits (MSBs). In parallel, a folding circuit divides the input signal into several regions, and then a fine ADC converts the least significant bits (LSBs) independently of the coarse ADC outputs. In a folding ADC, the MSBs are resolved in parallel with the folding operation and fine ADC decision, but in practice there is a small delay for the folding operation. Similar to the flash ADC, a folding ADC requires a front-end S/H to ensure that the same sampled value is processed by the folding circuit and the coarse ADC to avoid conversion errors. Folding significantly reduces the number of comparators because it divides the high-resolution ADC into coarse and fine ADCs with lower resolutions. It is a popular technique to combine interpolation and folding architectures to achieve better power and area efficiency, as in [66], [67] for instance.

Figure 10. Folding ADC example.
The high power consumption and limited bandwidth of the preamplifiers, nonlinearity of the practical folding circuit, and the delay of the folding path are the main limiting factors in folding and interpolating ADC architectures. Therefore, for high-speed analog-to-digital conversion with medium to high resolutions, other architectures such as time-interleaved ADCs are usually more power-efficient.

2.2.3 Subranging and two-step ADCs

As mentioned in Section 2.2.1, the number of comparators in flash ADCs exponentially increases with the number of bits, resulting in high power consumption and large chip area. Subranging and two-step architectures were introduced as a solution to this problem. In a subranging or two-step ADC, a high-resolution conversion task is divided between two ADCs with lower resolution that operate sequentially, as depicted in Figure 11. The coarse ADC operates with full-scale range, and resolves the MSBs from the sampled input voltage. The DAC generates a quantized reference level according to the MSBs. Next, the DAC output is subtracted from the sampled input voltage, generating a residue voltage. Finally, the fine ADC in the second stage, operating with a sub-range of the full-scale range, resolves the LSBs from the residue voltage. A two-step ADC is similar to a subranging ADC, but utilizes a gain stage to amplify the residue voltage before delivering it to the fine ADC [68]. Using such architectures can significantly reduce the number of comparators in an ADC. For example, a flash ADC with 8-bit resolution requires 255 comparators activated in parallel, while an equivalent subranging ADC consisting of a 4-bit coarse ADC and 4-bit fine ADC only requires 30 comparators in total, leading to significant reduction of power consumption and area on chip.

![Subranging ADC Architecture](image)

Figure 11. Subranging ADC architecture.
Despite the advantages of subranging and two-step ADC architectures, there are some drawbacks that should be considered. Since the second stage must wait for the completion of the first conversion and of the residue generation, there is an inevitable latency in the final digital output. It should also be noted that a front-end S/H is necessary in subranging or two-step ADCs. Moreover, if there is any mismatch between the generated residue voltage range and the input range of fine ADC, then the fine ADC converts an erroneous residue voltage, causing severe linearity issues such as missed codes or non-monotonicity. Thus, the design should be robust enough and well-trimmed to assure sufficient matching between the two stages for a given target resolution. The use of redundancy (implying additional resolution in the coarse ADC and/or fine ADC) as well as calibration are techniques to suppress such range-mismatch issue [69], [70]. It is also noteworthy that any non-ideality in the DAC can introduce errors for the LSBs generated by the fine ADC. An N-bit subranging (two-step) ADC can be constructed by L-bit coarse and M-bit fine ADCs, where N = L+M. Although the coarse and fine ADCs have lower resolutions, they still have to be designed and optimized for N-bit offset accuracy. An amplifier in a two-step ADC with a gain of $2^M$ will relax the offset requirement for the fine ADC to only M-bit. In practice, the gain, linearity, bandwidth, and offset of the amplifier should also satisfy the N-bit accuracy of the combined ADC to avoid errors in the residue voltage. Moreover, the power consumption of such an amplifier can be significantly high due to the stringent requirements for high resolutions, especially at high conversion rates.

2.2.4 Pipelined ADCs

Pipelined ADC architectures (such as the one in Figure 12) combine the concept of two-step analog-to-digital conversion with a pipelining technique to extend high-resolution operation to higher conversion rates. After sampling, the first stage starts to generate its corresponding output bits. It also generates the residue voltage and amplifies it to full-scale to be delivered to the next stage. This operation continuously occurs in all the subsequent stages. While the current stage is converting a sample, the preceding stage is processing the next sample. The last few LSBs in the final stage of a pipelined ADC are generally resolved by a low-resolution flash ADC. The final digital output code is generated at the same conversion rate as that of one pipelined stage. There is a substantial
latency in pipelined ADCs because a complete digital output will be resolved after all stages finish their conversion. Nevertheless, the output codes are generated with high speeds thanks to the pipelining operation.

![Figure 12. Pipelined ADC architecture.](image)

Each pipelined stage realizes the functions of a sample-and-hold, subtraction, DAC, and amplification; similar to a two-step ADC but without fine ADC. Such a subsystem is referred to as a multiplying digital-to-analog converter (MDAC) [69], [71]. A switched-capacitor circuit, comparators, and a high-performance operational amplifier (opamp) are usually used to build MDACs [72]. High-speed high-resolution pipelined ADCs require high-performance power-hungry amplifiers [73], [74], which are often the bottlenecks of the design. Similar to subranging ADCs, the use of redundancy techniques is very popular in pipelined ADCs. For example, a 1.5-bit pipelined stage can significantly relax the offset requirement of the comparators in each stage [72], [75].

Popular methods to reduce the power consumption of a pipelined ADC are stage power scaling [69], [76], opamp sharing [77], and switched-opamp [78] techniques. However, these techniques cannot be used for high-speed high-resolution ADCs due to limitations such as memory effects and additional delay because of extra clock phases. Alternative techniques such as comparator-based and charge-pump-based pipelined ADC architectures have been introduced together with calibrations to reduce the power consumption [79], [80]. However, they are not appropriate for high resolution at high speeds.
2.2.5 Successive approximation register ADCs

The simplified block diagram of a typical successive approximation register (SAR) ADC is shown in Figure 13, which consists of a S/H, comparator, DAC, and digital logic (controller). A full analog-to-digital conversion in a SAR ADC is performed over multiple clock cycles. The reference voltage of the comparator is generated by a DAC that has a resolution equal to the SAR ADC resolution and that is controlled by the digital SAR logic. The first clock cycle is generally dedicated to sampling the input. In the first cycle after the sampling, the DAC output is set to \( V_{FS}/2 \), such that the comparator compares the sampled voltage with the middle reference level, resolving the first MSB. Depending on the comparator’s output after each comparison, the SAR logic sets the DAC input bits to generate the appropriate reference voltage for the next comparison. The process continues until the last bit is resolved. Using a successive approximation algorithm (e.g., binary search algorithm), one output bit is generated during each conversion cycle. Therefore, a minimum of \( N+1 \) clock cycles is required to carry out a full \( N \)-bit conversion with a basic SAR ADC.

![Figure 13. SAR ADC architecture.](image)

SAR ADCs are generally power-efficient because they do not require a power-hungry component such as an opamp. However, their conversion speed is limited due to the large number of conversion cycles required for a full analog-to-digital conversion. For an \( N \)-bit SAR ADC, the internal clock frequency, comparator delay, and settling time of the DAC have to be optimized to be \( N \) times faster than the nominal sampling rate, which can easily reach the practical limits of a CMOS technology. Since there is only one comparator in a conventional SAR ADC, the comparator offset will appear as a universal offset for the complete ADC transfer function. Such offset impact can be calibrated with simple means, which is another advantage of this architecture.
Due to the high compatibility of SAR ADCs with digital CMOS and modern deep-submicron technologies, they have become very popular in recent years. Depending on the topology and fabrication technology, SAR ADCs can achieve a wide range of characteristics as standalone ADCs; such as ultra-low-power ([81], [82]), high speed [83], and high resolution [84]. Furthermore, they can be used as a part of a hybrid ADC such as in [4], [8], [85]. There are many different techniques and architectures to implement SAR ADCs, which will be elaborated in Section 3.2.1.

2.2.6 Time-interleaved ADCs

In a time-interleaved (TI) ADC, multiple ADCs operate in parallel to effectively achieve higher sampling rates. Figure 14 shows the block diagram of a simple M-channel time-interleaved ADC. By time-interleaving a number M of ADCs, each with sampling rate of \( f_s \), a total sampling rate of \( M \times f_s \) is attainable in theory. Therefore, conversion speeds of several GS/s are achievable with this architecture. In each channel, a sample-and-hold captures the input signal and the sub-ADC resolves the digital output at a conversion rate of \( f_s \). After combining the channel outputs with a digital multiplexer, the effective ADC output is generated at the rate of \( M \times f_s \). Each channel has \( 1/(M \times f_s) \) seconds delay compared to its neighboring channels.

A time-interleaved ADC can employ several power-efficient ADCs in parallel to achieve the same performance of a flash ADC but with lower power consumption in many applications, which also depends on the characteristics of a given CMOS technology. It is
possible to use different types of ADCs (such as SAR, pipelined, and flash ADCs) in a
time-interleaved architecture, which has to be determined by the designer under
consideration of the specific application and power efficiency requirements.

The main challenges with time-interleaving are offset mismatch, gain mismatch,
timing mismatch (timing skew), and bandwidth mismatch among the channels; which will
be discussed in Section 3.1. Thus, they often require calibration techniques for the
achievement of medium-high resolutions at relatively high speeds. In theory, the total
power consumption of an M-channel TI ADC is equal to M times of the single ADC power
in each channel. Therefore, it is expected to have the same overall TI ADC energy per
conversion efficiency as that of the single ADC used in the channels. However, a TI ADC
will always be less energy-efficient than its sub-ADCs because of the power overhead
associated with interleaving [4]. This overhead includes the generation and distribution of
multiple clock phases, the distribution of the input and reference signals to all channels,
and the correction of errors from channel mismatches by overdesign or calibration.
Therefore, to achieve the best efficiency, the power consumptions of each individual
channel as well as the time-interleaving overhead should be minimized.

2.3 Summary

The basic Nyquist-rate ADC architectures were reviewed in this chapter. Flash,
folding, and interpolating ADCs resolve the digital outputs in one cycle, achieving high
conversion rates. However, they are not area- and power-efficient when designed for
medium to high resolutions due to the high numbers of active comparators. Subranging,
pipelined, and SAR ADCs require multiple clock cycles to complete a full analog-to-
digital conversion, resulting in latency of the output. Although they are usually slower,
these architectures have the tendency to be more power-efficient than flash ADCs. On the
other hand, time-interleaved ADCs can achieve high conversion rates. As exemplified in
the next chapter, a hybrid ADC architecture can be constructed by combining different
basic ADC architectures to accomplish higher sampling rate and efficiency.
3. Proposed High-Sampling Rate Hybrid ADC Architecture

In this chapter, the general design challenges of TI ADCs are briefly studied first. Then, a concise overview of SAR ADC architectures is presented, as well as a review of some existing high-frequency ADC architectures. Towards the end, the proposed ADC architecture is introduced along with a description of system-level design aspects.

3.1 Time-Interleaved ADC Design Considerations

Time-interleaving is an effective technique to increase the sampling-rate of ADCs as described in Section 2.2.6. Next, the impacts of channel mismatches are outlined to bring attention to the most common issues in time-interleaved ADCs. Interested readers can refer to [25], [58], [86]–[88] for additional theory and analysis.

3.1.1 Channel offset mismatch

Offset mismatches among TI ADC channels can originate from the difference between DC offsets of the buffers or amplifiers, charge injection errors, and also offset errors of each sub-ADC in TI channels. Figure 15 models the total input-referred offset voltage ($V_{OS}$) of each channel in an M-channel TI ADC. The offset mismatches between each channel cause an error signal with fixed amplitude and periodic pattern in the time-domain ADC output [25]. In the frequency domain, the undesired frequency components due to offset mismatch error of an M-channel TI-ADC occur at

$$f_{\text{offset}} = \frac{k}{M} f_s, \quad k = 1, 2, \ldots ;$$

(5)

where $f_s$ is the sampling frequency of the TI ADC. The SNR degradation due to the offset mismatch is constant and independent of the input frequency and amplitude. The corresponding SNR degradation can to be calculated from the amount of offset mismatch ($\sigma_{os}$). The required standard deviation of the channel offset in an M-channel N-bit TI ADC can be calculated with the following equation [88]:

$$\sigma_{os}^2 \leq \left( \frac{M}{M-1} \right) \left( \frac{2 \cdot P}{3 \cdot 2^N} \right),$$

(6)

where P is the input signal power.
3.1.2 Channel gain mismatch

Gain mismatches in a TI ADC mainly result from mismatches between gains of the buffers (or amplifiers) and gain errors of each sub-ADC in the TI channels. The gain of each channel can be modeled as shown in Figure 16. The largest error magnitude due to gain mismatch occurs at the peaks of the sinusoidal input signal, which is similar to amplitude modulation (AM) [25].

In the frequency domain, the undesired components due to gain mismatch errors in an M-channel TI-ADC fall at the following locations:

\[
f_{gain} = \pm f_m + \frac{k}{M} f_s, \quad k = 1, 2, \ldots
\]  

(7)
where $f_{in}$ is the input signal frequency. The SNR degradation due to gain mismatches is independent of the input frequency, but depends on the amplitude of the input signal. The required standard deviation of the channel gain ($\sigma_{Gain}$) in an M-channel N-bit TI ADC can be obtained with [88]

$$\sigma_{Gain}^2 \leq \left( \frac{M}{M - 1} \right) \left( \frac{2}{3 \cdot 2^{2N}} \right). \quad (8)$$

### 3.1.3 Channel timing mismatch (timing skews)

Timing mismatches between sampling clocks, also known as clock skews or timing skews, are systematic errors due to the small differences between the actual sampling clock edges compared to the ideal sampling moments in the TI channels. Figure 17 exemplifies the timing skews of sampling clocks for a 4-channel TI ADC. The main sources of timing skews are from device mismatches in the sampling clock generation circuitry, threshold voltage mismatch of the MOS switches [89] in each sample-and-hold, and the routing mismatches of the sampling clock signals on the chip [90].

Figure 17 visualizes the timing mismatches of sampling clocks in an M-channel TI ADC, where $\Delta t_i$ is the deviation of a sampling moment in channel “i” from the ideal value. In the time domain, the largest error occurs when the input signal has the highest slew rate (at the zero crossing for differential sinusoidal input), which is like phase modulation (PM) noise [25]. In the frequency domain, the undesired frequency components due to gain mismatch errors occur at:
\[ f_{\text{skew}} = \pm f_{\text{in}} + \frac{k}{M} f_s, \quad k = 1, 2, \ldots \]  

which is similar to the case of gain mismatch according to equation (7). Notably, the amplitudes of these frequency components increase with increasing input frequency. The SNR degradation due to timing mismatches depends on both the amplitude and the frequency of the input signal [25], [86]. SNR degrades when input frequency increases, which can be a severe issue in TI ADCs because they are mainly used for broadband applications. Calibration of timing skews in TI ADCs is more complicated than offset and gain mismatch calibration. Many timing-skew calibration techniques have been proposed in theory and have also been implemented on-chip or off-chip [23], [36], [85], [86], [91], [92]. The key design considerations related to the clock signal generation for TI ADCs will be discussed in Section 4.8.

![Figure 18. Modeling of clock skews amongst channels in a TI ADC.](image)

### 3.1.4 Channel bandwidth mismatch

Mismatches between the sampling bandwidths of TI channels cause SNR degradation [58]. Each sample-and-hold (S/H) can be approximately modeled with an RC circuit, functioning like a low-pass filter with a cutoff frequency (or bandwidth) of \( f_c = \frac{1}{2\pi \cdot R \cdot C} \), where R and C are the total resistance of the sampling path and total sampling capacitance respectively [58], [93]. As shown in Figure 19, there are differences between bandwidths of TI channels, which originate from several sources [85], [93]: First, RC mismatch coming from the MOS switch resistance and the sampling capacitance in each sample-
and-hold. Second, the systematic RC mismatch between the input signal routing among the channels on the chip. Moreover, if a buffer amplifier is used in each S/H [60], the amplifier bandwidth mismatch will also contribute to the TI bandwidth mismatch [93].

Figure 19. Modeling of channel bandwidths in a TI ADC.

The analysis of channel bandwidth mismatches are usually performed by writing the transfer function of the sampling channel to evaluate the impact of bandwidth mismatch on both amplitude and phase [25], [93]. The bandwidth mismatch has nonlinear dependence on both input signal amplitude and frequency [87]. The bandwidth mismatch impact on SNR degradation is a combination of gain and phase mismatches, where for low input frequencies the impact of the phase errors is dominant [58].

### 3.2 Sub-ADC Architectures in Time-Interleaved ADCs

A time-interleaved ADC requires several channels to achieve high conversion rates, which in turn increases the overall chip area and power consumption. Therefore, designing a low-power high-performance sub-ADC to be used in each TI channel is of significant importance to optimize the total power and area of a time-interleaved ADC. SAR ADCs are commonly used in TI ADC channels [4], [23], [32], [36], [94], [95]. However, depending on the application, other ADC architectures such as flash and pipelined can also be utilized to construct a TI ADC [92], [96]. In this section, we review some of the low-power architectures (including SAR and binary search ADCs) that can be employed in time-interleaved architectures.
3.2.1 Overview of SAR ADC architectures

SAR ADCs have become very popular over the last decade as the CMOS process technologies evolved because their performance significantly benefits from technology scaling. The quality and density of capacitors as well as the switching speed of transistors are improving, which helps to implement more efficient SAR ADCs with capacitive DACs. In this brief study, we categorize SAR ADCs by their DAC structures, conversion-speed enhancement techniques, and switching techniques for higher energy efficiency.

The most common DAC architecture in SAR ADCs is the binary-weighted capacitive DAC (CDAC). However, it has a large total capacitance of $2^N \cdot C_u$, where $N$ is the number of bits and $C_u$ is the unit capacitor value. This limits the sampling speed of the ADC and increases the required area on the chip as the resolution increases. However, this architecture has very good device matching characteristics, which results in high linearity. Another type of SAR ADC has a split-capacitor architecture (segmented DAC), which uses two split capacitor banks connected by an attenuation (bridge) capacitor between them [97]. A SAR ADC with C-2C ladder DAC is an alternative technique [5]. These two latter architectures have the advantage of reducing the total capacitance in comparison to the conventional binary-weighted CDAC counterpart. However, they are more sensitive to parasitic capacitances, causing considerable non-linearity errors. In general, they require calibration because of such errors. Most of the state-of-the-art SAR ADCs contain binary-weighted CDACs because achieving higher sampling rates with small area is possible with these CDACs in modern short-channel technologies, which often necessitates the use of very small custom-designed capacitors ($\leq 1\text{fF}$) [4], [36].

Resistive DACs can be used in SAR ADCs. However, their problems are the static power consumption and the need for a separate S/H. Nonetheless, a few high speed state-of-the-art SAR ADCs contain resistive DACs [32] instead of capacitive DACs. In [98], a hybrid resistive-capacitive SAR architecture has been reported to save area on the chip because the total capacitance is significantly reduced. However, there is a tradeoff between area and linearity due to the higher mismatches of resistors in comparison to capacitors.
For an N-bit synchronous SAR ADC with a conversion rate of fs, an internal clock with frequency of (N+1)·fs is required. Therefore, the comparator must operate with such high-speed clock. For every output bit, the comparator decision and DAC settling must be completed in one clock cycle. Thus, (N+1) clock cycles are required to perform one complete conversion, limiting the overall speed of synchronous SAR ADCs. Several techniques have been proposed to improve the speed of SAR ADCs. An asynchronous SAR algorithm has been introduced in [99], where the triggering of the internal comparisons from MSB to LSB occurs by a ripple-like procedure. Hence, the quantization time allocated to each bit is no longer limited by the slowest conversion bit, but by the average conversion time, leading to speed enhancement in comparison to synchronous architectures. Asynchronous architectures have been used frequently in recent designs ([33], [34], [83]) to shorten the overall conversion time. While an asynchronous technique helps to achieve higher speeds, it usually requires more complicated digital blocks to generate signals with unequal pulse widths. Converting more than one bit per cycle is another effective way of increasing the conversion speed of a SAR ADC. Several SAR or TI-SAR ADCs with 2 bits/cycle have been introduced such as [3], [30]–[32]. They can achieve higher sampling rates because they require a smaller number of cycles compared to a 1 bit/cycle SAR ADC. However, the disadvantages of multi-bit/cycle SAR ADCs are the larger number of comparators, and the more complex DAC structure. In addition, unlike in a 1 bit/cycle architecture, offset calibration is often required for the comparators in multi-bit/cycle SAR ADCs.

Several techniques have been reported to increase the energy efficiency of SAR ADCs, especially through reducing the power consumed by switching operations in the CDAC. The switching scheme determines the DAC size and energy efficiency in a SAR ADC. Capacitive SAR ADCs operate based on one of the following two concept: charge redistribution and charge sharing. In charge redistribution architectures, the total DAC capacitance is fixed and the DAC output is set by changing the voltage on the bottom plate of the capacitors [36]. Most of the standard capacitive SAR ADCs operate based on charge redistribution. Moreover, there is no attenuation of the sampled input voltage (when neglecting the impacts of parasitic capacitances). Monotonic capacitor switching [59] is an example of efficient charge redistribution that requires one cycle less than conventional
capacitive SAR ADCs. Furthermore, the total capacitance is reduced to half of its counterpart in a typical SAR ADC. Monotonic switching reduces power consumption and increases the sampling rate. However, the variation of the input common-mode voltage in monotonic switching causes signal-dependent offsets that degrades the linearity of the ADC. In charge sharing architectures, the DAC output is varied by connecting pre-charged capacitors to the DAC nodes. Therefore, the total capacitance of the DAC increases during the conversion [100]. A charge sharing switching scheme has better energy efficiency than the conventional charge redistribution switching technique. Nevertheless, the input voltage will be attenuated at the output of the DAC due to the increment of the total capacitance after sampling. In addition, unlike the charge redistribution architectures, the charge sharing approach necessitates an explicit S/H.

### 3.2.2 Comparator-based asynchronous binary search (CABS) ADC

The comparator-based asynchronous binary search (CABS) ADC [33] can be regarded as an architecture in-between the flash and SAR ADCs, having characteristics that resemble both types. Unlike a SAR ADC, which employs one comparator (for 1-bit/cycle) and varying reference levels for each cycle, a CABS ADC consists of a comparator tree. For N-bit resolution, it requires $2^N-1$ comparators (as a flash ADC), but only N comparators are activated during a complete conversion. The first comparator is triggered by a clock, while the others are triggered asynchronously by the output of a previous comparator. The CABS architecture combines the advantages of both flash and SAR ADCs to realize high speed operation with low power consumption. However, due to its large number of comparators, it has a high input capacitance and occupies a relatively large chip area. A version of the CABS ADC with less comparators is presented in [34], where the total number of comparators has been reduced to $2\cdot N-1$. However, the required time for the reference settling and the operation of additional digital gates for each comparison can limit the speed of this reduced architecture.

### 3.3 Power-Efficient High-Speed Medium-Resolution ADCs

Flash, folding, and interpolating ADC architectures are inherently suitable for high-speed analog-to-digital conversion. However, for medium resolutions and above, their
power efficiency degrades significantly due to the large number of active comparators, [9], [20], [21], [101], [102]. For pipelined ADCs, high performance design becomes more difficult as scaled CMOS supply voltages continue to decrease due to the stringent gain and bandwidth requirements for the opamps. In addition, the high power consumption of high-performance opamps limits the minimum power of pipelined ADCs. However, high-speed single-channel pipelined ADCs have recently been reported with considerably low power consumption thanks to the techniques such as calibration [103] and incomplete settling [7], which relax the required opamp specifications. Moreover, time-interleaved pipelined ADC design is another method to achieve high-speed low-power performance as in [96], which also uses opamp-sharing for further energy savings.

Technology scaling with reduced supply voltages in digital CMOS processes favors ADC topologies that have only a few analog elements, such as SAR ADCs. In deep submicron technologies, SAR ADCs can achieve faster sampling rates, lower power consumptions, and smaller layout areas. Their sampling rate is limited by the need for a high-speed internal clock for the SAR logic and by the settling time of the DAC in every cycle. However, with time-interleaving SAR ADCs, high-speed and low-power performance is achievable [3]–[5], [31], [85]. Most of the SAR ADCs reviewed in Section 3.2.1 can be used in a TI architecture. The conversion speed of the SAR ADC determines the number of required TI channels. In addition, there are tradeoffs between the number of channels, the total power consumption and the input bandwidth of the TI ADC. The ADCs in [3], [31] utilize multi-bit per cycle SAR ADCs in TI-SAR ADCs for high-speed low-power performance. The higher conversion speed of multi-bit/cycle SAR ADCs helps to reduce the number of time-interleaved channels. However, when fabricated in CMOS technologies with relatively long channel length [3], they are not as power efficient as they are in short-channel CMOS technologies [30], [31].

Hybrid ADC architectures benefit from the combination of several ADCs to achieve high-speed and power-efficient operation. Using subranging or two-step architectures can facilitate the reduction of power consumption in ADCs [1], [35]. The ADCs reported in [10] and [104] have similar two-step and subranging architectures. They combine two-step, time-interleaved, and flash ADC architectures together. The MSBs are resolved by
one flash ADC in the first stage, while the conversions for LSBs are performed by two-channel time-interleaved flash ADCs in the second stage. In addition, an MDAC is used to amplify the residue voltage because both (coarse and fine) ADCs operate with full-scale range. Using more power-efficient ADC architectures inside a hybrid ADC can lead to additional power saving for higher resolutions. For example, a subranging flash-SAR ADC has been described in [29], which resolves the MSBs with a flash ADC that controls the MSB capacitors of the CDAC in the SAR ADC of the second stage to resolve the remaining LSBs. Some low-power hybrid ADC architectures have recently been introduced, combining methods of subranging and time-interleaving with flash and SAR stages [36]–[38]. A subranging TI ADC has been reported in [36], which uses a front-end flash ADC to resolve the MSBs at the full conversion rate of 1GS/s. The fine stage consists of eight time-interleaved 10-bit SAR ADCs, where their MSB capacitors in the CDAC are controlled by the flash ADC outputs. Redundancy in the flash ADC and the CDAC is used to relax the offset constraints of the flash ADC. In [37], two flash-TI-SAR ADCs are time-interleaved to lower the sampling rate of the front-end flash in order to save more power compared to [36]. It is worthwhile to mention that for these architectures the mismatches between the two subranging stages must be reconciled through calibration or redundancy techniques to avoid linearity problems. Due to their high power-efficiency with high speeds and medium resolutions, hybrid ADCs are viable alternatives to conventional TI-SAR ADCs.
3.4 Proposed Hybrid ADC Architecture

This research advances the concept of a subranging time-interleaved architecture by realizing a hybrid flash-TI ADC with four time-interleaved CABS ADCs. To the best of the author’s knowledge, this architecture is the first to use a CABS ADC in a time-interleaved hybrid ADC configuration to take advantage of its power efficiency at relatively high speeds compared to conventional SAR ADCs. A flash ADC resolves the most significant bits (MSBs), whereas a time-interleaved ADC resolves the least significant bits (LSBs). The fast MSB conversion by the flash ADC together with subranging helps to reduce the number of interleaved ADCs, which results in higher input bandwidth. A merged sample-and-hold and capacitive digital-to-analog converter (SHDAC) performs sampling as well as residue generation for the subranging operation. The systematic and random offsets of the flash ADC comparators are calibrated using a foreground calibration technique.

A single-ended illustration of the proposed ADC architecture is displayed in Figure 20. The architecture is a subranging ADC comprised of a 3-bit 1 GS/s flash ADC in the first stage and four time-interleaved 250MS/s 5-bit CABS ADCs in the second stage. The hybrid ADC has a fully differential architecture to increase the dynamic range and to suppress common-mode distortion and noise. It does not require an extra front-end sample-and-hold because the SHDAC in each channel both samples the input for the flash ADC and performs the residue generation for the CABS ADC. In particular, since each SHDAC is shared by the flash ADC and the CABS ADC, it is ensured that an identical sampled voltage is processed by both stages. In comparison to hybrid ADC architectures in which the signal is sampled separately for the MSB stage and the LSB stage [37], this architecture is more immune to the sampling errors that originate from clock skews and resistor-capacitor (RC) mismatches between the two stages. However, it requires additional bootstrap switches to connect the flash ADC to the proper SHDAC during each sampling cycle.
In comparison to conventional asynchronous SAR ADCs ([30], [36]), a CABS ADC can support faster speed since it does not depend on settling delays associated with switched capacitors or changing reference voltages for each comparator decision. Due to the asynchronous operation, achieving 5-bit resolution at a relatively high speed of 250MS/s is more feasible with a CABS ADC than with a synchronous SAR ADC. However, due to the large number of comparators, the CABS ADC has an input capacitance that is comparable to the total sampling capacitance in the SHDAC. Thus, the loading effect can change the output of the capacitive network, causing errors in the CABS ADC decisions. To alleviate this issue, a unity-gain voltage buffer is placed between the SHDAC and the CABS ADC (Figure 20). The buffer also isolates the SHDAC from the kickback noise of the CABS ADC. To calibrate the flash ADC’s offsets, one extra sampling channel is present, which is only activated when the ADC is in the calibration
mode. The flash ADC’s systematic and mismatch offsets are calibrated with a foreground calibration that imitates the sampling conditions in the main channels. Since the extra calibration channel is disconnected during normal operation, it does not significantly affect the input bandwidth when switches with small parasitic capacitances are employed.

The conversion phases of the hybrid ADC are visualized in Figure 21. A complete analog-to-digital conversion in each channel is completed in four phases: (1) the SHDAC samples the input signal, (2) the flash ADC resolves the MSB, (3) the SHDAC performs the residue generation, and (4) the CABS ADC resolves the LSBs. Figure 22 shows the clock diagram for one channel. Each channel has the same timing scheme with a delay of 1ns compared to its neighboring channel. The clock phases are generated from a 1GHz master clock. At the beginning of a conversion, the SHDAC of channel i (clocked by \( \text{CLK}_{\text{SAMP},i} \)) samples the input signal, and the switch between the SHDAC and flash ADC (controlled by \( \text{CLK}_{\text{SAMPX},i} \)) is closed while the ones in the other channels are opened. The clock signals \( \text{CLK}_{\text{SAMPX},1} \) through \( \text{CLK}_{\text{SAMPX},4} \) must be non-overlapping to avoid changing the charge that is held on the adjacent SHDACs. After the sampling phase is complete, the flash ADC resolves the first three MSBs. Based on the latched thermometer output of the flash ADC at the beginning of the third phase, the SHDAC generates the residue voltage that passes through the unity-gain buffer. Finally, the buffered residue voltage is delivered to the CABS ADC that operates with one-eighths of the full-scale range.
Adding redundancy is a way to relax the offset requirement in subranging ADCs. However, it necessitates to include extra comparators. In the flash ADC for example, this would increase the power consumption, input parasitic capacitance, kickback, and chip area. Similarly, adding redundancy to the second stage becomes challenging when the speed/power tradeoff is of foremost importance, especially due to the increased loading from the second stage (with a CABS ADC) or the increased decision time (with a conventional SAR ADC). Furthermore, driving the extra comparators would increase the power consumption in the clock generation circuitry. A foreground offset calibration method was developed in this work to achieve the low-power performance at the cost of area overhead. Since the calibration is offline, it does not increase the total power consumption.

3.4.1 Architectural power and area tradeoffs for the resolutions of the coarse and fine ADCs

When designing the proposed hybrid ADC architecture for a particular application, the decision concerning the number of bits for the flash ADC and the CABS ADC should be made under consideration of power and area impacts. The analytical formulas and quantitative comparisons in terms of power efficiency and total area are provided in Table 1 to compare different implementation options for the 8-bit subranging architecture.

In Table 1, $P_{\text{flash,comp}} = 143\, \mu \text{W}$ and $A_{\text{flash,comp}} = 448\, \mu \text{m}^2$ are the power consumption and area of each comparator in the flash ADC, and $P_{\text{CABS,comp}} = 48\, \mu \text{W}$ and $A_{\text{CABS,comp}} = 504\, \mu \text{m}^2$ are the power consumption and area of each comparator in the CABS ADC, respectively. L designates the flash ADC resolution, and M designates the CABS ADC resolution. According to Table 1, there is a tradeoff between power efficiency and area, as
also visualized through the plots in Figure 23. The $L-M = 2-6$ architecture is the one with the largest area occupation and $L-M = 5-3$ is the one with highest power consumption, making them least suitable. Minimizing the power consumption was the main priority of this work, which is why the 3-5 configuration was chosen over the 4-4 configuration. In addition to power and area considerations, selecting a 3-bit instead of a 4-bit flash ADC leads to approximately half the amount of kickback noise and input capacitance.

Table 1. Comparison of various options for the first and second stage resolutions in the proposed ADC architecture

<table>
<thead>
<tr>
<th>Subranging choice*</th>
<th>No. of comp. in flash ADC</th>
<th>No. of comp. in CABS ADC</th>
<th>No. of activated comparators during each 8-bit conversion</th>
<th>Est. power of the flash and CABS for each 8-bit conversion (mW)</th>
<th>Est. total minimum area for one flash and four CABS ADCs (mm$^2$)</th>
</tr>
</thead>
<tbody>
<tr>
<td>$L-M$</td>
<td>$2^L - 1$</td>
<td>$2^M - 1$</td>
<td>$(2^L - 1) + M$</td>
<td>$(2^L - 1) \cdot P_{\text{flash.comp}} + M \cdot P_{\text{CABS.comp}}$</td>
<td>$(2^L - 1) \cdot A_{\text{flash.comp}} + 4 \cdot (2^M - 1) \cdot A_{\text{CABS.comp}}$</td>
</tr>
<tr>
<td>2-6</td>
<td>3</td>
<td>63</td>
<td>3 + 6</td>
<td>0.717</td>
<td>0.128</td>
</tr>
<tr>
<td>3-5</td>
<td>7</td>
<td>31</td>
<td>7 + 5</td>
<td>1.241</td>
<td>0.066</td>
</tr>
<tr>
<td>4-4</td>
<td>15</td>
<td>15</td>
<td>15 + 4</td>
<td>2.337</td>
<td>0.037</td>
</tr>
<tr>
<td>5-3</td>
<td>31</td>
<td>7</td>
<td>31 + 3</td>
<td>4.577</td>
<td>0.028</td>
</tr>
</tbody>
</table>

* $L = \text{flash ADC resolution, } M = \text{CABS ADC resolution}$

Figure 23. Estimated (a) power and (b) area for this hybrid ADC architecture with $L-M$ bits, where $L = \text{flash ADC resolution and } M = \text{CABS ADC resolution.}$
4. Hybrid ADC Design Considerations and Circuit-Level Implementation

In this chapter, the design approach and circuit-level innovations for the proposed hybrid ADC architecture are described. In addition, post-layout simulation results are summarized for a comparison to state-of-the-art ADCs.

4.1 Merged Sample-and-Hold and Digital-to-Analog Converter (SHDAC) Circuit

Figure 24 shows the SHDAC transfer function during the residue generation. $V_{CM}$ is the common-mode voltage, $V_{RP}$ and $V_{RN}$ are the positive and negative reference levels of the flash ADC, $V_I$ is the differential input and $V_X$ is the differential SHDAC output. The reference levels of the CABS ADC are $V_{RP,CABS} = V_{CM} + (V_{RP} - V_{RN})/16$ and $V_{RN,CABS} = V_{CM} - (V_{RP} - V_{RN})/16$. Considering the number of required states to shift the sampled voltage (Figure 24) for residue generation, a simplified thermometer SHDAC was designed to implement the SHDAC with three reference levels ($V_{CM}$, $V_{RP}$ and $V_{RN}$) as shown in Figure 25. Due to the reduced number of switches and states, one can refer to it as a semi-thermometer SHDAC. During the sampling phase (CLK$_{SAMP}$ high), the input switches (at $V_{IP}$ and $V_{IN}$ in Figure 25) and $S_0$ through $S_3$ are closed, and the differential input signal is sampled onto the capacitor bank. In the residue generation phase, the control switches $S_0$-$S_3$, $CP_0$-$CP_3$ and $CN_0$-$CN_3$ connect the capacitor bottom plates to one of the reference voltages ($V_{RP}$, $V_{RN}$ and $V_{CM}$) depending on the thermometer code of the flash ADC.

![Figure 24. SHDAC transfer function during sampling and residue generation.](image-url)
Figure 25. Fully differential schematic of the semi-thermometer-coded SHDAC.

Table 2 lists all the configurations of the switches in the SHDAC during the shifting operation with the associated output voltages, where $V_R = V_{RP} - V_{RN}$ is the full-scale differential reference of the flash ADC. The output voltages in the table are calculated assuming ideal capacitance matching and no parasitics.

Table 2. SHDAC switch control truth table for the residue generation and the resulting analog output voltage

<table>
<thead>
<tr>
<th>Flash Binary Output</th>
<th>$V_{CM}$ Switches</th>
<th>$V_{RP}$ and $V_{RN}$ Switches</th>
<th>Differential Output Voltage $(V_X)$</th>
</tr>
</thead>
<tbody>
<tr>
<td>B3 B2 B1 S3 S2 S1 S0</td>
<td>CP3 CN3 CP2 CN2 CP1 CN1 CP0 CN0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0</td>
<td></td>
<td>$V_1 + (7/8)V_R$</td>
<td></td>
</tr>
<tr>
<td>0 0 1 1 0 0 0 0 0 0 1 0 1 0 1 0</td>
<td></td>
<td>$V_1 + (5/8)V_R$</td>
<td></td>
</tr>
<tr>
<td>0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 0</td>
<td></td>
<td>$V_1 + (3/8)V_R$</td>
<td></td>
</tr>
<tr>
<td>0 1 1 1 1 1 0 0 0 0 0 0 1 0 1 0</td>
<td></td>
<td>$V_1 + (1/8)V_R$</td>
<td></td>
</tr>
<tr>
<td>1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1</td>
<td></td>
<td>$V_1 - (1/8)V_R$</td>
<td></td>
</tr>
<tr>
<td>1 0 1 1 1 0 0 0 0 0 1 0 1 0 1 1</td>
<td></td>
<td>$V_1 - (3/8)V_R$</td>
<td></td>
</tr>
<tr>
<td>1 1 0 1 0 0 0 0 0 0 1 0 1 0 1 1</td>
<td></td>
<td>$V_1 - (5/8)V_R$</td>
<td></td>
</tr>
<tr>
<td>1 1 1 0 0 0 0 0 0 1 0 1 0 1 0 1</td>
<td></td>
<td>$V_1 - (7/8)V_R$</td>
<td></td>
</tr>
</tbody>
</table>

* During the sampling phase, all $S_i$ switches are turned on and all $CP_i$ and $CN_i$ switches are turned off.

Each flash comparator’s output is latched by a D flip-flop (DFF) as shown in Figure 26. Next, the thermometer code is converted to a binary code that is held until the end of the CABS ADC conversion. The SHDAC bottom plate switches are configured by a digital control block as a function of the flash thermometer code. As opposed to the binary-
weighted SHDAC in [38], this architecture eliminates the propagation delay of the thermometer-to-binary encoder in the control signal path, which relaxes the propagation delay requirement of the flash comparators by approximately 80ps. It also simplifies the residue generation control logic.

![Diagram of SHDAC controller and flash ADC thermometer-to-binary encoder](image)

Figure 26. SHDAC controller and flash ADC thermometer-to-binary encoder.

It is of foremost importance to minimize the unit capacitance ($C_u$) in order to minimize the settling time within the SHDAC (i.e., to avoid limiting the input bandwidth of the SHDAC). However, the minimum acceptable $C_u$ value is set by thermal noise and technology-dependent requirements for layout matching. On the other hand, the value of $C_u$ has to be sufficiently large to tolerate the impacts of parasitic capacitances and kickback noise from the flash, which was assessed through simulations to select the appropriate $C_u$ value. A $C_u$ of 14.8fF was optimal for this example design, and $C_u$ was implemented with a 2-layer metal-oxide-metal (MOM) capacitor.

As illustrated in Figure 25, the input signal is differentially sampled using bootstrap switches (controlled by CLK$_{SAMP}$) connected to the SHDAC capacitors’ top plates. The bootstrap switch, similar to that of [75], maintains the gate-source voltage of the sampling NMOS switch approximately equal to $V_{DD}$, which ensures a low on-resistance that is independent of the input signal amplitude. Signal-dependent charge injection is reduced by using bootstrap switches and differential sampling.

The switches that connect $V_{RP}$ and $V_{RN}$ to the bottom plates of the SHDAC capacitors (Figure 25) are PMOS and NMOS transistors, respectively. The switches connected to $V_{CM}$ must have less on-resistance in comparison to the ones at the other reference voltages
because they add series resistance in the sampling path, thereby creating RC delay. Hence, their width/length ratios should be higher. In the available 130nm technology, PMOS transistors show four times higher on-resistance with the same dimensions as NMOS transistors to conduct the common-mode voltage (600mV). Therefore, instead of using transmission gates, only NMOS transistors were used for the $V_{CM}$ switches to save area and reduce the parasitic capacitance.

### 4.2 Analysis and Correction of the Parasitic Capacitances’ Impacts on the Residue Voltage

In this section, we analyze the adverse effects of parasitic capacitances at the output of the SHDAC after residue generation, and present a technique to alleviate these effects. For simplicity, we use the single-ended model shown in Figure 27; where $C_{pbs}$, $C_{ps1}$, $C_{pf}$, $C_{pbuf}$, and $C_{pr}$ respectively represent the parasitic capacitances at node $V_X$ from the sampling bootstrap switch ($S_{bs}$), from the switch between the flash and the SHDAC ($S_1$), from the input of the flash ADC, from the input of the voltage buffer, and from routing.

![Figure 27. Model for analysis with parasitic capacitances at the output of the SHDAC in one channel (single-ended equivalent).](image)

Since the clock signals controlling the switches between the flash and the SHDAC ($CLK_{SAMPLE}$) are non-overlapping (Figure 22), the switch $S_1$ opens slightly earlier than the start of the residue generation phase. Furthermore, the reconfiguration of the SHDAC switches starts with a delay due to the propagation delay of the control logic. Therefore, $S_1$ is considered as an open switch in the residue generation phase and $C_{pf}$ is excluded.
from the analysis. The total parasitic capacitance \( C_{pX} \) at the output of the SHDAC (node \( V_X \)) is equal to

\[
C_{pX} = C_{pbs} + C_{psl} + C_{buf} + C_{pr} .
\] (10)

Since the values of the parasitic (junction) capacitances are dependent on bias conditions [105], they vary for different input voltage amplitudes. Therefore, we refer to the total parasitic capacitance at node \( X \) before and after the residue generation moment as \( C_{pX1} \) and \( C_{pX2} \), respectively. With these notations, the total charge at node \( X \) before residue generation is

\[
Q_{X1} = \left( 8C_u + C_{pX1} \right) \cdot V_{X1} ,
\] (11)

and after residue generation the total charge is

\[
Q_{X2} = 8k \cdot C_u \left( V_{X2} - V_R \right) + 8 \left( 1 - k \right) C_u \cdot V_{X2} + C_{pX2} \cdot V_{X2} ,
\] (12)

where \( V_{X1} \) is the differential sampled input voltage (\( \cong V_i \)), \( V_{X2} \) is the differential residue voltage after the residue generation, and \( V_R \) is the differential reference voltage. The shifting coefficient \( k \) can take the values of \( \pm 7/8, \pm 5/8, \pm 3/8 \) or \( \pm 1/8 \) based on the flash ADC output (see Table 2). According to the charge conservation principle, \( Q_{X1} \) is equal to \( Q_{X2} \), and \( V_{X2} \) can be expressed as

\[
V_{X2} = \left[ 1 + \frac{C_{pX1} - C_{pX2}}{8C_u + C_{pX2}} \right] V_{X1} + k \cdot \frac{8C_u}{8C_u + C_{pX2}} V_R .
\] (13)

To separate the error voltage from the ideal voltage, we rewrite equation (13) as

\[
V_{X2} = V_{X1} + \alpha \cdot k \cdot V_R + \beta \cdot V_{X1} ,
\] (14)

where

\[
\alpha = \frac{8C_u}{8C_u + C_{pX2}} \quad \text{and} \quad \beta = \frac{C_{pX1} - C_{pX2}}{8C_u + C_{pX2}} .
\]
An ideal operation is performed when $\alpha = 1$ and $\beta = 0$, which yields the expressions in Table 2. One way to reduce the effects of parasitic capacitances is to use a large unit capacitor ($C_u$), but this approach increases the settling time of the sampling, which makes it impractical for such high-speed operation (1GS/s). The ratio $\alpha$ is always less than 1 because of $C_{pX2}$, which attenuates the shift of the sampled voltage during the residue generation. On the other hand, $\beta \neq 0$ because $C_{pX1}$ and $C_{pX2}$ are not equal. This causes the transfer function slope to deviate inside each MSB region because the relevant error depends on the input amplitude. The applied voltage associated with $C_{pX1}$ varies widely (full-swing) in comparison to $C_{pX2}$ which only sees one eighth of full-swing (Figure 24). Therefore, it is essential to minimize the total parasitic capacitances at node X as much as possible to minimize its input-dependent variation, which reduces the $\beta$ term of the error in equation (14). Since the scaled input voltage range after shifting is significantly lower than the ADC full-scale range, the $C_{pX2}$ variation is limited. As a consequence, the attenuation factor $\alpha$ is relatively constant throughout all regions. Hence, our design approach involves compensation of this non-ideality by applying a constant scaling factor of $g = \alpha^{-1}$ to the differential SHDAC references. The $\alpha$ coefficient can be canceled when using a value of $g \cdot V_R$ instead of $V_R$ for the differential SHDAC reference voltage in equation (14), leading to

$$V_{X2} = V_{X1} + k \cdot V_R + \beta \cdot V_{X1}. \quad (15)$$

The value of $g$ was selected based on transistor-level simulations. For the hybrid ADC design of which simulation results are summarized in Section 4.11, the values of $g = 1.425$ for schematic-level simulation and $g = 1.28$ for post-layout simulation cancel the majority of the shifting error in all MSB regions.

As can be observed from equation (15), the term $\beta \cdot V_{X1}$ is not nullified even with scaled reference voltage levels, causing a small error after shifting as visualized in Figure 28(a). Note that the worst-case shifting errors due to $\beta$ occur at the peak amplitude levels of the input signal. To suppress the $\beta$ error, the $C_{pX1}$ and $C_{pX2}$ values must be as close as possible. To solve this issue, the $S_1$ switch at the input of the flash ADC (in Figure 27) is implemented with a bootstrap configuration. The bootstrap NMOS switch transistor
(2μm/130nm) has a significantly smaller size and parasitic capacitance ($C_{pSi}$) than a CMOS transmission gate with the same on-resistance (4μm/130nm NMOS in parallel with 16μm/130nm PMOS). In addition, $C_{pbuf}$ was minimized by avoiding to use a large input pair in the voltage buffer. Thus, the $C_{pX}$ value as well as its variation can be kept small, leading to negligible β error. In this design, the maximum error caused by $β \cdot V_{X1}$ is less than LSB/4 in the worst case. The conceptual representations in Figure 28(b) of the SHDAC outputs after reference voltage adjustment (blue) portray how the mentioned optimizations alleviate β error, particularly in regions associated with high input amplitudes.

![Graph](a)

**Figure 28.** Residue voltage generated by the SHDAC using the reference scaling method with (a) non-optimized design ($C_{pX1} \neq C_{pX2}$), (b) $C_{pX1}$ and $C_{pX2}$ values that are close to each other (through design optimization).
4.3 Bootstrap Switches

Bootstrap switches are utilized between the SHDAC and flash ADC to minimize variations of the parasitic capacitances while maintaining low on-resistance during the sampling phase. Bootstrap switches are less sensitive to charge injection effects due to their constant gate-to-source voltage. However, there is still an input-dependent error of the sampled voltage caused by clock feedthrough when these switches open. Therefore, the sampled voltage at the SHDAC output would be disturbed, causing nonlinearities (up to 1 LSB) in the generated residue voltage that is delivered to the CABS ADC through the buffer. To suppress this error, a dummy NMOS switch (M13) is included at the input of the bootstrap switch (S1 in Figure 27) that is connected to the SHDAC output. As shown in Figure 29, this dummy switch is clocked by CLK_SAMPX, unlike conventional dummy switches with inverted sampling clock signal. The dimensions of M13 are designed to generate an input-dependent charge injection with a magnitude close to the total clock feedthrough error from the two bootstrap switches (the one inside SHDAC, and the one between the SHDAC and flash ADC), but with opposite polarity. An analysis is provided in the next section to show the functionality of this error reduction method. In this design, the method reduces the total error from 3.74mV to 0.32mV for the worst case that occurs at the peak input amplitude.

![Bootstrap switch between the SHDAC and flash ADC with a dummy switch (M13).](image-url)
4.4 Analysis of the Clock Feedthrough Cancellation Technique for Bootstrap Switches

To obtain insights into the functionality and effectiveness of the correction technique with the dummy switch clocked by CLK\textsubscript{SAMPX}, a simplified differential model of the sample-and-hold network in the hybrid ADC was used, which is displayed in Figure 30. The transistors of the two bootstrap switches that directly contribute to the sampling error were kept in the figure, and the rest of each bootstrap switch circuit was simplified with a floating voltage source for the sake of brevity.

![Figure 30. Simplified differential model of the sampling network to analyze the operation of the proposed dummy switch technique.](image)

The annotated voltages at each node in Figure 30 show the worst-case scenario, which occurs when the input amplitude is at the peak value (i.e., $V_{XP} = 850\text{mV}$ and $V_{XN} = 350\text{mV}$), resulting in the largest error of the sampled voltage. Note that the first bootstrap switches inside the SHDAC ($M_{10bsP}$ and $M_{10bsN}$) are already open when the second bootstrap switches ($M_{10P}$ and $M_{10N}$) start to open (with the falling edge of CLK\textsubscript{SAMPX} based on Figure 22 that shows the diagram of the clock signals). To consider the sampling error from the first bootstrap switch during the analysis, we annotated their corresponding node voltages in Figure 30 for the moment that they are closed.
In Figure 30, \(V_{IP}\) and \(V_{IN}\) are the SHDAC inputs, \(V_{XP}\) and \(V_{XN}\) are the SHDAC outputs, and \(V_{FP}\) and \(V_{FN}\) are the flash ADC inputs. \(M_{9bsP,N}\) and \(M_{10bsP,N}\) are representing the main sampling bootstrap switches in the SHDAC, and \(M_{9P,N}\) and \(M_{10P,N}\) are representing the bootstrap switches that connect the SHDAC to flash ADC. \(C_u\) is the unit capacitor in the SHDAC, \(C_{pX}\) is the total parasitic capacitance at the SHDAC output (as already defined in equation (10)), and \(C_{pF}\) is the total parasitic capacitance at the flash ADC input. The total sampling capacitance at the SHDAC output (node \(V_X\)) is \(C_{tot} = 8C_u + C_{pX}\). The total error of the sampled voltage (\(\Delta V_{tot}\)) mainly originates from charge injection (\(\Delta V_{CI}\)) and clock feedthrough (\(\Delta V_{CF}\)). Similar to the analysis in [105], this error on the sampled voltage can be defined as

\[
\Delta V_{tot} = \Delta V_{CI} + \Delta V_{CF}, \quad \text{such that} \quad V_X = V_I - \Delta V_{tot},
\]

where \(V_I\) is the differential input voltage, and \(V_X\) is the differential sampled voltage at the SHDAC output. Next, the sampling network is analyzed under consideration of the two error sources to assess how this method helps to alleviate errors.

### 4.4.1 Sampling error due to charge injection

Assuming half of the channel charge flows to each of the source and drain terminals of the MOSFET switch after it opens, then the error from charge injection is defined as [105]:

\[
\Delta V_{CI,MOSFET} = \frac{Q_{ch}}{2C_{tot}} = \frac{WLC_{ox}(V_{GS} - V_{th})}{2C_{tot}},
\]

where \(Q_{ch}\) is the channel charge, \(W\) is the transistor channel width, \(L\) is the transistor channel length, \(C_{ox}\) is the oxide capacitance per area, \(V_{GS}\) is the gate-to-source voltage, and \(V_{th}\) is the threshold voltage. The impact of threshold voltage variation due to the body effect on the sampling error is neglected in this analysis for simplicity. Thus, the following equation can be written for the total charge injection error at the \(V_{XP}\) node from transistors \(M_{10bsP}, M_{9P}, M_{10P}\) and \(M_{13P}\):
\[ \Delta V_{CI,P} = \Delta V_{CI,10bsP} + \Delta V_{CI,13P} + \Delta V_{CI,9P} + \Delta V_{CI,13P} = \frac{(WL)_{10bs} C_{ox}(V_{GS,10bsP} - V_{b,10bsP})}{2C_{tot}} + \frac{(WL)_{10bs} C_{ox}(V_{GS,10P} - V_{b,10P})}{2C_{tot}} \]

\[ + \frac{(WL) C_{ox}(V_{GS,9P} - V_{b,9P})}{2C_{tot}} + \frac{(WL)_{13P} C_{ox}(V_{GS,13P} - V_{b,13P})}{2C_{tot}} \]

Similarly, for the negative branch:

\[ \Delta V_{CI,N} = \Delta V_{CI,10bsN} + \Delta V_{CI,10N} + \Delta V_{CI,9N} + \Delta V_{CI,13N} = \frac{(WL)_{10bs} C_{ox}(V_{GS,10bsN} - V_{b,10bsN})}{2C_{tot}} + \frac{(WL)_{10N} C_{ox}(V_{GS,10N} - V_{b,10N})}{2C_{tot}} \]

\[ + \frac{(WL) C_{ox}(V_{GS,9N} - V_{b,9N})}{2C_{tot}} + \frac{(WL)_{13N} C_{ox}(V_{GS,13N} - V_{b,13N})}{2C_{tot}} \]

\[ V_{GS} \text{ is constant (equal to } V_{DD} \text{) for the bootstrap switch transistors (} M_{10bs} \text{ and } M_{10} \text{) as well as for } M_9, \text{ which implies that } V_{GS,10bsP} = V_{GS,10bsN}, V_{GS,10P} = V_{GS,10N}, \text{ and } V_{GS,9P} = V_{GS,9N}. \text{ Consequently, their charge injection errors cancel in differential mode. However, for } V_{GS} \text{ of the dummy switches in each branch (} M_{13P} \text{ and } M_{13N} \text{) in Figure 30, } V_{GS,13P} = V_{DD} - V_{XP} \text{ and } V_{GS,13N} = V_{DD} - V_{XN}. \text{ Hence, the total differential sampling voltage error from charge injection can be estimated as:} \]

\[ \Delta V_{CI} = \Delta V_{CI,P} - \Delta V_{CI,N} \approx \left[ \frac{C_{ox}}{2C_{tot}} (WL) \right]_3 \cdot (V_{XN} - V_{XP}) = - \left[ \frac{C_{ox}}{2C_{tot}} (WL) \right]_3 \cdot (V_{XP} - V_{XN}) \cdot (20) \]

### 4.4.2 Sampling error due to clock feedthrough

The error on the sampled voltage from clock feedthrough in a bootstrap switch is input-dependent because the gate terminal voltage of the bootstrap switch is equal to \( V_{DD} + V_{in} \) before opening, and transitions to 0V afterwards. The impact of this error on the sampled voltage can be estimated by a voltage division from the gate terminal of the sampling switch, which is between the overlap capacitance \( W \cdot C_{ov} \) (either on source or drain side) and \( C_{tot} \), where \( C_{ov} \) is the overlap capacitance per width. Equation (21) below shows the clock feedthrough error for a MOSFET switch [105].

\[ \Delta V_{CF, MOSFET} = \frac{W C_{ov}}{(C_{tot} + W C_{ov})} \cdot V_{G} \]
In (21), $V_G$ is the gate voltage of MOSFET switch. Here, since the dummy switch is connected to the $V_x$ node with both, its source and drain terminals (Figure 30), the impact of the clock feedthrough error from the dummy switch is multiplied by 2. Thus, the following equation can be written for the positive branch of the sampling channel:

\[
\Delta V_{CF,P} = \Delta V_{CF,100p} + \Delta V_{CF,10p} + \Delta V_{CF,9p} + \Delta V_{CF,13p} \\
= \frac{W_{100} C_{av}}{C_{tot} + W_{100} C_{av}} \cdot (V_{G,100p}) + \frac{W_{10} C_{av}}{C_{tot} + W_{10} C_{av}} \cdot (V_{G,10p}) + \frac{W_C C_{av}}{C_{tot} + W_C C_{av}} \cdot (V_{G,9p}) + \frac{2 \cdot W_{13} C_{av}}{C_{tot} + W_{13} C_{av}} \cdot (V_{G,13p}) \\
= \frac{W_{100} C_{av}}{(C_{tot} + W_{100} C_{av})} \cdot (V_{I00} + V_{XP}) + \frac{W_{10} C_{av}}{(C_{tot} + W_{10} C_{av})} \cdot (V_{I0} + V_{XP}) + \frac{W_C C_{av}}{(C_{tot} + W_C C_{av})} \cdot (V_{I9} + V_{XP}) + \frac{2 \cdot W_{13} C_{av}}{(C_{tot} + W_{13} C_{av})} \cdot (V_{I13}) 
\]

Similarly, for the negative branch it can be obtained that

\[
\Delta V_{CF,N} = \Delta V_{CF,100n} + \Delta V_{CF,10n} + \Delta V_{CF,9n} + \Delta V_{CF,13n} \\
= \frac{W_{100} C_{av}}{C_{tot} + W_{100} C_{av}} \cdot (V_{G,100n}) + \frac{W_{10} C_{av}}{C_{tot} + W_{10} C_{av}} \cdot (V_{G,10n}) + \frac{W_C C_{av}}{C_{tot} + W_C C_{av}} \cdot (V_{G,9n}) + \frac{2 \cdot W_{13} C_{av}}{C_{tot} + W_{13} C_{av}} \cdot (V_{G,13n}) \\
= \frac{W_{100} C_{av}}{C_{tot} + W_{100} C_{av}} \cdot (V_{I00} + V_{XX}) + \frac{W_{10} C_{av}}{C_{tot} + W_{10} C_{av}} \cdot (V_{I0} + V_{XX}) + \frac{W_C C_{av}}{C_{tot} + W_C C_{av}} \cdot (V_{I9} + V_{XX}) + \frac{2 \cdot W_{13} C_{av}}{C_{tot} + W_{13} C_{av}} \cdot (V_{I13}) 
\]

Thus, the total sampling error from clock feedthrough can be expressed as follows for the differential mode:

\[
\Delta V_{CF} = \Delta V_{CF,P} - \Delta V_{CF,N} = \left[ \frac{W_{100} C_{av}}{C_{tot} + W_{100} C_{av}} + \frac{W_{10} C_{av}}{C_{tot} + W_{10} C_{av}} + \frac{W_C C_{av}}{C_{tot} + W_C C_{av}} \right] \cdot (V_{XP} - V_{XX}) \cdot \Delta V_{tot} 
\]

As can be seen from equation (24), the dummy switches do not contribute to clock feedthrough during the differential processing because the gate voltage of both dummy switches ($M_{13p}, M_{13n}$) is $V_{DD}$, which results in cancellations of the corresponding terms in equations (22) and (23).

### 4.4.3 Sampling voltage error cancellation

As already defined in equation (16), the total sampling error is the summation of the errors from clock feedthrough and charge injection. Hence, by substituting (20) and (24) into (16), the total differential sampling voltage error in this configuration is

\[
\Delta V_{tot} = \left[ \frac{W_{100} C_{av}}{C_{tot} + W_{100} C_{av}} + \frac{W_{10} C_{av}}{C_{tot} + W_{10} C_{av}} + \frac{W_C C_{av}}{C_{tot} + W_C C_{av}} \right] \cdot (V_{XP} - V_{XX}) - \left[ \frac{C_{tot}}{2 C_{tot}} \cdot (WL)_{13p} \right] \cdot (V_{XP} - V_{XX}). \]
As seen from equation (25), the total clock feedthrough error impact of the bootstrap switches on the sampled voltage has an opposite polarity compared to the charge injection effect from the dummy switches. Thus, the total sampled voltage error at the SHDAC output in differential mode can be suppressed significantly by design through the selection of proper transistor dimensions and confirmation of the cancellation via simulations.

To verify the sampling error cancellation, transient simulations were performed with the bootstrap switches when connected to the ADC circuitry for three different cases: with a dummy switch clocked by CLK_{SAMPX}, without dummy switch, and with the same dummy switch clocked by CLK_{SAMPX_B} (inverted clock). The differential output waveforms of the SHDAC in such cases are shown in Figure 31 for the worst-case condition; which is when the differential input voltage is at its peak, implying that $V_{IP} = 850\text{mV}$ and $V_{IN} = 350\text{mV}$. Note that the sampled voltage errors are measured from the end of the sampling phase to the moment before the start of the residue generation (voltage shifting). Table 3 summarizes the sampling voltage errors corresponding to Figure 31. As seen from the simulation results and Table 3, using the properly sized dummy switch clocked by CLK_{SAMPX} effectively reduces the total input-dependent sampling error in agreement with the above analysis.

![Figure 31. Simulated differential output voltage of the SHDAC, showing the voltage errors for three simulation cases.](image-url)
Table 3. Sampling error from schematic simulations for the three different cases

<table>
<thead>
<tr>
<th></th>
<th>Dummy switch with CLK&lt;sub&gt;SAMPX&lt;/sub&gt;</th>
<th>No dummy switch</th>
<th>Dummy switch with CLK&lt;sub&gt;SAMPX_B&lt;/sub&gt;</th>
</tr>
</thead>
<tbody>
<tr>
<td>( \Delta V_{XP} )</td>
<td>36.07mV</td>
<td>33.29mV</td>
<td>28.98mV</td>
</tr>
<tr>
<td>( \Delta V_{XN} )</td>
<td>35.75mV</td>
<td>29.55mV</td>
<td>21.65mV</td>
</tr>
<tr>
<td>( \Delta V_{tot} )</td>
<td><strong>0.32mV</strong></td>
<td>3.74mV</td>
<td>7.34mV</td>
</tr>
</tbody>
</table>

To evaluate the sampling error reduction method in the presence of process-voltage-temperature (PVT) variations, simulations were completed for all possible combinations of process corner cases (SS, TT, FF), supply voltages (1.14V, 1.2V, 1.26V assuming ±5% variation), and temperatures (-10ºC, 27ºC, 85ºC). The sampling error resulting from simulations in all PVT corner ranges from -931µV to 567µV.

As stated in the literature [106], the gate-drain and gate-source overlap capacitances can be approximated by \( C_{ov} = L_D \cdot C_{ox} \), where \( L_D \) is the side diffusion length at the drain or source terminals. Considering that \( W \cdot C_{ov} \ll C_{tot} \) and \( C_{ov} = L_D \cdot C_{ox} \), equation (25) can be simplified as:

\[
\Delta V_{tot} \approx \frac{C_{ox}}{C_{tot}} \left[ (WL_D)_{14D} + (WL_D)_{100} + (WL_D)_{90} - \frac{1}{2} (WL_D)_{3P} \right] \cdot (V_{XP} - V_{XN}) .
\]  

(26)

Since the transistors in the bootstrap switch are located close to each other in the layout, it can be assumed that \( C_{ox} \) is approximately the same for all of them. Hence, equation (26) is a fair estimation to analytically assess mismatches and PVT variations. The values of \( W, L, \) and \( L_D \) can have some deviations and can impact the error. However, as evaluated with comprehensive post-layout Monte Carlo simulations, the standard deviation of the sampled voltage error is 0.59mV (Figure 32). According to the PVT and Monte Carlo simulation results, all error values are in an acceptable range for the target resolution (LSB \( \approx \) 4mV), indicating the robustness of the correction method.
Figure 32. Histogram of the error of the sampled voltage from 100 Monte Carlo post-layout simulation runs.

Additional design optimizations were made under consideration of delays between different clock paths in the bootstrap switch. The dummy switch pre-charges the output node (using charge injection) slightly before the clock feedthrough occurs, and afterwards the error is suppressed through cancellation, even with short delays. According to the theory and calculations provided in [106], the sharpness of the sampling clock edge at the switch turn-off moment affects the charge injection and clock feedthrough. For this reason, the dummy transistor dimensions were optimized for a wide range of variation of the signal CLK\textsubscript{SAM PX} (fall time from 30ps to 100ps), while maintaining the related error due to variations below 0.4mV. The design optimizations to ensure acceptable sharpness for the clock signal were completed with post-layout simulations.

4.5 Flash ADC

The 3-bit flash ADC has a fully differential topology with seven comparators that compare the input signal against differential reference voltages. The flash ADC was collaboratively designed by another research group member, and adapted for the hybrid ADC architecture [107]. Figure 33 depicts the single-ended equivalent diagram of the flash ADC. Four sets of DFFs and thermometer-to-binary encoders capture the outputs of the comparators and generate the 3-bit binary code for each time-interleaved channel. \(V_{cp,i}\), \(V_{fp,i}\), \(V_{cn,i}\) and \(V_{cn,i}\) are the coarse and fine calibration voltages for the two halves of comparator “i”.
Figure 33. Single-ended equivalent of the 3-bit flash ADC architecture in the hybrid ADC.

Figure 34 displays the architecture of the modified StrongARM comparator [108] in the flash ADC. This architecture without preamplifier was chosen for its high speed of operation with low power consumption. The absence of a preamplifier results in a high kickback noise at the input of the flash ADC. The kickback noise is reduced by placing the series NMOS switches (M5-M6) between the input pairs (M1-M4) and the regenerative back-to-back inverters (M7-M10) as reported in [92], [109]. With this kickback reduction technique, the nodes at the drain of the input transistors are floating during the reset phase instead of pre-charging to VDD as in the conventional case. This results in less voltage variation at the drains of M1-M4 during the comparator’s operation, thereby reducing the kickback noise at the input. Figure 35(a) shows the differential input waveform of the flash ADC inputs with and without kickback reduction in the worst-case where the input voltage is equal to the peak value of 500mV. Figure 35(b) displays the kickback errors at the input of the flash ADC versus input voltage amplitude from simulations with and without kickback reduction transistors. In this design, the maximum kickback (absolute value) reduced from 23.2mV to 1.12mV during transistor-level flash ADC simulations. Note that simulations revealed that a kickback of 23.2mV would severely degrade the ENOB of the hybrid ADC to 3.16.
Figure 34. Dynamic latched comparator with kickback reduction and offset compensation circuitry.

Figure 35. Simulation results with and without kickback reduction transistors: (a) differential input voltage signal of the flash ADC, (b) kickback voltage error vs. input amplitude.

To ensure sufficient linearity in a standalone flash ADC it is required to satisfy the condition $3 \sigma < \text{LSB}_{\text{Flash}}/2$, where $\sigma$ is the standard deviation of the comparators’ offsets [110]. In a 3-bit case for instance, $\text{LSB}_{\text{Flash}} = 125\text{mV}$ with $1\text{V}_{\text{p-p}}$ full-scale swing. However, when the flash ADC is part of a hybrid ADC, its comparators must satisfy the overall ADC
resolution. The impacts of comparator offsets in the flash ADC on the transfer function of the subranging architecture is conceptually depicted in Figure 36. A large comparator offset can cause significant residue voltage error by sending the generated voltage out of the operating range (green area in Figure 36) of the CABS ADC. With a full-scale swing of $1V_{p-p}$ in this design, a target offset specification of $3\cdot\sigma < \text{LSB}_{\text{Hybrid}}/2 \approx 2\text{mV}$ was used since $\text{LSB}_{\text{Hybrid}} = 1V / (2^8) = 3.9\text{mV}$.

\begin{figure}[h]
\centering
\includegraphics[width=0.5\textwidth]{figure36.png}
\caption{Impacts of flash ADC comparator offsets on the subranging transfer function.}
\end{figure}

Low-power high-speed comparators are typically designed with small device dimensions to minimize parasitic capacitances. However, this will increase the input offset that is inversely proportional to the square root of the transistor layout area. Using large input pair transistors to reduce the mismatch offset is not an option to achieve this goal because they increase the total input capacitance of the flash ADC as well as the kickback noise. Another alternative would be the use of preamplifiers, but they are avoided in this work to minimize power consumption. Since digitally-assisted design approaches are effective for improving robustness to process variations with system-level design flexibility [48], [50], [110], a digital offset calibration scheme was developed for this flash ADC, which is described further in Section 4.10.

Two pairs of transistors ($\text{M}_{15}$-$\text{M}_{18}$) are used to adjust the offset of each comparator (Figure 34) by creating a current imbalance between its branches [62]. The amount of current injected in each branch is controlled through the gate voltage of the calibration transistors. Alternative approaches use varactors [61] or banks of capacitors [36] for
calibration. For flash ADC calibration within the hybrid ADC architecture (Figure 20), using the current injection method rather than the capacitor-based techniques leads to less extra parasitic capacitance as well as reduced kickback noise.

With typical simulation corner models in 130nm CMOS technology, the comparator has a propagation delay of 200ps for a voltage difference of LSB/16 (250µV) between the input and the reference. The comparator consumes 0.143mW at 1GHz. The flash ADC consumes 1mW from a 1.2V supply voltage, including the comparators and resistor ladder.

4.6 Unity-Gain Voltage Buffer

The unity-gain voltage buffer between the SHDAC outputs and CABS ADC inputs (Figure 20) suppresses the loading and kickback noise effects from the CABS ADC at the SHDAC output. Since the flash ADC input range is equal to the full-scale voltage of the hybrid ADC, driving the high-speed flash would demand an operational transconductance amplifier (OTA) with very high input/output swing capabilities, which would result in excessive power consumption. On the other hand, the CABS ADC input range is one eighth of the full-scale range. Thus, a low-voltage swing OTA with lower power is sufficient to drive the second stage. For this reason, the flash ADC is directly connected to the SHDAC output without buffer in this hybrid ADC architecture. The need to isolate the flash from the SHDAC was eliminated with the kickback reduction method within the flash, as described in Section 4.5. Figure 37(a) illustrates the unity-gain buffer configuration to drive the CABS ADCs. The single-stage low-power telescopic OTA [105] designed for the unity-gain buffer is shown in Figure 37(b). The OTA’s DC gain and unity-gain bandwidth are 36.1dB and 1.42GHz with phase margin of 60.1°, providing stability and sufficiently fast settling behavior for this application. Each OTA in the unity-gain configuration consumes 0.65mW from a 1.2V supply.
Figure 37. (a) Unity-gain buffer configuration, (b) schematic of the telescopic OTA in the unity-gain buffer.

The unity-gain buffer’s power supply rejection ratio (PSRR) was evaluated by running 100 Monte Carlo simulations. Figure 38(a) shows the PSRR plots from the transfer function analysis (xf) in Cadence, where the minimum PSRR (at 10MHz) is equal to 54.9dB. Furthermore, linearity distortion was evaluated by applying a sinusoidal input with peak-to-peak amplitude of \( V_{FS}/8 = 62.5\text{mV} \) (i.e., the maximum full-scale range of the second stage), and running 100 Monte Carlo simulations. The total harmonic distortion (THD) was calculated from the FFT of the output voltage, of which the histogram is displayed in Figure 38(b). The simulated worst-case THD is -66.04dB.

Figure 38. Simulated (a) PSRR, and (b) THD of the unity-gain buffer from 100 Monte Carlo simulation runs.
In case of 5% supply voltage drop, the simulated minimum gain and unity-gain frequency of the telescopic OTA are 34.8dB and 1.387GHz, respectively. In addition, the layout of the current mirror for the unity-gain buffers was completed with a common centroid approach to enhance the matching between the buffers in each time-interleaved channel.

4.7 Comparator-Based Asynchronous Binary Search (CABS) ADC

The 5-bit CABS ADC designed for the proposed hybrid ADC is depicted in Figure 39. This CABS architecture is similar to the one introduced in [33], except that the reference voltages are generated by a resistive ladder. The low-power property of the CABS ADC originates from the asynchronous operation because only one comparator operates at a time. With exception of the single comparator in the first stage, which is triggered by CLK\textsubscript{CABS}, the outputs of each comparator in subsequent stages trigger the relevant comparator of the next stage. At each level of the comparator tree, only one of the comparator outputs transitions to high, triggering one of the comparators in the next stage. In Figure 39, V\textsubscript{R.C} is the full-scale voltage of the CABS ADC.

![Figure 39. 5-bit CABS ADC architecture (single-ended equivalent).](image)
As displayed in Figure 40, the CABS output bits are asynchronously resolved from MSB to LSB until the last stage is reached. Unlike conventional SAR ADCs, the CABS ADC does not require a digital control unit or an internal clock signal that is multiple times faster than its conversion rate.

![Figure 40. Outputs of the activated CABS comparators for one 5-bit conversion.](image)

The dynamic comparator shown in Figure 41(a) was designed for the CABS ADC. Memory effects due to the remaining charges on the parasitic capacitances at critical nodes can degrade the performance of the CABS ADC. For this reason, transistors M9-12 are used to reset the comparator nodes when the clock signal is low, which suppresses memory effects. The CMOS inverters at the outputs provide rail-to-rail swing as well as improve the capability to drive the next stage with high speed. As shown in Figure 41(b), a pull up/down latch encoder is embedded in each comparator to resolve and latch the output bit immediately after the comparison has been finished. This helps to reduce the total conversion time of the CABS ADC compared to [38] where all outputs were generated once at the end of the LSB conversion by a more complex CMOS encoder. All binary outputs are held at the encoder outputs without being reset until the next conversion, which ensures reliable latching of the outputs at the end of the conversion.
The maximum propagation delay of the CABS comparator for a voltage difference of LSB/16 is 225ps while clocked at 250MHz and consuming a power of 48µW. As shown in Figure 39, a ladder of 100Ω polysilicon resistors generates the reference voltages. Grounded on-chip MOS bypass capacitors (C_{dec} = 6pF) are connected to every reference voltage node of the resistor ladder to reduce ripples from kickback and input feedthrough. The total layout size of the resistor ladder with bypass capacitors in each CABS ADC is 500µm × 80µm. Further ripple reduction occurs due to partial cancellation of the single-ended ripples in differential mode. The capacitor values were chosen based on transient simulation results to ensure that the peak values of the ripples on the differential reference voltages are below 0.19mV. The total power consumption of the CABS ADC at 250MS/s is 280µW. The standard deviation of the CABS comparator offset is equal to 0.699mV from 100 Monte Carlo runs (Figure 42). This offset was achieved by using input pairs and M_{tail} transistors with relatively large dimensions, as well as conservative layout design with matching techniques.

Figure 41. (a) CABS ADC comparator schematic, (b) pull up/down latch encoder.
The asynchronous operation of the CABS ADC helps to suppress the effect of metastability in this architecture because the decision time budget for each stage is flexible with a large overall timing margin. If metastability occurs in one stage, then the decision in this stage will be made with additional delay, but after that additional delay the comparators in the subsequent stages of the CABS ADC will still be triggered. The propagation delays of the regenerative latch in the CABS comparator and of the complete comparator (i.e., after the output inverters) were evaluated by running transient simulations for various input voltage differences from 1nV to 10mV. The key waveforms and simulation results for metastability evaluation are displayed in Figure 43, showing that the probability of a metastable state is very low, especially with the benefit from asynchronous operation. For instance, let us assume a case in which a metastable condition occurs in one stage of the CABS ADC and the latch outputs become ready after a long delay of 500ps, which is very pessimistic considering the simulation results in Figure 43(b). Note that the comparators of the remaining four stages will have plenty of time during the conversion, especially since their delays should be significantly shorter than the pessimistic worst-case event because the voltage difference at their inputs is relatively large (≥1LSB = 4mV). There is a 2ns time budget for the 5-bit CABS ADC conversion in this design. Thus, no error would occur in such a case as long as all five comparisons are
completed in less than 2ns. However, to evaluate the robustness of the CABS ADC, we estimated the bit error rate (BER) with a pessimistic assumption of losing 1 bit in the event of a 350ps propagation delay. Using the calculation method described in [111], the estimated BER of the CABS ADC is $2.39 \times 10^{-17}$, which is comparable to commercial high-speed ADCs [112].

![Graph](image)

(a)

<table>
<thead>
<tr>
<th>Point</th>
<th>Corner</th>
<th>vi_diff</th>
<th>Pass/Fall</th>
<th>Propagation_Delay</th>
<th>Latch_Delay</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>nom</td>
<td>1n</td>
<td></td>
<td>430.1p</td>
<td>422p</td>
</tr>
<tr>
<td>2</td>
<td>nom</td>
<td>10n</td>
<td></td>
<td>353.8p</td>
<td>335.7p</td>
</tr>
<tr>
<td>3</td>
<td>nom</td>
<td>100n</td>
<td></td>
<td>357.7p</td>
<td>343.4p</td>
</tr>
<tr>
<td>4</td>
<td>nom</td>
<td>1u</td>
<td></td>
<td>321.7p</td>
<td>313.1p</td>
</tr>
<tr>
<td>5</td>
<td>nom</td>
<td>10u</td>
<td></td>
<td>265.9p</td>
<td>276.7p</td>
</tr>
<tr>
<td>6</td>
<td>nom</td>
<td>100u</td>
<td></td>
<td>250.6p</td>
<td>240.3p</td>
</tr>
<tr>
<td>7</td>
<td>nom</td>
<td>1m</td>
<td></td>
<td>215.6p</td>
<td>203.6p</td>
</tr>
<tr>
<td>8</td>
<td>nom</td>
<td>10m</td>
<td></td>
<td>182p</td>
<td>185.7p</td>
</tr>
</tbody>
</table>

(b)

Figure 43. Results from simulations to assess metastability: (a) regenerative latch output waveforms in the CABS comparator, and (b) propagation delay of the latch and the complete comparator.
4.8 Clock Generation System

Clock generation for time-interleaved (TI) analog-to-digital converter (ADC) architectures becomes increasingly challenging at high sampling rates because timing skew mismatches between the channels in a TI architecture can cause significant signal-to-noise ratio (SNR) degradation at high input frequencies [25], [90], [86].

4.8.1 Timing considerations for time-interleaved high-frequency ADCs

For an M-channel time-interleaved ADC with an input signal of \( V_{in} = A \cdot \cos(2\pi f_{in} \cdot t) \), where the peak-to-peak amplitude is \( V_{FS} = 2 \cdot A \), the SNR limit due to the impact of timing skews between channels can be estimated with the following equation [90]:

\[
SNR = 20 \cdot \log \left( \frac{1}{2\pi \cdot f_{in} \cdot \sigma_{skew}} \right), \quad \text{where} \quad \sigma_{skew} = \frac{1}{M} \sum_{i=1}^{M-1} \Delta t_{i}^2, \quad (27)
\]

and \( \Delta t_{i} \) is the timing delay of channel \( i \)'s sampling instance from its nominal value (assuming that the \( \Delta t_{i} \)'s average is 0).

Another important constraint for clock signal generation is noise due to jitter, which is a common challenge for all ADC architectures. Jitter stems from random variations of the sampling instant around the ideal sampling time. It can be caused by the phase noise of the oscillator, thermal noise, and power supply noise. For the same sinusoidal input, the SNR limit of an ADC under the influence of jitter is given by [90]:

\[
SNR = 20 \cdot \log \left( \frac{1}{2\pi \cdot f_{in} \cdot \sigma_{jitter}} \right), \quad (28)
\]

where \( \sigma_{jitter} \) is the root-mean-square jitter in seconds. On the other hand, the quantization noise power can be calculated as follows:

\[
P_{\text{quantization}} = \frac{\Delta^2}{12} = \frac{A^2}{12 \times 2^{2N-2}}, \quad (29)
\]

where \( N \) is the number of bits and \( \Delta = V_{FS}/2^N = A/2^{N-1} \) is the least significant bit (LSB) of the ADC. For a sine input, the signal power is \( P_{\text{sig}} = A^2/2 \). Hence, based on equations (27)
and (28), the power of the timing skew noise ($P_{\text{skew}}$) and of the jitter noise ($P_{\text{jitter}}$) can be estimated as

$$
P_{\text{skew}} = \frac{A^2}{2} \left(2\pi \cdot f_{\text{in}} \cdot \sigma_{\text{skew}}\right)^2 \quad \text{and} \quad P_{\text{jitter}} = \frac{A^2}{2} \left(2\pi \cdot f_{\text{in}} \cdot \sigma_{\text{jitter}}\right)^2 . \quad (30)
$$

The total noise power of the considered non-idealities can be expressed as $P_{\text{noise}} = P_{\text{quantization}} + P_{\text{skew}} + P_{\text{jitter}}$. To limit the SNR degradation from jitter and timing skews to 3dB (i.e., ENOB reduction of 0.5), it is required that $P_{\text{skew}} + P_{\text{jitter}} \leq P_{\text{quantization}}$, which results in the following condition:

$$
\frac{A^2}{2} \left(2\pi \cdot f_{\text{in}} \cdot \sigma_{\text{skew}}\right)^2 + \frac{A^2}{2} \left(2\pi \cdot f_{\text{in}} \cdot \sigma_{\text{jitter}}\right)^2 \leq \frac{A^2}{12 \times 2^{2N-2}} . \quad (31)
$$

It follows from equation (31) that, for an 8-bit 1GS/s ADC with 500MHz (worst case) sinusoidal input, the target values for $\sigma_{\text{skew}}$ and $\sigma_{\text{jitter}}$ should meet the following constraint:

$$
\sqrt{\sigma_{\text{skew}}^2 + \sigma_{\text{jitter}}^2} \leq 1.02 \, \text{ps} . \quad (32)
$$

The driving capability of the clock generation circuit is another design consideration. Clock signals in hybrid time-interleaved ADCs are required to drive input capacitances of numerous gates in addition to the parasitic capacitances associated with long routing distances. To guarantee fast transitions, the clock generation circuitry is designed with extensive use of output buffers that are designed with proper fan-in and fan-out considerations.

Finally, the layout quality of clock signal generation systems in TI ADCs is critical to minimize timing skews and crosstalk. Long and asymmetric routing for high-frequency clock distribution can significantly degrade the ADC performance. Furthermore, the coupling effects of the clock distribution network throughout the chip can degrade the performance of the analog blocks if digital clock signals are routed too close to them.
4.8.2 Clock buffer

An external low-jitter high-frequency signal generator can be used to provide a reference signal that appears sinusoidal-like on the chip if an on-chip frequency synthesizer is not available. As depicted in Figure 44(a), a CMOS inverter-based clock buffer was designed with a chain of inverters to generate sharp square-wave-shaped clocks, i.e. CLK and CLK. To synchronize the edges and of the final clocks in each channel, sets of inverters were added to create the required delays. The flash ADC clock was designed to quantize with an intentional delay after the falling edge of the sampling clock to accommodate the hold settling time of the SHDAC. Furthermore, clocking all dynamic comparators in the flash ADC at the same time requires a high driving capability at 1GHz due to the high total load capacitance. To achieve the two mentioned goals, an inverter chain with a high fan-out was designed to generate the flash ADC clock as shown in Figure 44(b).

![Clock Buffer Diagram]

Figure 44. (a) Generation of delayed versions of the main clock, (b) insertion of further delay and fan-out strength to drive the flash ADC.

4.8.3 Circuit implementation of the clock generation system

The sampling clocks are the ones that predominantly degrade the SNR of the ADC if affected by small amounts of timing skew or jitter, especially with high input signal frequencies. To minimize these clock skews, a clock gating technique [4] has been adapted
to generate all sampling clocks from one reference clock and thereby minimize sampling time mismatches. The ring counter in Figure 45 was employed to generate the four channel gating signals (Phi.1 to Phi.4), for which two examples are shown in Figure 46(a). The gating signals and one reference clock (CLKDL) generate the sampling clocks for each channel via a NAND gate as visualized in Figure 46(b). With this technique, possible timing skews between the gating signals (Phi.1 to Phi.4) do not create timing skews between the final sampling clocks (CLKSAMP.1 - CLKSAMP.4 in Figure 20).

Figure 45. Generation of the gating signals from a single reference clock using a ring counter.

Figure 46. (a) Clock gating scheme and (b) the combinational logic, shown for two of the four identical channels of the hybrid ADC.

CMOS gates operating at high frequencies cause the supply voltage to bounce due to bonding wire inductances, package parasitics, and on-chip routing of power supply and ground lines, which can severely increase jitter. To alleviate such high-frequency noise, the supply for the jitter sensitive blocks (the circuits in Figure 44(a) and Figure 46(b)) is separated from the supply for the rest of the clock generation circuits. In addition, large
on-chip decoupling capacitors (600pF for the jitter-sensitive supply and 200pF for the less critical supply) are connected to the supply voltage pads to reduce high-frequency noise. Figure 47 displays the circuit that was used during simulations to model the pad and chip packaging parasitic elements.

![Figure 47. Modeling the die pad and chip package parasitics.](image)

As indicated in Figure 20, the clock signals $\text{CLK}_{\text{SAMPX,1}}$ through $\text{CLK}_{\text{SAMPX,4}}$ control the bootstrap switches between each SHDAC output and the flash ADC input. These clocks must be non-overlapping with each other to avoid changing the charge that is held on the adjacent SHDACs. Furthermore, in each channel, the rising instant of $\text{CLK}_{\text{SAMPX,i}}$ should be close to the rising instant of the main sampling clock ($\text{CLK}_{\text{SAMP,i}}$). However, clock gating is not required because a difference (less than 30ps) between the start times of these two clocks is acceptable as long as the switches have a low on-resistance. Therefore, a replica of the ring counter configuration in Figure 45 is used (clocked with CLK) to generate Phi.1X to Phi.4X. The reason for using CLK (buffer output in Figure 44 (a)) instead of $\text{CLK}_{\text{DL}}$ is to compensate for the long additional propagation delay (~80ps) from the D flip-flops in the Phi.iX generation circuit. The circuit in Figure 48 generates $\text{CLK}_{\text{SAMPX,i}}$ and $\text{CLK}_{\text{SAMPX,i,B}}$ (shown for two of the four identical channels). It ensures that there is no overlap between neighboring channels, and it provides a proper fan-out.

![Figure 48. Generation of non-overlapping signals for the second set of bootstrap switches (shown for two of the four identical channels).](image)
As illustrated in Figure 49, a ring counter with two cycling “1s” and clocked by CLKDL.C in Figure 44(a) generates the clocks for the CABS ADCs. It generates four clock signals with a duty cycle of 50% and a pulse width of 2ns, while the rising edge of each channel has 1ns latency compared to the next channel. Inverters are used as buffers between the outputs of the D flip-flops and the CABS ADCs to maintain sharp rise and fall times.

![Figure 49. Ring counter with two circulating bits of “1” to produce the clocks for the CABS ADCs.](image)

### 4.8.4 Synchronous clock reset

It is essential to ensure the correct ordering of the clock signals within each channel and of the clock signals between the TI channels. If the reset signal’s falling edge is not defined correctly, it can change the order of the clocks and cause a severe problem. The reset synchronization circuit in Figure 50 prevents this issue. It ensures that the synchronous reset’s falling edge for all three ring counters occurs after the rising edge of CLKDL with a constant lag of one D flip-flop delay.

![Figure 50. Circuit for synchronous resetting.](image)
4.8.5 Layout and routing considerations for clock generation

Figure 51 shows the layout of the clock generation circuits. The sampling clocks are the most critical clock signals, and are therefore placed close to the ADC core to minimize the coupling effects and the routing parasitic capacitances. To ensure fast operation of the clock generation circuitry, a set of high-speed logic gates with low threshold voltage MOS transistors were custom-designed and laid out to be used in the clock generation circuitry. The clock buffer and the clock gating circuits, which are critical for low jitter operation, have their separate supply to reduce supply noise caused by high-frequency transitions. Transistors with low threshold voltage variation (i.e., with long distances from the edges of n-wells) are used to decrease timing skews caused by device-to-device mismatches. The clocks are routed to the four channels of the ADC with an H-tree network. However, unavoidable differences of the routing distances from the SHDACs to the clock generator cause timing skew mismatches. Small metal-oxide-metal (MOM) capacitors (each less than 20fF) have been added to the three fastest channels to compensate for those deviations. As a result, the maximum timing skew from routing was reduced to below 0.3ps.

![Figure 51. Layout of the clock generation system.](image)
The clock generation circuits for the hybrid ADC were designed and simulated in Cadence using 130nm CMOS technology. The fan-out of the buffers was optimized with extracted layout parasitics to be able to drive the additional routing capacitances. A post-layout simulation at 1GHz revealed a 3.88mW power consumption for the clock generation core with a 1.2V supply. The input clock buffer power consumption is 1.45mW with a 1GHz sinusoidal input signal. In addition, a simplified version of the same clock generation circuitry was designed for the calibration mode to drive the extra channel (Figure 54). The clock generation for calibration produces the CLK_SAMP,CAL and CLK,SAMPX,CAL signals, and is only activated during the calibration mode described in Section 4.10.

The layout of the clock generation core (Figure 51) occupies 77µm × 240µm. The simulations included transient noise effects, as well as pad and bonding/package parasitic models causing supply noise. The input signal (at the clock buffer input) is a 1GHz sinusoid with 1.2V_{p-p} amplitude and 0.6V DC, which is terminated by a 50Ω AC-coupled resistor on the chip. The simulations show a jitter of 93fs. To evaluate the timing skews with device mismatches, 100 Monte Carlo simulations were performed in Cadence using foundry-supplied device models, after which the falling edges of the sampling clocks were compared to the sampling clock of the first channel as reference. The histogram of the resulting timing skews is displayed in Figure 52, showing that they have a standard deviation of 958fs. Therefore, total standard deviation of timing errors is 0.963ps, meeting the requirement from equation (32), which is an acceptable range to guarantee the required SNDR for input frequencies up to the Nyquist rate.

![Figure 52. Histogram of timing skews from 100 Monte Carlo simulations.](image-url)
4.9 Channel Bandwidth Mismatch Considerations

According to the analysis for a 4-channel TI ADC in [25], an overdesign of the sample-and-hold bandwidth (BW) suppresses the impact of bandwidth mismatches, as also verified through measurements in [58]. The SHDAC was overdesigned with 6.92GHz bandwidth, which is significantly higher than the maximum input signal BW (max. \( f_{\text{in}} = 500\text{MHz} \)), and it achieves a 60dB SNDR. Monte Carlo simulations with the post-layout extracted netlist of the time-interleaved sampling core of the ADC were performed in Cadence; including the SHDACs, bootstrap switches, flash ADC, unity-gain buffers, and routings. The resulting histogram in Figure 53(a) reveals \( \sigma(BW)/BW = 0.013 \) as the total bandwidth mismatch. This result was obtained from \( f_{3\text{dB}} \) of the channels, defined as \( f_{3\text{dB}} = BW = 1/(2\pi R_{\text{eff}} C_{\text{tot}}) \), where \( R_{\text{eff}} \) is the effective resistance of the sampling path and \( C_{\text{tot}} \) is the total sampling capacitance. According to Figure 17 in reference [25], this value corresponds to a maximum achievable SNR of 57dB when the channel bandwidth is overdesigned by a factor of ten compared to the input bandwidth, implying sufficient margin for this 8-bit time-interleaved architecture.

![Figure 53](image_url)

Figure 53. Histogram of (a) the total simulated BW mismatch \( \sigma(BW)/BW \) among TI channels, (b) the simulated BW mismatch \( \sigma(BW)/BW \) among TI channels caused only by the second bootstrap switch.

The SHDACs of the four TI channels are located as close as possible to each other in the center of the ADC core layout to minimize the systematic BW mismatch from the input routing. The total BW mismatch assessment results in Figure 53(a) are
comprehensive, which also include the impact of the mismatches related to the second bootstrap switches. Theoretically, the impact of BW mismatches from the second bootstrap switch should be very small because it only charges the input capacitance of flash ADC that is 10 times smaller than the total sampling capacitance, implying a much shorter RC time constant in comparison to the main switches. To support this theoretical expectation, another Monte Carlo simulation was completing while activating mismatches only for the devices that are associated with the second bootstrap switches. As shown in Figure 53(b), the resulting bandwidth mismatch from the second switch is \( \sigma(BW)/BW = 0.270m \approx 0.0003 \), having a negligible impact on the SNR limit.

### 4.10 Calibration Technique for Flash ADC Offset Cancellation

The offsets of the flash ADC comparators in the proposed hybrid architecture have two sources: the main one is random static offsets from device mismatches, and the second one is a small systematic offset caused by common-mode voltage (V\textsubscript{CM}) variations at the flash input. The V\textsubscript{CM} variations originate from asymmetric charge injection after sampling and from kickback noise of the flash ADC. The overall systematic offset is larger for higher input amplitudes, creating larger gate-source voltage differences for the comparator input transistors that cause asymmetric characteristics. A calibration system was designed by another group member to remove systematic and random offsets, ensuring proper operation of the flash ADC. A brief overview of this calibration system is provided here.

Figure 54 illustrates the block diagram of the closed-loop foreground offset calibration. The system automatically controls the gate voltages of the coarse tuning (M\textsubscript{15}-M\textsubscript{16}) and fine tuning (M\textsubscript{17}-M\textsubscript{18}) current injection transistors of each comparator in Figure 34 to achieve the required input-referred offset. The coarse transistors have a larger W/L ratio than the fine transistors, such that a change of their gate voltage leads to more drain current change and therefore more offset adjustment. During calibration mode, the flash ADC input is disconnected from the main signal path, and successively connected to each of the seven reference voltages via an extra sampling path identical to the ones in main ADC channels. The calibration range was selected to cover \( 3\cdot\sigma \) of the random offset in addition to the systematic offset, which in this design corresponds to maximum offset compensation of 165mV. The coarse correction has an offset step size of 23mV, and the
fine correction has a step size slightly smaller than 1.4mV. The coarse and fine corrections are controlled by a DAC with 36 levels. The DAC that generates the coarse and fine voltages is implemented with one resistor ladder that has a voltage range from 500mV to 1.2V with steps of 40mV and 20mV for coarse and fine tuning, respectively.

![Diagram](image)

**Figure 54.** Offset calibration system for the flash ADC (single-ended equivalent).

Figure 55 shows offset coverage for the seven comparators when sweeping the coarse voltage from 500mV to 1.18V for both positive and negative offset polarities with steps of 40mV, while the fine voltage is set to minimal and maximal values (500mV and 1.2V). The zigzag characteristic of the plots is due to overlaps of the offset compensation regions defined by the digital coarse codes. The compensation range is designed to have enough overlaps to ensure that no offset value is missed by the calibration. Each comparator has a different coverage range for the same control code because the offset shift through current injection depends on the imbalance of a comparator’s differential reference.
voltages (VRN, VRP in Figure 34). The top and bottom comparators (±375mV reference levels) have the widest coverage range, while the middle comparator (0mV reference level) has the least coverage as evident from Figure 55.

![Figure 55. Offset calibration ranges of the flash comparators.](image)

The calibration scheme is structured to mimic the hybrid ADC’s normal operation. During the calibration mode, all switches between the main SHDACs (1-4) and the flash ADC input in Figure 54 are opened. Instead, the CLK\_SAMP\_CAL signal activates the sampling in the calibration channel. The calibration is executed serially for each comparator and the coarse/fine tuning transistors inside the comparator. The counter bits from the control logic set the switches of the resistor ladder DAC to generate the corresponding reference voltage for each comparator. This DC input voltage is applied through a buffer to be sampled by SHDAC CAL. The unity-gain buffer in Figure 54 is required to drive the SHDAC and to isolate the input reference generation DAC from the kickback noise of the flash ADC. A rail-to-rail operational amplifier with low output resistance similar to [113] was designed for this buffer in the calibration path. The amplifier has an open-loop gain of 64.6 dB, a bandwidth of 690 KHz, and a 44º phase margin. It consumes 4.1mW with a 1.2V supply, but is powered down during the ADC’s normal mode of operation. The outputs of the comparators are latched with DFFs that are clocked by CLK\_SAMP\_CAL\_B, which is a non-overlapped inverted version of CLK\_SAMP\_CAL.
At the beginning of each cycle, the proper differential input reference level is set for the comparator under calibration. Next, the polarity of the offset is determined according to the latched comparator output. Afterwards, the coarse calibration process begins by sweeping the 5-bit coarse code to set the switches for the coarse calibration voltage with steps of 40mV. The coarse calibration stops when the comparator output flips to a different state from the initially detected polarity, which triggers the start of the fine calibration. If the offset reaches the maximum coarse calibration voltage and the output does not flip, then the coarse calibration stops and fine calibration begins automatically. Otherwise, the stored code is set to one code before the flipping of the comparator output, such that the comparator returns to its original polarity. Next, the gates of the coarse tuning transistors in the comparator are set to voltages corresponding to the code at which the coarse calibration stopped. Then, the fine calibration proceeds in the same manner but with voltage steps of 20mV. At the end of the fine calibration, a “calib_done” signal resets all state machines and calibration codes, and starts the next comparator calibration. The control block also increments the counter to activate different switches at the resistor ladder that generates the differential reference voltage for the next comparator. In each cycle, the calibration codes are stored in the memory, which directly sets the switches of the coarse and fine calibration DACs. It was observed from Monte Carlo simulations that a minimum of two consecutive calibrations are required to obtain correct offset compensation codes because the calibration converges to smaller residual offset when the systematic offset at the start of the calibration cycle has already been reduced through a prior cycle. To ensure reliable operation, the calibration was designed to sequentially cycle through the comparators three times.

Since the calibration involves settling times and the calibration logic operates with a clock frequency of 10MHz that is much lower than the ADC sampling clock, the simulations with the AMS simulator in Cadence require a long time. For this reason, the DAC and calibration logic were implemented with Verilog-A modules for Monte Carlo simulations, while the other circuits and components are on the transistor level. Figure 56 presents the histogram of the offset values before and after calibration for the middle comparator (with differential reference of 0V) from 100 Monte Carlo simulation runs of the calibration system with transient noise enabled. The offset standard deviation of this
comparator was 11.02mV before calibration, which reduced to 396µV after calibration. For the other six comparators, the offsets standard deviations before calibration were 21.67mV, 13.65mV, 11.17mV, 12.04mV, 14.38mV and 24.79mV; which after the calibration reduced to 0.79mV, 0.42mV, 0.40mV, 0.41mV, 0.66mV and 0.13mV, respectively. To consider possible variations in supply voltage and/or temperature after the foreground calibration, the flash ADC was simulated for all voltage and temperature corners while using the same calibration codes as from the nominal case (1.2V, 27ºC). The resulting maximum offset of -2.31mV was observed at 85ºC with 1.14V.

![Histograms of the comparator offsets from 100 Monte Carlo runs in the presence of transient noise (a) before and (b) after calibration.](image)

A manual calibration feature was also incorporated to be able to read and write the coarse and fine calibration codes off-chip. In manual calibration mode, the memory in Figure 54 is controlled externally. The write function is performed by setting the address of the memory location and setting the calibration codes externally on the data line.
4.11 Hybrid ADC Post-Layout Simulation Results

The proposed hybrid ADC was designed in 130nm CMOS technology and simulated with Cadence Spectre. All circuits and sub-systems of the ADC’s analog core as well as the digital CMOS blocks were designed on the transistor level and layout level. To simulate the ADC for ENOB and SFDR evaluation, a differential full-scale (1V<sub>p-p</sub>) sinusoid signal from a source with 50 Ω resistance was applied to the ADC inputs through a single-ended to differential balun model with 50Ω termination at each output for impedance matching. Figure 57 shows the dynamic performance of the hybrid ADC from the schematic simulation results for different input frequencies with transient noise enabled. The ADC has a promising SNDR and SFDR over the complete Nyquist bandwidth.

![Figure 57. Hybrid ADC dynamic performance at f_s = 1GS/s vs. input frequency.](image)

The simulation results for different process corners are summarized in Table 4, which reveal an estimated worst-case ENOB of 7.43 in the SS process corner at nominal voltage and temperature. The ADC was also simulated across all possible combinations of process corner cases (SS, TT, FF), supply voltages (1.14V, 1.2V, 1.26V; assuming ±5% variation), and temperatures (-10ºC, 27ºC, 85ºC) without recalibrating for each corner case. The results in Table 5 reveal a worst-case ENOB of 7.12 (SS, 1.26V, -10ºC) with a near Nyquist rate input signal.
Table 4. Hybrid ADC ENOB and SFDR for different process corner cases

<table>
<thead>
<tr>
<th>Process Corner</th>
<th>TT</th>
<th>SS</th>
<th>FF</th>
</tr>
</thead>
<tbody>
<tr>
<td>ENOB (L)</td>
<td>7.83</td>
<td>7.43</td>
<td>7.80</td>
</tr>
<tr>
<td>ENOB (NQ)</td>
<td>7.71</td>
<td>7.61</td>
<td>7.58</td>
</tr>
<tr>
<td>SFDR (dB) (L)</td>
<td>60.20</td>
<td>52.87</td>
<td>63.19</td>
</tr>
<tr>
<td>SFDR (dB) (NQ)</td>
<td>61.59</td>
<td>59.15</td>
<td>59.30</td>
</tr>
</tbody>
</table>

* L.f_{in} = 2.93MHz and NQ.f_{in} = 491.2MHz

Table 5. Detailed simulation results of the hybrid ADC for all PVT corner cases

<table>
<thead>
<tr>
<th>Corner</th>
<th>all Models</th>
<th>size (area, temperature)</th>
<th>Low f_{in} (2.9MHz)</th>
<th>High f_{in} (492MHz)</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td>SNDR</td>
<td>ENOB</td>
</tr>
<tr>
<td>FF_0</td>
<td>ff</td>
<td>1.14 -10</td>
<td>40.73</td>
<td>7.513</td>
</tr>
<tr>
<td>FF_1</td>
<td>ff</td>
<td>1.14 27</td>
<td>45.82</td>
<td>7.533</td>
</tr>
<tr>
<td>FF_2</td>
<td>ff</td>
<td>1.14 05</td>
<td>48.85</td>
<td>7.622</td>
</tr>
<tr>
<td>FF_3</td>
<td>ff</td>
<td>1.2 -10</td>
<td>45.92</td>
<td>7.533</td>
</tr>
<tr>
<td>FF_4</td>
<td>ff</td>
<td>1.2 27</td>
<td>48.73</td>
<td>7.803</td>
</tr>
<tr>
<td>FF_5</td>
<td>ff</td>
<td>1.2 05</td>
<td>48.64</td>
<td>7.522</td>
</tr>
<tr>
<td>FF_6</td>
<td>ff</td>
<td>1.20 -10</td>
<td>45.06</td>
<td>7.761</td>
</tr>
<tr>
<td>FF_7</td>
<td>ff</td>
<td>1.22 27</td>
<td>45.43</td>
<td>7.763</td>
</tr>
<tr>
<td>FF_8</td>
<td>ff</td>
<td>1.28 05</td>
<td>48.22</td>
<td>7.718</td>
</tr>
<tr>
<td>SS_0</td>
<td>ss</td>
<td>1.14 -10</td>
<td>45.03</td>
<td>7.196</td>
</tr>
<tr>
<td>SS_1</td>
<td>ss</td>
<td>1.14 27</td>
<td>45.05</td>
<td>7.495</td>
</tr>
<tr>
<td>SS_2</td>
<td>ss</td>
<td>1.14 05</td>
<td>45.10</td>
<td>7.370</td>
</tr>
<tr>
<td>SS_3</td>
<td>ss</td>
<td>1.2 -10</td>
<td>45.67</td>
<td>7.293</td>
</tr>
<tr>
<td>SS_4</td>
<td>ss</td>
<td>1.2 27</td>
<td>45.67</td>
<td>7.481</td>
</tr>
<tr>
<td>SS_5</td>
<td>ss</td>
<td>1.2 05</td>
<td>47.54</td>
<td>7.585</td>
</tr>
<tr>
<td>SS_6</td>
<td>ss</td>
<td>1.26 -10</td>
<td>45.25</td>
<td>7.224</td>
</tr>
<tr>
<td>SS_7</td>
<td>ss</td>
<td>1.26 27</td>
<td>46.69</td>
<td>7.464</td>
</tr>
<tr>
<td>SS_0</td>
<td>ss</td>
<td>1.26 05</td>
<td>47.39</td>
<td>7.579</td>
</tr>
<tr>
<td>TT_0</td>
<td>ff</td>
<td>1.14 -10</td>
<td>45.05</td>
<td>7.765</td>
</tr>
<tr>
<td>TT_1</td>
<td>ff</td>
<td>1.14 27</td>
<td>45.53</td>
<td>7.776</td>
</tr>
<tr>
<td>TT_2</td>
<td>ff</td>
<td>1.14 05</td>
<td>48.81</td>
<td>7.616</td>
</tr>
<tr>
<td>TT_3</td>
<td>ff</td>
<td>1.2 -10</td>
<td>45.38</td>
<td>7.741</td>
</tr>
<tr>
<td>TT_4</td>
<td>ff</td>
<td>1.2 27</td>
<td>48.87</td>
<td>7.526</td>
</tr>
<tr>
<td>TT_5</td>
<td>ff</td>
<td>1.2 05</td>
<td>47.73</td>
<td>7.006</td>
</tr>
<tr>
<td>TT_6</td>
<td>ff</td>
<td>1.20 -10</td>
<td>47.78</td>
<td>7.541</td>
</tr>
<tr>
<td>TT_7</td>
<td>ff</td>
<td>1.20 27</td>
<td>45.45</td>
<td>7.755</td>
</tr>
<tr>
<td>TT_8</td>
<td>ff</td>
<td>1.20 05</td>
<td>48.55</td>
<td>7.774</td>
</tr>
</tbody>
</table>

The layout of the ADC core and digital offset calibration was completed in 130nm CMOS technology as shown in Figure 58. The hybrid ADC occupies 1600µm × 450µm, including the flash and CABS ADCs, SHDACs, unity-gain buffers, clock generation and all digital circuitry. The on-chip digital calibration circuitry occupies 1300µm × 750µm (0.68mm² after subtracting the 30% empty space), which includes the calibration logic, DACs, low-bandwidth unity-gain buffers, and one extra SHDAC. While the choice of a
CABS ADC within the hybrid ADC has speed and power benefits, it can be observed from Table 6 that this particular design occupies a relatively large layout area. In this prototype, the 5-bit CABS ADC (each: 180µm × 560µm) has not been optimized for area efficiency. Instead, a conservative floor plan was implemented to route signals with minimal crossings of wires and minimal routing over active devices. Furthermore, note that designing a CABS ADC in technologies with shorter channel length results in significant area reduction. For example, the 6-bit 250MS/s CABS ADC in [33] only occupies 60µm × 200µm in 90nm CMOS technology.

![Figure 58. Layout of the hybrid ADC.](image)

The calibration circuits were synthesized from Verilog HDL code and verified by comparing post-layouts simulation results of the standalone calibration path with the results from the initial Verilog-A modules. Due to simulation resource constraints, the automatic flash offset calibration was verified by simulating the hybrid ADC core layout with extracted parasitics in a test setup with the Verilog-A modules. Afterwards, the optimized codes were set for the simulation of the complete hybrid ADC layout with extracted parasitics in a different testbench. Figure 59 displays the output spectra of the hybrid ADC from post-layout simulations at a sampling rate of 1GS/s with a low-
frequency input and a high-frequency input (transient noise enabled). Furthermore, the impact of flash ADC comparator offsets on the overall ADC performance was evaluated. Figure 59 also contains the output spectra before offset calibration, demonstrating that the ENOB improvement through calibration is more than 1 bit. The ADC achieves an ENOB of 7.37 and a SFDR of 56.73dB with an input close to the Nyquist frequency (Figure 59 (b)) based on this post-layout simulation with clock signal generation circuits and loading from the extra SHDAC for calibration.

![Output spectra before offset calibration](image)

Figure 59. Output spectra (1024-point FFT) of the 8-bit 1GS/s hybrid ADC from post-layout simulation: (a) $f_{in} = 6.84MHz$, (b) $f_{in} = 491.2MHz$ with and without flash offset calibration.

The impact of mismatches in the unity-gain buffers on the performance of the hybrid ADC was assessed with Monte Carlo simulations. A worst-case ENOB of 7.34 was
observed, in which the degradation is mainly caused by the buffer offset and gain mismatches among the TI channels. To evaluate the robustness of the correction method introduced in Section 4.2, extreme cases with ±5mV error (>1LSB = 4mV) in the SHDAC reference voltages were simulated, which is equal to ±0.01 variation in term of the scaling factor “g”. The minimum ENOB resulting from the described condition was 7.48, which is comparable to the nominal case (i.e., 7.71 from Table 4).

From post-layout simulations with $F_s = 1$GHz, the analog and mixed-signal core of the ADC consumes 8.18mW from a 1.2V analog supply; including the SHDACs, bootstrap switches, flash ADC, CABS ADCs, and buffers. The power consumption of all digital circuits is 1.36mW from a separate 1.2V digital supply; which includes the DFF sets, control logic for the SHDACs, and the thermometer-to-binary output encoders for the flash ADC. The power consumption of the clock generation circuits is 3.75mW. Figure 60 shows a diagram that visualizes the power dissipation breakdown. The total power consumption of the 8-bit 1GS/s hybrid ADC is 13.29mW. Based on the post-layout simulations, the ADC has a figure-of-merit [$\text{FoM} = \text{Power} / (2^\text{ENOB} \times F_s)$] of 80.3fJ/conv. step for the near-Nyquist input frequency. The estimated power consumption of the calibration system is 600µW when it is activated with a 10MHz clock. Since the offset calibration is deactivated during normal operation, its power consumption was not included in the total ADC power.

![Figure 60. Breakdown of the simulated power consumptions in the hybrid ADC.](image)

### 4.12 Interpretation of the Hybrid ADC Simulation Results

Table 6 contains an overview of the hybrid ADC specifications in comparison to state-of-the-art ADCs having similar resolution and speed. The proposed ADC in 130nm CMOS technology is among the ADCs with relatively low FoM due to its power...
efficiency. It is noteworthy that the ADCs with lower FoM were designed in 28nm, 45nm, and 65nm CMOS technologies with the benefit of transistors that have higher transition frequencies \(f_T\) and considerably lower parasitic capacitances. In a technology with shorter channel length, one can expect a power reduction and FoM improvement for the presented hybrid ADC. On the other hand, most other results listed in Table 6 are measurement results. The circuits of the presented ADC were overdesigned with margins and simulated to consider impacts of PVT variations. Nevertheless, some performance degradation is expected after fabrication due to non-idealities such as offset, gain, and timing mismatches between channels in time-interleaved architectures [25]. In particular, the effect of timing skews among the sampling clocks can degrade SNDR for higher input frequencies. Fortunately, such non-idealities are fairly well-known, and can be calibrated with on-chip [4], [23], [37] or off-chip techniques [24], [36]. In comparison to the specifications from other works in Table 6, the simulation results of the proposed ADC architecture provide a promising proof-of-concept for its feasibility.

To fairly assess the overall efficiency of this hybrid ADC architecture, it can be compared to other ADC architectures designed under similar technology constraints. The ADCs in Table 7 were all designed in 130nm and 90nm CMOS technologies with similar resolutions and sampling rates. The ADCs with flash architectures [9], [101], [114] have a minimum FoM of 1386 fJ/conv.step for resolutions ranging from 6-bit to 8-bit and sampling rates from 1.2GS/s to 1.6GS/s in 130nm CMOS. The FoM reported for the folding ADC architecture in [102] (7-bit, 0.8GS/s) is 3834 fJ/conv.step. Among the two-step TI architectures in [10], [104] (both 6-bit 1GS/s), a minimum FoM of 1239 fJ/conv.step was reported. The 7-bit TI-pipelined ADC in [96] has a FoM of 462 fJ/conv.step with 1.1GS/s operation. For reported ADCs having 6-bit to 9-bit resolution and 0.6GS/s to 1.25GS/s sampling rates, TI-SAR architectures [3], [99], [115] have the tendency to be the most power-efficient designs, reaching a FoM down to 215.1 fJ/conv.step in 130nm CMOS. The hybrid ADC architecture presented in this work has a FoM of 80.3 fJ/conv.step, which compares favorably to the other architectures in 130nm and 90nm CMOS technologies.
Table 6. Performance summary and comparison

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Schematic</td>
<td>Post-layout</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Sampling Rate (GS/s)</td>
<td>1</td>
<td>1</td>
<td>1.25</td>
<td>1.5</td>
<td>1.62</td>
<td>0.75</td>
<td>1</td>
<td>1</td>
<td>1.6</td>
<td>0.6</td>
<td>1</td>
<td>1.4</td>
<td></td>
</tr>
<tr>
<td>Resolution (bit)</td>
<td>8</td>
<td>8</td>
<td>6</td>
<td>6</td>
<td>7</td>
<td>9</td>
<td>8</td>
<td>8</td>
<td>10</td>
<td>10</td>
<td>6</td>
<td>8</td>
<td>7</td>
</tr>
<tr>
<td>CMOS Technology (nm)</td>
<td>130</td>
<td>130</td>
<td>55</td>
<td>65</td>
<td>90</td>
<td>40</td>
<td>28</td>
<td>65</td>
<td>65</td>
<td>45</td>
<td>130</td>
<td>65</td>
<td>45</td>
</tr>
<tr>
<td>ENOB @ NQ³</td>
<td>7.71</td>
<td>7.37</td>
<td>6.19</td>
<td>5.0</td>
<td>5.25</td>
<td>6.05</td>
<td>7.68</td>
<td>6.9</td>
<td>6.84</td>
<td>8.25</td>
<td>9.03</td>
<td>5.02</td>
<td>7.6</td>
</tr>
<tr>
<td>SNDR @ NQ (dB)</td>
<td>48.15</td>
<td>46.16</td>
<td>39</td>
<td>32</td>
<td>33.4</td>
<td>38.2</td>
<td>48</td>
<td>43.3</td>
<td>42.98</td>
<td>51.4</td>
<td>56.1</td>
<td>32</td>
<td>47.5</td>
</tr>
<tr>
<td>SFDR @ NQ (dB)</td>
<td>61.59</td>
<td>56.73</td>
<td>53</td>
<td>35</td>
<td>41.03</td>
<td>46.6</td>
<td>62</td>
<td>57.5</td>
<td>58.56</td>
<td>59.1</td>
<td>61.2</td>
<td>46</td>
<td>57.8</td>
</tr>
<tr>
<td>Supply Voltage (V)</td>
<td>1.2</td>
<td>1.2</td>
<td>1.2</td>
<td>1.2</td>
<td>1.2</td>
<td>1.1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1.1</td>
<td>1.2</td>
<td>1.2</td>
<td>1.15</td>
</tr>
<tr>
<td>Power (mW)</td>
<td>13.4</td>
<td>13.29</td>
<td>16</td>
<td>32</td>
<td>62</td>
<td>204</td>
<td>93</td>
<td>4.5</td>
<td>3.8</td>
<td>18.9</td>
<td>17.3</td>
<td>5.3</td>
<td>33.24</td>
</tr>
<tr>
<td>Area (mm²)</td>
<td>-</td>
<td>0.72⁵</td>
<td>1.4⁶</td>
<td>0.2</td>
<td>0.09</td>
<td>0.3</td>
<td>1.2</td>
<td>0.83</td>
<td>0.004</td>
<td>0.013</td>
<td>0.78</td>
<td>0.36</td>
<td>0.12</td>
</tr>
<tr>
<td>FoM⁴ @ NQ (fJ/conv. step)</td>
<td>64.4</td>
<td>80.3</td>
<td>219</td>
<td>800</td>
<td>1629</td>
<td>2053</td>
<td>283</td>
<td>50.2</td>
<td>33.2</td>
<td>62.3</td>
<td>21</td>
<td>272</td>
<td>390</td>
</tr>
</tbody>
</table>

1: Simulation results,  2: measurement results,  3: NQ = Nyquist-rate input frequency,  4: FoM = Power / (2^ENOB @ NQ × F_s),  5: ADC core area only,  6: total area with calibration circuitry
Table 7. Specification summary of high-speed ADCs designed in 130nm and 90nm CMOS technologies

<table>
<thead>
<tr>
<th>Ref.</th>
<th>Architecture</th>
<th>Technology (CMOS)</th>
<th>Resolution (bit)</th>
<th>fs (GHz)</th>
<th>SNDR @ NQ (dB)</th>
<th>ENOB @ freq.</th>
<th>Power (mW)</th>
<th>FoM @ NQ (fJ/conv.step)</th>
</tr>
</thead>
<tbody>
<tr>
<td>[9]</td>
<td>Flash</td>
<td>130nm</td>
<td>6</td>
<td>1.6</td>
<td>31.0</td>
<td>4.86</td>
<td>180</td>
<td>3874</td>
</tr>
<tr>
<td>[114]</td>
<td>Flash</td>
<td>130nm</td>
<td>6</td>
<td>1.2</td>
<td>35.5</td>
<td>5.60</td>
<td>160</td>
<td>2749</td>
</tr>
<tr>
<td>[101]</td>
<td>Flash</td>
<td>90nm</td>
<td>8</td>
<td>1.25</td>
<td>43.3</td>
<td>6.90</td>
<td>207</td>
<td>1387</td>
</tr>
<tr>
<td>[102]</td>
<td>Folding</td>
<td>90nm</td>
<td>7</td>
<td>0.8</td>
<td>33.6</td>
<td>5.29</td>
<td>120</td>
<td>3834</td>
</tr>
<tr>
<td>[10]</td>
<td>Two-Step-TI</td>
<td>130nm</td>
<td>6</td>
<td>1</td>
<td>33.7</td>
<td>5.31</td>
<td>49</td>
<td>1239</td>
</tr>
<tr>
<td>[104]</td>
<td>Two-Step-TI</td>
<td>90nm</td>
<td>6</td>
<td>1</td>
<td>33.8</td>
<td>5.32</td>
<td>55</td>
<td>1375</td>
</tr>
<tr>
<td>[96]</td>
<td>TI-pipelined</td>
<td>90nm</td>
<td>7</td>
<td>1.1</td>
<td>36.1</td>
<td>5.70</td>
<td>92</td>
<td>1609</td>
</tr>
<tr>
<td>[3]</td>
<td>TI-SAR</td>
<td>130nm</td>
<td>6</td>
<td>1.25</td>
<td>32.0</td>
<td>5.0</td>
<td>32</td>
<td>800</td>
</tr>
<tr>
<td>[115]</td>
<td>TI-SAR</td>
<td>130nm</td>
<td>9</td>
<td>0.6</td>
<td>43.0</td>
<td>6.85</td>
<td>23.6</td>
<td>340</td>
</tr>
<tr>
<td>[99]</td>
<td>TI-SAR</td>
<td>130nm</td>
<td>6</td>
<td>0.6</td>
<td>32.0</td>
<td>5.02</td>
<td>5.3</td>
<td>272</td>
</tr>
<tr>
<td>This work</td>
<td>Subrange-TI</td>
<td>130nm</td>
<td>8</td>
<td>1</td>
<td>46.16</td>
<td>7.37</td>
<td>13.29</td>
<td>80.3</td>
</tr>
</tbody>
</table>
4.13 Summary

A low-power high-speed hybrid ADC has been proposed and described in chapters 3 and 4 of this dissertation. The subranging time-interleaved architecture is comprised of a 3-bit flash ADC converting at full speed in the first stage, and four time-interleaved 5-bit CABS ADCs in the second stage. Sharing of the high-speed flash ADC between time-interleaved channels together with utilizing high-speed power-efficient CABS ADCs resulted in high input bandwidth and conversion rate with relatively low power consumption. A novel sample-and-hold and capacitive digital-to-analog converter (SHDAC) was constructed to perform the sampling and residue generation for the subranging operation in each channel. The analysis in this dissertation revealed how the proposed design approach alleviates linearity errors during the residue generation. Furthermore, the sampling network configuration incorporates an error reduction technique to alleviate the clock feedthrough of bootstrap switches. The offsets of the comparators in the flash ADC are calibrated in the foreground using a built-in reference signal via an extra sampling channel. A prototype ADC was designed and simulated in 130nm CMOS technology. Based on post-layout simulations, it achieves an ENOB above 7.37 over the Nyquist bandwidth while operating at 1GS/s. The power consumption of the ADC is 13.3mW from 1.2V analog and digital supplies.
5. Hybrid ADC Testing and Measurement Results

The input/output interface circuits, printed circuit board (PCB) design, and the experimental setup of the hybrid ADC are described in this chapter. Measurement results are provided and interpreted through discussions.

5.1 Bit Alignment Unit

To acquire the digital outputs of the time-interleaved hybrid ADC (Figure 20), the outputs of the multiple channels can be read as one multiplexed channel or as multiple (separated) channels. Multiplexing the channels into one high-frequency (e.g., > 1GHz) channel is possible on the chip. However, sending the data off-chip with such a high rate is very challenging. To avoid this issue, the multiplexed output is often down-sampled to a much lower rate (for example, to less than 50 MHz) on the chip before transferring it to outside of the chip. In this work, the four TI channels are read separately, each at 250MS/s, which is a practical data rate for high-speed interface standards such as low-voltage differential signaling (LVDS). For this purpose, the outputs of channels 1-3 are delayed by additional DFFs for synchronization with channel 4 to ease the acquisition with a logic analyzer, as shown in Figure 61.

Figure 61. Bit alignment unit for the hybrid ADC.
CLKCABS4_B is used as the reference clock for bit alignment, and proper on-chip buffers were inserted to assure its driving capability with the fan-out requirement. Figure 62 displays the synchronized outputs of the bit alignment unit. If the outputs were not synchronized with the clock edge of one channel, then the logic analyzer would have to read the outputs with a four times higher rate in order to account for the latency between the four channels.

Figure 62. Timing of the bit alignment unit’s synchronized outputs.

5.2 Low-Voltage Differential Signaling (LVDS) Driver

LVDS is widely used for data communication. In addition to the high-speed switching capability, LVDS also has the benefit of common-mode rejection. Hence, noise and distortion will be partially cancelled out by the differential processing in the LVDS receiver. A conventional on-chip LVDS driver circuit [118] was designed in this work, which is shown in Figure 63 where $V_{IP,LVDS}$ and $V_{IN,LVDS}$ are the digital CMOS inputs of the LVDS driver coming from the bit alignment unit (Figure 61) and $V_{OP,LVDS}$ and $V_{ON,LVDS}$ are the low-swing differential LVDS outputs going to the pads of the ADC chip. When $V_{IP,LVDS}$ is high (VDD) and $V_{IN,LVDS}$ is low (0V), M1 and M4 are turned on, and M2 and M3 are turned off. Therefore, a current from the PMOS and NMOS tail transistors ($M_{tp}$ and $M_{tn}$) circulates through M1/M4 and an off-chip differential 100Ω termination resistor that is connected between the two outputs. The current through the 100Ω termination resistor is converted to a single-ended voltage signal in the LVDS receiver.
The LVDS driver was evaluated through simulations using a pad and package model (Figure 47) at each output in addition to a 6pF capacitor and 100Ω resistor between the differential outputs to model the off-chip LVDS receiver IC termination. Figure 64 displays the simulated differential output of the LVDS driver. A total of 32 on-chip LVDS drivers are used to transfer the 4-channel 8-bit output data of the ADC. A 1-to-32 PMOS and NMOS current mirror was designed to bias all LVDS drivers using an off-chip reference current that can range from 2mA to 3mA. Based on simulations, the LVDS driver is designed to operate at up to 500Mbps with 300mVp-p to 500mVp-p differential amplitudes, depending on the reference current generated on the PCB.

Figure 63. On-chip LVDS Driver circuit schematic.

Figure 64. Simulated transient differential output waveform of the LVDS driver.
5.3 ADC Chip Fabrication and Packaging

The hybrid ADC die was fabricated in 130nm CMOS technology and assembled in a TQFP128 package with a ground plane underneath. The full layout of the ADC is shown in Figure 65. A pad frame with 132 pads was created, where 3 of the pads are bonded to the ground plane of the package and one pad is unused. Due to the required density rules of the technology, the majority of the area on chip was filled with top metal layers.

Figure 66 displays the micrograph of the fabricated ADC. To benefit from the available 4mm×4mm silicon area, most of the unused areas include large decoupling MIM
capacitors to reduce the high-frequency noise on sensitive reference and supply voltage lines. All pins on the left and right side of the chip are assigned to 8-bit differential LVDS outputs of channels 1-4 (64 pins in total). Several pads are assigned to power supply and ground to reduce the overall bonding wire inductances in order to minimize the bouncing noise. The pads assigned to the ADC input and clock signals are located on the two opposite sides (bottom and top) of the chip to minimize the coupling between the corresponding bonding wires as well as the on-chip routings.

Figure 66. Micrograph of the fabricated hybrid ADC chip.
The ADC core occupies 0.69mm², including all SHDACs, CABS ADCs, flash ADC, bootstrap switches, DFFs, and thermometer-to-binary encoders. The clock generation circuits occupy 0.03mm², including the on-chip clock buffer that is only added for the prototype test interface. Combined, the ADC core and clock generation occupy area of 0.72mm². The digital calibration occupies 0.71mm², which includes the calibration logic, DAC, test signal generation, and extra calibration channel (SHDAC, unity-gain buffer).

5.4 Hybrid ADC Test Setup

5.4.1 Interface circuits for the input and clock signals of the ADC

As shown in Figure 67, an on-chip matching network with two 50Ω high-precision poly resistors ensures impedance matching. The common-mode voltage of the differential input signals is generated off-chip and is buffered through an opamp IC (TI OPA2626) in unity-gain configuration. The opamp has only 1Ω open-loop output resistance, which provides a very low impedance for the reference voltage. The 2Ω series resistor at the output of the opamp improves stability. The single-ended input signal is generated with an RF signal generator (Keysight N5173B). The tunable band-pass filter (BPF) serves as an antialiasing filter to ensure removal of the non-desired harmonics of the sinusoidal input signal. Using a BPF is optional, but highly recommended. The DC blockers remove the DC level from the external signals because the DC level of the inputs is set by the on-chip matching network.

![Figure 67. Test setup configuration at the ADC’s differential inputs.](image)

A single-ended to differential balun (Marki BAL-0006) is used to convert the single-ended input signal to differential signals at the ADC inputs (Figure 67). However, it was
observed that the linearity and amplitude balance of the balun’s differential outputs degrade for frequencies below 50MHz. Therefore, an evaluation board with a single-ended to differential amplifier configuration (TI THS4509 EVM) is used to drive the ADC for that input frequency range (f_{in} < 50MHz), as depicted in Figure 68.

![Figure 68](image)

**Figure 68.** Test setup configuration at the ADC’s differential inputs for low-frequency measurements.

The 1GHz clock signal is generated by a low-jitter RF signal generator (Agilent E8257D), and applied as shown in Figure 69. A bias tee (Mini-Circuits ZFBT-6GW-FT+) is used to set the DC level of the sinusoidal clock signal. The clock signal is terminated on the chip with an AC-coupled 50Ω resistor to assure impedance matching.

![Figure 69](image)

**Figure 69.** Test setup configuration for the single-ended 1GHz input clock signal.

### 5.4.2 ADC output interface

Figure 70 displays the interface to acquire the digital outputs of the hybrid ADC. An LVDS receiver IC (TI LVDT386) converts the differential LVDS outputs to the single-
ended signals (LVTTL) before the acquisition with a logic analyzer (Keysight 16852A). This LVDS receiver IC can detect differential voltages as low as 100mV_{p-p} and consists of 16 receiver channels with an integrated 110Ω termination resistor for each channel. The logic analyzer can read the LVTTL signals at sampling rates up to 2.5GS/s, which is 10 times faster than the ADC output channels.

![Figure 70. Test setup configuration at the ADC’s outputs.](image)

### 5.5 Printed Circuit Board

A custom printed circuit board (PCB) was designed to evaluate the hybrid ADC. As shown in Figure 71, SMA connectors are used for the ADC inputs and clock signal, as well as for the 10MHz clock for the calibration logic. Adjustable voltage regulators (Maxim MAX8526) generate separate 1.2V analog and digital supply voltages for the hybrid ADC chip, as well as a 2.5V for the opamp ICs and a 3.3V for the LVDS receiver ICs. The reference voltages are generated on-board and delivered to the ADC through unity-gain voltage buffers with a low output impedance, similar to the generation of V_{CM} in Figure 67. For bias voltages that are connected to high-impedance nodes on the ADC
chip (gate of MOS transistors), simple voltage dividers were employed without buffering. The ADC can operate in normal mode and calibration mode, which can be set by an on-
board switch. Another switch is used to set and reset the clock generation circuitry. To also allow manual calibration, the PCB contains DIP switches for the 6-bit input data of the coarse and fine codes, 4-bit address lines, and control signals of the calibration logic.

Figure 71. Evaluation board for the hybrid ADC.
5.6 Manual Flash ADC Offset Calibration

As mentioned in Section 4.10, a manual calibration option allows to read and write the calibration codes into the on-chip memory for control of the fine and coarse voltages connected the calibration transistors in the flash ADC comparators. The range for the 5-bit coarse code is from 00000 to 10010 (0 to 18 in decimal), and the range for the 6-bit fine code is from 000000 to 10011 (0 to 35 in decimal). The on-chip memory has 14 rows of 6-bit codes. 4-bit memory addresses of 0001 to 0111 (1 to 7 in decimal) are assigned to save the 5-bit coarse code in addition to the 1-bit polarity, and addresses of 1001 to 1111 (9 to 15 in decimal) are used to save the 6-bit fine code for the corresponding comparators in the flash ADC.

To manually calibrate the flash ADC, the differential nonlinearity (DNL) and integral nonlinearity (INL) of the hybrid ADC were first measured for the case in which all coarse and fine codes are reset to 0 (no calibration). The polarity and amount of offset were estimated according to the DNL and INL values at the codes with transitions of the 3 MSBs. For the tested ADC chip on the PCB, the optimum coarse and fine codes with corresponding polarity for each comparator in the flash ADC (Figure 33) are listed in Table 8. The impact of the flash offset calibration on ADC performance can be observed from the measurement results in the next section.

<table>
<thead>
<tr>
<th>Comparator</th>
<th>Polarity</th>
<th>Coarse Code</th>
<th>Fine Code</th>
</tr>
</thead>
<tbody>
<tr>
<td>Comparator 1</td>
<td>1</td>
<td>00011 (3)</td>
<td>000100 (4)</td>
</tr>
<tr>
<td>Comparator 2</td>
<td>1</td>
<td>00000 (0)</td>
<td>000010 (2)</td>
</tr>
<tr>
<td>Comparator 3</td>
<td>0</td>
<td>00010 (2)</td>
<td>001011 (11)</td>
</tr>
<tr>
<td>Comparator 4</td>
<td>0</td>
<td>00010 (2)</td>
<td>010000 (16)</td>
</tr>
<tr>
<td>Comparator 5</td>
<td>0</td>
<td>00110 (6)</td>
<td>100000 (32)</td>
</tr>
<tr>
<td>Comparator 6</td>
<td>0</td>
<td>00011 (3)</td>
<td>010100 (20)</td>
</tr>
<tr>
<td>Comparator 7</td>
<td>0</td>
<td>00101 (5)</td>
<td>011000 (24)</td>
</tr>
</tbody>
</table>
5.7 Measurement Results

The ADC performance was measured after proper adjustment of the reference and bias voltages, and with the optimum calibration codes written into the on-chip memory. The sinusoidal input and 1GHz clock signals were applied as described in Section 5.4.1. After the recording of the digital output data from the four channels with the logic analyzer, it was evaluated in MATLAB.

A histogram testing method was employed to evaluate DNL and INL errors of the hybrid ADC [119]. The DNL and INL were measured with a 3.234863 MHz sinusoidal input signal having a swing slightly larger than the full-scale voltage. This swing ensures that the output of the ADC is slightly clipped at the peaks, such that all codes will be presented in the acquired data. With the histogram testing method, a large number of data points is required to assure accuracy of the calculation results. The DNL and INL were statistically calculated in MATLAB by comparing the resulting histogram of the sinusoidal output data to the bath tub shape histogram of an ideal ADC with the same sinusoidal signal.

The measured DNL and INL for the 8-bit outputs of the hybrid ADC revealed large nonlinearity errors (DNL of -1/+1.88 LSB_8-bit and INL of -3.58/+2.79 LSB_8-bit after flash ADC calibration), which also caused degradation that limited the dynamic performance (SNDR and SFDR) of the ADC for 8-bit accuracy. In retrospect, the main factor for the large 8-bit DNL/INL is variation of the fabricated CABS ADC comparators offsets caused by random device mismatches. The CABS comparator offsets were estimated under the assumption of correlations between parameters during the Monte Carlo simulations, which was based on the proximity of the devices on the chip and matched layout configurations. However, the measurements revealed that the offsets after fabrication were higher than the estimations. As observed during measurements, removing the last two LSBs of the CABS ADC stage leads to suitable linearity performance for the ADC when evaluated with 6-bit accuracy. For this reason, the results in this section focus mainly on tests with 6-bit resolution (i.e., not using the last two LSB outputs). Figure 72 and Figure 73 display measured DNL and INL errors of the hybrid ADC with 6-bit equivalence before and after flash ADC calibration, respectively. The results demonstrate that the nonlinearity errors
of the hybrid ADC have been significantly reduced by the calibration of the flash ADC. As seen from Figure 73, the 6-bit DNL and INL errors after flash ADC offset calibration are within -0.41/+0.50 LSB and -0.77/+0.52 LSB, respectively.

![Figure 72. Measured DNL and INL of the hybrid ADC before flash ADC calibration (6-bit evaluation).](image1)

![Figure 73. Measured DNL and INL of the hybrid ADC after flash ADC calibration (6-bit evaluation).](image2)
Figure 74 displays the output spectra of the 1GS/s 8-bit hybrid ADC from the captured data with low-frequency \((f_{in} = 10.193\text{MHz})\) and high-frequency \((f_{in} = 493.958\text{MHz})\) sinusoidal full-scale input signals before and after flash ADC offset calibration.

![Figure 74](image)

**Figure 74.** Measured output spectra (8192-point FFT) of the 8-bit 1GS/s hybrid ADC output for (a) \(f_{in} = 10.193\text{MHz}\), (b) \(f_{in} = 493.958\text{MHz}\) before and after flash offset calibration.
As seen from Figure 74, the ENOB of the hybrid ADC has been improved through the flash ADC offset calibration by 2.02 and 1.31 for the low and high input frequencies, respectively. The 8-bit hybrid ADC achieves 36.32dB SNDR and 48.06dB SFDR with a low input frequency, and 34.74dB SNDR and 46.03dB SFDR with a near Nyquist rate input frequency. The minimum ENOB of the 8-bit 1GS/s ADC with a near Nyquist rate input frequency is 5.48. As explained earlier in this section, the ENOB degradation of the 8-bit ADC is mainly due to the relatively high nonlinearity (DNL and INL) errors for 8-bit accuracy that cause large distortions in the output of the ADC.
Figure 75 shows the measured output spectra of the hybrid ADC with 6-bit evaluation at a sampling rate of 1GS/s with low-frequency and high-frequency (near Nyquist rate) sinusoidal full-scale input signals before and after flash ADC offset calibration.

Figure 75. Measured output spectra (8192-point FFT) of the 6-bit 1GS/s hybrid ADC output for (a) $f_{in} = 10.193$MHz, (b) $f_{in} = 493.958$MHz before and after flash offset calibration.
As observed from the measurements (Figure 75), by dropping the last two LSBs from the 8-bit output (i.e., 6-bit evaluation), the ENOB at Nyquist rate only degraded by 0.22 bit in comparison to the 8-bit case (Figure 74). The flash offset calibration has improved the ENOB of the 6-bit hybrid ADC output by 1.87 and 1.22 for low and high input frequencies, respectively. The 6-bit 1GS/s hybrid ADC achieves 34.94dB SNDR and 48.52dB SFDR with a low input frequency, and 33.42dB SNDR and 45.71dB SFDR with a near Nyquist rate input frequency. The 6-bit evaluation of the 1GS/s hybrid ADC revealed an ENOB of 5.26 with a near Nyquist rate input frequency. The undesired frequency component at 250MHz in the output spectra originates from the offset mismatches among the TI channels. Furthermore, it can be observed that the components caused by the timing mismatch between TI channels (equation (9)) increase with high input frequencies (Figure 75(b)). As discussed in Section 3.1, such non-idealities can be calibrated off-chip or on-chip to enhance the performance of a TI ADC. However, for the evaluations of this work, the post-processing did not involve the calibration of the impacts from time-interleaved channel mismatches.

The dynamic performance of the hybrid ADC was also assessed with measurements at various input frequencies over the Nyquist bandwidth using 6-bit equivalence. Figure 76 displays the measured SNDR and SFDR of the ADC at 1GS/s versus input frequency.

Figure 76. Hybrid SNDR and SFDR vs. input frequency at f_s = 1GS/s (6-bit evaluation).
Figure 77 displays the ENOB of the 6-bit 1GS/s ADC versus input frequency. As can be observed from Figure 77, the ADC has a viable ENOB (>5.26) over the complete Nyquist bandwidth.

![ENOB vs Frequency Graph](image)

Figure 77. Measured ENOB of the hybrid ADC vs. input frequency at $f_s = 1$GS/s (6-bit evaluation).

The total measured analog power is 6.6mW, including the flash ADC (1.05mW), all CABS ADCs (1.19mW), SHDACs with decoders (0.81mW), and the four unity-gain buffers with eight OTAs (3.55mW). The total digital power of 0.86mW includes the DFFs and thermometer-to-binary encoders at the flash ADC output. The measured power consumption of the clock generation circuitry and the clock buffer are 3.54mW and 2.35mW, respectively. Accordingly, the total power consumption of the 8-bit 1GS/s ADC is 11mW, excluding the clock buffer which was used for testing. The measurements of the power consumptions were completed at room temperature while applying a 10.2MHz full-scale sinusoidal input signal to the ADC clocked at 1GHz. The power consumption of each circuit was measured by disconnecting its corresponding 1.2V supply voltage jumper and measuring the average current with a multimeter.

For a fair 6-bit evaluation of the hybrid ADC, the power consumption should be adjusted to account for the fact that the last two LSBs of the ADC output are not used. By reducing the CABS ADC resolution to 3-bit, the number of CABS comparators is reduced from 31 to 7. This would result in a significant reduction of the area and input capacitance
of the CABS ADC by factor of 4. Moreover, the number of active comparators in the CABS ADC is reduced from 5 to 3, leading to 40% reduction of the power consumed by the CABS ADC, which is 0.5mW. Therefore, the total power consumption of the hybrid ADC for 6-bit evaluation is 10.5mW, which is based strictly on measurement data. It is worth mentioning that, for a 3-bit redesign of the CABS ADC, the total area of the four CABS ADCs would become more than 4 times smaller, resulting in further area saving.

Furthermore, the OTA in the unity-gain buffer was designed to drive the input capacitance of the 5-bit CABS ADC. In case of a 3-bit CABS ADC design, the load capacitance of the buffer would be reduced from 250fF to only 60fF. To estimate the buffer power consumption associated with driving a 3-bit CABS ADC, the OTA in the buffer was simulated with lower bias currents and a 60fF load capacitance. From the simulation results, the same OTA can drive a 60fF load while consuming only 320µW of power, and achieving similar DC gain and GBW (35.6dB, 1.53GHz) compared to the OTA in Section 4.6. The simulated total harmonic distortion (THD) of the low-power buffer is 4.7dB higher, which is acceptable considering that the 6-bit hybrid ADC requires approximately 12dB less SNDR performance compared to the 8-bit hybrid ADC. This would reduce the total power consumption of the unity-gain buffers by a factor of 2, if the 5-bit CABS were replaced by a 3-bit CABS, leading to an estimated total power consumption of 8.7mW for a 6-bit hybrid ADC.

Table 9 lists the measured performance specifications of the fabricated hybrid ADC chip in comparison to the other reported ADCs with similar resolutions and sampling rates. The power consumption of the hybrid ADC is reported as the measured power for the 8-bit and 6-bit cases. Furthermore, an estimated power for a 6-bit redesign of the hybrid ADC is reported for comparison based on the reasoning in the previous two paragraphs. The hybrid ADC fabricated in 130nm CMOS technology stands amongst the ADCs with a low FoM due to its power efficiency. A lower FoM would be expected if the hybrid ADC would be designed and fabricated in a technology with smaller channel length because the architecture would benefit from transistors with higher transition frequencies ($f_T$) and considerably lower parasitic capacitances. Hence, a design in a newer CMOS technology would lead to lower power consumption and significant area reduction.
Overall, in comparison to the specifications of other works in Table 9, the measurement results of the proposed ADC architecture provide a proof-of-concept for its efficiency and performance.
Table 9. Summary of the hybrid ADC measurement results and comparison to other works

<table>
<thead>
<tr>
<th>Specification</th>
<th>This work</th>
<th>[1]</th>
<th>[3]</th>
<th>[7]</th>
<th>[8]</th>
<th>[20]</th>
<th>[99]</th>
<th>[117]</th>
<th>[120]</th>
<th>[121]</th>
<th>[122]</th>
</tr>
</thead>
<tbody>
<tr>
<td>Sampling Rate (GS/s)</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1.25</td>
<td>1</td>
<td>1.6</td>
<td>1.5</td>
<td>0.6</td>
<td>1.4</td>
<td>1</td>
<td>1.2</td>
</tr>
<tr>
<td>Resolution (bit)</td>
<td>6</td>
<td>8</td>
<td>8</td>
<td>6</td>
<td>6</td>
<td>6</td>
<td>7</td>
<td>6</td>
<td>7</td>
<td>6</td>
<td>8</td>
</tr>
<tr>
<td>CMOS Technology (nm)</td>
<td>130</td>
<td>130</td>
<td>55</td>
<td>130</td>
<td>65</td>
<td>90</td>
<td>90</td>
<td>130</td>
<td>45</td>
<td>65</td>
<td>65</td>
</tr>
<tr>
<td>Architecture</td>
<td>Subr.-TI</td>
<td>Subr.-TI</td>
<td>Subr.</td>
<td>TI-SAR</td>
<td>Pipeline</td>
<td>TI-SAR</td>
<td>Flash</td>
<td>Async. SAR</td>
<td>Flash</td>
<td>Interp.-Subr.</td>
<td>Twostep-SAR</td>
</tr>
<tr>
<td>ENOB @ NQ</td>
<td>5.26</td>
<td>5.48</td>
<td>6.19</td>
<td>5.0</td>
<td>5.25</td>
<td>4.44</td>
<td>6.05</td>
<td>5.02</td>
<td>6.17</td>
<td>5.16</td>
<td>6.97</td>
</tr>
<tr>
<td>SNDR @ NQ (dB)</td>
<td>33.42</td>
<td>34.74</td>
<td>39</td>
<td>32</td>
<td>33.4</td>
<td>28.5</td>
<td>38.2</td>
<td>32</td>
<td>38.9</td>
<td>32.8</td>
<td>43.7</td>
</tr>
<tr>
<td>SFDR @ NQ (dB)</td>
<td>45.71</td>
<td>46.03</td>
<td>53</td>
<td>35</td>
<td>41.03</td>
<td>35.5</td>
<td>46.6</td>
<td>46</td>
<td>NA</td>
<td>44</td>
<td>58.1</td>
</tr>
<tr>
<td>Supply Voltage (V)</td>
<td>1.2</td>
<td>1.2</td>
<td>1.2</td>
<td>1.2</td>
<td>1</td>
<td>1.3</td>
<td>1.2</td>
<td>1.2</td>
<td>1.15</td>
<td>1.1</td>
<td>1.3</td>
</tr>
<tr>
<td>Power (mW)</td>
<td>10.5, 8.72</td>
<td>11</td>
<td>16</td>
<td>62</td>
<td>20.1</td>
<td>204</td>
<td>5.3</td>
<td>33.24</td>
<td>9.9</td>
<td>5</td>
<td>3.62</td>
</tr>
<tr>
<td>Area (mm²)</td>
<td>&lt; 0.72³,  &lt; 1.4⁴</td>
<td>0.72³</td>
<td>1.4⁴</td>
<td>0.2</td>
<td>0.09</td>
<td>0.3</td>
<td>0.24</td>
<td>1.2</td>
<td>0.12</td>
<td>0.085</td>
<td>0.044</td>
</tr>
<tr>
<td>FoM¹ @ NQ (fJ/conv. step)</td>
<td>274, 227⁵</td>
<td>246</td>
<td>219</td>
<td>800</td>
<td>1629</td>
<td>579</td>
<td>2053</td>
<td>272</td>
<td>330</td>
<td>278</td>
<td>35</td>
</tr>
</tbody>
</table>

1: FoM = Power / (2^{ENOB @ NQ \times F_s}), 2: estimated power consumption for a 6-bit redesign, 3: ADC core area only
4: total area with calibration circuitry, 5: estimated FoM for a 6-bit redesign
5.8 Summary

The proposed hybrid ADC architecture was demonstrated with a design fabricated in 130nm to verify its feasibility with measurements. In this chapter, the test and measurement methodologies were described in detail. Offsets of the comparators in the flash ADC were manually calibrated using the digitally-controlled design feature, which revealed significant performance improvement. The fabricated 1GS/s ADC was primarily evaluated for 6-bit accuracy. According to the measurement results, the 6-bit 1GS/s ADC has an ENOB of 5.51 and 5.26 when applying a full-scale sinusoidal input signal at low and high (near Nyquist-rate) input frequencies, respectively. The ADC consumes 10.5mW from a 1.2V supply.
6. On-Chip Digital Calibration of an Analog Front-End for Biopotential Measurements

In this chapter, an automatic on-chip digital calibration system for the acquisition of biopotential signals is introduced. The circuit design and system-level innovations are described prior to a discussion of prototype chip measurement results.

6.1 Self-Calibrated Analog Front-End for Long Acquisitions of Biosignals (SCAFELAB)

The development of dry-electrode measurement techniques has been a research challenge [44] due to rising demands in health monitoring applications. As a consequence, there is a need to create integrated analog front-ends with higher input impedance in addition to an excellent common-mode rejection ratio (CMRR). In general, dry electrodes measurements are better suited for long-term monitoring applications [44], such as drowsiness detection, epilepsy diagnosis, and intent recognition to enable a person to command computers or robots. In long-term EEG monitoring applications, the use of dry electrodes avoids gels or adhesives that can cause irritations or allergic reactions [123], making dry-electrode measurements more comfortable and easier to integrate into wearable headsets. In general, dry electrodes such as inexpensive Ag/AgCl are better suited for long-term monitoring, but their use is associated with increased contact resistances that can be above 1MΩ [44]. This characteristic complicates the measurement of small biopotentials by requiring very high impedance at the input of analog front-end instrumentation amplifier, as high as 500MΩ [45]. Figure 78 displays a generalized representation of a typical body-electrode interface at the front-end amplifier for electroencephalography (EEG) applications, which contains a simplified dry electrode model [44]. The problem is that the input impedance is affected by parasitic capacitances from the package of the integrated circuit as well as electrode cable and printed circuit board (PCB) capacitances. Such capacitances can be as large as 50-150pF at the instrumentation amplifier (IA) input, which are modeled as $C_{\text{inp}}$ and $C_{\text{inn}}$ in Figure 78. The IA introduced in [46] utilizes a negative capacitance generation technique to boost the input impedance from a few megaohms to above 500MΩ. The self-calibrated analog front-end for long acquisitions of biosignals (SCAFELAB) project was carried out in our
research group towards the goal of automating the input impedance boosting for integrated EEG signal acquisition front-ends [47].

![Diagram of a conventional dry skin-electrode-amplifier interface.](image)

Figure 78. Model of a conventional dry skin-electrode-amplifier interface.

Figure 79 displays the block diagram of the analog EEG front-end chip with self-calibration. The instrumentation amplifier was designed with feedback that generates negative capacitances at its inputs [46]. An on-chip oscillator, frequency divider, voltage limiter and operational transconductance amplifier (OTA) create a test signal [124] that is monitored with a test amplifier and comparator bank for digitally-assisted impedance boosting [48]. In this project, the analog front-end circuitry and test signal generation blocks were designed by other team members. The scope of this dissertation research is the digitally-assisted analog design technique for enhanced integration, which entailed the development of an on-chip calibration system to automatically boost the input impedance of the analog front-end.
As visualized in Figure 79, the calibration system consists of a digital calibration unit, a test amplifier, a set of four comparators to detect the amplitude, and two comparators to detect if an oscillation (unstable condition) occurs. The calibration unit automatically determines the optimum code for two 8-bit programmable capacitor banks between the input stage and the feedback loop of the IA to cancel the unwanted input capacitances. When the four switches in Figure 79 are closed, the system operates in calibration mode such that the 19.5Hz test signal current ($i_t$) is injected into the circuit under test and the calibration system is connected to the signal path. The DC decoupling capacitors are placed at the OTA outputs to prevent leakage currents. The power line interference is suppressed by the notch in the low-pass filter response [125].
Figure 80. (a) Instrumentation amplifier (IA) with direct current feedback and negative capacitance generation feedback (NCGFB), (b) implemented NCGFB with programmable capacitor bank.

Figure 80 shows the schematics of the IA and the programmable capacitors that form the negative capacitance generation feedback (NCGFB) [46] designed by another group member, which is automatically controlled by the digital calibration unit. The NCGFB is implemented with an 8-bit digitally-controlled capacitor bank \((C_p - 2^7 \cdot C_p)\) and one fixed capacitor \((C_{p0})\) between nodes \(V_{i+}\) and \(V_C\), as well as \(V_{i-}\) and \(V_D\) (with \(C_n - 2^7 \cdot C_n\) and \(C_{n0}\)). The maximum and minimum capacitance values occur with \(S_{7,6,5,4,3,2,1,0} = [11111111]\) and \([00000000]\) respectively, where “1” or “0” indicate that the switch is turned “on” or “off”. Since the IA circuit is not perfectly symmetric (i.e., \(M_3\) is diode-connected while \(M_4\) is not), it was observed that the optimum result can be achieved by setting the capacitor sizes in the negative path \((C_n)\) to around half of the sizes of the positive path \((C_p)\). The switches of the both capacitor banks receive the same control signals from the digital calibration unit. The tuning range of the 8-bit capacitor networks should be designed to compensate
for the expected 50-150pF capacitances from the cables and PCB by generating negative capacitance at the IA input.

The calibration technique takes advantage of the knowledge that the amplitude of the test amplifier output reflects the equivalent impedance at the instrumentation amplifier input because the test current magnitude is known. NOR-type SR latches are connected at the output of the comparators to hold the outputs until the reset signal, as visualized in Figure 81.

![Figure 81. Monitoring scheme at the test amplifier output voltage for maximum impedance detection with comparators and SR latches.](image)

The minimum acceptable output amplitude was determined based on simulations with worst-case PVT variation corners prior to selecting the number of comparators and their reference voltage levels. Four comparators are used to identify the amplitude around the required peak voltage swing (with minimum acceptable input impedance) to assure detection capability in the presence of PVT variations. In each cycle, the latched outputs of the comparators can be interpreted as thermometer-coded representation of the front-end input impedance. This approach avoids direct measurement of the input impedance by using low-power area-efficient circuits in an amplitude detection scheme [124].

### 6.2 Oscillation Detection Technique to Prevent Unstable Operation

Overcompensation with excessive negative input capacitance causes oscillations with amplitudes at the IA output that are at least three times higher than during stable operation.
in calibration mode. For this reason, the threshold of the oscillation detector in Figure 79 was set at twice the maximum signal swing during stable operation. Switch settings for which oscillations occur are discarded by the calibration unit. In an unstable condition, the frequency of the oscillation is less than the frequency of the 19.5Hz test signal. As visualized in Figure 82, the IA output might remain at the positive or negative peak for the complete period of the oscillation evaluation (e.g., after waiting 7 clock cycles). In an earlier version of the calibration system [48], only one comparator was used to detect the high amplitude of the oscillation, creating a risk of missing the observation of oscillation cases in which the differential IA output (Figure 82) remains at the negative peak amplitude during the evaluation moment. To cover both cases, a second comparator was added to the oscillation detection scheme as shown in Figure 83, where \( V_{o+} \) and \( V_{o-} \) are the differential outputs of the IA output buffer (not shown in Figure 80, but included in [47]). The high and low reference levels for the oscillation detection comparators (\( V_{\text{Ref.Osc.P}} \) and \( V_{\text{Ref.Osc.N}} \) in Figure 83) were set to 950mV and 750mV, such that the differential comparison levels (Figure 82) are \( +V_{\text{Ref. osc}} = V_{\text{Ref.Osc.P}} - V_{\text{Ref.Osc.N}} \) and \( -V_{\text{Ref. osc}} = V_{\text{Ref.Osc.N}} - V_{\text{Ref.Osc.P}} \). The SR latch in Figure 83 holds the high comparator output for either case: when a positive peak of an oscillation is detected (Pos.Osc signal), or when a negative peak (Neg.Osc signal) is detected.

**Figure 82.** Conceptual waveform diagrams of the IA’s differential output for the two possible cases when oscillation occurs after switching from a stable code to an unstable code.
6.3 Analysis of the Required Time for Automatic Calibration

Three different options were investigated for the programmed response after a detected oscillation during the calibration process. Each option leads to different total calibration times for different input capacitances in the 50-150pF design range. With the first method, the calibration logic requires 10 clock periods to evaluate every code, and if an oscillation occurs, then the corresponding code is saved in memory. This leads to a constant calibration time of 131s (with a 19.5Hz calibration control clock), which is independent of the input capacitance values. With the second method, the last 2 periods (to write to memory) are skipped for the codes that result in oscillation. Thus, it only takes 8 clock periods per code in such cases, which includes the codes creating a negative capacitance that overcompensates the input capacitance. Since the control logic sweeps the switch codes (S7,6,5,4,3,2,1,0) to increase the total digitally-controlled capacitance (C_{total,i+} and C_{total,i-}) during a complete calibration, the oscillation starts at a particular code and persists for subsequent codes. This creates a calibration time dependence on the input capacitance (C_{inp} and C_{inn}). For method 2, the total calibration time (t_{Cal,total}) can be calculated with the following equation, where k is the code number (between 0 and 255 with 8-bit control) at which the oscillation begins and T_{Clk.Cal} (≈ 51.2ms) is the period of the 19.5Hz calibration clock:

\[
t_{Cal,total} = \sum_{i=0}^{k-1} 10 \times T_{Clk.Cal} + \sum_{i=k}^{255} 8 \times T_{Clk.Cal}
\]

(33)
With the third method, the calibration is interrupted immediately when an oscillation event is detected, and the last code before oscillation remains saved in memory. If oscillation occurs at code $k$, which depends on the particular input capacitance value, then the total calibration time with method 3 is:

$$t_{\text{cal, total}} = \sum_{i=0}^{k} 10 \times T_{\text{clk,cal}}$$

(34)

To evaluate the impact of input capacitances on the calibration time under the influence of circuit non-idealities, transient simulations of the circuits in the signal path were performed to obtain the code $k$ at which oscillation starts with different input capacitance values ($C_{\text{inp}} = C_{\text{inn}}$) between 50pF and 150pF. Figure 84 compares the total calibration time of the three methods vs. input capacitance values. The first method was chosen for the calibration logic implementation on the prototype chip to ensure that the code leading to the largest stable input amplitude is always selected, even if intermittent oscillations would occur due to errors from noises on the chip or PCB for one or more codes during the sweep. In retrospect, the choice of method 1 was overly conservative, and method 3 will be the preferred option in most applications to reduce calibration time. Considering that electrode (cable) changes are expected to be infrequent, the maximum wait time of 131s is acceptable because the calibration would only be initiated by the user from time to time.

![Figure 84. Total calibration time vs. input capacitance for different alternatives to respond to oscillation events during calibrations.](image)

The digital calibration approach is depicted as a flowchart in Figure 85. At each step, the calibration unit first sends a reset signal to all the SR latches located at the outputs of the comparators. Next, it sets the 8 control signals that are connected to the switches (S7-S0) of the IA’s negative capacitance generation block, where the initial output code (S[7:0]) is [00000000]. After waiting for 7 periods of the test signal to allow for settling of the analog signals, the reset signal goes to “0” and the SR latches hold the output values of the four comparators (Comparator[3:0]) from which the test signal amplitude can be inferred. If the Oscillation Detection bit is “1”, then the unit will skip Comparator[3:0] code and increase the current S[7:0] code by one. If there is no oscillation, then Comparator[3:0] is compared with the MaxInput[3:0] (initially zero). If Comparator[3:0] is equal or larger than MaxCode[3:0], then its value is replaced with MaxInput [3:0] in the memory. Furthermore, the related S[7:0] is saved as BestCode[7:0] in another memory location. Afterwards, the S[7:0] is incremented by one, and the process will be repeated (Figure 85). When the S[7:0] reaches [11111111] (255 in decimal), then the calibration unit reads the BestCode[7:0] from the memory and applies the corresponding S[7:0] to set the programmable capacitor bank at the IA for optimum impedance at the end of the calibration. The optimum code is applied to the IA’s capacitor bank until the calibration is restarted after changing electrode cables.
The calibration unit has been implemented with the control and memory blocks shown in Figure 86. The start signal resets all registers at the beginning of the calibration process. The functions of setting the control signals for switches, decision-making, and saving the desired values in memory are completed during each cycle of the algorithm.

Figure 85. Calibration flow chart.

Figure 86. On-chip digital calibration unit.
The memory block receives the latched outputs of the comparators as inputs. It generates and saves the 8-bit codes to control the switches in the capacitor bank. The main clock for the calibration block is derived from the same oscillator that is part of the test signal generation circuitry in Figure 79. The calibration unit uses a clock that is two times faster than the 19.5Hz test signal. Such low clock speed is sufficient for this application since the test signal has a very low frequency and the calibration unit has to wait proportionally to make decisions. The control block operates with an internal clock signal (Clk.Cal) generated by the memory block with half the frequency of the main clock. The internal frequency divider is implemented with a 1-bit counter inside of the memory block.

The memory cells in the memory block are built by shift registers. Each of the flip-flops in the shift registers imitates the function of an SRAM cell. Therefore, the memory cells require address, read and write signals based on SRAM principles [126]. The code comparisons and final decision-making tasks are also performed in the memory block.

The control block contains a 4-bit counter to count the number of test signal cycles after changing the switch codes (S[7:0]), which is important to ensure sufficient settling time for the analog signals. The control block also has an 8-bit counter to generate all 256 states of the switch controls, which is sent to the memory block as Code_i[7:0] and also sent out as S[7:0] for both of the capacitor banks of the IA (Figure 80(b)). The reset signal for the SR latches is generated by the control block during each calibration step. At the end, the Finish signal from the control block is sent to the memory block to stop the calibration process. Afterwards, the S[7:0] code will be set to the optimum value and will remain unaltered.

With a test input current amplitude of $i_t = 1$ pA and 20dB of gain in the IA and filter combination, the voltage swings at the filter output are below 25mV_{p-p} across process corner cases. The number of bits for the programmable capacitors was chosen such that multiple codes meet the minimum input impedance requirement, which makes the calibration scheme more reliable. To account for the different corner cases and temperature conditions, we defined the differential reference levels for the comparators in Figure 81 as three equally-spaced values (300mV_{p-p}, 200mV_{p-p}, 100mV_{p-p}) and one additional level (40mV_{p-p}) to cover the range with some extra margin below the minimum.

133
6.5 Post-Layout Simulation Results

Synthesizable Verilog HDL codes were written to implement the calibration algorithm on a chip. All required logic gates were manually designed and laid out in 130nm CMOS technology. A cell library was created from these logic gates to be used with digital synthesis tools (RTL design). The Verilog code was synthesized to gate-level netlists with Cadence RTL Compiler. Cadence Encounter was used to automatically place and route the layouts from the gate-level netlists. The layouts of the control block and memory block of the calibration unit are displayed in Figure 87. They occupy a total chip area of approximately 0.062mm² in 130nm CMOS technology. From post-layout simulations, the total power consumption of the digital calibration unit at the main clock frequency of 39Hz is 22.9μW with 1.2V supply voltage. A summary of the digital calibration unit specifications is given in Table 10.

![Layout of the digital calibration unit in 130nm CMOS technology](image)

(a) control block, (b) memory block.

<table>
<thead>
<tr>
<th>Block</th>
<th>Number of Gates</th>
<th>Area</th>
<th>Power</th>
</tr>
</thead>
<tbody>
<tr>
<td>Control</td>
<td>113</td>
<td>0.026mm²</td>
<td>---</td>
</tr>
<tr>
<td>Memory</td>
<td>146</td>
<td>0.036mm²</td>
<td>---</td>
</tr>
<tr>
<td>Total</td>
<td>259</td>
<td>0.062mm²</td>
<td>22.9μW</td>
</tr>
</tbody>
</table>
Figure 88. Simulated waveforms of the test amplifier output and the digital control signals for the capacitor bank switches ($C_{\text{in}} = C_{\text{inp}} = 100\text{pF}$).
Figure 88 shows the simulated output waveforms of the test amplifier, the output of the oscillation detection circuit, and the digital signals from the digital calibration unit that control the switches in the capacitor bank. For the purpose of simulation, the test current generator is disconnected when the “finish” signal (the topmost digital signal in the figure) transitions to high. If oscillation occurs, as in Figure 88, then the oscillation detection bit (second digital signal from the top) is set to “1”. In this case, the calibration block saves the best code without oscillation and sets the switches to it after cycling through all combinations.

In the typical process corner case with parasitic capacitances ($C_{\text{inp}}$ and $C_{\text{inn}}$ in Figure 79) at the IA input both equal to 100pF, the best code identified by the calibration system is “01010111”. With this code, the simulated input impedance is equal to 2.2GΩ at 50Hz. Table 11 lists simulation results of the EEG front-end with the calibration unit for other process corner cases and parasitic input capacitances in the 50-150pF range. In all cases, the simulated input impedance of the IA remains above the 500MΩ target for dry-contact EEG measurement methods.
Table 11. Simulation results for different parasitic capacitance values and process corners

<table>
<thead>
<tr>
<th>C_{inp}, C_{inn} (in Figure 79)</th>
<th>Switch Control Code (S_7 - S_0 in Figure 80(b))</th>
<th>Process Corner</th>
<th>IA Input Impedance at 50Hz</th>
</tr>
</thead>
<tbody>
<tr>
<td>150pF, 150pF</td>
<td>10011001</td>
<td>SS</td>
<td>1.12GΩ</td>
</tr>
<tr>
<td>150pF, 150pF</td>
<td>10101101</td>
<td>TT</td>
<td>1.90GΩ</td>
</tr>
<tr>
<td>150pF, 150pF</td>
<td>10111111</td>
<td>FF</td>
<td>2.34GΩ</td>
</tr>
<tr>
<td>150pF, 100pF</td>
<td>01011111</td>
<td>SS</td>
<td>669MΩ</td>
</tr>
<tr>
<td>150pF, 100pF</td>
<td>01110101</td>
<td>TT</td>
<td>1.28GΩ</td>
</tr>
<tr>
<td>150pF, 100pF</td>
<td>01111111</td>
<td>FF</td>
<td>873MΩ</td>
</tr>
<tr>
<td>100pF, 100pF</td>
<td>01001001</td>
<td>SS</td>
<td>1.33GΩ</td>
</tr>
<tr>
<td>100pF, 100pF</td>
<td>01010111</td>
<td>TT</td>
<td>2.20GΩ</td>
</tr>
<tr>
<td>100pF, 100pF</td>
<td>01011111</td>
<td>FF</td>
<td>1.47GΩ</td>
</tr>
<tr>
<td>100pF, 50pF</td>
<td>00011101</td>
<td>SS</td>
<td>859MΩ</td>
</tr>
<tr>
<td>100pF, 50pF</td>
<td>00011011</td>
<td>TT</td>
<td>1.53GΩ</td>
</tr>
<tr>
<td>100pF, 50pF</td>
<td>00100101</td>
<td>FF</td>
<td>1.70GΩ</td>
</tr>
<tr>
<td>50pF, 50pF</td>
<td>00000000</td>
<td>SS</td>
<td>10.30GΩ</td>
</tr>
<tr>
<td>50pF, 50pF</td>
<td>00010000</td>
<td>TT</td>
<td>2.77GΩ</td>
</tr>
<tr>
<td>50pF, 50pF</td>
<td>00001111</td>
<td>FF</td>
<td>2.49GΩ</td>
</tr>
</tbody>
</table>

6.6 Test Setup and Measurement Results

The analog front-end with built-in calibration system was fabricated in 130nm CMOS technology. Figure 89 displays a micrograph of the prototype chip, which was assembled in a PLCC84 package and tested with a socket on a custom printed circuit board. The die areas occupied by the instrumentation amplifier, lowpass-notch filter, variable gain amplifier and test current generator are 0.183\,mm², 0.314\,mm², 0.222\,mm², and 0.156\,mm², respectively. On the prototype chip, an external switch control option is also available, which was implemented for testing purposes during measurements. The total area occupied by the digital calibration unit and the I/O interface circuits for manual switch control is 0.08\,mm². The total power consumptions of the digital calibration unit operating
with a 39Hz clock and the six comparators are 22.9µW and 32.3µW with a 1.2V supply voltage, respectively.

Figure 89. Chip micrograph of the EEG front-end with circuits for automatic input impedance boosting (130nm CMOS technology).

A custom printed circuit board (PCB) was designed for testing of the prototype chip, which is shown in Figure 90. The differential reference voltages of the comparators in the calibration system are generated on the PCB by voltage dividers with resistors and potentiometers. Two 4-bit DIP switches provide external control for the on-chip capacitor banks as an option for manual programming when the automatic calibration is deactivated.
Figure 90. The evaluation board designed for testing of the SCAFELAB chip.

Figure 91 displays the test setup to evaluate the calibration system. During the automatic calibration, the jumpers in Figure 91 are disconnected to be able to read the 8-bit switch control code with the logic analyzer. An 8-channel CMOS buffer IC (NXP 74LV244N) was used to drive the input connectors of the logic analyzer. The instrumentation amplifier output, test amplifier outputs, and the oscillation detection signal were monitored with the oscilloscope during the automatic calibration for functionality verification of the digital calibration system. The DIP switches allowed setting the codes off-chip for manual calibration (i.e., testing purposes), and to separately measure the input impedance with manual adjustments of the codes.
Figure 91. Measurement setup to test the digital calibration of the SCAFELAB chip.

The analog front-end was tested by activating the on-chip test current generator and calibration unit. Figure 92(a) and Figure 92(b) display the single-ended buffered output of the instrumentation amplifier and single-ended output of the test amplifier during a complete calibration process with $C_{\text{inp}} = C_{\text{inn}} = 100\text{pF}$, respectively. The oscillation detection signal is also shown on the top of each waveform in the figure. It can be observed from Figure 92(a) and Figure 92(b) that the amplitudes increase as the NCGFB control codes cycle from low to high values during the automatic calibration. This amplitude growth during calibration is due to the boosted input impedance. Furthermore, the oscillation detection signal indicates when a code results in overcompensation. After sweeping through all codes, the on-chip digital calibration unit automatically sets the switches to the code that resulted in the highest amplitude without oscillation, which corresponds to the highest instrumentation amplifier input impedance. Figure 92(c) and Figure 92(d) show zoomed-in parts of the test amplifier output waveform before and after calibration to illustrate the amplitude growth and to validate its output.
Figure 92. Transient waveforms of the (a) instrumentation amplifier’s single-ended buffered output during the complete calibration, and (b) the test amplifier’s single-ended output during the complete calibration, (c) before calibration, and (d) after calibration.
Figure 93 displays the measured switch control signals generated by the calibration logic at the beginning and towards the end of the calibration, starting from S0 at the top to S7 at the bottom. The inverted signals (on-chip) are applied to the PMOS switches in Figure 80(b).

Table 12 lists the final switch codes from the digital calibration unit together with the corresponding single-ended (buffered) IA peak-to-peak output amplitudes as well as the IA input impedances at 20Hz and 50Hz for four different input capacitance cases. The experimental results demonstrate the impedance-boosting capability of the automatic on-chip calibration system.
Table 12. On-chip calibration unit’s output code with resulting instrumentation amplifier (IA) amplitude and input impedance

| C_{pp} | C_{pn} | Output Code | IA Output Amplitude (peak-to-peak) | |Z_{in}| at 20Hz | Estimated |Z_{in}| at 50Hz |
|--------|--------|-------------|-------------------------------------|-----------------|------------|-----------------|
| 50 pF  | 50 pF  | 00001011    | 104mV                               | 2.056GΩ         | 823MΩ      |
| 100 pF | 100 pF | 01100011    | 92mV                                | 1.820GΩ         | 728MΩ      |
| 120 pF | 120 pF | 10011110    | 84mV                                | 1.661GΩ         | 665MΩ      |
| 150 pF | 150 pF | 10101101    | 72mV                                | 1.424GΩ         | 570MΩ      |

6.7 Summary

An on-chip digital calibration technique for adaptive input impedance boosting in biopotential signal measurement applications was introduced through this research. Validated by experimental results, the on-chip calibration system automatically controls the programmable negative capacitance generation feedback of an instrumentation amplifier. As a consequence, the instrumentation amplifier’s differential input impedance can be increased to more than 570MΩ at 50Hz and to more than 1.4GΩ at 20Hz when the input equivalent capacitance is up to 150pF.
7. General Conclusion and Future Work

This dissertation research concentrated on two fundamental research topics: design of low-power high-speed hybrid ADCs with linearity enhancement and offset calibration techniques, and automatic on-chip calibration for input impedance boosting in EEG measurement applications with dry-contact electrode.

As part of the high-speed ADC research, a 1GS/s subranging time-interleaved ADC was designed in 130nm CMOS technology and verified through prototype chip measurements. The hybrid ADC architecture includes a flash ADC in the first stage to resolve the most significant bits (MSBs), and four time-interleaved CABS ADCs in the second stage to resolve the least significant bits (LSBs). This hybrid ADC architecture is the first to realize a second stage with time-interleaved CABS ADCs, which is a significant factor for its power efficiency. A novel merged sample-and-hold and DAC (SHDAC) in each TI channel performs the sampling and residue generation for the subranging operation. A linearity enhancement technique was created for SHDAC to suppress the impact of parasitic capacitances. The measurement results with 8-bit and 6-bit output evaluations of the 1GS/s hybrid ADC revealed minimum ENOBs of 5.48 bits and 5.26 for near Nyquist-rate input frequencies with power consumptions of 11mW and 10.5mW from a 1.2V supply, respectively. Compared to other state-of-the-art high-speed ADCs, this hybrid ADC has a competitive performance because of its high power efficiency. The low-power high-speed operation of the ADC in this dissertation is expected to facilitate the development of emerging applications, especially in portable devices that are powered by batteries. For instance, future research can involve the development of a low-power software-defined radio transceiver architecture using the hybrid ADC. Furthermore, designing forthcoming versions of the hybrid ADC in technologies with shorter channel lengths, such as 28nm, 45nm and 65nm CMOS technologies, will lead to higher speed with lower power and area because the architecture benefits from technology scaling. Moreover, future research can be carried out by investigating new SAR ADC architectures to be used in the time-interleaved architecture.

In the second research effort, an on-chip digital calibration technique was developed for adaptive input impedance boosting in biopotential signal measurement applications.
The calibration system was fabricated together with an analog front-end in 130nm CMOS technology. Measurement results demonstrated the system’s capability to automatically control the programmable negative capacitance generation feedback of the instrumentation amplifier, boosting its differential input impedance to above 570MΩ at 50Hz. Other calibration algorithms such as binary search can be utilized in the future to reduce the calibration time. Furthermore, the digitally-assisted design techniques from this research can be extended to other digitally-assisted analog design efforts for enhancements of circuit and system performance.
8. References


[57] I. Beavers, “MS-2660 - Understanding spurious-free dynamic range in wideband GSPS ADCs,” *Analog Devices Technical Article*, 2014. Available online:


