# Time Domain Control of Asymmetric Interleaving in Low-Voltage CMOS Power Management Systems

Braedon Salz, *Student Member, IEEE* University of Illinois at Urbana-Champaign Website: http://www.bradysalz.com

Abstract—We present a new technique for minimizing the input current ripple in a multi-phase power converter. The target application is for Internet of Things (IoT) devices, where one power converter must provide numerous supply rail voltages. Based on previous analytical work, a method was found to minimize the input current ripple and ultimately reduce the power converter area. In this work, we apply that analysis to a digital time-based control system. A prototype of this work was simulated in Verilog and 65nm CMOS.

*Index Terms*—Time domain control, DCDL, DLL, asymmetric operation, ripple minimization, multiphase converter.

#### I. INTRODUCTION

The largest elements in low-voltage power converters are increasingly becoming the passive filter elements. One solution to minimize this effect is to use an interleaved multi-phase converter, where an N phase converter has each of its switching waveforms offset by  $2\pi/N$ . This reduces the switching stress seen by each component, but also requires us to have N multiples of each component. With perfect interleaving, there is no effective ripple on the input and output nodes in a single-output system. This methodology can also be applied to power systems where instead of having a singular output, we have multiple outputs.

One such use case for this is in IoT Devices. These systems require one input power source to manage numerous loads, often with varying voltage and current requirements. In these systems, there is even more pressure to reduce package size and board area, therefore proper interleaving is even more critical. However, the proper phase shift to minimize voltage and current ripple is not immediately clear. In a single-input single-output (SISO) converter, the ideal phase shift is always increments of  $2\pi/N$ . In an asymmetric converter, this is beneficial but non-optimal. This work expands on the algorithmic approach presented in [1] and applies it in a time domain control loop.

We implement this algorithm using a digital control loop with two stages. First, we sample each of the three buck converters on the both the high side output resistance and low side sense resistor. This allows us to measure the output voltage ( $V_{out,k}$ ) and current ( $I_{out,k}$ ). The sampling happens using two six bit flash ADCs. The high side value is fed directly in to the ADC, while the low side sense resistance is first amplified by a factor of 50 to achieve full scale. In the second stage, we pass these values to a microcontroller to compute the Fourier coefficients, here implemented using



Fig. 1. One branch of the proposed control design

Verilog-AMS with 32-bit precision, which outputs a six bit code to the routing engine. Finally, the routing engine takes in the output control bits and routes the appropriate signals to the delay locked loop (DLL) and digitally controlled delay line (DCDL). One subsystem of this controller is shown in Fig. 1.

This paper is outlined as followed. In Section II, we discuss the input ripple minimization algorithm as derived in previous work. In Section III, we derive the minimal digital precision needed for the control loop. In Section IV, we discuss the implementations of each block in the control loop. Lastly in Section V, we present our conclusions and ideas for further work.

#### **II. MINIMIZATION ALGORITHM**

This system was designed using a single input, three output buck converter. Using the algorithm derived in [1], we can derive the necessary phase offset for each branch. We begin by defining the input current for the k-th branch  $(i_{in_k})$  as:

$$i_{in_k}(t) = \begin{cases} I_{out_k} - \frac{\Delta I_k}{2} + \frac{\Delta I_k}{D_k T} t & \text{for } 0 < t \le D_k T \\ 0 & \text{for } D_k T < t \le T \end{cases}$$
(1)

where D refers to the duty cycle,  $\Delta I$  the current ripple, and  $I_{out}$  the average output current. The current ripple in a buck converter is expressed as:

$$\Delta I = \frac{V_{in}(1 - D_k)D_k}{f_{sw}L_k} \tag{2}$$

where  $f_{sw}$  represents the switching frequency, and L the inductance value.

We can then show each current component in it's Fourier Series form:

$$i_{in_k} = \frac{a_{k0}}{2} + \sum_{n=1}^{\infty} A_{kn} e^{-j\psi_{kn}}$$
(3)

With this information, we can calculate the necessary phase shifts  $\theta_k$  for each input. Here, assuming we have a three-input system, they take the form:

$$\theta_1 = -\psi_1 \tag{4}$$

$$\theta_2 = \cos^{-1}\left(\frac{A_3^2 - A_2^2 - A_1^2}{2A_1 A_2}\right) - \psi_2 \tag{5}$$

$$\theta_3 = \cos^{-1}\left(\frac{A_2^2 - A_3^2 - A_1^2}{2A_1A_3}\right) - \psi_3 \tag{6}$$

#### **III. LOOP PRECISION**

With the system proposed, the first system design choice is precision on the phase control. Shown in Fig. 2 is the input current ripple when stepping the phase resolution. The function is very discontinuous, as the calculations used to derive the Fourier coefficients are periodic and have numerous local minima and maxima. While it may seem that there is very little tradeoff from one to ten degree precision on average, one must note that this is strongly a result of the output loads parameters. A different  $V_{out,k}$ ,  $I_{out,k}$  profile would give radically different curves, with different minimums.

The trend line, however, would remain the same. This logarithmic nature comes from the magnitude to phase relationship in the Fourier Series. When taking the vector sum of this, we can see that:

$$\log(Error) \propto \sum_{k} \log(A_k) \theta_k \tag{7}$$

This implies that in order to truly evaluate the robustness of the system, we must look at the local maximums in Fig. 2. To further illustrate the reasoning behind this periodic behavior, Fig. 3 shows the calculated phase offset compared to the ideal phase offset for each output. Again, while we may occasionally "get lucky" and choose a phase resolution that is a natural factor of the ideal phase offset, in reality that is quite rare and only true for the smallest of operating conditions. We can see that the amplitude of the phase mismatch also follows a logarithmic trend line, as expected.

A precision of one degree phase step was chosen as a compromise between accuracy and input ripple minimization. After this point, most of the input ripple will most likely be due to higher order harmonics or effects such as the ESR ripple of the input capacitors. To determine the digital precision needed, we bound the necessary shift between -45 to 45 degrees and the duty cycle for reasonable ranges (0.25 to 1). This resulted in an effect phase shift range of 61 degrees, which requires six bits of precision. Therefore all digital blocks were designed with this in mind.



Fig. 2. The input current ripple is plotted as a function of the output phase mismatch. The x-axis represents the phase control resolution from 0.1 to 10 degree steps.



Fig. 3. The ideal phase for each phase resolution step is shown to further clarify the problem of local minima and maxima.

## IV. IMPLEMENTATION + TESTING DETAILS

## A. Flash ADC Design

A data converter was needed to sample the buck converter output waveforms and present them as digital data. Due to the presumed slow update of these loads, a flash ADC was chosen. Flash ADCs are usually avoided due to the area and active component count scaling with the square of the number of output bits needed, and subsequently the power will as well. However, with low sampling speed and a low number of bits needed, a flash ADC is the optimal design choice. In order to minimize power consumption, the ADC is clocked at 1/64th the switching frequency. Lowering the sampling frequency also acts as a pseudo-lowpass filter, which further simplifies



Fig. 4. A subsection of the Flash ADC used. Only two slices are shown here, but in the system 64 of these are repeated in a cascading manner.

the design of our next system. The ADC is implemented using a chain of 64 segmented resistors feeding into the negative input of 64 comparators. The other positive comparator input is the input voltage.

Each comparator is designed as a single stage open loop amplifier. The system was designed to minimize power while having high enough gain to resolve a 10mV difference in under 50ns. This results in an effective bandwidth of 20MHz for the comparator, which is sufficient for the purpose. While lowering the bandwidth is possible to further reduce the static power of the ADC, it leads to slower response times. The ADC is designed to support to the full-scale voltage of the buck converter output, while the current reading done with the sense resistors requires an amplifier.

## B. Digital LPF

The digital filter is simply used as an averaging lowpass filter, which takes the form:

$$y[n] = \frac{\sum_{k=0}^{L} x[n-k]}{L}$$
(8)

Through simulation, a four-tap filter was shown to be the most effective due to the low sampling rate and average ripple in the buck converter load. Piecewise averaging was used to minimize the possibility of overflow, and the final filter was implemented as:

$$y[n] = \frac{\frac{x[n] + x[n-1]}{2} + \frac{x[n-2] + x[n-3]}{2}}{2} \tag{9}$$

#### C. Phase Calculations

The calculation of the Fourier coefficients for finding the necessary phase was done using Verilog-AMS with real number implementation. This was used in lieu of a microcontroller, and was not counted in the power budget. All computation was done using the real number implementation of sine, cosine, and square root function. Attempts to linearize these methods failed due to the abundance of local mininma and maxima, for reasons described above in the previous section. The inverse



Fig. 5. Single pole delay locked loop. One only delay stage is shown for simplicity.

cosine calculation was successful with it's linearization, and was broken into its corresponding Taylor Series:

$$\theta = \cos^{-1}(x) \approx \frac{\pi}{2} - x + \frac{x^3}{6}$$
 (10)

Furthermore, since we know the calculated phase difference is bound between  $\pi/2$  and  $\pi$ , we can linearize the system around x = -1/2 for minimal error. Note the above Taylor Series assumes  $|x| \leq 1$  for stability purposes. The resultant output phase shift was converted to a six bit digital word, and then routed to the DLL and DCDL appropriately.

# D. DLL

The delay locked loop (sh) is used to choose between the three coarse, or most significant, bits in the phase selection process. The DLL functions by balancing the control voltage on a chain of delay cells in order to precise lock the delay to  $T_{ref}$ . The loop gain transfer function from input delay to output delay is given by:

$$\frac{\Phi_o}{\Phi_i} = \frac{K_{pd}I_{cp}K_{vcdl}}{sC} \tag{11}$$

where  $K_{pd}$  is the phase detector gain,  $I_{cp}$  the charge pump current, and  $K_{vcdl}$  the voltage-controlled delay line gain in s/V. This loop is stable and phase-locked assuming we are within the bandwidth of the integrator. A standard NANDbased tristate PFD was used here.

This loop was designed to consume the minimal necessary power, which comes at the cost of having a very slow transient response. Since we assume the DLL will only need to lock once during system turn-on, this choice was easily made. Each delay cell consists of two inverters in series acting as a buffer, with a 15pF capacitor in between them to force the delay. Each inverter cell consists of a current-starved inverter, and a current mirror to bias the high side control in order to match the rise and fall times. When in lock, each cell delays the input clock by exactly  $T_{ref}/8$ . For the given switching frequency of



Fig. 6. One delay cell inside the DLL.



Fig. 7. DCDL with three control bits.

TABLE I OPERATING CONDITIONS

| $V_{in}$               | 2V           |
|------------------------|--------------|
| $V_{out,1}, I_{out,1}$ | 1.5V, 0.75A  |
| $V_{out,2}, I_{out,2}$ | 1.25V, 1.25A |
| $V_{out,3}, I_{out,3}$ | 1.0V, 0.8A   |
| $L_1, L_2, L_3$        | $1 \mu H$    |
| $f_{sw}$               | 1 MHz        |

1MHz, the loop required a 100pF capacitor and  $70\mu$ A biasing current. Each delay stage had a 15pF capacitor in between the inverters, with the NMOS sizes being 200/60 nm, and the PMOS 600/60 nm. One such delay cell is shown in Fig. 6.

# E. DCDL

The three fine, or least significant, bits control the DCDL. An example DCDL is shown in Fig. 7. Each bit switches an NMOS transistor on or off, which correspondingly either shorts the capacitive element to ground, or leaves it floating. While the capacitance ideally should scale  $1C_{ref}$ ,  $2C_{ref}$ ,  $4C_{ref}$ , the delay effect is non-linear, and is thus the capacitances are chosen in an exponential manner. These values were found through simulation and optimization, and are 1pF, 2.5pF, 4.3pF. The transistors are the same size as in the DLL delay cells.

#### V. RESULTS

The system was tested using similar parameters as presented in the original research. The operating conditions are listed in Table I, and the individual component parameters are listed in

| TABLE II<br>Static Power Consumption |             |  |
|--------------------------------------|-------------|--|
| ADCs                                 | 12.5mW      |  |
| DLL                                  | $800 \mu W$ |  |
| DCDL                                 | $200\mu W$  |  |
| Routing Logic                        | $150 \mu W$ |  |
| Total                                | 13.8mW      |  |
| TABLE III                            |             |  |

COMPONENT PARAMETERS

| Technology      | 65nm                  |
|-----------------|-----------------------|
| Switch $r_{on}$ | $50 \mathrm{m}\Omega$ |
| Inductor DCR    | $40 \mathrm{m}\Omega$ |
| $R_{sense}$     | $10 \mathrm{m}\Omega$ |
| $C_{in}$        | 4 x 0.1µF             |
| $C_{out}$       | $4 \times 10 \mu F$   |
| Cout            | 4 х 10µг              |

Table III. The system was successful, and correctly calculated the necessary phase offset. Unfortunately, the system did not seem to improve input current ripple in simulation. This is most likely due to the use of an ideal input source, but could not be debugged successfully. We are confident that with proper configuration, a reduced voltage/current ripple would be clear.

This system also had very lower static power consumption, as presented in Table II. This does not include the necessary power to for the microcontroller or gate drivers however, which would also be critical for evaluating the system performance.



Fig. 8. Clocking waveforms once in lock. Shown for each of the three loads.

# VI. CONCLUSION

An alternative control strategy for power management systems with asymmetric outputs was presented here. While the ultimate goal of having complete digital control without any computation was not met, the amount of full precision computation was still minimized. Additionally, since the phase shift can be created simply by writing one output register is most microcontrollers (assuming an eight bit sized register), this has substantial improvement over the delay mentioned in the previous work (10ms due to the clocking update delay).

Future improvements could certainly be made to the system. First, if one needs tighter phase control over a wider range, one must increase the sampling resolution. Doing this would likely mean switching from a flash ADC to a SAR ADC, or if even higher resolution is needed, a  $\Delta\Sigma$  ADC. Both would work well in this use case due to the very low sampling rate.

If one wished for a fully integrated digital system, there are several options available. A coarse search methodology of sweeping each phase could work, but it is very suspect to the numerous local minima and maxima as described prior. Most of the calculations can be linearized through Taylor Series, but only for certain ranges. Due to the nature of the Fourier coefficients being arguements of sine and cosine, one would need to sublinearize for certain ranges (i.e.  $0 \le D \le 0.25$ ,  $0.25 \le D \le 0.5$ , etc), but is certainly doable.

#### ACKNOWLEDGMENT

The author would like to thank Professor Pilawa for providing the opportunity to investigate this work, as well as Marcel Schuck and Aaron Ho for designing the algorithmic approach.

#### REFERENCES

- M. Schuck; A. Ho; R. Pilawa-Podgurski, "Asymmetric Interleaving in Low-Voltage CMOS Power Management with Multiple Supply Rails", in *IEEE Transactions on Power Electronics*, vol.PP, no.99, pp.1-1
- [2] Yongsam Moon, Jongsang Choi, Kyeongho Lee, Deog-Kyoon Jeong and Min-Kyu Kim, "An all-analog multiphase delay-locked loop using a replica delay line for wide-range operation and low-jitter performance", in *IEEE Journal of Solid-State Circuits*, vol. 35, no. 3, pp. 377-384, March 2000.
- [3] Yongsam Moon, Jongsang Choi, Kyeongho Lee, Deog-Kyoon Jeong and Min-Kyu Kim, "An all-analog multiphase delay-locked loop using a replica delay line for wide-range operation and low-jitter performance", in IEEE Journal of Solid-State Circuits, vol. 35, no. 3, pp. 377-384, March 2000.
- [4] P. Krein, *Elements of Power Electronics*, 2nd ed., New York, Oxford University Press, 1998.
- [5] R. Erickson, D. Maksimovic, Fundamentals of Power Electronics, 2nd ed., New York, Springer Printing Press, 2001.