# Low-Power Crossbar Switch With Two-Varistor Selected Complementary Atom Switch (2V-1CAS; Via-Switch) for Nonvolatile FPGA

Naoki Banno<sup>®</sup>, Koichiro Okamoto, Noriyuki Iguchi, Hiroyuki Ochi<sup>®</sup>, *Member, IEEE*, Hidetoshi Onodera, *Fellow, IEEE*, Masanori Hashimoto<sup>®</sup>, *Senior Member, IEEE*, Tadahiko Sugibayashi, Toshitsugu Sakamoto, and Munehiro Tada<sup>®</sup>, *Fellow, IEEE* 

Abstract—A nonvolatile and programmable routing switch featuring two-varistor selected complementary atom switch (2V-1CAS also known as via-switch) is evaluated. The a-Si/SiN/a-Si varistor as a selector for the atom switch shows superior nonlinear current–voltage characteristics with high selectivity of  $\sim 10^5$ , which originates from the staircase barrier height in the layers. The two control lines connected to the varistors realize multiple fan-outs of the crossbar switch block is demonstrated, where the atom switch is placed at each cross point and programmed through the varistors. The developed via-switch crossbar switch is a strong candidate for achieving energy-efficient nonvolatile field-programmable gate array in the Internet-of-Things applications.

*Index Terms*— Atom switch, electrochemical reaction, field-programmable gate array (FPGA), nonvolatile memory, polymer solid electrolyte (PSE), reconfigurable logic.

#### I. INTRODUCTION

I N the era of the Internet-of-Things (IoT), a huge number of devices connect to networks through wireless communication, where logic large-scale integrated (LSI) circuit with high energy efficiency and flexibility contributes to achieving high-performance mobile edge computing. Generally, in logic LSIs, the central processing units (CPUs) or general-purpose computing on graphics processing units (GPGPUs) have high flexibility but low energy efficiency. Field-programmable gate arrays (FPGAs) are known to be energy efficient, but reducing power consumption in the mobile edge-computing applications is still essential for saving the battery life [1]. Power gating

Manuscript received March 31, 2019; revised May 24, 2019; accepted June 7, 2019. Date of publication July 3, 2019; date of current version July 23, 2019. This work was supported by JST CREST under Grant JPMJCR1432. The review of this paper was arranged by Editor T.-H. Kim. (*Corresponding author: Naoki Banno.*)

N. Banno, K. Okamoto, N. Iguchi, T. Sugibayashi, T. Sakamoto, and M. Tada are with NEC Corporation, Tsukuba 305-8501, Japan (e-mail: banno@bu.jp.nec.com).

H. Ochi is with the Department of Computer Science, College of Information Science and Engineering, Ritsumeikan University, Kusatsu 525-8577, Japan.

H. Onodera is with the Department of Communications and Computer Engineering, Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.

M. Hashimoto is with the Department of Information Systems Engineering, Graduate School of Information Science Technology, Osaka University, Suita 565-0871, Japan.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TED.2019.2922352

technique is also known to be an effective way to reduce the overall chip power of the CMOS circuit [2]. However, it is not easy to adopt the conventional power gating for the FPGA since the configuration memory of the static random access memory (SRAM) in the FPGA is volatile and needs to be reconfigured per power gating. In other words, frequent power gating lowers the static power but consumes much power to wake-up and reconfigure the FPGA. Thus, in addition to the energy efficiency of the logic operation itself, non-volatility of the configuration information is desired for the applications.

Recently, resistive random access memory (ReRAM) and magneto-resistive random access memory (MRAM) have been introduced to store the configuration information for non-volatile operations in FPGAs [3]–[5]. However, these works still used pass transistors for routing signals and, therefore, did not improve the energy efficiency of logic operations. To overcome these issues, the ReRAM is used not only as configuration memory but also as a routing switch for signal transmission [6]–[9].

Besides, an atom switch (also known as NanoBridge), which is a nonvolatile resistive-change device integrated on the copper back-end-of-line (Cu-BEOL) layers, has been developed as a nonvolatile programmable switch tailored for the FPGA [10]. The first FPGA implementing an atom switch was fabricated and reported in [11]. The atom switch consists of a polymer solid electrolyte (PSE) sandwiched between Ru and Cu electrodes [12]. The resistance changes by forming a Cu bridge with non-volatility. Positive voltage of the Cu electrode makes Cu ions form the Cu bridge by electrochemical reaction, and the switch turns into ON-state. Conversely, by applying negative voltage, the Cu bridge is annihilated, and the switch turns into OFF-state. In addition, a complementary atom switch (CAS) consisting of two atom switches in series with opposite directions improves the OFF-state reliability with reduced programming voltage of 2 V [13]. The FPGA implementation with the CAS and their silicon results are reported in [14]-[16]. The switching mechanism of the atom switch and the conductive bridge random access memory (CBRAM) [17], [18], which is one of the ReRAMs, are the same in that they use conductive bridges. However, if we apply the routing switch of the FPGA, the atom switch has the advantage of a high ON/OFF resistance ratio, forming free and high reliability by using the PSE.

However, all the FPGA implementations with ReRAM, atom switch, and the CAS mentioned above require one or two select transistors for each programmable switch. Even though the switch itself has a small footprint, the switch

0018-9383 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.



Fig. 1. Comparison of novel VS-FPGA with conventional FPGA using SRAM-based switch. By replacing select transistors with varistors, chip size reduction is estimated to be more than 90%. 1T-1CAS-FPGA is called "AS-FPGA." TMG stands for transmission gate.

density is limited by the area of the transistor. If the select transistor can be eliminated or replaced, the switch density can be dramatically improved, and the interconnect delay and energy for the signal transmission can be reduced. Motivated by this idea, a bidirectional diode [19]-[22] using TaO stacked on the CAS (DCAS) is proposed [23], but the DCAS suffers from a functional limitation of a single fan-out (FO) in the crossbar, which consumes many switch blocks due to the poor functionality of the switch block. To obtain high functionality with multiple cross-point programming per column or row (multiple-FOs) of the crossbar switch, two-varistor selected-CAS (2V-1CAS) structure is proposed [24], [25]. The 2V-1CAS is named "via-switch" since an extremely small footprint of 18  $F^2$  is achieved, which is more than one order smaller than that of the conventional SRAM-based FPGA. The a-Si/SiN/a-Si varistor has an advantage of high compatibility with a CMOS process featuring standard material elements. It has been estimated that the via-switch-FPGA (VS-FPGA) gives further 75% area reduction to the conventional 1T-1CAS-FPGA [26]. A case study of application mapping shows that the introduction of the via-switch can reduce the array area by 21.7%, thanks to the bidirectional interconnection. A highly dense FPGA is developed with the x26 crossbar density, 90% reduction delay, and 94% reduction energy in the interconnection at 0.5-V operation [27]. Fig. 1 summarizes the comparison of these FPGAs. The remaining development challenge is to keep the static power of the via-switch crossbar low, especially for the edge-computing application.

In this paper, first, we discuss the conducting mechanism of the a-Si/SiN/a-Si varistor to improve the nonlinearity (NL) performance. The effects of the a-Si layer and composition of the SiN layer are analyzed. We clarify that the improved NL of the a-Si/SiN/a-Si varistor is caused by its staircase barrier height using X-ray photoelectron spectroscopy (XPS) and the X-ray absorption fine structure (XAFS). Second, to clarify the effect of the varistor characteristics on the crossbar energy, the static power of the crossbar switch is calculated on various nonlinear performances and configurations. The effect of depopulation of the cross points is also discussed, as depopulation is a very effective way to reduce the cell area and line capacitance. Finally, a 50  $\times$  20 crossbar switch is fabricated, and its programmability and signal transfer are demonstrated.

## II. CROSSBAR SWITCH

In our previous work, we have demonstrated a CAS-based FPGA, in which the CAS-based crossbar switch is used for



Fig. 2. Schematic of programmable logic. SRAM-based routing switch and configuration memory of LUTs are replaced by crossbar switch with via-switch.

both the routing multiplexer (MUX) and the configuration memory of a lookup table (LUT) (Fig. 2) [26]. The logic block has two pairs of 4-input LUTs and a flip-flop. The signal is routed via the switch MUX (SMUX) and the interconnection MUX (IMUX). In the case of SRAM-based design, the MUX comprised of the SRAM and the pass transistor is used for the signal routing. On the other hand, in the CAS-based design, a CAS-based simple crossbar circuit can replace the SRAM and the pass transistor. Single-stage routing and the small input capacitance of the CAS contribute to the improvement of the signal delay and active power. Also, the crossbar switch is usable for the configuration memory of the LUT, so that the entire component of the programmable cell achieves nonvolatility.

In this paper, the CASs of SMUX, IMUX, and the LUT memory are replaced by the via-switches without select transistors (Fig. 2). The via-switch-based crossbar switch needs two control lines, vertical and horizontal, which program the CAS through each varistor without a sneak path [Fig. 3(a)]. Varistors are connected to the independent control lines through control 1 (C1) and control 2 (C2) terminals, resulting in that the CASs can be programmed individually without select transistors [Fig. 3(b)]. Signal lines of the CAS connect to terminal 1 (T1) and terminal 2 (T2). Multiple-FOs are achieved since the CASs can be programmed individually [27]. Thus, to suppress the sneak current during programming, the two varistors are essential for selecting the CAS. The ON-resistance  $(R_{ON})$  of the via-switch does not change the number of the ON-state switches sharing the signal line, namely, the multiple-FOs.

#### III. EXPERIMENTAL PROCEDURE

First, for evaluating the performance of a simple varistor, the a-Si/SiN/a-Si varistor is fabricated on a Cu line (M1) in a 65-nm node Cu BEOL. The Ru-alloy electrode and the TiN/a-Si/SiN/a-Si/TiN layers are directly deposited on Cu through contact hole. Second, for evaluating the via-switch, the CAS and the varistor are stacked on the edge of two Cu lines. The buffer and the PSE layers are deposited on Cu through a contact hole. During the PSE deposition, the buffer metal prevents the Cu electrode from oxidizing and changes to metal oxide. This metal oxide works as a part of the solid electrolyte [15], [16]. Then, the TiN/a-Si/SiN/a-Si/TiN layers are deposited on the CAS stack. The SiN and a-Si layers are deposited by plasma-enhanced chemical vapor



Fig. 3. (a) Schematic of a crossbar switch block with via-switch. (b) Schematic of the via-switch device structure. Terminal 1 (T1) and terminal 2 (T2) connect to signal lines; control 1 (C1) and control 2 (C2) connect to control lines.



Fig. 4. (a) I-V characteristics of a-Si and Si<sub>0.5</sub>N<sub>0.5</sub> films for varistor layers. SiN or a-Si films with different thicknesses are deposited on TiN electrode. (b) I-V characteristics of a-Si/SiN<sub>x</sub>/a-Si films. Single Si<sub>0.5</sub>N<sub>0.5</sub> and triple-layered SiN stacks are compared.

deposition (PE-CVD) at 400 °C. The via-switch stack is dryetched by using the dual-hard mask (DHM) [24] process of SiCN/SiO<sub>2</sub>. The DHM process allows to transfer the viaswitch pattern on the stack without any plasma damages on the varistor and the CAS. To form the pattern of the varistor region, a part of SiO<sub>2</sub>-HM is etched by using the first mask. After that, to form the pattern of the switch stack region including the varistor region, the residual SiO<sub>2</sub>-HM is etched by using the second mask. The SiCN-HM protects the viaswitch stack from O<sub>2</sub>-ashing damage. After the etching of the via-switch stack using the metal etcher, the top electrode of the varistor (TiN) is etched, except for the varistor region, and the top electrode of the CAS (Ru-alloy) remains on both the CAS and the varistor region. The via-switch can be integrated with the footprint of  $18 \text{ F}^2$  [Fig. 3(b)]. Before integrating the varistor on a Cu BEOL, we compare the performances of the varistor by film stacks deposited on a 300-mm wafer. SiN or a-Si is deposited on 300-mm Si wafers as samples for XPS and XAFS measurements. XAFS are carried out on beamline BL-2 at the SR center of Ritsumeikan University (Shiga, Japan).

## IV. a-Si/SiN/a-Si VARISTOR

This section describes the effect of the stacking layers in a-Si/SiN<sub>x</sub>/a-Si on NL characteristics in comparison to a single a-Si or SiN layer. We clarify the reasons for its high NL. Fig. 4 shows the I-V characteristics of various film stacks of varistor depositing on 300-mm wafer. As shown in Fig. 4(a), each single-insulator layer [i.e., metal/insulator/metal (MIM)] I-V characteristic shows poor NL performance. To improve the NL performance, we propose a new technique to use the a-Si layer as a barrier height control layer [i.e., metal/semiconductor/insulator/semiconductor/metal (MSISM)] [Fig. 4(b)]. Furthermore, we propose a triplelayered SiN stack to improve barrier height control layers (i.e., MSIIISM). The triple-layered SiN stack consists of a



Fig. 5. XPS spectra of  $Si_{0.49}N_{0.51}$  and  $Si_{0.51}N_{0.49}$ . (a) Si-N peak of N 1s and (b) valence band.



Fig. 6. XANES spectra of  $Si_{0.49}N_{0.51}$  for (a) normalized intensity and (b) differentiated intensity. XANES spectra of  $Si_{0.51}N_{0.49}$  for (c) normalized intensity and (d) differentiated intensity.

thin  $Si_{0.49}N_{0.51}$  sandwiched between two  $Si_{0.51}N_{0.49}$  layers and shows high NL of over  $10^5$ . The NL is defined as the ON/OFF current ratio between the ON-state at 2 V and OFF-state at 0.25 V [24]. The composition of SiN can be controlled by optimizing the amount of N<sub>2</sub> and SiH<sub>4</sub> gas in the PE-CVD. The PE-CVD ensures enough process window of the composition of SiN.

Next, to investigate the difference in barrier height between  $Si_{0.49}N_{0.51}$  and  $Si_{0.51}N_{0.49}$ , the band gaps are analyzed by XPS and XAFS measurements. Fig. 5 shows the N 1s and valence band XPS spectra of Si<sub>0.49</sub>N<sub>0.51</sub> and Si<sub>0.51</sub>N<sub>0.49</sub>. Differences from binding energies of N 1s' peaks to binding energies of the build-up position of the valence band show 395.87 eV for Si<sub>0.49</sub>N<sub>0.51</sub> and 396.84 eV for Si<sub>0.51</sub>N<sub>0.49</sub>, respectively. Fig. 6 shows the X-ray absorption near the edge structure (XANES) spectra of Si<sub>0.49</sub>N<sub>0.51</sub> and Si<sub>0.51</sub>N<sub>0.49</sub>. Build up positions of N-K edge in XANES are 400.82 eV of Si<sub>0.49</sub>N<sub>0.51</sub> and 400.72 eV of Si<sub>0.51</sub>N<sub>0.49</sub>. From Figs. 5 and 6, the bandgaps of  $Si_{0.49}N_{0.51}$  and  $Si_{0.51}N_{0.49}$  are calculated to be 4.9 and 3.9 eV, respectively. Fig. 7 shows the simple band diagrams of MIM, MSISM, and MSIIISM varistors. In the case of the MSIIISM varistor, it is assumed that the tunneling current under high voltage is enhanced by using a thin Si<sub>0.49</sub>N<sub>0.51</sub> and the leak current under low voltage is kept low by the large bandgap of  $Si_{0.49}N_{0.51}$  and the two steps of the high-barrier height in a-Si/Si<sub>0.51</sub>N 0.49.

Next, the developed a-Si/SiN/a-Si varistor with the triplelayered SiN is integrated on a Cu line in a 65-nm node BEOL on a 300-mm wafer. The bottom Ru-alloy electrode and the varistor stack of the TiN/a-Si/SiN/a-Si/TiN layers are deposited on Cu through a contact hole directly. Fig. 8 shows the current density-voltage (J-V) curve of the integrated



Fig. 7. Band diagrams of (a) single layer of SiN varistor (MIM), (b) a-Si/SiN/a-Si layer (MSISM), and (c) a-Si/SiN/a-Si layer with triplelayered SiN stack (MSIIISM).



Fig. 8. Current density–voltage characteristics of integrated a-Si/SiN stack/a-Si varistor with triple-layered SiN stack.



Fig. 9. (a) Top-view SEM image and (b) cross-sectional TEM image of via-switch.

varistor, which exhibits NL of  $1.1 \times 10^5$ , OFF resistance ( $R_{OFF}$ ) of 270 M $\Omega$  at 0.25 V, and maximum current density ( $J_{max}$ ) of 1.63 MA/cm<sup>2</sup>. The small temperature dependence of the ON and OFF current is originated from the small activation energies ( $E_a$ ) of 0.032 eV in the ON-state and 0.037 eV in the OFF-state, respectively [25]. These results support the tunneling conduction model of the varistor.

# V. Demonstration of 50 $\times$ 20 Crossbar Switch

Fig. 9 shows the device structure of the fabricated via-switch stack using the DHM etching process. The buffer, PSE, and Ru-alloy (CAS) are deposited on the hole, followed by the TiN/a-Si/SiN stack/a-Si/TiN varistor stacking. The a-Si/SiN/a-Si has good CMOS-process compatibility. The etching of a-Si/SiN/a-Si stack is easy by using the etching condition of the atom switch. The developed varistor is suitable for forming



Fig. 10. (a) Set/reset characteristics of via-switch single side. (b) ON- and OFF-state characteristics of via-switch with leak current between varistors.

the via-switch structure. Moreover, high NL is confirmed in the crossbar switch application of the FPGA. Fig. 9(a) shows a top view SEM image of the via-switch after the stack etching. The parts of two varistors are clearly separated on the CAS. Fig. 9(b) shows a cross-sectional TEM image of the viaswitch. The varistor and the atom switch are stacked on the Cu interconnects.

Fig. 10(a) shows the I-V characteristics of a single side of the integrated via-switch. By applying positive voltage to T1 while keeping C1 grounded [Fig. 3(b)], the Cu bridge is formed in the PSE, and the atom switch turns to ON-state at 3 V.  $R_{\rm ON}$  of the via-switch depends on the I-V characteristics of the varistor since the atom switch connects to the varistor in series. The current of the varistor when the via-switch turns into ON-state determines  $R_{ON}$  of the via-switch. The programming current compliance is successfully achieved by the stacked varistor. In contrast, by applying negative voltage to T1, the Cu bridge is annihilated, and the atom switch turns to OFF-state at -3 V. The set and reset voltages are slightly increased due to the voltage drop in the varistor. After the set programming of the atom switch, the low current at the low voltage of around 0.25 V indicates the OFF current of the varistor, which is well consistent with the characteristics of the single varistor shown in Fig. 8. To check the cross-point characteristics during programming, the ON/OFFcharacteristics between T1 and T2 and the leak current between C1 and C2 are measured [Fig. 10(b)]. The electrical separation between the two varistors and the high ON/OFF current ratio of the CAS are confirmed. The retention characteristics depend on the atom switch. In our previous work, we reported the ON-state data-retention test of the atom switch. The data-retention tests at 260 °C, corresponding to lead-free soldering temperature, and at 150 °C for long operation lifetime are performed. No failure is observed in the ON-state atom switches for 1 h at 260 °C and for 3000 h at 150 °C [28].

Next, the relationship between NL and the static power of the 50  $\times$  20 crossbar switch is investigated. In addition, the effect of depopulation of cross points is also discussed, since depopulation is a very effective way to reduce the cell area and line capacitance. In our previous study, we reported that the crossbar switches with select transistors are depopulated by 50% to reduce the cell area with no degradation of its mappability on an application with keeping the same functionality [29]. Understanding the effect of depopulation of via-switches on crossbar's performance/power is essential for practical implementation. Fig. 11 shows the schematic of



Fig. 11. Schematic of  $50 \times 20$  crossbar switches. Crossbar switches with (a) 100% population and (b) 50% population.



Fig. 12. Schematic of leak paths in crossbar switch using via-switch at operation. Leak current flows through two varistors and one atom switch during signal transfer.  $I_{PRO}$  is the leak current via programming lines and  $I_{CAS}$  is the leak current via signal lines.



Fig. 13. Five different configurations for simulating relationships between static power of  $50 \times 20$  crossbar switch and NL of varistor.

the 50  $\times$  20 crossbar switches with 100% population and 50% population, where the cross point of the via-switch is placed in one skip. The leak path (LP) through the varistor occurs when the programmed via-switch connects the signal lines. Fig. 12 shows the schematic of the LP in the crossbar switch using a via-switch. The leak current could be flown through two varistors and one OFF-state atom switch. To reduce the power consumption of the crossbar switch, the leak current should be kept low. We evaluate the leak current in the  $50 \times 20$  crossbar switch in terms of the NL in the varistor. Here, we assume five different configurations shown in Fig. 13. The number of LPs in different configurations is indicated in the figures. The black dots show the via-switches programmed to be in ON-state and transfer the signals. Electric potential of the horizontal lines connecting to the ON-state via switches raises up to  $V_{DD}$  and makes the LP to the grounded lines. Configurations 1, 2, 3, and 5 include multiple-FOs. One of the two adjacent input lines is biased at high voltage ( $V_{DD} = 0.5$  V), and another is grounded. Fig. 14 shows five other different configurations by using a 50% depopulated crossbar switch.



Fig. 14. Five different configurations by using 50% depopulated crossbar switch for simulation of relationship between static power of  $50 \times 20$  crossbar switch and NL of varistor.



Fig. 15. Simulated relationships between static power of  $50 \times 20$  crossbar switch and NL of varistors in five different configurations in (a) Fig. 13 and (b) Fig. 14. ON current is supposed to be 500  $\mu$ A for extracting NL.



Fig. 16. (a) Programming configuration of  $50 \times 20$  crossbar switch with via-switch. (b) Outputs of OUT1-50 when signal applies to IN5 and IN10.

Fig. 15 shows the static powers for the 50 × 20 crossbar switch with 100% population and 50% population. The leak current is composed of two components: the leak current via programming lines ( $I_{PRO}$ ) and the leak current via signal lines ( $I_{CAS}$ ), as shown in Fig. 12.  $I_{PRO}$  depends on the  $R_{OFF}$ of the CAS and the varistor, namely on NL.  $I_{PRO}$  is an additional leak current introduced by the varistor into the conventional CAS-based FPGA.  $I_{CAS}$  only depends on the  $R_{OFF}$  of the CAS, which is 200 M $\Omega$  in the calculations. When NL increases in Fig. 15, the static power decreases and approaches  $I_{CAS}$ . High NL performance of over 10<sup>5</sup> gives low static power, irrespective of the configuration. NL higher than 10<sup>5</sup> is desirable to keep the static power of the crossbar switch below 0.2  $\mu$ W [Fig. 15(a)]. However, when NL is below 10<sup>5</sup>, the static power is strongly affected by the configuration and the number of the LPs. A thorough mapping algorithm may be required to minimize the power. When NL is high enough, the introduction of a varistor does not impact the static power in the FPGA. Depopulating the cross points is an effective way of reducing the static power due to the forcible reduction of the number of the LPs [Fig. 15(b)]. It is confirmed that the depopulation is useful to reduce not only the cell area and capacitance of the crossbar but also the static power.

Finally, to demonstrate the multiple-FOs in the crossbar switch, six CASs are programmed along the two column lines [columns 5 and 10 in Fig. 16(a)]. After programming, we input the signal waves to IN5 and IN10 and detect the output signals from all output ports. As a result, an accurate signal transfer is confirmed with small crosstalk [Fig. 16(b)]. Thus, a large-scale crossbar switch with the via-switch is successfully demonstrated.

## **VI. CONCLUSION**

A via-switch was successfully integrated into the Cu-BEOL, resulting in the compact crossbar switch without a select transistor with multiple-FOs. The newly proposed triple-layered a-Si/SiN stack/a-Si varistor improves the NL and the OFF resistance to more than  $10^5$  and  $270 \text{ M}\Omega$ , respectively, which can be applied to routing switches of low-power FPGAs. A large-scale  $50 \times 20$  crossbar switch is demonstrated with multiple-FOs. In simulation, the depopulated crossbar switch is useful to reduce not only the cell area and capacitance but also the static power.

#### ACKNOWLEDGMENT

A part of the device processing was operated by AIST, Japan.

#### REFERENCES

- I. Kuon and J. Rose, "Measuring the gap between FPGAs and ASICs," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 26, no. 2, pp. 203–215, Feb. 2007. doi: 10.1109/TCAD.2006.884574.
- [2] K. Kim, R. Kanj, and R. V. Joshi, "Impact of FinFET technology for power gating in nano-scale design," in *Proc. 15th Int. Symp. Qual. Electron. Design*, Mar. 2014, pp. 543–547. doi: 10.1109/ ISQED.2014.6783374.
- [3] E. Vianello *et al.*, "Resistive memories for ultra-low-power embedded computing design," in *IEEE IEDM Tech. Dig.*, Dec. 2014, pp. 6.3.1–6.3.4. doi: 10.1109/IEDM.2014.7046995.
- [4] A. Kawahara *et al.*, "An 8 Mb multi-layered cross-point ReRAM macro with 443 MB/s write throughput," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2012, pp. 432–434. doi: 10.1109/ISSCC.2012.6177078.
- [5] G. Hu *et al.*, "STT-MRAM with double magnetic tunnel junctions," in *IEDM Tech. Dig.*, Dec. 2015, pp. 26.3.1–26.3.4. doi: 10.1109/IEDM.2015.7409772.
- [6] P.-E. Gaillardon *et al.*, "Design and architectural assessment of 3-D resistive memory technologies in FPGAs," *IEEE Trans. Nanotechnol.*, vol. 12, no. 1, pp. 40–50, Jan. 2013. doi: 10.1109/TNANO. 2012.2226747.
- [7] X. Tang, G. De Micheli, and P.-E. Gaillardon, "A high-performance FPGA architecture using one-level RRAM-based multiplexers," *IEEE Trans. Emerg. Topics Comput.*, vol. 5, no. 2, pp. 210–222, Apr./Jun. 2017. doi: 10.1109/TETC.2016.2630121.
- [8] J. Cong and B. Xiao, "FPGA-RPI: A novel FPGA architecture with RRAM-based programmable interconnects," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 22, no. 4, pp. 864–877, Apr. 2014. doi: 10.1109/TVLSI.2013.2259512.
- [9] Y. Y. Liauw, Z. Zhang, W. Kim, A. El Gamal, and S. S. Wong, "Nonvolatile 3D-FPGA with monolithically stacked RRAM-based configuration memory," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2012, pp. 406–408. doi: 10.1109/ISSCC.2012.6177067.

- [10] M. Miyamura *et al.*, "Programmable cell array using rewritable solid-electrolyte switch integrated in 90 nm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2011, pp. 228–229. doi: 10.1109/ISSCC.2011.5746296.
- [11] M. Miyamura *et al.*, "First demonstration of logic mapping on nonvolatile programmable cell using complementary atom switch," in *IEDM Tech. Dig.*, Dec. 2012, pp. 247–250. doi: 10.1109/IEDM.2012.6479020.
- [12] M. Tada, K. Okamoto, T. Sakamoto, M. Miyamura, N. Banno, and H. Hada, "Polymer solid-electrolyte switch embedded on CMOS for nonvolatile crossbar switch," *IEEE Trans. Electron Devices*, vol. 58, no. 12, pp. 4398–4406, Dec. 2011. doi: 10.1109/TED.2011.216 9070.
- [13] M. Tada *et al.*, "Improved off-state reliability of nonvolatile resistive switch with low programming voltage," *IEEE Trans. Electron Devices*, vol. 59, no. 9, pp. 2357–2362, Sep. 2012. doi: 10.1109/TED.2012.2204263.
- [14] M. Tada *et al.*, "Improved reliability and switching performance of atom switch by using ternary Cu-alloy and RuTa electrodes," in *IEEE IEDM Tech. Dig.*, Dec. 2012, pp. 29.8.1–29.8.4. doi: 10.1109/IEDM.2012.6479133.
- [15] N. Banno et al., "Nonvolatile 32×32 crossbar atom switch block integrated on a 65-nm CMOS platform," in Proc. Symp. VLSI Technol. (VLSIT), Jun. 2012, pp. 39–40. doi: 10.1109/VLSIT.2012.6242450.
- [16] N. Banno *et al.*, "Improved switching voltage variation of Cu atom switch for nonvolatile programmable logic," *IEEE Trans. Electron Devices*, vol. 61, no. 11, pp. 3827–3832, Nov. 2014. doi: 10.1109/TED.2014.2355830.
- [17] S. Fujii *et al.*, "Scaling the CBRAM switching layer diameter to 30 nm improves cycling endurance," *IEEE Electron Device Lett.*, vol. 39, no. 1, pp. 23–26, Jan. 2018. doi: 10.1109/LED.2017.2771718.
- [18] J. Zahurak *et al.*, "Process integration of a 27nm, 16Gb Cu ReRAM," in *IEDM Tech. Dig.*, Dec. 2014, pp. 6.2.1–6.2.4. doi: 10.1109/IEDM.2014.7046994.
- [19] L. Zhang *et al.*, "Ultrathin metal/amorphous-silicon/metal diode for bipolar RRAM selector applications," *IEEE Electron Device Lett.*, vol. 35, no. 2, pp. 199–201, Feb. 2014, doi: 10.1109/LED.2013.2293591.
- [20] W. Lee *et al.*, "Varistor-type bidirectional switch (JMAX>10<sup>7</sup>A/cm<sup>2</sup>, selectivity 10<sup>4</sup>) for 3D bipolar resistive memory arrays," in *Proc. Symp. VLSI Technol. (VLSIT)*, Jun. 2012, pp. 37–38. doi: 10.1109/VLSIT.2012.6242449.
- [21] E. Cha et al., "Nanoscale (~10nm) 3D vertical ReRAM and NbO<sub>2</sub> threshold selector with TiN electrode," in *IEDM Tech. Dig.*, Dec. 2013, pp. 10.5.1–10.5.4. doi: 10.1109/IEDM.2013.6724602.
- [22] Q. Luo et al., "Cu BEOL compatible selector with high selectivity (>107), extremely low off-current (~pA) and high endurance (>1010)," in *IEDM Tech. Dig.*, Dec. 2015, pp. 10.4.1–10.4.4. doi: 10.1109/IEDM.2015.7409669.
- [23] K. Okamoto *et al.*, "Bidirectional TaO-diode-selected, complementary atom switch (DCAS) for area-efficient, nonvolatile crossbar switch block," in *Proc. Symp. VLSI Technol.*, Jun. 2013, pp. T242–T243.
- [24] N. Banno et al., "A novel two-varistors (a-Si/SiN/a-Si) selected complementary atom switch (2V-1CAS) for nonvolatile crossbar switch with multiple fan-outs," in *IEDM Tech. Dig.*, Dec. 2015, pp. 2.5.1–2.5.4. doi: 10.1109/IEDM.2015.7409614.
- [25] N. Banno et al., "50×20 crossbar switch block (CSB) with two-varistors (a-Si/SiN/a-Si) selected complementary atom switch for a highly-dense reconfigurable logic," in *IEDM Tech. Dig.*, Dec. 2016, pp. 16.4.1–16.4.4. doi: 10.1109/IEDM.2016.7838431.
- [26] M. Miyamura *et al.*, "Low-power programmable-logic cell arrays using nonvolatile complementary atom switch," in *Proc. 15th Int. Symp. Qual. Electron. Design*, Mar. 2014, pp. 330–334. doi: 10.1109/ISQED.2014.6783344.
- [27] H. Ochi *et al.*, "Via-switch fpga: Highly dense mixed-grained reconfigurable architecture with overlay via-switch crossbars," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 26, no. 12, pp. 2723–2736, Dec. 2018. doi: 10.1109/TVLSI.2018.2812914.
- [28] N. Banno et al., "A fast and low-voltage Cu complementary-atomswitch 1Mb array with high-temperature retention," in Symp. VLSI Technol. (VLSI-Technol.) Dig. Tech. Papers, Jun. 2014, pp. 1–2. doi: 10.1109/VLSIT.2014.6894437.
- [29] Y. Tsuji et al., "A 2×logic density Programmable logic array using atom switch fully implemented with logic transistors at 40nm-node and beyond," in *Proc. IEEE Symp. VLSI Circuits*, Jun. 2016, pp. 1–2. doi: 10.1109/VLSIC.2016.7573461.