Processing math: 100%
CAO Huamin, WANG Qi, LIU Fei, HUO Zongliang. A Novel Plane-Based Control Bus Design with Distributed Registers in 3D NAND Flash Memories[J]. Chinese Journal of Electronics, 2022, 31(4): 647-651. DOI: 10.1049/cje.2021.00.283
Citation: CAO Huamin, WANG Qi, LIU Fei, HUO Zongliang. A Novel Plane-Based Control Bus Design with Distributed Registers in 3D NAND Flash Memories[J]. Chinese Journal of Electronics, 2022, 31(4): 647-651. DOI: 10.1049/cje.2021.00.283

A Novel Plane-Based Control Bus Design with Distributed Registers in 3D NAND Flash Memories

Funds: This work was supported by the National Science and Technology Major Project of China (21-02)
More Information
  • Author Bio:

    CAO Huamin: (corresponding author) was born in 1987. She received the B.E. degree in integrated circuit engineering from Tsinghua University, China, in 2012. She is a Ph.D. candidate of Institute of Microelectronic, Chinese Academy of Sciences, and the University of Chinese Academy of Sciences. Her research interests include circuit design in memories and design for testability. (Email: caohuamin2005@163.com)

    WANG Qi: received the Ph.D. degree from Fudan University, China, in 2005. He is a Professor in the Institute of Microelectronic, Chinese Academy of Sciences, and the University of Chinese Academy of Sciences. His research interests include novel memory design, error correction code, and mass storage techniques. (Email: wangqi1@ime.ac.cn)

    LIU Fei: received the Ph.D. degree from Peking University, China, in 2003. He is a Professor in the Institute of Microelectronic, Chinese Academy of Sciences, and the University of Chinese Academy of Sciences. His research interests include novel memory design, phase lock loop, and analog digital converter. (Email: liufei@ime.ac.cn)

    HUO Zongliang: received the Ph.D. degree from Peking University, China, in 2003. He is a Professor in the Institute of Microelectronic, Chinese Academy of Sciences, and the University of Chinese Academy of Sciences. His research interests include process, device characterization, and reliability of novel flash memory device. (Email: huozongliang@ime.ac.cn)

  • Received Date: August 20, 2021
  • Accepted Date: February 21, 2022
  • Available Online: March 17, 2022
  • Published Date: July 04, 2022
  • This work presents a novel plane-based area-saving control bus design with distributed registers in 3D NAND flash memory. 99.47% control signal routing wires are reduced compared to the conventional control circuit design. Independent multi-plane read is compatible with the existing read operations thanks to the register addresses are reasonably assigned. Furthermore, power-saving register group address-based plane gating scheme is proposed which saves about 2.9 mW bus toggling power. A four-plane control bus design with 20K-bits registers has been demonstrated in field programmable gate array tester. The results show that the plane-based control bus design is beneficial to high-performance 3D NAND flash memory design.
  • 3D NAND flash has been widely used in the field of mass storage because of its large storage capacity and low bit cost[1-3]. In addition, other metrics, such as low read and write latencies also have attracted extensive attention. From ONFI 3.0 (Open NAND Flash Interface Specification 3.0), MPR (multi-plane read) was introduced to perform read operations on multiple planes at once. Unfortunately, the data only can be output serially as all data from multiple planes are ready at the same time. Consequently, IMPR (independent multi-plane read) was proposed recently to overcome the limitation of MPR[4]. Read operations in multiple planes are fully independent, thus the utilization of the output ports is improved. However, independent plane read requires separate control logic circuit[4]. This means the area and power consumption of the peripheral CMOS circuit will increase. So far there is little research providing detailed discussion about this problem.

    As we know, the most important functionality of the control logic circuit is to control or command the memory to cause operations to occur (e.g., read, program, erase, and other operations). To accomplish the functionality, a multitude of registers are indispensable to store the control signal bits. In conventional control logic, automatic digital design flow from RTL (register transfer level) to APR (auto place and route) is implemented. Therefore, the registers are often centralized in control logic circuit in order to simplify the design flow and the timing check. Nevertheless, a tremendous amount of routing area is occupied by control signals from control logic to the controlled circuit modules. To make matters worse, the number of control signals increases rapidly and the routing length grows quickly with larger NAND flash die size. In recent years, there is very limited number of published works on control architecture and control method for 3D NAND flash memories. Previous work presents an overview of logic architecture in flash memories[5]. It mentioned that “program/erase controller drives a lot of control signals.” Besides, a system and method of controlling a 3D memory is disclosed[6]. The controller provides control signals to a 3D memory array and related circuits[6]. They did not mention how to improve the area of control signals routing.

    One novel plane-based area-saving control bus architecture with distributed registers is proposed and demonstrated in this work. Control bus wires from control logic are placed at each plane direction and routing area for only 27 control bus signals is needed in each direction. The registers for control signals are distributed in plane local area and connected to plane control bus. Moreover, the register addresses are carefully designed to support NR (normal read), MPR and IMPR. The registers are classified into two types, one is for plane-independent circuit modules and the other is for plane-shared circuit modules. A power-saving register address-based plane gating scheme is presented and an ASIC register byte/bit design is also introduced. Finally, a four-plane control bus design with 20K-bits registers are implemented and demonstrated in FPGA (field programmable gate array) tester. The results show that the plane-based control bus design is beneficial to high-performance 3D NAND flash memory design.

    Diagram of the proposed plane-based control bus design in 3D NAND flash memory is shown in Fig.1. There are four plane regions (PLANE0, PLANE1, PLANE2, PLANE3) each of which includes plane-independent memory array (ARRAY), WL voltage switches (WL SW), BL voltage switches (BL SW) and plane-shared high-voltage circuit modules (HV), data-path circuit modules (DATA PATH). Control logic circuit (CTRL LOGIC) is located in the central area of the flash memory chip, it configures and controls circuit modules through control bus and registers. The control bus (BUS_PL0, BUS_PL1, BUS_PL2, BUS_PL3) are placed in different plane directions. Register groups (reg00 - reg04, reg10 - reg14, reg20 - reg24, reg30 - reg34) are located in PLANE0, PLANE1, PLANE2, PLANE3. Each register group consists of 1024 register bits. The inputs of these registers are connected to the control bus and the outputs are connected to the controlled circuit modules. The input/output pins (INPUT/OUTPUT) are used to transmit read and write data for planes.

    Figure  1.  Plane-based control bus diagram in 3D NAND flash

    According to the register allocation in Fig.1, the number of effective control signal bits is 5120 per plane (5 groups per plane and 1024 register bits per group). It needs 5120 signal routing wires if the conventional control logic is applied. In the proposed scheme, only 27 signal routing wires are needed by the introduction of the plane-based control bus design. The detailed information of control bus signals in each plane direction is listed in Table 1. Routing area of control signals is reduced to 0.53% of that in conventional control logic design. Obviously, the new proposed plane-based control bus design with distributed registers significantly optimizes the signal routing area, improves the memory integration density, and reduces the storage cost per bit.

    Table  1.  Bus signals for each plane
    Signal Bits Description
    bus_clk 1 Bus clock signal
    bus_write 1 Bus write enable
    bus_read 1 Bus read enable
    bus_addr 7 Register group address
    wdata_is_addr 1 Write data is register address
    bus_wdata 8 Bus write data
    bus_rdata 8 Bus read data
     | Show Table
    DownLoad: CSV

    The most important goal of the register group address assignment is to support NR, MPR and IMPR at the same time. The design difficulty is that different read operations need different control timing. Fig.2 illustrates read sequence of the three read operations and the involved circuit modules in each sequence. Fig.2(a) is the read sequence of NR. Single plane read operation is executed one after another which reads array data to peripheral circuit. WL SW, BL SW and HV modules are working in read operation, they are listed below in Fig.2(a). Data output follows each read operation transferring data to output pins. DATAPATH is working in data output operation, so it is listed below in Fig.2(a). Fig.2(b) shows the MPR sequence. Multiple planes are read simultaneously compared to NR sequence. Before next MPR operation, data output operations are executed to transfer the data of the previous MPR operation. The involved modules are listed below in Fig.2(b). Fig.2(c) is the IMPR sequence. Single plane read operation can start at any time once the data is transferred out. The read operations of different planes are fully independent. Read performance are highly improved in IMPR sequence. Plane-shared HV and DATAPATH operations are almost the same in all the three read operations, the only difference is the HV working duration. Actually, the major differences between different read operations are in WL SW and BL SW operations. As exhibited in Fig.2, the main difference between NR and MPR is the operation plane number. In NR operation, just one plane WL SW and BL SW is active. While in MPR, WL SW and BL SW in multiple planes (Fig.2 takes an example of a two planes MPR) are available. Besides, arbitrary overlap between the operations of different planes is the new challenge in IMPR. As this requires control logic circuit to switch plane at different times.

    Figure  2.  Read sequence of (a) NR, (b) MPR, (c) IMPR with the involved circuit modules below each operation

    According to previous analysis, the register group addresses are assigned as shown in Table 2. The register groups are divided into two types, one is for plane-shared circuit modules (HV and DATA PATH) and the other is for plane-independent circuit modules (WL SW and BL SW). The first type ones include reg02–reg04, reg12–reg14, reg22–reg24, reg32–eg34. Different addresses are assigned to them in order to control them independently. So HV and DATA PATH circuits can be enabled separately based on operation timing requirements. While reg00–reg01, reg10–reg11, reg20–reg21, reg30–reg31 are classified into the second type. The register groups in different planes share the same group address. Additionally, plane address is introduced for the selection of the second type registers. Thus, the active plane number of WL SW and BL SW can be decided by plane address in NR and MPR. The WL SW and BL SW operations in different planes can be switched by plane address change during IMPR. In one word, IMPR is compatible to NR and MPR since register group addresses are assigned in this work.

    Table  2.  Resister group address assignment
    Registers Address Description
    reg00–reg01 0x00–0x01 PL0 WL SW, BL SW
    reg02–reg04 0x10–0x12 PL0 HV, DATAPATH
    reg10–reg11 0x00–0x01 PL1 WL SW, BL SW
    reg12–reg14 0x20–0x22 PL1 HV, DATAPATH
    reg20–reg21 0x00–0x01 PL2 WL SW, BL SW
    reg22–reg24 0x30–0x32 PL2 HV, DATAPATH
    reg30–reg31 0x00–0x01 PL3 WL SW, BL SW
    reg32–reg34 0x40–0x42 PL3 HV, DATAPATH
     | Show Table
    DownLoad: CSV

    As mentioned in the previous section, plane gating is applied in plane-independent register groups. Actually, all register groups can adopt plane gating scheme to reduce power consumption. This work presents a register group address-based plane gating scheme as shown in Fig.3. Signal op_impr is the flag of IMPR operation from user command decoder in control logic circuit. Signal pl_addr_ext[3:0] is the plane address from user command, while signal pl_addr_int[3:0] is the plane address from control logic circuit during IMPR operation. When bus_addr[6:4] equals to 3b000, signal match_shared goes high. At the same time, signals match_ind0–match_ind3 keep low. As can be seen in Fig.3, the plane selection signal bus_sel[3:0] depends on pl_addr[3:0] which is chosen from pl_addr_ext or pl_addr_int. When bus_addr[6:4] equals to 3b001, match_ind0 and bus_sel[0] go high. Therefore, only plane 0 is selected. And so on, for each of the four planes. To sum up, only the selected bus is active to write and read registers while the rests are in low-power idle state.

    Figure  3.  Register group address-based plane gating diagram

    Ref.[3] mentioned that the typical die size of the latest 3D NAND flash memory is 136 mm2. So that the length of the bus signal is about 7.5 mm (2.5 mm in vertical and 5 mm in horizontal) by rough estimation. The capacitance of each bus signal routing wire is about 1.35 pF in a typical 65 nm CMOS technology. The voltage of rail-to-rail is 1.2 V, and the bus clock frequency is 50 MHz. Through Eq.(1), power consumption of one rail to rail charge-discharge is 97.2 μW. Simultaneous read and write are not supported in this paper, this means bus_wdata and bus_rdata do not toggle at the same time. Therefore, at most 19 bus signals works in bus read or bus write operation. bus_clk is the clock signal, which performs one charge-discharge in each clock cycle. While other 18 signals are only possible to toggle at the falling edge of bus_clk. For those signals, there is only half charge-discharge per clock cycle. That is to say, there are at most 10 (1+0.5×18) signals charge-discharge per plane in each clock cycle. So the power consumption with and without the new register group address-based plane gating design is 972 μW and 3888 μW respectively.

    P=CV2t
    (1)

    An ASIC register byte and bit design are introduced in this work. First, registers distributed in one plane are divided into register groups according to the different controlled modules. Each register group contains 128 register bytes or 1024 register bits. Diagrams of register byte and register bit are shown in Fig.4. The register byte is the smallest unit for reading and writing, so a 7-bits register byte address is required. Signal reg_addr[6:0] is the register byte address, which is fixed to a hard value in 0–127. reg_addr_int[6:0] is controlled in register group to realize random or sequential register access. When reg_addr_int[6:0] equals to reg_addr[6:0], this register byte is selected. Signal wr_match and rd_match are decided by bus_write and bus_read separately. They are the inputs of register bits to enable register write and read. The register bit consists of 1 tri-state buffer (tri), 1 flip-flop (dff with enable pin) and 1 multiplexers (mux). The signal reset is asynchronous reset signal. Signal por_end is the termination flag of POR (power on read operation). During POR operation, the flash memory chip reads out the initial values stored in memory array and writes into registers. These initial values are all configuration data of circuits. After POR, circuits can use the initial value to work. The control signals of circuit need to use another register design without the multiplexers. So that the circuit can be controlled to perform properly during POR operation.

    Figure  4.  Diagram of an ASIC register byte and register bit

    Signal config_data[7:0] is the control signals connected to circuit module. Signal bus_clk is the bus clock signal, and only valid when bus_write or bus_read is high, at a frequency of 50MHz. In the normal operation of 3D NAND flash memory, control logic circuit obtains the working status of the circuit module through bus_rdata[7:0], then decides whether to proceed to the next operation step. For this purpose, another read only register design need to be used. In addition, during the testing of 3D NAND flash memory, control logic circuit reads the data written to the register through bus_rdata[7:0], to determine whether the register is written correctly. Finally, in most of the time, control logic circuit controls the operation of the circuit module by writing control signals to registers with bus_wdata[7:0].

    A four-plane control bus design with 20K-bits registers adopting the above design is implemented and verified in FPGA tester environment. The timing waveform of the control bus is shown in Fig.5. All bus operations can be divided into the following three basic operation units. The first one is register write that writes data to the register group specified by bus_addr[6:0]. If bus_write is high and wdata_is_addr is low, bus_wdata[7:0] is written to the register. The second one is register read which reads data from the register group specified by bus_addr[6:0]. If bus_read is high, the data is read by bus_rdata[7:0]. The last one is register addressing which specifies the byte address of the register group specified by bus_addr[6:0]. If bus_write and wdata_is_addr are both high, bus_wdata[7:0] is valid as the address of register byte. In the case of continuous read and write, only one register addressing is needed. The address of the specified register byte is the initial address, and the increment of register byte address can be completed in register group. In the case of random reads and writes, register addressing is required for each read and write. Signal reg_addr_int[6:0] is the internal register byte address generated in register group. Signal bus_rdata_dp[7:0] is the output of control logic to transmit bus_rdata[7:0] to data path circuit. As shown in Fig.5, continuous write and read is executed for register byte 5, 6, and 7 of register group 0x11.

    Figure  5.  Timing waveform of control bus in continuous write and read operation

    Fig.6 is the test waveform of plane-gating scheme. Signals bus_sel[0]–bus_sel[3] are the gating signals of the four planes respectively. When bus_addr[6:0] is in 0x10–0x42, only one plane orientation’s control bus is selected, while that for the other planes remains unchanged. When bus_addr[6:0] is in 0x00–0x01, the selected plane is determined by plane address. In IMPR, plane address is from an internal register pl_addr_int[3:0] controlled by command decoder in control logic circuit. Otherwise, plane address is from pl_addr_ext[3:0] determined by user command directly.

    Figure  6.  Test waveform of plane gating scheme

    In this paper, the plane-based area-saving control bus design with distributed registers for 3D NAND flash applications has been proposed. A four-plane bus design with 20K-bits registers has been implemented and verified in FPGA tester. The plane-based control bus design realizes operation control through only 27 signals in each plane direction. The IMPR operation can be compatible to the existing read operations easily. The test results show that the novel plane-based control bus design with distributed registers is suitable for high-performance flash memory.

  • [1]
    D. Kang, M. Kim, C. J. Su, et al., “A 512Gb 3-bit/Cell 3D 6th-generation V-NAND flash memory with 82MB/s write throughput and 1.2Gb/s interface,” IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, pp.216–218, 2019.
    [2]
    N. Shibata, K. Kanda, T. Shimizu, et al., “A 1.33-Tb 4-bit/cell 3-D flash memory on a 96-word-line-layer technology,” IEEE Journal of Solid-State Circuits, vol.55, no.1, pp.178–188, 2020. DOI: 10.1109/JSSC.2019.2941758
    [3]
    D. Kim, H. Kim, S. Yun, et al., “A 1Tb 4b/Cell NAND flash memory with tPROG=2ms, tR=110us and 1.2Gb/s high-speed IO rate,” IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, pp.218–220, 2020.
    [4]
    Y. B. Wakchaure, A. S. Madraswala, D. J. Pelster, et al., “Independent NAND memory operations by plane,” Patent, 10877696 B2, USA, 2020-12-29.
    [5]
    A. Silvagni, G. Fusillo, R. Ravasio, et al., “An overview of logic architectures inside flash memory devices,” Proceedings of the IEEE, vol.91, no.4, pp.569–580, 2003. DOI: 10.1109/JPROC.2003.811707
    [6]
    L. G. Fasoli., “System and method of controlling a three-dimensional memory,” Patent, 7149119 B2, USA, 2006-12-12.

Catalog

    Figures(6)  /  Tables(2)

    Article Metrics

    Article views (472) PDF downloads (40) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return