Flash memory stores information in an array of memory cells made from
floating-gate transistors. In
single-level cell (SLC) devices, each cell stores only one bit of information.
Multi-level cell (MLC) devices, including
triple-level cell (TLC) devices, can store more than one bit per cell. The floating gate may be conductive (typically
polysilicon in most kinds of flash memory) or non-conductive (as in
SONOS flash memory).
Floating-gate MOSFET In flash memory, each memory cell resembles a standard
metal–oxide–semiconductor field-effect transistor (MOSFET) except that the transistor has two gates instead of one. The cells can be seen as an electrical switch in which current flows between two terminals (source and drain) and is controlled by a floating gate (FG) and a control gate (CG). The CG is similar to the gate in other MOS transistors, but below this is the FG, which is insulated all around by an oxide layer. The FG is interposed between the CG and the MOSFET channel. Because the FG is electrically isolated by its insulating layer, electrons placed on it are trapped. When the FG is charged with electrons, this charge
screens the
electric field from the CG, thus increasing the
threshold voltage (VT) of the cell. This means that the VT of the cell can be changed between the
uncharged FG threshold voltage (VT1) and the higher
charged FG threshold voltage (VT2) by changing the FG charge. In order to read a value from the cell, an intermediate voltage (VI) between VT1 and VT2 is applied to the CG. If the channel conducts at VI, the FG must be uncharged (if it were charged, there would not be conduction because VI is less than VT2). If the channel does not conduct at the VI, it indicates that the FG is charged. The binary value of the cell is sensed by determining whether there is current flowing through the transistor when VI is asserted on the CG. In a multi-level cell device, which stores more than one
bit per cell, the amount of current flow is sensed (rather than simply its presence or absence), in order to determine more precisely the level of charge on the FG. Floating gate MOSFETs are so named because there is an electrically insulating tunnel oxide layer between the floating gate and the silicon, so the gate "floats" above the silicon. The oxide keeps the electrons confined to the floating gate. Degradation or wear (and the limited endurance of floating gate Flash memory) occurs due to the extremely high
electric field (10 million volts per centimeter) experienced by the oxide. Such high voltage densities can break atomic bonds over time in the relatively thin oxide, gradually degrading its electrically insulating properties and allowing electrons to be trapped in and pass through freely (leak) from the floating gate into the oxide, increasing the likelihood of data loss since the electrons (the quantity of which is used to represent different charge levels, each assigned to a different combination of bits in MLC Flash) are normally in the floating gate. This is why data retention goes down and the risk of data loss increases with increasing degradation. The silicon oxide in a cell degrades with every erase operation. The degradation increases the amount of negative charge in the cell over time due to trapped electrons in the oxide and negates some of the control gate voltage. Over time, this also makes erasing the cell slower; to maintain the performance and reliability of the NAND chip, the cell must be retired from use. Endurance also decreases with the number of bits in a cell. With more bits in a cell, the number of possible states (each represented by a different voltage level) in a cell increases and is more sensitive to the voltages used for programming. Voltages may be adjusted to compensate for degradation of the silicon oxide, and as the number of bits increases, the number of possible states also increases and thus the cell is less tolerant of adjustments to programming voltages, because there is less space between the voltage levels that define each state in a cell.
Fowler–Nordheim tunneling The process of moving electrons from the control gate and into the floating gate is called
Fowler–Nordheim tunneling, and it fundamentally changes the characteristics of the cell by increasing the MOSFET's threshold voltage. This, in turn, changes the drain-source current that flows through the transistor for a given gate voltage, which is ultimately used to encode a binary value. The Fowler-Nordheim tunneling effect is reversible, so electrons can be added to or removed from the floating gate, processes traditionally known as writing and erasing.
Internal charge pumps Despite the need for relatively high programming and erasing voltages, virtually all flash chips today require only a single supply voltage and produce the high voltages that are required using on-chip
charge pumps. Over half the energy used by a 1.8 V-NAND flash chip is lost in the charge pump itself. Since
boost converters are inherently more efficient than charge pumps, researchers developing
low-power SSDs have proposed returning to the dual Vcc/Vpp supply voltages used on all early flash chips, driving the high Vpp voltage for all flash chips in an SSD with a single shared external boost converter. In spacecraft and other high-radiation environments, the on-chip charge pump is the first part of the flash chip to fail, although flash memories will continue to work in read-only mode at much higher radiation levels.
NOR flash In both NOR and NAND flash memories, the cells are arranged in a grid. We can think of the memory as consisting of "words" of a certain number of bits (or cells), with each word being confined to a particular column of the grid, and the bits being in different rows. All the bits of a particular word are linked by a
wordline, a conductor connecting to the control gates of all the bits of that word. All the first bits of a certain number of adjacent words (columns) are linked by a
bitline, as are all the second bits and so on. The bitlines connect to one of the terminals (source or drain) of the cells. By manipulating the voltages on the wordlines one can read a certain bit by measuring the voltage on the corresponding bitline. The way to do this depends on whether the memory chip is a NOR or a NAND flash. In NOR flash, each cell has one end connected directly to ground, and the other end connected directly to a bit line. This arrangement is called "NOR flash" because it acts like a
NOR gate if any of the word lines (connected to the CG of the cells) is brought high, the corresponding storage transistor may act to pull the output bit line low, but this depends on the charge in the floating gate. Since several words are connected by the bit line, the output does not depend on only two (the bitline staying high if neither the first NOR the second wordline is high) but on all (the bitline remaining high if NONE of the wordlines is high). So to read a bit of a certain word, all the wordlines except that of the desired word are put low. NOR flash continues to be the technology of choice for embedded applications requiring a discrete non-volatile memory device. The low read latencies characteristic of NOR devices allow for both direct code execution and data storage in a single memory product.
Programming A single-level NOR flash cell in its default state is logically equivalent to a binary "1" value, because current will flow through the channel under application of an appropriate voltage to the control gate, so that the bitline voltage is pulled down. A NOR flash cell can be programmed, or set to a binary "0" value, by the following procedure: • an elevated on-voltage (typically >5 V) is applied to the CG • the channel is now turned on, so electrons can flow from the source to the drain (assuming an NMOS transistor) • the source-drain current is sufficiently high to cause some high energy electrons to jump through the insulating layer onto the FG, via a process called
hot-electron injection.
Erasing To erase a NOR flash cell (resetting it to the "1" state), a large voltage
of the opposite polarity is applied between the CG and source terminal, pulling the electrons off the FG through
Fowler–Nordheim tunneling (FN tunneling). This is known as Negative gate source erase. Newer NOR memories can erase using negative gate channel erase, which biases the wordline on a NOR memory cell block and the P-well of the memory cell block to allow FN tunneling to be carried out, erasing the cell block. Older memories used source erase, in which a high voltage was applied to the source and then electrons from the FG were moved to the source. Modern NOR flash memory chips are divided into erase segments (often called blocks or sectors). The erase operation can be performed only on a block-wise basis; all the cells in an erase segment must be erased together. Programming of NOR cells, however, generally can be performed one byte or word at a time.
NAND flash NAND flash also uses a grid of
floating-gate transistors (see above), but they are connected in a way that resembles a
NAND gate: the transistors corresponding to a given bit of several words are connected in series, and the bitline is pulled low if all the word lines are pulled high (above the transistors' VT). To read the bit of a particular word, its wordline is put low and all the other wordlines are put high, and then the bitline will reflect the state of the floating gate of the desired cell. These groups are then connected via some additional transistors to a NOR-style bit line array in the same way that single transistors are linked in NOR flash. Compared to NOR flash, replacing single transistors with serial-linked groups adds an extra level of addressing. Whereas NOR flash might address memory by page then word, NAND flash might address it by page, word and bit. Bit-level addressing suits bit-serial applications (such as hard disk emulation), which access only one bit at a time. applications, on the other hand, require every bit in a word to be accessed simultaneously. This requires word-level addressing. In any case, both bit and word addressing modes are possible with either NOR or NAND flash. To read data, first the desired group is selected (in the same way that a single transistor is selected from a NOR array). Next, most of the word lines are pulled up above VT2, while one of them is pulled up to VI. The series group will conduct (and pull the bit line low) if the selected bit has not been programmed. Despite the additional transistors, the reduction in ground wires and bit lines allows a denser layout and greater storage capacity per chip. (The ground wires and bit lines are actually much wider than the lines in the diagrams.) In addition, NAND flash is typically permitted to contain a certain number of faults (NOR flash, as is used for a
BIOS ROM, is expected to be fault-free). Manufacturers try to maximize the amount of usable storage by shrinking the size of the transistors or cells, however the industry can avoid this and achieve higher storage densities per die by using 3D NAND, which stacks cells on top of each other. NAND flash cells are read by analysing their response to various voltages.
Writing and erasing NAND flash uses
tunnel injection for writing and
tunnel release for erasing. NAND flash memory forms the core of the removable
USB storage devices known as
USB flash drives, as well as most
memory card formats and
solid-state drives available today. The hierarchical structure of NAND flash starts at a cell level which establishes strings, then pages, blocks, planes and ultimately a die. A string is a series of connected NAND cells in which the source of one cell is connected to the drain of the next one. Depending on the NAND technology, a string typically consists of 32 to 128 NAND cells. Strings are organised into pages which are then organised into blocks in which each string is connected to a separate line called a bitline. All cells with the same position in the string are connected through the control gates by a wordline. A plane contains a certain number of blocks that are connected through the same bitline. A flash die consists of one or more planes, and the peripheral circuitry that is needed to perform all the read, write, and erase operations. The architecture of NAND flash means that data can be read and programmed (written) in pages, typically between 4 KiB and 16 KiB in size, but can only be erased at the level of entire blocks consisting of multiple pages. When a block is erased, all the cells are logically set to 1. Data can only be programmed in one pass to a page in a block that was erased. The programming process is set one or more cells from 1 to 0. Any cells that have been set to 0 by programming can only be reset to 1 by erasing the entire block. This means that before new data can be programmed into a page that already contains data, the current contents of the page plus the new data must all be copied to a new, erased page. If a suitable erased page is available, the data can be written to it immediately. If no erased page is available, a block must be erased before copying the data to a page in that block. The old page is then marked as invalid and is available for erasing and reuse. This is different from operating system
LBA view, for example, if operating system writes 1100 0011 to the flash storage device (such as
SSD), the data actually written to the flash memory may be 0011 1100.
Vertical NAND Vertical NAND (V-NAND) or 3D NAND memory stacks memory cells vertically and uses a
charge trap flash architecture. The vertical layers allow larger areal bit densities without requiring smaller individual cells. It is also sold under the trademark
BiCS Flash, which is a trademark of Kioxia Corporation (formerly Toshiba Memory Corporation). 3D NAND was first announced by
Toshiba in 2007. V-NAND was first commercially manufactured by
Samsung Electronics in 2013.
Structure V-NAND uses a
charge trap flash geometry (which was commercially introduced in 2002 by
AMD and
Fujitsu) An individual memory cell is made up of one planar polysilicon layer containing a hole filled by multiple concentric vertical cylinders. The hole's polysilicon surface acts as the gate electrode. The outermost silicon dioxide cylinder acts as the gate dielectric, enclosing a silicon nitride cylinder that stores charge, in turn enclosing a silicon dioxide cylinder as the tunnel dielectric that surrounds a central rod of conducting polysilicon which acts as the conducting channel. separately, but stacked together to create a product with a higher number of 3D NAND layers on a single die. Often, two or 3 arrays are stacked. The misalignment between plugs is in the order of 30 to 10nm.
Construction Growth of a group of V-NAND cells begins with an alternating stack of conducting (doped) polysilicon layers and insulating silicon dioxide layers. As the number of layers increases, the capacity and endurance of flash memory may be increased.
Cost The wafer cost of a 3D NAND is comparable with scaled down (32 nm or less) planar NAND flash. However, with planar NAND scaling stopping at 16 nm, the cost per bit reduction can continue by 3D NAND starting with 16 layers. However, due to the non-vertical sidewall of the hole etched through the layers; even a slight deviation leads to a minimum bit cost, i.e., minimum equivalent design rule (or maximum density), for a given number of layers; this minimum bit cost layer number decreases for smaller hole diameter. ==Limitations==