Two modes The 65C816 has two operating modes: "emulation mode", in which the 16-bit operations are invisible—the index registers are forced to eight bits—and the chip appears to be very similar to the 6502, with the same cycle timings for the opcodes; and "native mode", which exposes all new features. The CPU automatically enters emulation mode when it is powered on or reset, which allows it to replace a 65(C)02, assuming one makes the required circuit changes to accommodate the different pin layout.
16-bit registers The most obvious change to the 65C816 when running in native mode is the expansion of the various registers from 8-bit to 16-bit sizes. This enhancement affects the accumulator (A), the X and Y
index registers, and the
stack pointer (SP). It does not affect the
program counter (PC), which has always been 16-bit. When running in native mode, two bits in the status register change their meaning. In the original 6502, bits 4 and 5 were not used, although bit 4 is referred to as the break (b) flag. In native mode, bit 4 becomes the x flag and bit 5 becomes the m flag. These bits control whether or not the
index registers (x) and accumulator/memory (m) are 8-bit or 16-bit in size. Zeros in these bits set 16-bit sizes, ones set 8-bit sizes. These bits are locked at ones when the processor is powered on or reset, but become changeable when the processor is switched to native mode. In native mode operation, the accumulator and index registers may be set to 16- or 8-bit sizes at the programmer's discretion by using the REP and SEP instructions to manipulate the m and x status register bits. This feature gives the programmer the ability to perform operations on either word- and byte-size data. As the accumulator and index register sizes are independently settable, it is possible, for example, to have the accumulator set to eight bits and the index registers set to 16 bits, giving the programmer the ability to manipulate individual bytes over a 64 KB range without having to perform pointer arithmetic. When register sizes are set to 16 bits, a memory access will fetch or store two contiguous bytes at the rate of one byte per clock cycle. Hence a read-modify-write instruction, such as ROR , when used while the accumulator is set to 16 bits, will affect two contiguous bytes of memory, not one and will consume more clock cycles than when the accumulator is set to eight bits. Similarly, all arithmetic and logical operations will be 16-bit operations.
24-bit addressing The other major change to the system while running in native mode is that the memory model is expanded to a 24-bit format from the original 16-bit format of the 6502. The 65C816 makes use of two 8-bit registers, the data bank register (DB) and the program bank register (PB), to set bits 16-23 of the address, effectively generating 24-bit addresses. In both cases, "bank" refers to a contiguous 64 KB segment of memory that is bounded by the address range $xx0000–$xxFFFF, where xx is the bank address, that is, bits 16-23 of the effective address. Both DB and PB are initialized to $00 at power-on or reset. During an
opcode or
operand fetch cycle, PB is prepended to the
program counter (PC) to form the 24-bit effective address. Should PC "wrap" (return to zero), PB will not be incremented. Hence a program is bounded by the limits of the bank in which it is executing. Implied by this memory model is that branch and subroutine targets must be in the same bank as the instruction making the branch or call, unless "long" jumps or subroutine calls are used to execute code in another bank. To ensure code coherency when transferring control across banks, PB can only be changed when PC is also set at the same time. During a data fetch or store cycle, DB is prepended to a 16-bit data address to form the 24-bit effective address at which data will be accessed. This processor characteristic makes it possible to sanely execute 6502 or 65C02 code that uses 16-bit addresses to reference data elements. DB can be changed under program control, something that might be done to access data beyond the limits of 16-bit addressing. Also, DB will temporarily increment if an address is indexed beyond the limits of the bank currently in DB. DB is ignored if a 24-bit address is specified as the operand to a data fetch/store instruction, or if the effective address is on direct (zero) page or the
hardware stack. In the latter case, an implied bank $00 is used to generate the effective address. A further addition to the register set is the 16-bit direct page register (DP), which sets the base address for what was formerly called the
zero page, but now referred to as
direct page. Direct page addressing uses an 8-bit address, which results in faster access than when a 16- or 24-bit address is used. Also, some addressing modes that offer indirection are only possible on direct page. In the 65(c)02, the direct page is always the first 256 bytes of memory, hence “zero page”. In native mode, the 65C816 can relocate direct (zero) page anywhere in bank $00 (the first 64 KB of memory) by writing the 16-bit starting address into DP. There is a one-cycle access penalty if DP is not set to an exact page boundary, that is, if the value in DP is not $xx00, where xx is the most-significant byte.
Switching between modes The current mode of operation is stored in the emulation (e) bit. Having already added the new x and m bits to the previous set of six flags in the status register (SR), there were not enough bits left to hold the new mode bit. Instead, a unique solution was used in which the mode bit was left "invisible", unable to be directly accessed. The XCE (e
Xchange
Carry with
Emulation) instruction exchanges the value of the emulation bit with the carry (c) bit, bit 0 in SR. For instance, if one wants to enter native mode after the processor has started up, one would use CLC to clear the carry bit, and then XCE to write it to the emulation bit. Returning to 65c02 emulation mode uses SEC followed by XCE. Internally, the 65C816 is a fully 16-bit design. The m and x bits in SR determine how the user registers (accumulator and index) appear to the rest of the system. Upon reset, the 65C816 starts in 6502 emulation mode, in which m and x are locked to 1. Hence the registers are locked to eight-bit size. The most significant byte (MSB) of the accumulator (the B-accumulator) is not directly accessible but can be swapped with the least significant byte (LSB) of the accumulator (the A-accumulator) by using the XBA instruction. There is no corresponding operation for the index registers (X and Y), whose MSBs are locked at $00. Upon being switched to native mode, the MSB of X and Y will be zero, and the B-accumulator will be unchanged. If the m bit in SR is cleared, the B-accumulator will be "ganged" to the A-accumulator to form a 16-bit register (called the C-accumulator). A load/store or arithmetic/logical operation involving the accumulator or memory will be a 16-bit operation—two bus cycles are required to fetch/store a 16-bit value. If the x bit in SR is cleared, both index registers will be set to 16 bits. If used to index an address, e.g., LDA SOMEWHERE,X, the 16-bit value in the index register will be added to the base address to form the effective address. If the m bit in SR is set, the accumulator will return to being an 8-bit register and subsequent operations on the accumulator, with a few exceptions, will be 8-bit operations. The B-accumulator will retain the value it had when the accumulator was set to 16 bits. The exceptions are the instructions that transfer the direct page register (DP) and stack pointer (SP) to/from the accumulator. These operations are always 16 bits wide in native mode, regardless of the condition of the m bit in SR. If the x bit in SR is set, not only will the index registers return to being 8 bits, whatever was in the MSB while they were 16 bits wide will be lost, something an assembly language programmer cannot afford to forget. ==Applications==