Hubbry Logo
MOS Technology 6502MOS Technology 6502Main
Open search
MOS Technology 6502
Community hub
MOS Technology 6502
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
MOS Technology 6502
MOS Technology 6502
from Wikipedia
MOS Technology 6502
6502 processor in a DIP-40 plastic package. The four-digit date code indicates it was made in the 45th week (November 4–10) of 1985.
General information
Launched1975; 50 years ago (1975)
Common manufacturer
Performance
Max. CPU clock rate1 MHz to 3 MHz
Data width8 bits
Address width16 bits
Architecture and classification
Instruction setMOS 6502
Number of instructions56 (55 originally)
Physical specifications
Transistors
Package
History
Predecessors
Successors

The MOS Technology 6502 (typically pronounced "sixty-five-oh-two" or "six-five-oh-two")[3] is an 8-bit microprocessor that was designed by a small team led by Chuck Peddle for MOS Technology and was launched in September 1975. The design team had formerly worked at Motorola on the Motorola 6800 project; the 6502 is essentially a simplified, less expensive and faster version of that design using depletion-load NMOS technology that made using the microchip in computers much cheaper.

When it was introduced, the 6502 was the least expensive microprocessor on the market by a considerable margin. It initially sold for less than one-sixth the cost of competing designs from larger companies, such as the 6800 or Intel 8080. Its introduction caused rapid decreases in pricing across the entire processor market. Along with the Zilog Z80, it sparked a series of projects that resulted in the home computer revolution of the early 1980s.

Home video game consoles and home computers of the 1970s through the early 1990s, such as the Atari 2600, Atari 8-bit computers, Apple II, Nintendo Entertainment System, Commodore 64, Atari Lynx, BBC Micro and others, use the 6502 or variations of the basic design. Soon after the 6502's introduction, MOS Technology was purchased outright by Commodore International, who continued to sell the microprocessor and licenses to other manufacturers. In the early days of the 6502, it was second-sourced by Rockwell and Synertek, and later licensed to other companies.

In 1981, the Western Design Center started development of a CMOS version, the 65C02. This continues to be widely used in embedded systems, with estimated production volumes in the hundreds of millions.[4]

History and use

[edit]

Conception

[edit]

The origins of the 6502 chip date back to 1960, after the Soviet Union launched the first artificial Earth satellite – the Sputnik 1. During this time, Chuck Peddle worked at General Electric as an engineer-in-training, designing tests and systems for missiles and spaceships. As he advanced into his engineering career, he found room-sized computers to be a flawed model of centralized intelligence, and instead, considered distributing it locally. However, General Electric sold its computer division to Honeywell in 1970, liquidating the entire section he worked in.[5]

Undeterred, Peddle took this severance and started his own company in 1972 to make intelligent terminals for word-processing.[6][7] Shortly after, Peddle found himself in a technological struggle; even though electronics were evolving at the time, it was still ridiculously complex to run the system he conceived. His idea required a microprocessor that would be capable of running programs. However, many companies were competing on the same technology for the same reason, including Motorola.[7]

Origins at Motorola

[edit]
Motorola 6800 demonstration board built by Chuck Peddle and John Buchanan in 1974

Motorola started the 6800 microprocessor project in 1971 with Tom Bennett as the main architect. Motorola's engineers could run analog and digital simulations on an IBM 370-165 mainframe computer.[8] The chip layout began in late 1972, the first 6800 chips were fabricated in February 1974 and the full family was officially released in November 1974.[9][10]

John Buchanan was the designer of the 6800 chip[11][12] and Rod Orgill, who later did the 6501, assisted Buchanan with circuit analyses and chip layout.[13] Bill Mensch joined Motorola in June 1971 after graduating from the University of Arizona (at age 26).[14] His first assignment was helping define the peripheral ICs for the 6800 family and later he was the principal designer of the 6820 Peripheral Interface Adapter (PIA).[15] Bennett hired Chuck Peddle in 1973 to do architectural support work on the 6800 family products already in progress.[16] He contributed in many areas, including the design of the 6850 ACIA (serial interface).[17]

Motorola's target customers were established electronics companies such as Hewlett-Packard, Tektronix, TRW, and Chrysler.[18] In May 1972, Motorola's engineers began visiting select customers and sharing the details of their proposed 8-bit microprocessor system with ROM, RAM, parallel and serial interfaces.[19] In early 1974, they provided engineering samples of the chips so that customers could prototype their designs. Motorola's "total product family" strategy did not focus on the price of the microprocessor, but on reducing the customer's total design cost. They offered development software on a timeshare computer, the "EXORciser" debugging system, onsite training and field application engineer support.[20][21] Both Intel and Motorola had initially announced a US$360 price for a single microprocessor.[22][23] The actual price for production quantities was much less. Motorola offered a design kit containing the 6800 with six support chips for US$300.[24]

Peddle, who would accompany the salespeople on customer visits, found that customers were put off by the high cost of the microprocessor chips.[25] At the same time, these visits invariably resulted in the engineers he presented to producing lists of required instructions that were much smaller than "all these fancy instructions" that had been included in the 6800.[26] Peddle and other team members started outlining the design of an improved feature, reduced-size microprocessor. At that time, Motorola's new semiconductor fabrication facility in Austin, Texas, was having difficulty producing MOS chips, and mid-1974 was the beginning of a year-long recession in the semiconductor industry. Also, many of the Mesa, Arizona employees were displeased with the upcoming relocation to Austin.[27]

Motorola's Semiconductor Products Division management showed no interest in Peddle's low-cost microprocessor proposal. Eventually, Peddle was given an official letter telling him to stop working on the system.[28] Peddle responded to the order by informing Motorola that the letter represented an official declaration of "project abandonment", and as such, the intellectual property he had developed to that point was now his.[29] The 6502 was designed by many of the same engineers that had designed the Motorola 6800 microprocessor family.[30]

In a November 1975 interview, Motorola's Chairman, Bob Galvin, ultimately agreed that Peddle's concept was a good one and that the division missed an opportunity, "We did not choose the right leaders in the Semiconductor Products division." The division was reorganized and the management replaced. The new group vice president John Welty said, "The semiconductor sales organization lost its sensitivity to customer needs and couldn't make speedy decisions."[31]

MOS Technology

[edit]
A 1973 MOS Technology advertisement highlighting their custom integrated circuit capabilities
MOS Technology MCS6501, in white ceramic package, made in late August 1975

Peddle began looking outside Motorola for a source of funding for this new project. He initially approached Mostek CEO L. J. Sevin, but was declined. Sevin later admitted this was because he was afraid Motorola would sue them.[32]

While Peddle was visiting Ford Motor Company on one of his sales trips, Bob Johnson, later head of Ford's engine automation division, mentioned that their former colleague John Paivinen had moved to General Instrument and taught himself semiconductor design.[33] Paivinen then formed MOS Technology in Valley Forge, Pennsylvania in 1969 with two other executives from General Instrument, Mort Jaffe and Don McLaughlin. Allen-Bradley, a supplier of electronic components and industrial controls, acquired a majority interest in 1970.[34] The company designed and fabricated custom ICs for customers and had developed a line of calculator chips.[35]

After the Mostek efforts fell through, Peddle approached Paivinen, who "immediately got it".[36] On 19 August 1974, Chuck Peddle, Bill Mensch, Rod Orgill, Harry Bawcom, Ray Hirt, Terry Holdt, and Wil Mathys left Motorola to join MOS. Mike Janes joined later. Of the seventeen chip designers and layout people on the 6800 team, eight left. The goal of the team was to design and produce a low-cost microprocessor for embedded applications and to target as wide as possible a customer base. This would be possible only if the microprocessor was low cost, and the team set the price goal for volume purchases at $5.[37] Mensch later stated the goal was not the processor price itself, but to create a set of chips that could sell at $20 to compete with the recently introduced Intel 4040 that sold for $29 in a similar complete chipset.[38]

Chips are produced by printing multiple copies of the chip design on the surface of a wafer, a thin disk of highly pure silicon. Smaller chips can be printed in greater numbers on the same wafer, decreasing their relative price. Additionally, wafers always include some number of tiny physical defects that are scattered across the surface. Any chip printed in that location will fail and has to be discarded. Smaller chips mean any single copy is less likely to be printed on a defect. For both of these reasons, the cost of the final product is strongly dependent on the size of the chip design.[39]

The original 6800 chips were intended to be 180 by 180 mils (4.6 mm × 4.6 mm), but layout was completed at 212 by 212 mils (5.4 mm × 5.4 mm), or an area of 29.0 mm2.[40] For the new design, the cost goal demanded a size goal of 153 by 168 mils (3.9 mm × 4.3 mm), or an area of 16.6 mm2, roughly half that of the 6800.[41] Several new techniques would be needed to hit this goal.

Moving to NMOS

[edit]

Two significant advances arrived in the market just as the 6502 was being designed that provided significant cost reductions. The first was the move to depletion-load NMOS. The 6800 used an early NMOS process, enhancement mode, that required three supply voltages. One of the 6800's headlining features was an onboard voltage doubler that allowed a single +5 V supply be used for +5, −5 and +12 V internally, as opposed to other chips of the era like the Intel 8080 that required three separate supply pins.[42] While this feature reduced the complexity of the power supply and pin layout, it still required separate power line to the various gates on the chip, driving up complexity and size. By moving to the new depletion-load design, a single +5 V supply was all that was needed, eliminating all of this complexity.[43]

A further advantage was that depletion-load designs used less power while switching, thus running cooler and allowing higher operating speeds. Another practical offshoot is that the clock signal for earlier CPUs had to be strong enough to survive all the dissipation as it traveled through the circuits, which almost always required a separate external chip that could supply a powerful signal. With the reduced power requirements of depletion-load design, the clock could be moved onto the chip, simplifying the overall computer design. These changes greatly reduced complexity and the cost of implementing a complete system.[43]

A wider change taking place in the industry was the introduction of projection masking. Previously, chips were patterned onto the surface of the wafer by placing a mask on the surface of the wafer and then shining a bright light on it. The masks often picked up tiny bits of dirt or photoresist as they were lifted off the chip, causing flaws in those locations on any subsequent masking. With complex designs like CPUs, 5 or 6 such masking steps would be used, and the chance that at least one of these steps would introduce a flaw was very high. In most cases, 90% of such designs were flawed, resulting in a 10% yield. The price of the working examples had to cover the production cost of the 90% that were thrown away.[44]

In 1973, Perkin-Elmer introduced the Micralign system, which projected an image of the mask on the wafer instead of requiring direct contact. Masks no longer picked up dirt from the wafers and lasted on the order of 100,000 uses rather than 10. This eliminated step-to-step failures and the high flaw rates formerly seen on complex designs. Yields on CPUs immediately jumped from 10% to 60 or 70%. This meant the price of the CPU declined roughly the same amount and the microprocessor suddenly became a commodity device.[44]

MOS Technology's existing fabrication lines were based on the older PMOS technology, they had not yet begun to work with NMOS when the team arrived. Paivinen promised to have an NMOS line up and running in time to begin the production of the new CPU. He delivered on the promise, the new line was ready by June 1975.[45]

Design notes

[edit]

Chuck Peddle, Rod Orgill, and Wil Mathys designed the initial architecture of the new processors. A September 1975 article in EDN magazine gives this summary of the design:[46]

The MOS Technology 650X family represents a conscious attempt of eight former Motorola employees who worked on the development of the 6800 system to put out a part that would replace and outperform the 6800, yet undersell it. With the benefit of hindsight gained on the 6800 project, the MOS Technology team headed by Chuck Peddle, made the following architectural changes in the Motorola CPU…

The main change in terms of chip size was the elimination of the tri-state drivers from the address bus outputs. A three-state bus has states for 1, 0 and high impedance. The last state is used to allow other devices to access the bus, and is typically used for multiprocessing, or more commonly in these roles, for direct memory access (DMA). While useful, this feature is expensive in terms of on-chip circuitry. The 6502 simply removed this feature, in keeping with its design as an inexpensive controller being used for specific tasks and communicating with simple devices. Peddle suggested that anyone who required this style of access could implement it with a 74158.[47][a]

The next major difference was to simplify the registers. To start with, one of the two accumulators was removed. General-purpose registers like accumulators have to be accessed by many parts of the instruction decoder, and thus require significant amounts of wiring to move data to and from their storage. Two accumulators makes many coding tasks easier but costs the chip design itself significant complexity.[46] Further savings were made by reducing the stack register from 16 to 8 bits, meaning that the stack could only be 256 bytes long, which was enough for its intended role as a microcontroller.[46][failed verification]

The 16-bit IX index register was split in two, becoming X and Y. More importantly, the style of access changed. In the 6800, IX held a 16-bit address which was offset by an 8-bit number stored with the instruction and added to the address. In the 6502 (and most other contemporary designs), the 16-bit base address was stored in the instruction, and the 8-bit X or Y was added to it.[47]

Finally, the instruction set was simplified, simplifying the decoder and control logic. Of the original 72 instructions in the 6800, 56 were implemented. Among those removed were instructions that operated between the 6800's two accumulators, and several branch instructions inspired by the PDP-11.[47]

The chip's high-level design had to be turned into drawings of transistors and interconnects. At MOS Technology, the layout was a very manual process done with colored pencils and vellum paper. The layout consisted of thousands of polygon shapes on six different drawings; one for each layer of the fabrication process. Given the size limits, the entire chip design had to be constantly considered. Mensch and Paivinen worked on the instruction decoder[49] while Mensch, Peddle and Orgill worked on the ALU and registers. A further advance, developed at a party, was a way to share some of the internal wiring to allow the ALU to be reduced in size.[50]

Despite their best efforts, the final design ended up being larger than the original target. The first 6502 chips were 168 by 183 mils (4.3 mm × 4.6 mm), for an area of 19.8 mm2. The original version of the processor had no rotate right (ROR) capability, so the instruction was omitted from the original documentation. The next iteration of the design shrank the chip and added the rotate right capability, and ROR was included in revised documentation.[51][b]

Introducing the 6501 and 6502

[edit]
Introductory advertisement for the MOS Technology MCS6501 and MCS6502 microprocessors

MOS would introduce two microprocessors based on the same underlying design: the 6501 would plug into the same socket as the Motorola 6800, while the 6502 re-arranged the pinout to support an on-chip clock oscillator. Both would work with other support chips designed for the 6800. They would not run 6800 software because they had a different instruction set, different registers, and mostly different addressing modes.[3] Rod Orgill was responsible for the 6501 design; he had assisted John Buchanan at Motorola on the 6800. Bill Mensch did the 6502; he was the designer of the 6820 PIA at Motorola. Harry Bawcom, Mike Janes and Sydney-Anne Holt helped with the layout.

MOS Technology's microprocessor introduction was different from the traditional months-long product launch. The first run of a new integrated circuit is normally used for internal testing and shared with select customers as engineering samples. These chips often have minor design defects that will be corrected before production begins. Chuck Peddle's goal was to sell the first run 6501 and 6502 chips to the attendees at the WESCON trade show in San Francisco beginning on September 16, 1975. Peddle was a very effective spokesman and the MOS Technology microprocessors were extensively covered in the trade press. One of the earliest was a full-page story on the MCS6501 and MCS6502 microprocessors in the July 24, 1975 issue of Electronics magazine.[55] Stories also ran in EE Times (August 24, 1975),[56] EDN (September 20, 1975), Electronic News (November 3, 1975), Byte (November 1975)[57] and Microcomputer Digest (November 1975).[58] Advertisements for the 6501 appeared in several publications the first week of August 1975. The 6501 would be for sale at WESCON for $20 each.[59] In September 1975, the advertisements included both the 6501 and the 6502 microprocessors. The 6502 would cost only $25 (equivalent to $146 in 2024).[60]

When MOS Technology arrived at Wescon, they found that exhibitors were not permitted to sell anything on the show floor. They rented the MacArthur Suite at the St. Francis Hotel and directed customers there to purchase the processors. At the suite, the processors were stored in large jars to imply that the chips were in production and readily available. The customers did not know the bottom half of each jar contained non-functional chips.[61] The chips were $20 and $25 while the documentation package was an additional $10. Users were encouraged to make photocopies of the documents, an inexpensive way for MOS Technology to distribute product information. The preliminary data sheets listed just 55 instructions and excluded the Rotate Right (ROR) instruction, which was not supported on these early chips. The reviews in Byte and EDN noted the lack of the ROR instruction. The next revision of the layout fixed this problem and the May 1976 datasheet listed 56 instructions. Peddle wanted every interested engineer and hobbyist to have access to the chips and documentation, whereas other semiconductor companies only wanted to deal with "serious" customers. For example, Signetics was introducing the 2650 microprocessor and its advertisements asked readers to write for information on their company letterhead.[62]

Motorola lawsuit

[edit]
The May 1976 datasheet omitted the 6501 microprocessor that was in the August 1975 version.

The 6501/6502 introduction in print and at Wescon was a success. The press coverage got Motorola's attention, precipitating pricing adjustments and lawsuits. In October 1975, Motorola reduced the price of a single 6800 microprocessor from $175 to $69. The $300 system design kit was reduced to $150 and it now came with a printed circuit board.[63] On November 3, 1975, Motorola sought an injunction in Federal Court to stop MOS Technology from making and selling microprocessor products. They also filed a lawsuit claiming patent infringement and misappropriation of trade secrets. Motorola claimed that seven former employees joined MOS Technology to create that company's microprocessor products.[64]

Motorola was a billion-dollar company with a plausible case and expensive lawyers. On October 30, 1974, Motorola had filed numerous patent applications on the microprocessor family and was granted twenty-five patents. The first was in June 1976 and the second was to Bill Mensch on July 6, 1976, for the 6820 PIA chip layout. These patents covered the 6800 bus and how the peripheral chips interfaced with the microprocessor.[65] Motorola began making transistors in 1950 and had a portfolio of semiconductor patents. Allen-Bradley decided not to fight this case and sold their interest in MOS Technology back to the founders. Four of the former Motorola engineers were named in the suit: Chuck Peddle, Will Mathys, Bill Mensch and Rod Orgill. All were named inventors in the 6800 patent applications. During the discovery process, Motorola found that one engineer, Mike Janes, had ignored Peddle's instructions and brought his 6800 design documents to MOS Technology.[66] In March 1976, the now independent MOS Technology was running out of money and had to settle the case. They agreed to drop the 6501 processor, pay Motorola $200,000 and return the documents that Motorola contended were confidential. Both companies agreed to cross-license microprocessor patents.[67] That May, Motorola dropped the price of a single 6800 microprocessor to $35. By November, Commodore had acquired MOS Technology.[68][69]

Computers and games

[edit]

With legal troubles behind them, MOS was still left with the problem of getting developers to try their processor, prompting Chuck Peddle to design the MDT-650 (microcomputer development terminal) single-board computer. Another group inside the company designed the KIM-1, which was sold semi-complete and could be turned into a usable system with the addition of a 3rd party computer terminal and compact cassette drive. While it sold well to its intended market, the company found the KIM-1 also sold well to hobbyists and tinkerers. The related Rockwell AIM-65 control, training, and development system also did well. The software in the AIM 65 was based on that in the MDT. Another roughly similar product was the Synertek SYM-1.

One of the first external uses for the design was the Apple I microcomputer, introduced in 1976. The 6502 was next used in the Commodore PET and Apple II,[70] both released in 1977. It was later used in the Atari 8-bit computers, Acorn Atom, BBC Micro,[70] VIC-20 and other designs both for home computers and business, such as Ohio Scientific and Oric computers. The 6510, a direct successor of the 6502 with a digital I/O port and a tri-state address bus, was the CPU utilized in the best-selling[71][72] Commodore 64 home computer.

Another important use of the 6500 family was in video games. The first to make use of the processor design was the 1977 Atari VCS, later renamed the Atari 2600. The VCS used a 6502 variant named the 6507, which had fewer pins, so it could address only 8 KB of memory. Millions of the Atari consoles would be sold, each with a MOS processor. Another significant use was by the Nintendo Entertainment System (NES) and Famicom. The 6502 used in the NES was a second source version by Ricoh, a partial system on a chip, that lacked the binary-coded decimal mode but added 22 memory-mapped registers and on-die hardware for sound generation, joypad reading, and sprite list DMA. Called 2A03 in NTSC consoles and 2A07 in PAL consoles,[c] this processor was produced exclusively for Nintendo.

6502 or variants were used in all of Commodore's floppy disk drives for all of their 8-bit computers, from the PET line through the Commodore 128D, including the Commodore 64. 8-inch PET drives had two 6502 processors. Atari used the same 6507 used in the Atari 2600 for its 810 and 1050 disk drives used for all of their 8-bit computer line, from the 400/800 through the XEGS.

In the 1980s, a popular electronics magazine Elektor used the processor in its microprocessor development board Junior Computer.

The CMOS successor to the 6502, the WDC 65C02, also saw use in home computers and video game consoles. Apple used it in the Apple II line starting with the Apple IIc and later variants of the Apple IIe and also offered a kit to upgrade older IIe systems with the new processor.[73] The Hudson Soft HuC6280 chip used in the TurboGrafx-16 was based on a 65C02 core. The Atari Lynx used a custom chip named "Mikey"[74] designed by Epyx which included a VLSI VL65NC02 licensed cell. The G65SC12 by GTE Microcircuits (renamed California Micro Devices) variant was used in the BBC Master. Some models of the BBC Master also included an additional G65SC102 co-processor.

Programmers model

[edit]
6502 processor die. The regular section at the top is the instruction decoding ROM, the seemingly random section in the center is the control logic, and at the bottom are the registers (right) and the ALU (left). The data bus connections are along the lower right, and the address bus along the bottom and lower left.[41]
6502 pin configuration (40-pin DIP)
MOS 6502 registers
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 (bit position)
Main registers
  A Accumulator
Index registers
  X X index
  Y Y index
0 0 0 0 0 0 0 1 S Stack pointer
Program counter
PC Program Counter
Status register
N V - B D I Z C Processor flags

The 6502 is a little-endian 8-bit processor with a 16-bit address bus. The original versions were fabricated using an 8 µm[75] process technology chip with a die size of 3.9 mm × 4.3 mm (153 by 168 mils), for a total area of 16.6 mm2.[41]

The internal logic runs at the same speed as the external clock rate. It featured a simple pipeline; on each cycle, the processor fetches one byte from memory and processes another. This means that any single instruction can take as few as two cycles to complete, depending on the number of operands that instruction uses. For comparison, the Zilog Z80 required two cycles to fetch memory, and the minimum instruction time was four cycles. Thus, despite the lower clock speeds compared to competing designs, typically in the neighborhood of 1 to MHz, the 6502's performance was competitive with CPUs using significantly faster clocks. This is partly due to a simple state machine implemented by combinational (clockless) logic to a greater extent than in many other designs; the two-phase clock (supplying two synchronizations per cycle) could thereby control the machine cycle directly.

This design also led to one useful design note of the 6502, and the 6800 before it. Because the chip only accessed memory during a certain part of the clock cycle, and this duration was indicated by the φ2-low clock-out pin, other chips in a system could access memory during those times when the 6502 was off the bus. This was sometimes known as "hidden access". This technique was widely used by computer systems; they would use memory capable of access at 2 MHz, and then run the CPU at 1 MHz. This guaranteed that the CPU and video hardware could interleave their accesses, with a total performance matching that of the memory device. Because this access was every other cycle, there was no need to signal the CPU to avoid using the bus, making this sort of access easy to implement without any bus logic. [76] When faster memories became available in the 1980s, newer machines could use this same technique while running at higher clock rates, the BBC Micro used newer RAM that allowed its CPU to run at 2 MHz while still using the same bus sharing techniques.

Like most simple CPUs of the era, the dynamic NMOS 6502 chip is not sequenced by microcode but decoded directly using a dedicated PLA. The decoder occupied about 15% of the chip area. This compares to later microcode-based designs like the Motorola 68000, where the microcode ROM and decoder engine represented about a third of the gates in the system.

Registers

[edit]

Like its precursor, the 6800, the 6502 has very few registers. They include[77]

This compares to a contemporaneous competitor, the Intel 8080, which likewise has one 8-bit accumulator and a 16-bit program counter, but has six more general-purpose 8-bit registers (which can be combined into three 16-bit pointers) and a larger 16-bit stack pointer.[80]

In order to make up somewhat for the lack of registers, the 6502 includes a zero page addressing mode that uses one address byte in the instruction instead of the two needed to address the full 64 KB of memory. This provides fast access to the first 256 bytes of RAM by using shorter instructions. For instance, an instruction to add a value from memory to the value in the accumulator would normally be three bytes, one for the instruction and two for the 16-bit address. Using the zero page reduces this to an 8-bit address, reducing the total instruction length to two bytes, and thus improving instruction performance.

The stack address space is hardwired to memory page $01, i.e. the address range $0100$01FF (256511). Software access to the stack is done via four implied addressing mode instructions, whose functions are to push or pop (pull) the accumulator or the processor status register. The same stack is also used for subroutine calls via the JSR (jump to subroutine) and RTS (return from subroutine) instructions and for interrupt handling.

Addressing

[edit]

The chip uses the index and stack registers effectively with several addressing modes, including a fast "direct page" or "zero page" mode, similar to that found on the PDP-8, that accesses memory locations from addresses 0 to 255 with a single 8-bit address (saving the cycle normally required to fetch the high-order byte of the address)—code for the 6502 uses the zero page much as code for other processors would use registers. On some 6502-based microcomputers with an operating system, the operating system uses most of zero page, leaving only a handful of locations for the user.

Addressing modes also include implied (1-byte instructions); absolute (3 bytes); indexed absolute (3 bytes); indexed zero-page (2 bytes); relative (2 bytes); accumulator (1); indirect,x and indirect,y (2); and immediate (2). Absolute mode is a general-purpose mode. Branch instructions use a signed 8-bit offset relative to the instruction after the branch; the numerical range −128..127 therefore translates to 128 bytes backward and 127 bytes forward from the instruction following the branch (which is 126 bytes backward and 129 bytes forward from the start of the branch instruction). Accumulator mode operates on the accumulator register and does not need any operand data. Immediate mode uses an 8-bit literal operand.

Indirect addressing

[edit]

The indirect modes are useful for array processing and other looping. With the 5/6 cycle "(indirect),y" mode, the 8-bit Y register is added to a 16-bit base address read from zero page, which is located by a single byte following the opcode. The Y register is therefore an index register in the sense that it is used to hold an actual index (as opposed to the X register in the 6800, where a base address was directly stored and to which an immediate offset could be added). Incrementing the index register to walk the array byte-wise takes only two additional cycles. With the less frequently used "(indirect,x)" mode the effective address for the operation is found at the zero page address formed by adding the second byte of the instruction to the contents of the X register. Using the indexed modes, the zero page effectively acts as a set of up to 128 additional (though very slow) address registers.

The 6502 is capable of performing addition and subtraction in binary or binary-coded decimal. Placing the CPU into BCD mode with the SED (set D flag) instruction results in decimal arithmetic, in which $99 + $01 would result in $00 and the carry (C) flag being set. In binary mode (CLD, clear D flag), the same operation would result in $9A and the carry flag being cleared. Other than Atari BASIC, BCD mode was seldom used in home-computer applications.

See the Hello world! article for a simple but characteristic example of 6502 assembly language.

Instructions and opcodes

[edit]

6502 instruction operation codes (opcodes) are 8 bits long and have the general form AAABBBCC, where AAA and CC define the opcode, and BBB defines the addressing mode.[81] For example, the ORA instruction performs a bitwise OR on the bits in the accumulator with another value. The instruction opcode is of the form 000bbb01, where bbb may be 010 for an immediate mode value (constant), 001 for zero-page fixed address, 011 for an absolute address, and so on.[81] This pattern is not universal, as there are exceptions, but it allows opcode values to be easily converted to assembly mnemonics for the majority of instructions, handling the edge cases with special-purpose code.[81]

Of the 256 possible opcodes available using an 8-bit pattern, the original 6502 uses 151 of them, organized into 56 instructions with (possibly) multiple addressing modes. Depending on the instruction and addressing mode, the opcode may require zero, one or two additional bytes for operands. Hence 6502 machine instructions vary in length from one to three bytes.[82][83] The operand is stored in the 6502's customary little-endian format.

Each CPU machine instruction takes up a certain number of clock cycles, usually equal to the number of memory accesses. For example, the absolute indexing mode of the ORA instruction takes 4 clock cycles; 3 cycles to read the instruction and 1 cycle to read the value of the absolute address. If no memory is accessed, the number of clock cycles is two. The minimum clock cycles for any instruction is two. When using indexed addressing, if the result crosses a page boundary an extra clock cycle is added. Also, when a zero page address is used in indexing mode (e.g. zp,X) an extra clock cycle is added.

The 65C816, the 16-bit CMOS descendant of the 6502, also supports 24-bit addressing, which results in instructions being assembled with three-byte operands, also arranged in little-endian format.

The remaining 105 opcodes are undefined. In the original design, instructions where the low-order 4 bits (nibble) were 3, 7, B or F were not used, providing room for future expansion. Likewise, the $x2 column had only a single entry, LDX #constant. The remaining 25 empty slots were distributed. Some of the empty slots were used in the 65C02 to provide both new instructions and variations on existing ones with new addressing modes. The $xF instructions were initially left free to allow 3rd-party vendors to add their own instructions, but later versions of the 65C02 standardized a set of bit manipulation instructions developed by Rockwell Semiconductor.

Assembly language

[edit]

A 6502 assembly language statement consists of a three-character instruction mnemonic, followed by any operands. Instructions that do not take a separate operand but target a single register based on the addressing mode combine the target register in the instruction mnemonic, so the assembler uses INX as opposed to INC X to increment the X register.

Instruction table

[edit]

Example code

[edit]

The following 6502 assembly language source code is for a subroutine named TOLOWER, which copies a null-terminated character string from one location to another, converting upper-case letter characters to lower-case letters. The string being copied is the "source", and the string into which the converted source is stored is the "destination".











0080
 
0080  00 04
0082  00 05
 
0600
 
0600  A0 00
 
0602  B1 80
0604  F0 11
 
0606  C9 41
0608  90 06
 
060A  C9 5B
060C  B0 02
 
060E  09 20
 
0610  91 82
0612  C8   
0613  D0 ED
 
 
 
 
0615  38   
0616  60
 
0617  91 82
0619  18   
061A  60
 
061B       
; TOLOWER:
;
;   Convert a null-terminated character string to all lower case.
;   Maximum string length is 255 characters, plus the null term-
;   inator.
;
; Parameters:
;
;   SRC – Source string address
;   DST – Destination string address
;
        ORG $0080
;
SRC     .WORD $0400     ;source string pointer
DST     .WORD $0500     ;destination string pointer
;
        ORG $0600       ;execution start address
;
TOLOWER LDY #$00        ;starting index
;
LOOP    LDA (SRC),Y     ;get from source string
        BEQ DONE        ;end of string
;
        CMP #'A'        ;if lower than UC alphabet...
        BCC SKIP        ;copy unchanged
;
        CMP #'Z'+1      ;if greater than UC alphabet...
        BCS SKIP        ;copy unchanged
;
        ORA #%00100000  ;convert to lower case
;
SKIP    STA (DST),Y     ;store to destination string
        INY             ;bump index
        BNE LOOP        ;next character
;
; NOTE: If Y wraps the destination string will be left in an undefined
;  state. We set carry to indicate this to the calling function.
;
        SEC             ;report string too long error &...
        RTS             ;return to caller
;
DONE    STA (DST),Y     ;terminate destination string
        CLC             ;report conversion completed &...
        RTS             ;return to caller
;
        .END

IC hardware design

[edit]
6502 processor die with drawn-in NMOS transistors and labels hinting at the functionality of the 6502's components

The processor's non-maskable interrupt (NMI) input is edge sensitive, which means that the interrupt is triggered by the falling edge of the signal rather than its level. Consequently a wired-OR interrupt circuit cannot be readily supported. However, this also prevents nested NMI interrupts from occurring until the interrupt source hardware makes the NMI input inactive again, often under control of the NMI interrupt handler.

The simultaneous assertion of the NMI and IRQ (maskable) hardware interrupt lines causes IRQ to be ignored. However, if the IRQ line remains asserted after the servicing of the NMI, the processor will almost immediately respond to IRQ, as IRQ is level sensitive. Thus a sort of built-in interrupt priority was established in the 6502 design. (The first opcode of the NMI handler is executed before the IRQ is detected again.)

The B flag is set by the 6502's periodically sampling its NMI edge detector's output and its IRQ input. The IRQ signal being driven low is only recognized though if IRQs are not masked by the I flag. If in this way a NMI request or (maskable) IRQ is detected the B flag is set to zero and causes the processor to execute the BRK instruction next instead of executing the next instruction based on the program counter.[84][85]

The BRK instruction then pushes the processor status onto the stack, with the B flag bit set to zero. At the end of its execution the BRK instruction resets the B flag's value to one. This is the only way the B flag can be modified. If an instruction other than the BRK instruction pushes the B flag onto the stack as part of the processor status[86] the B flag always has the value one.

A high-to-low transition on the SO input pin will set the processor's overflow status bit. This can be used for fast response to external hardware. For example, a high-speed polling device driver can poll the hardware once in only three cycles using a Branch-on-oVerflow-Clear (BVC) instruction that branches to itself until overflow is set by an SO falling transition. The Commodore 1541 and other Commodore floppy disk drives use this technique to detect when the serializer is ready to transfer another byte of disk data. The system hardware and software design must ensure that an SO will not occur during arithmetic processing and disrupt calculations.

Variations and derivatives

[edit]

The 6502 was the most prolific variant of the 65xx series family from MOS Technology.

The 6501 and 6502 have 40-pin DIP packages; the 6503, 6504, 6505, and 6507 are 28-pin DIP versions, for reduced chip and circuit board cost. In all of the 28-pin versions, the pin count is reduced by leaving off some of the high-order address pins and various combinations of function pins, making those functions unavailable.

Pinout differences
Pin 6800 6501 6502
2 Halt Ready Ready
3 ∅1 (in) ∅1 (in) ∅1 (out)
5 Valid memory address Valid memory address N.C.
7 Bus available Bus available SYNC
36 Data bus enable Data bus enable N.C.
37 ∅2 (in) ∅2 (in) ∅0 (in)
38 N.C. N.C. Set overflow flag
39 Three-state control N.C. ∅2 (out)

Typically, the 12 pins omitted to reduce the pin count from 40 to 28 are the three not connected (NC) pins, one of the two Vss pins, one of the clock pins, the SYNC pin, the set overflow (SO) pin, either the maskable interrupt or the non-maskable interrupt (NMI), and the four most-significant address lines (A12–A15). The omission of four address pins reduces the external addressability to 4 KB (from the 64 KB of the 6502), though the internal PC register and all effective address calculations remain 16-bit.

The 6507 omits both interrupt pins in order to include address line A12, providing 8 KB of external addressability but no interrupt capability. The 6507 was used in the popular Atari 2600 video game console, the design of which divides the 8 KB memory space in half, allocating the lower half to the console's internal RAM and peripherals, and the upper half to the Game Cartridge, so Atari 2600 cartridges have a 4 KB address limit (and the same capacity limit unless the cartridge contains bank switching circuitry).

One popular 6502-based computer, the Commodore 64, used a modified 6502 CPU, the 6510. Unlike the 6503–6505 and 6507, the 6510 is a 40-pin chip that adds internal hardware: a 6-bit parallel I/O port mapped to addresses 0000 and 0001. The 6508 is another chip that, like the 6510, adds internal hardware: 256 bytes of SRAM and an 8-bit I/O port similar to those featured by the 6510. Though these chips do not have reduced pin counts compared to the 6502, they need new pins for the added parallel I/O port. In this case, no address lines are among the removed pins.

Variations
Company Model Description
6502 A 1 MHz chip used in KIM-1 and other single board computers in the mid-1970s.
6502A A 1.5 MHz chip used in Asteroids Deluxe and at 2 MHz in the BBC Micro
6502B Version of the 6502 capable of running at a maximum speed of 3 MHz instead of 2 MHz. The B was used in both the Apple III and early Atari 8-bit computers, each running at ~1.8 MHz.[d]
6502C The “official” 6502C was a version of the original 6502 able to run at up to 4 MHz.

Not to be confused with SALLY, a custom 6502 designed for Atari (and sometimes referred to by them as "6502C"[87]) nor with the similarly named 65C02.

SALLY, C014806, "6502C" Custom 6502 variant designed for Atari, used in later Atari 8-bit computers and Atari 5200 and Atari 7800 consoles.

Has a HALT signal on pin 35 and the R/W signal on pin 36 (these pins are not connected (N/C) on a standard 6502). Pulling HALT low latches the clock, pausing the processor. This was used to allow the video circuitry direct memory access (DMA).[88]

Although sometimes referred to as "6502C" in Atari documentation, this is not the same as the official 6502C and the chip itself is never marked as such.[87]

MOS 6503 Reduced memory addressing capability (4 KB) and no RDY input, in a 28-pin DIP package (with the phase 1 (OUT), SYNC, redundant Vss, and SO pins of the 6502 also omitted).[89]
MOS 6504 Reduced memory addressing capability (8 KB), no NMI, and no RDY input, in a 28-pin DIP package (with the phase 1 (OUT), SYNC, redundant Vss, and SO pins of the 6502 also omitted).[89]
MOS 6505 Reduced memory addressing capability (4 KB) and no NMI, in a 28-pin DIP package (with the phase 1 (OUT), SYNC, redundant Vss, and SO pins of the 6502 also omitted).[89]
MOS 6506 Reduced memory addressing capability (4 KB), no NMI, and no RDY input, but all 3 clock pins of the 6502 (i.e. a 2-phase output clock), in a 28-pin DIP package (with the SYNC, redundant Vss, and SO pins of the 6502 also omitted).[89]
MOS 6507 Reduced memory addressing capability (8 KB) and no interrupts, in a 28-pin DIP package (with the phase 1 (OUT), SYNC, redundant Vss, and SO pins of the 6502 also omitted).[89] This chip was used in the Atari 2600 video game system and in many Atari 8-bit computer peripherals.
MOS 6508 Has a built-in 8-bit input/output port and 256 bytes of internal static RAM.
MOS 6509 Can address up to 1 MB of RAM as 16 banks of 64 KB and was used in the Commodore CBM-II series.
MOS 6510 Has a built-in 6-bit programmable input/output port and was used in the Commodore 64. The 8500 is effectively an HMOS version of the 6510, and replaced it in later versions of the C64.
MOS 6512
6513
6514
6515
The MOS Technology 6512, 6513, 6514, and 6515 each rely on an external clock, instead of using an internal clock generator like the 650x (e.g. 6502). This was used to advantage in some designs where the clocks could be run asymmetrically, increasing overall CPU performance.

The 6512 is a 6502 with a 2-phase clock input for an external clock oscillator, instead of an on-board clock oscillator.[89] The 6513, 6514 and 6515 are similarly equivalent to (respectively) a 6503, 6504 and 6505 with the same 2-phase clock input.[89]

The 6512 was used in the BBC Micro B+64.

Ricoh RP2A03
RP2A07
Unlicensed 6502 variants running at ~1.8 MHz[d] including an audio processing unit but lacking the BCD mode, used in the Nintendo Entertainment System.
MOS 6591
6592
System on a chip designs that utilize a complete Atari 2600 in a 48-pin DIP package.[90][91]
WDC 65C02 CMOS version of the NMOS 6502 that was designed by Bill Mensch of the Western Design Center (WDC), featuring reduced power consumption, support for much higher clock speeds, new instructions, new addressing modes for some existing instructions, and correction of NMOS errata, such as the JMP ($xxFF) bug.
CSG, MOS 65CE02 CMOS variant developed by the Commodore Semiconductor Group (CSG), formerly MOS Technology. The 65CE02 provides a further enhanced instruction set from the 65C02, featuring a third indexing register (Z), base page register, 16-bit stack and faster program execution with the minimal instruction timing reduced from 2 to 1 clock cycles.
Rockwell R6511Q

R6500/11, R6500/12, R6500/15 "One-Chip Microcomputers"
Enhanced versions of the 6502-based processor, also including individual bit manipulation operations (RMB, SMB, BBR and BBS), on-chip 192-byte zero-page RAM, UART, etc.[92][93]
Rockwell R65F11
R65F12
The Rockwell R65F11 (introduced in 1983) and the later R65F12 are enhanced versions of the 6502-based processor, also including individual bit manipulation operations (RMB, SMB, BBR and BBS), on-chip zero-page RAM, on-chip Forth kernel ROM, a UART, etc.[94][95][96][97][98]
GTE G65SC12 Drop in 6502 CMOS variant without individual bit manipulation operations (RMB, SMB, BBR and BBS).[99] This was used in the BBC Master.
GTE G65SC102 Software compatible with the 6502, but has a slightly different pinout and oscillator circuit. The BBC Master Turbo included the 4 MHz version of this CPU on a coprocessor card, which could also be bought separately and added to the Master 128.
Rockwell R65C00
R65C21
R65C29
The R65C00, R65C21, and R65C29 have two enhanced CMOS 6502s in a single chip, and the R65C00 and R65C21 additionally contained 2 KB of mask-programmable ROM.[100][101]
CM630 A 1 MHz Eastern Bloc clone of the 6502 and was used in the Pravetz 8A and 8C, Bulgarian clones of the Apple II.[102]
MOS 7501
8501
6510 (an enhanced 6502) variants, introduced in 1984.[103] They extended the number of I/O port pins from 6 to 7, but omitted pins for non-maskable interrupt and clock output.[104] Used in Commodore's C-16, C-116 and Plus/4 computers. The main difference between 7501 and 8501 CPUs is that the 7501 was manufactured with the HMOS-1 process and the 8501 with HMOS-2.[103]
MOS 8500 Introduced in 1985 as an HMOS version of the 6510 (which is in turn based on the 6502). Other than the process modification, the 8500 is virtually identical to the NMOS version of the 6510. It replaced the 6510 in later versions of the Commodore 64.
MOS 8502 Designed by MOS Technology and used in the Commodore 128. Based on the MOS 6510 used in the Commodore 64, the 8502 was able run at double clock rate of the 6510.[105] The 8502 family also includes the MOS 7501, 8500 and 8501.
Hudson Soft HuC6280 Japanese video game company Hudson Soft's improved version of the WDC 65C02. Manufactured for them by Seiko Epson and NEC for the SuperGrafx. The most notable product using the HuC6280 is NEC's TurboGrafx-16 video game console.
VLSI VL65NC02[106] VLSI licensed 65C02 variant was included in the Atari Lynx's main system IC named Mikey.

16-bit derivatives

[edit]

The Western Design Center designed and, as of 2025, still produces the WDC 65C816S processor, a 16-bit, static-core successor to the 65C02. The W65C816S is a newer variant of the 65C816, which is the core of the Apple IIGS computer and is the basis of the Ricoh 5A22 processor that powers the Super Nintendo Entertainment System. The W65C816S incorporates minor improvements over the 65C816 that make the newer chip not an exact hardware-compatible replacement for the earlier one. Among these improvements was conversion to a static core, which makes it possible to stop the clock in either phase without the registers losing data. Available through electronics distributors, as of March 2020, the W65C816S is officially rated for 14 MHz operation.

The Western Design Center also designed and produced the 65C802, which was a 65C816 core with a 64-kilobyte address space in a 65(C)02 pin-compatible package. The 65C802 could be retrofitted to a 6502 board and would function as a 65C02 on power-up, operating in "emulation mode." As with the 65C816, a two-instruction sequence would switch the 65C802 to "native mode" operation, exposing its 16-bit accumulator and index registers, and other 65C816 features. The 65C802 was not widely used and production ended.

Bugs and quirks

[edit]

The 6502 had several bugs and quirks, which had to be accounted for when programming it:

  • The earliest revisions of the 6502, such as those shipped with some KIM-1 computers, did not have a ROR (rotate right memory or accumulator) instruction. In these chips, the operation of the opcode that was later assigned to ROR is effectively an ASL (arithmetic shift left) instruction that does not affect the carry bit in the status register. Initially, MOS intentionally excluded ROR from the instruction set, deeming it not of enough value to justify its costs. In reaction to many customer inquiries, MOS promised in the second edition of the MCS6500 Programming Manual (document no. 6500-50A) that ROR would appear in 6502 chips starting in 1976.[51][e] The vast majority of 6502 chips in existence today do feature the ROR instruction; these include all those CPUs originally installed in popular fully-assembled microcomputers such as the Apple II and Commodore 64 lines, all of which were manufactured after 1976.
  • The NMOS 6502 family has a variety of undocumented instructions, which vary from one chip manufacturer to another. The 6502 instruction decoding is implemented in a hardwired logic array (similar to a programmable logic array) that is only defined for 151 of the 256 available opcodes. The remaining 105 trigger strange and occasionally hard-to-predict actions, such as crashing the processor, performing two valid instructions consecutively, performing strange mixtures of two instructions, or simply doing nothing at all. Some hardware designers used the undefined opcodes to extend the 6502 instruction set by detecting when a certain undefined opcode is fetched and performing an extension operation externally to the processor while substituting a neutral (NOP-like) opcode to the 6502 in order to make it idle while the external hardware handles the extension operation. Also, some programmers utilized this feature to extend the 6502 instruction set by providing functionality for the unimplemented opcodes with specially written software intercepted at the BRK instruction's 0xFFFE vector.[107][108] All of the undefined opcodes have been replaced with NOP instructions in the 65C02, an enhanced CMOS version of the 6502, although with varying byte sizes and execution times. (Some of them actually perform memory read operations but then ignore the data.) In the 65C802/65C816, all 256 opcodes perform defined operations.
  • The 6502's memory indirect jump instruction, JMP (<address>), has a nonintuitive limitation which many users consider a defect. If <address> is hex xxFF (i.e., any word ending in FF), the processor will not jump to the address stored in xxFF and xxFF+1 as expected, but rather the one defined by xxFF and xx00 (for example, JMP ($10FF) would jump to the address stored in 10FF and 1000, instead of the one stored in 10FF and 1100). This can be avoided simply by not placing any indirect jump target address across a page boundary, and the MOS Technology MCS6500 Programming Manual gives reason to believe that this was the intention of the designers of the 6502, in order to save space on the chip that would have been used to implement the more complex behavior of conditionally adding 1 clock cycle to propagate the carry when necessary. This ostensible defect continued through the entire NMOS line but was corrected in the CMOS derivatives.
  • The NMOS 6502 indexed addressing across page boundaries will do an extra read of an invalid address in the page of the base address (to which the index is added). This characteristic may cause random issues by accessing hardware that acts on a read, such as clearing timer or IRQ flags, sending an I/O handshake, etc. The extra read can be predicted and managed to avoid such problems, but only with special care in both hardware and software design. This flaw continued through the entire NMOS line but was corrected in the CMOS derivatives, in which the processor does an extra read of the last instruction byte.
  • The 6502 read–modify–write instructions perform one read and two write cycles. First, the unmodified data that was read is written back, and then the modified data is written. This characteristic may cause issues by twice accessing hardware that acts on a write. This anomaly continued through the full NMOS line but was fixed in the CMOS derivatives, in which the processor does two reads and one write cycle. Defensive programming practice will generally avoid this problem by not executing read/modify/write instructions on hardware registers.
  • The N (result negative), V (sign bit overflow) and Z (result zero) status flags are generally meaningless when performing arithmetic operations while the processor is in BCD mode, as these flags are undefined in decimal mode and have been empirically shown to reflect the binary, not BCD, result. This limitation was removed in the CMOS derivatives, at the cost of one added clock cycle for an ADC or SBC instruction in decimal mode (except on the 65C816). Therefore, this feature may be used to distinguish a CMOS processor from an NMOS version (by relying on the undocumented behavior of the NMOS version).[109]
  • If the 6502 happens to be in BCD mode when a hardware interrupt occurs, it will not revert to binary mode. This characteristic could result in obscure bugs in the interrupt service routine (ISR) if it fails to clear BCD mode before performing any arithmetic operations. The 6502 programming manual thus requires each ISR to reset or set the D flag if it uses the ADC or SBC instruction, but occasionally a human programmer may mistakenly omit to do this, causing a bug. For example, the Commodore 64's KERNAL did not correctly handle this processor characteristic, requiring that IRQs be disabled or re-vectored during BCD math operations. This issue was addressed in the CMOS derivatives also, by making reset and all interrupts automatically reset the D flag. (The change has the one disadvantage that it makes a [rare] program that operates continuously in decimal mode slightly longer and slower, as now every ISR must set the D flag before executing ADC or SBC.)
  • The 6502 instruction set includes BRK (opcode $00), which is technically a software interrupt (similar in spirit to the SWI mnemonic of the Motorola 6800 and ARM processors). BRK is most often used to interrupt program execution and start a machine language monitor for testing and debugging during software development. BRK could also be used to route program execution using a simple jump table (analogous to the manner in which the Intel 8086 and derivatives handle software interrupts by number). However, if a maskable hardware interrupt occurs when the processor is fetching a BRK instruction, the NMOS version of the processor will fail to execute BRK and instead proceed as if only the hardware interrupt had occurred. This fault—an unequivocal hardware bug—was corrected in the CMOS implementation of the processor, which first calls the ISR for the hardware interrupt and then executes the BRK instruction.
  • When executing JSR (jump to subroutine) and RTS (return from subroutine) instructions, the return address pushed to the stack by JSR is that of the last byte of the JSR operand (that is, the most significant byte of the subroutine address), rather than the address of the following instruction. This is because the actual copy (from program counter to stack and then conversely) takes place before the automatic increment of the program counter that occurs at the end of every instruction.[110] This characteristic would go unnoticed unless the code examined the return address in order to retrieve parameters in the code stream (a 6502 programming idiom documented in the ProDOS 8 Technical Reference Manual). It remains a characteristic of 6502 derivatives to this day. The original MCS6500 Programming Manual points it out and explains the reason: it saves one clock cycle in the JSR by not incrementing the PC before pushing it, while in the RET instruction, the deferred increment of the pulled PC is overlapped with other steps and adds no clock cycle. As designed, a JSR and RET take 12 clock cycles total; if the JSR pushed the incremented PC, the call and return would take 13 clock cycles.
  • The SBC instruction (Subtract Memory from Accumulator with Borrow) uses inverted carry as borrow. A zero carry flag is used to signal borrow from previous subtract. If a borrow is not desired, the carry bit must be set before the SBC instruction. This way the ALU subtract logic can reuse the add logic with just inverted second input which saves a lot of gates.
  • The read access of the CPU can be delayed by setting the RDY pin to low temporarily. However, during write access, which can take up to three consecutive clock cycles for a BRK instruction, the CPU will stop only in the next read cycle.[111] This quirk was corrected in the CMOS derivatives and also in the 6510 and its variants.

See also

[edit]

Notes

[edit]

References

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The MOS Technology 6502 is an 8-bit microprocessor developed by MOS Technology Inc. and first introduced to customers in September 1975, renowned for its affordability at a retail price of $25, which made it a pivotal component in the early personal computing revolution. It features a 16-bit address bus capable of addressing up to 64 kilobytes of memory and operates at typical clock speeds of around 1 MHz, with internal logic enabling efficient processing despite the modest external clock. Designed primarily by Chuck Peddle and a team of engineers at MOS Technology, the 6502 was created as a cost-effective alternative to more expensive processors like the Intel 8080, emphasizing simplicity, low power consumption, and compatibility with NMOS technology for broad manufacturability. The chip's development stemmed from Peddle's prior work on the 6800 microprocessor at Motorola, where he sought to simplify it, resulting in a cleaner architecture with 56 instructions and support for zero-page addressing to optimize memory access in resource-constrained systems. MOS Technology, founded in 1969 and acquired by Allen-Bradley in 1970 before becoming independent, introduced the 6502 in 1975. It followed up with the KIM-1 single-board computer in 1976, which used the 6502 and served as an early development platform that helped popularize the processor. Its low cost and performance advantages—being faster and smaller than contemporaries—enabled widespread adoption, powering iconic systems such as the Apple II (1977), Commodore PET (1977), Atari 400/800 (1979), and BBC Micro (1981). Technically, the 6502 employs a non-pipelined design with an accumulator-based architecture, three general-purpose registers (A, X, Y), a stack pointer, and a program counter, supporting both binary and BCD arithmetic modes. Variants like CMOS versions such as the 65C02 (with additional instructions) extended its lifespan into the 1980s and beyond, while second-sourced chips from Rockwell and Synertek ensured availability. The processor's influence extended to gaming consoles, including the Nintendo Entertainment System (via the Ricoh 2A03 derivative) and Atari 2600, where its efficient interrupt handling and direct memory access capabilities supported real-time operations. The 6502's legacy endures as a foundational element of computing history, embodying a design philosophy that prioritized accessibility and sparking innovations in home computing, education, and embedded systems; its open architecture even inspired modern recreations and emulations for retro computing enthusiasts. In 2025, the 6502 marked its 50th anniversary, with continued interest in emulations and hardware recreations. By enabling affordable machines that brought computing to millions, it played a crucial role in democratizing technology and laying groundwork for the personal computer industry.

History

Origins at Motorola

Chuck Peddle joined Motorola in 1973, brought on by project lead Tom Bennett to help complete the development of the MC6800 8-bit microprocessor, which was in its final stages. His primary contributions focused on system-level integration rather than the core CPU design, including the creation of the MC6821 Programmable Interface Adapter (PIA), a crucial peripheral chip that handled input/output operations for the 6800 family. Peddle also addressed timing and interfacing issues in the overall architecture, drawing from his prior experience with custom integrated circuits at earlier firms like Collins Radio. Peddle grew increasingly frustrated with the 6800's architecture, which demanded multiple support chips to function effectively in a system, such as the MC6875 clock generator and various interface adapters, complicating board layouts and driving up manufacturing costs for potential users. Specific design elements exacerbated this, including the 8-bit bidirectional data bus that required careful signal management to avoid contention during read and write cycles, and the need for an external two-phase non-overlapping clock signal operating at up to 1 MHz, which added circuitry overhead and power consumption. These features, while innovative for enabling efficient data flow in an 8-bit system with a 16-bit address bus, made the 6800 less accessible for low-end applications compared to simpler alternatives like Intel's 4040. Throughout 1974, as the 6800 was prepared for market release in March, Peddle traveled extensively to demonstrate the chip to potential customers and performed detailed cost analyses, concluding that a minimal 6800-based system—including the CPU, support chips, and basic memory—would retail for approximately $300, pricing it out of reach for hobbyists and emerging consumer electronics markets. This timeline aligned with Peddle's growing advocacy for a stripped-down, lower-cost microprocessor variant, a push that ultimately clashed with Motorola's focus on industrial and embedded applications. The 6800 project involved a small core team of about eight engineers, including Bill Mensch, who had joined Motorola in 1971 after graduating from the University of Arizona and contributed significantly to the microprocessor's instruction set and register design. Mensch's work on simplifying data path logic helped shape the 6800's orthogonal architecture, but like Peddle, he shared concerns over the system's overall affordability and complexity.

Conception and Development at MOS Technology

In 1974, Chuck Peddle, a key engineer on Motorola's MC6800 microprocessor project, left the company along with seven colleagues—including Bill Mensch, Gary Ingram, and Rod Orgill—to join MOS Technology, a small semiconductor firm founded in 1969 in Valley Forge, Pennsylvania, by former General Instrument executives John Paivinen, Mort Jaffe, and Don McLaughlin. At MOS, Peddle advocated for the development of a low-cost standalone 8-bit microprocessor, building on the 6800 architecture but redesigned for affordability and ease of use to target broader markets beyond industrial applications. The team's core design decisions focused on cost reduction through architectural simplifications, aiming for a retail price of $20–$25 per chip in volume. Unlike the 6800's multiplexed address/data bus, which required external latches adding to system cost, the 6502 featured a dedicated 16-bit unidirectional address bus and an 8-bit bidirectional data bus, streamlining board design and reducing external components. It also adopted a simpler two-phase non-overlapping clock scheme with a single 5 V power supply, eliminating the 6800's more complex three-phase clock and dual supplies (5 V and 12 V), which further lowered manufacturing and integration expenses. These changes, combined with a hardwired programmable logic array (PLA) for instruction decoding instead of microcode, enabled an NMOS fabrication process that minimized power consumption and die size. The transistor count was optimized to approximately 3,510 enhancement-mode devices (plus about 1,018 depletion-mode pull-ups), roughly 30–40% fewer than contemporaries like the 6800 or Intel 8080, allowing efficient production on MOS's existing 8 µm NMOS process. A notable innovation was the 6501 variant, engineered for drop-in compatibility with the 6800's 40-pin package and signaling, permitting existing 6800 systems to upgrade with minimal changes while offering the 6502's performance at a fraction of the cost. Prototyping commenced in late 1974 shortly after the team's arrival, with initial schematics hand-drawn and layout work beginning by year's end; the design taped out in March 1975, yielding first functional samples by June 1975 for internal testing and early customer evaluation. This rapid development cycle, spanning less than a year, underscored MOS's focus on leveraging the team's Motorola experience to deliver a commercially viable CPU that prioritized simplicity and economic accessibility. MOS Technology unveiled the 6502 microprocessor at the Wescon trade show in San Francisco on September 16, 1975, pricing it at $25 per unit in quantities of 100, while the pin-compatible 6501 variant, designed to directly replace the Motorola 6800, was offered at $20 per unit. This aggressive pricing strategy, far below the $360 initial cost of the Motorola 6800 or Intel 8080, aimed to disrupt the market by making microprocessor technology accessible to hobbyists and small firms, with initial production limited to samples for trade show attendees and early orders. Early customer feedback highlighted the 6502's cost-effectiveness and performance, though initial runs suffered from a hardware bug in the ROR (rotate right) instruction that rendered it non-functional, leading to reliability concerns until fixed in later revisions by early 1976. In response to the 6501's compatibility with its 6800, Motorola filed a lawsuit against MOS Technology in November 1975, alleging patent infringement, misappropriation of trade secrets, and unfair competition based on design similarities derived from former Motorola engineers. The litigation strained MOS's resources amid low-margin sales volumes, exacerbating financial difficulties as production ramped up but legal costs mounted. The case settled out of court in March 1976, with MOS agreeing to pay Motorola $200,000, cease production of the 6501, and return confidential documents, while retaining rights to continue manufacturing and selling the 6502 without further royalties. The lawsuit's financial toll contributed to MOS Technology's cash flow crisis, prompting its acquisition by Commodore International in October 1976 for an equity stake valued at approximately $12 million, providing stability for ongoing 6502 production. This move allowed Commodore to vertically integrate chip manufacturing, mitigating supply risks while MOS focused on scaling output to meet growing demand.

Adoption in Computing and Gaming

The MOS Technology 6502 microprocessor played a pivotal role in the early personal computing era by powering several landmark systems that democratized access to computing. The Apple I, released in 1976, and the Apple II, introduced in 1977, both utilized the 6502 as their central processor, with the Apple II series ultimately selling approximately 6 million units over its lifespan and becoming a foundational platform for hobbyists, education, and business applications. Similarly, Commodore International adopted the 6502 for its PET in 1977, VIC-20 in 1980 (over 1 million units sold), and Commodore 64 in 1982 (around 12.5 million units sold), making the latter one of the most commercially successful home computers of all time. The Atari 2600 home video game console, launched in 1977, employed the MOS 6507, a reduced version of the 6502, and achieved sales of about 30 million units, revolutionizing the gaming industry by enabling programmable entertainment in living rooms. In the United Kingdom, the BBC Micro, released in 1981 by Acorn Computers, featured the 6502 and sold roughly 1.5 million units, largely driven by its integration into the national education system through the BBC Computer Literacy Project. In gaming, the 6502's influence extended prominently through dedicated consoles. The Atari 2600's success established it as a gaming staple, supporting titles that defined the second generation of video games. The Nintendo Entertainment System (NES), known as the Famicom in Japan and released internationally in 1983–1985, used a 6502 derivative called the Ricoh 2A03, which powered over 61.91 million units sold worldwide and revived the North American video game market after the 1983 crash. The 6502's low cost—initially $25 compared to competitors' $175–$200 pricing—enabled an affordable home computing revolution, with systems based on it or its derivatives collectively exceeding 50 million units sold and introducing millions to programming, education, and entertainment. This widespread adoption fostered innovations in software ecosystems, including early versions of BASIC interpreters tailored for the processor. As of 2025, marking the 50th anniversary of the 6502's September 1975 launch, its legacy endures in modern contexts. In September 2025, Microsoft open-sourced the source code for its original 1976 Microsoft BASIC Version 1.1 for the 6502 under the MIT license, allowing developers to explore and extend the interpreter that powered many early systems. The processor continues to see use in embedded applications for its simplicity and low power consumption, as well as in retro computing projects like the Olimex NEO6502 board, which combines a modern 65C02 variant with contemporary peripherals for educational and hobbyist experimentation.

Programmer's Model

Registers

The MOS Technology 6502 microprocessor features a minimal register set designed for simplicity and efficiency in 8-bit computing, consisting of six primary registers: the 8-bit accumulator (A), two 8-bit index registers (X and Y), an 8-bit stack pointer (S or SP), a 16-bit program counter (PC), and an 8-bit status register (P). This limited architecture emphasizes the use of memory locations, particularly the zero page ($0000–$00FF), as pseudo-registers for additional data storage and manipulation. The accumulator (A) serves as the central register for performing arithmetic, logical, and data transfer operations, acting as the primary input and output for the ALU (arithmetic logic unit). Most instructions that read from or write to memory route data through the accumulator, making it essential for computations like addition, subtraction, AND, OR, and shifts. The index registers X and Y are 8-bit general-purpose registers optimized for memory addressing and iteration tasks. The X register supports indexed addressing modes for indirect and zero-page operations, while Y is used for post-indexed indirect addressing and array traversal in loops, enabling efficient pointer arithmetic without additional overhead. Both can also hold temporary values, though they lack direct ALU support compared to the accumulator. The stack pointer (SP or S) is an 8-bit register that manages the hardware stack, a fixed 256-byte region in memory from $0100 to $01FF, which grows downward from $01FF toward $0100. It points to the next available stack location for push operations (decrementing before storing) and is incremented after pops, facilitating subroutine calls, interrupts, and temporary data storage via instructions like PHA (push accumulator) and PHP (push processor status). The stack's fixed location simplifies hardware design but limits its size and relocation. The program counter (PC) is a 16-bit register that holds the address of the next instruction to fetch and execute, incrementing automatically after each opcode retrieval. It supports the 6502's 64 KB address space and is loaded during jumps, branches, or returns from subroutines and interrupts, forming the core of sequential and non-sequential program flow. The status register (P), also called the processor status or flags register, is an 8-bit structure where each bit represents a condition flag set or cleared by instructions to reflect operation results or processor state. The bits, ordered from MSB (bit 7) to LSB (bit 0), are: N (Negative), V (Overflow), – (unused, always 1 when read or pushed to stack), B (Break), D (Decimal), I (Interrupt disable), Z (Zero), and C (Carry). The Negative flag (N) is set if the most significant bit (bit 7) of the result is 1, indicating a negative value in two's complement arithmetic. The Overflow flag (V) signals signed arithmetic overflow, set when addition or subtraction produces a result outside the representable range for 8-bit signed integers (–128 to +127). The Break flag (B) is asserted on BRK instructions or certain interrupt contexts (though not hardware IRQ/NMI) and aids in debugging or software interrupts. The Decimal flag (D) enables binary-coded decimal (BCD) mode for arithmetic instructions, treating numbers as two 4-bit digits for accurate decimal calculations. The Interrupt disable flag (I) prevents maskable interrupts (like IRQ) when set, allowing critical sections of code to run uninterrupted. The Zero flag (Z) is set if the operation result is zero, used for conditional branching on equality. The Carry flag (C) indicates unsigned overflow or borrow in arithmetic, shift, and rotate operations, and is essential for multi-byte arithmetic. Flags are updated selectively by ALU instructions and can be tested for control flow decisions.

Addressing Modes

The MOS Technology 6502 microprocessor supports 13 distinct addressing modes, which determine how operands are specified and accessed from memory, enabling efficient code generation by allowing programmers to trade off between address range, speed, and instruction size. These modes leverage the processor's 16-bit address bus while optimizing for common operations like array traversal and conditional branching, with many instructions supporting multiple modes to suit different scenarios. The design emphasizes zero-page addressing for performance-critical tasks, as it reduces fetch cycles compared to full 16-bit modes.

Immediate Mode

In immediate mode, the operand is provided directly as part of the instruction, denoted in assembly as #value, where value is an 8-bit constant. This mode fetches the value inline without memory access, making it ideal for loading constants or immediate comparisons, and it typically requires 2 cycles for execution. It supports only 8-bit values due to the single-byte operand field, limiting its use to operations not requiring full address resolution.

Zero-Page Mode

Zero-page mode addresses locations in the first 256 bytes of memory (addresses $00 to FF), using a single-byte operand for the low-order address while implying a high byte of &#36;00.[](https://www.princeton.edu/~mae412/HANDOUTS/Datasheets/6502.pdf) This mode is syntactically addr and executes in 3 cycles for most load/store operations, offering a 1-2 cycle savings over absolute mode by avoiding the high-byte fetch. Programmers often reserve zero page for frequently accessed variables or pointers to maximize speed in performance-sensitive code.

Absolute Mode

Absolute mode provides full 16-bit addressing across the entire 64 KB memory space, with the operand consisting of two bytes: low byte followed by high byte, denoted as $HHHH. It requires 4 cycles for read operations, as the processor fetches both bytes before accessing the target location. This mode is essential for accessing data or code anywhere in memory but incurs higher latency than zero-page variants, making it less suitable for tight loops.

Indexed Modes

Indexed modes modify zero-page or absolute addresses by adding the contents of the X or Y index register (8-bit) to the base address, facilitating array indexing or pointer arithmetic; syntax is addr,Xoraddr,X or addr,Y for zero-page and HHHH,XorHHHH,X or HHHH,Y for absolute. Zero-page indexed modes take 4 cycles, while absolute indexed modes require 4 cycles if no page boundary is crossed or 5 cycles otherwise, due to an extra memory fetch on overflow. The X register is commonly used for indirect indexing, and Y for direct, with wraparound occurring on 256-byte boundaries to simplify bounded array access.

Indirect Mode

Indirect mode, denoted as (addr),loadsa16biteffectiveaddressfromthememorylocationspecifiedbytheoperand,thenusesthataddressfortheoperation;forzeropage,itis(addr), loads a 16-bit effective address from the memory location specified by the operand, then uses that address for the operation; for zero-page, it is (addr,X) or (addr),Y.[](https://www.princeton.edu/ mae412/HANDOUTS/Datasheets/6502.pdf)Theabsoluteindirectvariant(addr),Y.[](https://www.princeton.edu/~mae412/HANDOUTS/Datasheets/6502.pdf) The absolute indirect variant (HHHH) is particularly bug-prone: when the low byte of the address is $FF (page boundary), the high byte of the fetched address is incorrectly read from $00 instead of the next page, affecting jump instructions. Cycle counts vary: zero-page indirect indexed modes take 5-6 cycles, providing flexible indirection for tables or vectors at the cost of additional fetches.

Relative Mode

Relative mode is used exclusively for branch instructions, where the operand is a signed 8-bit offset (-128 to +127) from the program counter, allowing conditional jumps within a local range. It executes in 2 cycles if the branch is not taken or 3 cycles if taken, promoting compact control flow in routines without full absolute jumps. This mode calculates the target by adding the offset to the address of the next instruction, enabling efficient short-range branching in assembly code.

Stack and Implied Modes

Stack mode operates implicitly on the hardware stack at 01000100-01FF, using the stack pointer register for push and pull operations without an explicit address operand; pushes decrement the stack pointer first, while pulls increment after. Implied mode requires no operand at all, relying on dedicated registers like the accumulator for operations such as shifts or transfers. Both modes are the fastest, typically 2 cycles, as they avoid address resolution entirely, with the stack providing LIFO storage for subroutines and interrupts. The accumulator implied mode is used for unary operations, enhancing code density for register-centric tasks.

Instruction Set and Opcodes

The MOS Technology 6502 microprocessor features an instruction set comprising 56 distinct instructions, implemented across 151 unique opcodes within the 8-bit opcode space of 256 possible values. These opcodes encode both the operation and, in many cases, elements of the addressing mode, following a bit pattern where the high-order bits typically specify the instruction class and the low-order bits indicate register or mode selection. The remaining 105 opcodes are undocumented in official specifications but generally execute deterministic behaviors, often as combinations or no-operations that were utilized in later systems for efficiency.

Load and Store Instructions

The load and store instructions facilitate data movement between registers and memory locations, supporting the accumulator (A), index registers X and Y, and the stack pointer (SP). Key load instructions include LDA (load accumulator), LDX (load X), and LDY (load Y), which fetch 8-bit values into their respective registers and update the negative and zero flags based on the result. Corresponding store instructions are STA (store accumulator), STX (store X), and STY (store Y), which write register contents to memory without affecting flags. Register transfer instructions such as TAX (A to X), TAY (A to Y), TXA (X to A), TYA (Y to A), TSX (SP to X), and TXS (X to SP) enable efficient data shuffling between registers and the stack pointer, preserving or clearing flags as appropriate. These instructions apply addressing modes such as immediate, zero page, absolute, and indexed variants to specify operands. Example opcodes include A9forLDAimmediate,A9 for LDA immediate, A5 for LDA zero page, BD for LDA absolute indexed by Y, &#36;8D for STA absolute, and AA for TAX (implied addressing).

Arithmetic Instructions

Arithmetic operations on the 6502 center on addition, subtraction, comparison, and increment/decrement, with support for both binary and binary-coded decimal (BCD) modes controlled by the decimal flag (D). The ADC (add with carry) instruction adds an operand and the carry flag to the accumulator, setting flags for carry, overflow, negative, and zero; in BCD mode, it performs decimal arithmetic. SBC (subtract with borrow) similarly subtracts an operand and the borrow (inverted carry) from the accumulator, with analogous flag updates and BCD support. Comparison instructions CMP (accumulator), CPX (X register), and CPY (Y register) subtract the operand from the register without altering it, updating the carry, negative, and zero flags to indicate relationships like greater than or equal. Increment and decrement operations include INC and DEC for memory locations (updating negative and zero flags), along with register-specific INX, DEX (for X), and INY, DEY (for Y), which do not affect the carry or overflow flags. Notably, the 6502 lacks dedicated multiply or divide instructions, requiring software implementations for such operations. Example opcodes are $69 for ADC immediate, E9forSBCimmediate,E9 for SBC immediate, C9 for CMP immediate, E8forDEX(implied),andE8 for DEX (implied), and EE for INC absolute.

Logical Instructions

Logical operations manipulate bits in the accumulator or memory, providing bitwise AND, exclusive-OR, inclusive-OR, bit testing, and shifts/rotates. AND (bitwise AND), EOR (exclusive OR), and ORA (inclusive OR) combine the accumulator with an operand, storing the result back in the accumulator and updating the negative and zero flags; these are essential for masking and toggling bits. The BIT instruction tests specified bits in a memory operand against the accumulator, setting the negative and overflow flags to the operand's bit 7 and 6 respectively, and the zero flag based on the AND result without modifying registers. Shift and rotate instructions include ASL (arithmetic shift left, equivalent to multiply by 2), LSR (logical shift right, divide by 2), ROL (rotate left through carry), and ROR (rotate right through carry), applicable to the accumulator (implied) or memory; they update the negative, zero, and carry flags accordingly. Example opcodes include $29 for AND immediate, $49 for EOR immediate, $09 for ORA immediate, $24 for BIT zero page, $0A for ASL accumulator, and $4A for LSR accumulator.

Branch Instructions

Branch instructions enable conditional control flow using relative addressing, offsetting the program counter (PC) by a signed 8-bit value (-128 to +127 bytes). These test specific processor flags: BEQ (branch if equal, zero flag set), BNE (branch if not equal, zero flag clear), BPL (branch if positive, negative flag clear), BMI (branch if minus, negative flag set), BCC (branch if carry clear), BCS (branch if carry set), BVC (branch if overflow clear), and BVS (branch if overflow set). An unconditional branch, JMP (jump), uses absolute or indirect addressing for longer displacements. These instructions do not affect flags but rely on prior operations to set them. Example opcodes are F0forBEQrelative,F0 for BEQ relative, D0 for BNE relative, $4C for JMP absolute, and $6C for JMP indirect.

Stack Instructions

The 6502's stack, a fixed 256-byte LIFO structure in page 1 ($0100–$01FF), supports subroutine calls, interrupts, and register preservation via dedicated push and pull operations. PHA (push accumulator) and PHP (push processor status) store the accumulator or flags (with breaks flag always set) on the stack, decrementing the stack pointer (SP) by 1; PLA (pull accumulator) and PLP (pull processor status) reverse this, incrementing SP and loading values while updating flags (noting that PLP ignores the breaks flag bit). Subroutine instructions include JSR (jump to subroutine, pushing PC high byte then low), which saves the return address minus 1, and RTS (return from subroutine), which pulls the PC and adds 1. For interrupts and resets, RTI (return from interrupt) pulls the PC low byte, then status flags (ignoring breaks), then PC high byte. The 6502 lacks direct push/pull for X or Y in its base set, though later variants add them. Example opcodes are $48 for PHA (implied), $08 for PHP (implied), $68 for PLA (implied), $28 for PLP (implied), $20 for JSR absolute, $60 for RTS (implied), and $40 for RTI (implied).

Flag and Miscellaneous Instructions

Flag instructions provide direct control over individual status flags using implied addressing, each executing in 2 cycles without affecting other flags. They include: CLC (clear carry flag, $18), SEC (set carry flag, $38), CLD (clear decimal mode, D8),SED(setdecimalmode,D8), SED (set decimal mode, F8), CLI (clear interrupt disable, $58), SEI (set interrupt disable, $78), and CLV (clear overflow flag, $B8). These are crucial for configuring arithmetic modes, enabling/disabling interrupts, and managing carry/overflow in multi-byte operations. BRK (break, $00) is an implied instruction that initiates a software interrupt: it pushes the processor status (with B flag set to 1) and the incremented program counter (PC+2) onto the stack, sets the interrupt disable flag (I=1), and transfers control to the interrupt vector at FFFEFFFE–FFFF. It is used for debugging, error handling, or invoking operating system services. NOP (no operation, $EA) is an implied instruction that performs no computational action, simply advancing the program counter by 1 byte and consuming 2 cycles. It is commonly used for precise timing delays or as a placeholder in code.
Instruction GroupNumber of InstructionsExample Opcodes (Hex)
Load/Store/Transfer12LDA imm: A9, STA abs: &#36;8D, TAX: AA
Arithmetic/Compare/Inc-Dec11ADC imm: $69, SBC zp: E5,CMPabs:E5, CMP abs: CD, DEX: $CA
Logical/Shift/Rotate8AND absX: $3D, BIT zp: $24, ASL acc: $0A, ROR abs: $6E
Branch/Jump9BEQ rel: $F0, BCC rel: $90, JMP abs: $4C
Stack/Subroutine/Interrupt7PHA: $48, JSR abs: $20, RTS: $60, RTI: $40
Flag and Miscellaneous9CLC: $18, SEC: $38, CLD: D8, BRK: &#36;00, NOP: EA
This table summarizes the grouping and provides representative opcodes; full mappings reveal mode-specific variants within each class.

Assembly Language and Code Examples

Assembly language for the MOS Technology 6502 uses a mnemonic-based syntax to represent machine instructions, with common assemblers including ca65 from the cc65 suite and dasm. Ca65 accepts standard 6502 syntax, where a line may include a label followed by a colon, an optional directive like .org to set the origin address, and instructions such as LDA #$42 to load the immediate value 42 into the accumulator. Dasm supports macro capabilities and targets the 6502 among other 8-bit processors, using similar syntax for directives and opcodes. A simple example demonstrates a loop that initializes the accumulator to 0 and adds 1 ten times, using the X register as a counter. The assembly code is:

.org &#36;8000 LDA #0 ; Load accumulator with 0 LDX #10 ; Load X with loop count loop: ADC #1 ; Add 1 to accumulator DEX ; Decrement X BNE loop ; Branch if not equal (X != 0)

.org &#36;8000 LDA #0 ; Load accumulator with 0 LDX #10 ; Load X with loop count loop: ADC #1 ; Add 1 to accumulator DEX ; Decrement X BNE loop ; Branch if not equal (X != 0)

Disassembly of the resulting machine code (starting at $8000) shows the opcodes: A9 00 (LDA #0), A2 0A (LDX #10), 69 01 (ADC #1), CA (DEX), D0 FB (BNE loop, relative branch of -5 bytes). Subroutines are invoked using the JSR (Jump to Subroutine) instruction, which pushes the return address minus one onto the stack, and returned from with RTS (Return from Subroutine), which pulls the address, increments it, and loads it into the program counter. For instance, to call a subroutine that increments a zero-page variable:

JSR increment_var ; Continue here after return increment_var: INC &#36;20 ; Increment location &#36;20 RTS

JSR increment_var ; Continue here after return increment_var: INC &#36;20 ; Increment location &#36;20 RTS

This pair enables modular code by saving and restoring execution flow via the stack. Interrupt handling in 6502 assembly involves vectors at memory locations FFFEFFFE-FFFF, which hold the 16-bit address of the IRQ (Interrupt Request) or BRK (Break) handler routine, with the low byte at FFFEandhighbyteatFFFE and high byte at FFFF. The handler typically saves registers, processes the interrupt, restores state, and ends with RTI (Return from Interrupt) to pull the program counter and status flags from the stack. Best practices in 6502 assembly emphasize zero-page optimization for performance, as instructions accessing the first 256 bytes of memory (0000-FF) execute faster and often in fewer cycles than absolute addressing. Programmers allocate frequently used variables and temporaries to zero page to reduce instruction sizes and cycle counts, such as using LDA $20 instead of LDA $8020, which saves bytes and time in loops or arithmetic operations.

Hardware Design

Integrated Circuit Architecture

The MOS Technology 6502 employs an 8-bit datapath design, featuring an arithmetic logic unit (ALU) capable of performing operations such as addition, subtraction, bitwise logic (AND, OR, XOR, NOT), and shifts on 8-bit data words. The ALU receives inputs from the accumulator and other sources via an internal 8-bit data bus, with results directed back to registers or the bus for storage or output. This structure supports efficient execution of the processor's instructions by minimizing data movement overhead within the chip. The datapath interconnects the ALU with a limited set of on-chip registers, including the 8-bit accumulator (A), index registers X and Y, an 8-bit stack pointer (SP), a 16-bit program counter (PC), and an 8-bit status register (P), all linked through a shared internal data bus and separate address generation logic. Data flows between these elements and external memory via the chip's 8-bit bidirectional data bus (D0-D7) and 16-bit address bus (A0-A15), enabling zero-page addressing and stack operations without excessive external cycles. This bus-oriented architecture balances simplicity and performance, using dynamic latches clocked by internal phases to propagate signals across the datapath. The control unit is a hardwired finite state machine (FSM) that orchestrates instruction execution through approximately 24 timing states, decoding opcodes fetched during the initial cycle and generating control signals to sequence ALU operations, register loads, and bus transfers over subsequent cycles. These states ensure precise synchronization, with the FSM advancing based on the instruction type and addressing mode, typically requiring 2 to 7 machine cycles per instruction. The design avoids microcode for reduced complexity, relying instead on a programmable logic array (PLA) to map opcode-state combinations to datapath controls. Fabricated in NMOS technology, the 6502 integrates roughly 3,510 transistors on a die measuring 3.9 mm × 4.3 mm, contributing to its compact footprint and low cost. Operating at clock speeds of 1-2 MHz, it consumes approximately 450 mW at 1 MHz, primarily due to dynamic logic and bus loading. The clock subsystem accepts a single-phase input signal, which is internally buffered and divided into two non-overlapping phases (phi1 and phi2) to drive latching and avoid race conditions; phi1 handles internal computations, while phi2 manages external bus access.

Process Technology Evolution

The MOS Technology 6502 was introduced in production in 1975 using a depletion-load NMOS fabrication process, which allowed for a single 5 V power supply and improved performance over earlier enhancement-mode NMOS designs like the Motorola 6800. This initial NMOS implementation, known as the "019" process developed at MOS Technology, featured approximately 3,510 transistors on a die measuring 3.9 mm × 4.3 mm and operated at a standard clock speed of 1 MHz, enabling efficient 8-bit processing for early microcomputers. Subsequent NMOS iterations of the 6502 by MOS Technology and Commodore Semiconductor Group increased clock speeds for better performance in consumer systems, with variants reaching up to 2 MHz in products like the Commodore 128 and Atari systems, while maintaining compatibility with the original design. These NMOS versions consumed around 450 mW at 1 MHz, providing higher speed but generating significant heat compared to later technologies, which often required robust cooling in densely packed systems. The transition to CMOS began with the Western Design Center's (WDC) 65C02 in 1982, an enhanced drop-in replacement that retained pin compatibility while adopting a complementary metal-oxide-semiconductor process for drastically reduced power consumption—approximately 20 mW at 1 MHz—making it ideal for battery-powered and embedded applications. This shift prioritized energy efficiency over raw speed in early CMOS variants, though subsequent WDC evolutions like the W65C02S achieved clock speeds up to 14 MHz in NMOS-equivalent performance while keeping power under 10 mW at lower frequencies, highlighting CMOS's advantages in scalability and thermal management. As of 2025, WDC continues fabricating CMOS-based 6502 derivatives, such as the W65C02S, using modern processes down to 1.2 μm or finer for embedded systems in automotive, industrial, and IoT devices, ensuring long-term availability with power efficiencies below 1 mW/MHz at reduced voltages.

Variants and Derivatives

Second-Source Manufacturers

To ensure a reliable supply chain for the MOS Technology 6502 microprocessor, MOS licensed its design to several second-source manufacturers who produced compatible clones. These licensees manufactured exact 8-bit replicas that maintained full software and hardware compatibility with the original, serving as drop-in replacements with only minor variations in operating speed and power requirements. Rockwell International was an early second-source licensee, producing the R6500 series, which included the R6502 processor and supporting peripherals like the R6532 RAM/I/O/Timer. The R6500 family was employed in demanding environments, including military and aerospace applications, due to Rockwell's expertise in defense electronics and its emphasis on robust, cost-effective microcomputer systems. Rockwell's production helped lower overall 6502 costs and extended availability for industrial uses. Synertek, another key licensee, developed the SY6500 series, encompassing the SY6502 microprocessor and compatible support chips, marketed as a totally software-compatible family for embedded and consumer systems. Synertek supplied these chips to major customers, including Atari for their 8-bit computers and gaming consoles, contributing to the widespread adoption in personal computing. The SY6500 line emphasized high-volume production and integration into development kits like the SYM-1 single-board computer. Additional licensees included California Micro Devices (under GTE Microcircuits) and various international firms, though Japanese production was less prevalent and typically limited to specific regional markets. Most second-source manufacturing ended by the early 1990s as demand shifted to CMOS derivatives, but limited legacy production persisted for embedded and replacement applications into the late 1990s.

MOS 6507

The MOS 6507 was a lower-cost variant of the 6502 developed by MOS Technology, featuring a reduced 13-bit address bus (A0–A12) that limited addressable memory to 8 KB instead of the standard 64 KB. This reduction lowered the pin count to 28 pins from the original 40-pin package, reducing manufacturing costs and making it suitable for cost-sensitive consumer products. The 6507 retained the full 6502 instruction set, including support for binary-coded decimal (BCD) mode, ensuring complete software compatibility. It was primarily used in the Atari 2600 (originally Atari Video Computer System) video game console released in 1977, where it operated at approximately 1.19 MHz and supported the console's minimal memory configuration of up to 4 KB ROM and 128 bytes of RAM within its constrained address space.

Enhanced and Extended Versions

The MOS 6510, introduced in 1982 for the Commodore 64, is a derivative of the 6502 featuring an integrated 8-bit bidirectional I/O port and tri-state address lines to enable dynamic memory banking between RAM and ROM. This design allowed the Commodore 64 to use a single 64 KB memory space more efficiently without additional hardware for address decoding. The MOS 8500, released in 1985 as an HMOS-II process variant of the 6510, maintained full compatibility while reducing power consumption and heat generation compared to the original NMOS implementation, and it became standard in later Commodore 64 revisions. Western Design Center's (WDC) 65C02, launched in 1983 as a CMOS redesign, addressed power efficiency issues of the NMOS 6502 by operating at lower voltages and including enhancements such as the STP instruction to halt the clock for power saving and the WAI instruction to pause execution until an interrupt. It expanded the instruction set with bit manipulation operations like TSB (test and set bits), TRB (test and reset bits), and enhanced BIT for testing specific bits in memory, alongside relative branches on bit states (BBR and BBS) for more efficient conditional code. These additions improved code density and performance in embedded systems without altering the core 8-bit architecture. For 16-bit capabilities, WDC introduced the 65816 in 1985, extending the 6502 lineage with 16-bit accumulator, index registers (X and Y), and a 24-bit address bus supporting up to 16 MB of memory, while retaining backward compatibility through an emulation mode that mimics the original 6502 behavior. A software-controlled flag switches between 8-bit and 16-bit modes, enabling hybrid applications that leverage extended registers for faster arithmetic and larger addressing. This processor powered the Super Nintendo Entertainment System (SNES) via Ricoh's 5A22 variant, which integrated additional features like DMA controllers for sprite handling. Ricoh's 2A03, developed in 1982 for the Nintendo Entertainment System (NES), incorporated a 6502-compatible core but omitted binary-coded decimal (BCD) mode by disconnecting relevant circuitry on the die, a modification that avoided patent issues while integrating an Audio Processing Unit (APU) with five sound channels: two pulse waves, a triangle wave, noise, and DPCM for sampled audio. It also embedded memory-mapped registers for joypad input and sprite DMA, streamlining NES hardware design into a single 40-pin chip running at 1.79 MHz. Hudson Soft's HuC6280, released in 1988 for the PC Engine (TurboGrafx-16), built on the 65C02 with an integrated Memory Management Unit (MMU) using eight mapping registers to expand the address space to 2 MB, far exceeding the 6502's 64 KB limit. Additional features included a 7-bit interval timer for precise timing and three programmable timers for interrupt-driven tasks, with the CPU capable of switching between 1.79 MHz and 7.16 MHz speeds to balance performance and power. These enhancements supported the console's advanced graphics and multitasking requirements. In 2025, 6502 derivatives continue in embedded and retro applications, with WDC producing low-power 65C02 and 65816 variants for IoT devices due to their static CMOS design and minimal instruction overhead. Modern projects like the 65uino single-board computer integrate the 65C02 with USB and peripherals for educational retro programming, while the Neo6502 pairs it with an RP2040 coprocessor for hybrid embedded systems handling USB and memory emulation. Tools such as llvm-mos enable contemporary C/C++ development on these cores, sustaining their use in niche low-power IoT sensors and vintage hardware revivals.

Quirks and Limitations

Notable Bugs

The original MOS Technology 6502 microprocessor, particularly its NMOS implementations, exhibited several hardware bugs and quirks that affected instruction execution and flag status updates. These issues were present in early silicon masks produced primarily before the 1980s and were later mitigated in CMOS derivatives like the WDC 65C02. One prominent bug involves the indirect jump instruction (JMP (indirect)), where the effective address vector straddles a 256-byte page boundary—specifically, when the low byte of the vector address is FF.Insuchcases,theprocessorfetchesthehighbyteofthetargetaddressfromthesamepage(e.g.,forJMP(FF. In such cases, the processor fetches the high byte of the target address from the same page (e.g., for JMP (xxFF), the high byte comes from xx00insteadofxx00 instead of (xx+1)00), leading to an incorrect jump destination. This occurs because the 6502's address increment logic fails to carry over the page boundary during the fetch of the high byte. For example, JMP ($00FF) fetches the low byte from $00FF and the high byte from $0000 (due to the bug) instead of $0100, potentially causing jumps to unintended locations. This bug is characteristic of NMOS revisions and requires workarounds such as aligning vectors away from page boundaries. In decimal mode (enabled by the SED instruction), the ADC (add with carry) and SBC (subtract with borrow) instructions perform binary arithmetic internally before adjusting the result to binary-coded decimal (BCD), but the status flags are updated based on the pre-adjustment binary values rather than the final BCD result. Consequently, the negative (N), zero (Z), and overflow (V) flags do not accurately reflect BCD overflow or underflow conditions; for instance, the V flag is set only if there is a signed overflow in the binary operation, ignoring whether the BCD digits exceed 9 in any position (e.g., adding 0x59 + 0x01 in decimal mode yields 0x60 with V clear, despite no BCD overflow, but binary overflow scenarios can set V erroneously for BCD). Reproduction involves setting the D flag, performing ADC or SBC on BCD values that cause digit carries, and observing mismatched flags; this quirk affects arithmetic validation in BCD-dependent code and was not corrected until CMOS versions. Certain undocumented opcodes, resulting from unused combinations in the 6502's opcode decode matrix, trigger a "JAM" or "kill" (KIL) condition that halts the processor indefinitely. These include opcodes such as $02, $12, $22, $32, $42, $52, $62, $72, $92, B2,B2, D2, and F2,whichattemptinvalidaddressingmodesandtraptheCPUinaninfinitefetchcycle(T1state)withF2, which attempt invalid addressing modes and trap the CPU in an infinite fetch cycle (T1 state) with FF on the data bus, ignoring interrupts until reset. Although the standard BRK opcode ($00) is a documented software interrupt that does not halt, some illegal opcodes mimic partial BRK sequences but fail to complete, leading to the JAM state; execution can be reproduced by assembling and running one of these opcodes directly. These behaviors stem from incomplete microcode in the NMOS design and are treated as NOPs in later CMOS variants. These bugs primarily affected early NMOS silicon revisions, such as Revision A (pre-1976 masks lacking full ROR support) and subsequent NMOS variants used in 1970s systems like the KIM-1 and Apple II, with production masks before the 1980s transition to CMOS processes. Later revisions and second-source CMOS implementations resolved most issues, though NMOS compatibility requires emulating these quirks. Workarounds, such as avoiding boundary-aligned vectors or inserting flag-correcting instructions like EOR #$00 after decimal operations, were commonly employed in period software.

Design Trade-offs and Workarounds

The MOS Technology 6502's design prioritized cost reduction and simplicity, omitting dedicated multiply and divide instructions to minimize transistor count and die size, which allowed the chip to be priced at $25 upon release—far below competitors like the Intel 8080. This trade-off enabled widespread adoption in early personal computers but necessitated software implementations for multiplication and division, such as table lookups or bit-shifting loops, which could consume dozens of cycles and additional bytes per operation compared to hardware support in later processors. While this increased computational overhead for numerical tasks, it aligned with the 6502's target market of resource-constrained systems where such operations were infrequent or could be optimized via precomputation. The 6502's stack, limited to 256 bytes on a fixed page ($0100–$01FF), simplified the hardware by using an 8-bit stack pointer without needing a separate page register, reducing complexity and power consumption. This constraint curtails deep recursion or extensive subroutine nesting, potentially leading to stack overflows in complex programs, though it proved adequate for most 1970s-era applications like game loops or simple OS kernels that favored iterative designs over recursive ones. Programmers mitigated this by employing software-managed stacks in main memory or restricting call depth, techniques that added minimal overhead in flat address spaces but required careful management to avoid fragmentation. Interrupt handling introduced further trade-offs, with the non-maskable interrupt (NMI) using edge-triggered detection to ensure responsiveness in critical scenarios like power failure detection, but this created potential race conditions if an edge occurred during the processor's fetch cycle, risking missed interrupts without external latching hardware. Maskable IRQs, being level-sensitive, avoided some of these issues but still required precise timing to prevent nested interrupts from corrupting state. To address these, designers often incorporated external edge detectors or software polling as safeguards, balancing reliability against added circuitry cost. A notable workaround arose from the indirect jump (JMP (addr)) instruction's behavior, where vectors spanning a page boundary fetched the high byte from the wrong page due to a hardware oversight in address incrementation, potentially causing jumps to invalid locations. Programmers circumvented this by aligning vectors within the same page or substituting absolute jumps, a practice that avoided the bug without performance penalty in most cases but increased code size slightly for affected routines. Decimal mode, enabled via the SED instruction for BCD arithmetic, suffered from inconsistent flag behavior on the original NMOS 6502, where the negative (N), overflow (V), and zero (Z) flags did not reliably reflect the decimal result after ADC or SBC operations, complicating conditional branching in financial or display code. A common workaround involved appending an EOR #$00 (or equivalent) immediately after the arithmetic instruction to normalize N and Z flags based on the accumulator's binary value, adding one byte and three cycles per operation but ensuring correct branching without altering the result. The carry (C) flag remained valid for decimal comparisons, allowing its use in loops or validations. These design choices and their mitigations collectively imposed a performance toll through additional instructions and careful coding practices in affected software. Later variants, such as the Western Design Center 65C02, addressed several issues by fixing the indirect JMP bug—properly incrementing the page for high-byte fetches—and improving decimal mode handling, such as clearing the D flag during BRK/IRQ entry to prevent carryover into handlers, while adding opcodes like bit test instructions to reduce software overhead. These enhancements made the 65C02 more suitable for modern embedded uses without altering the core architecture's efficiency.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.