Hubbry Logo
AMD 10hAMD 10hMain
Open search
AMD 10h
Community hub
AMD 10h
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
AMD 10h
AMD 10h
from Wikipedia
K10 / Family 10h
General information
Launched2007
Discontinued2012
Common manufacturer
Performance
Max. CPU clock rate1700 MHz to 3700 MHz
FSB speeds1000 MHz to 2000 MHz
Architecture and classification
Technology node65 nm to 32 nm
Instruction setAMD64 (x86-64-v1)
Physical specifications
Sockets
Products, models, variants
Core names
History
PredecessorK8 - Hammer
SuccessorBulldozer - Family 15h
Support status
iGPU unsupported

The AMD Family 10h, or K10, is a microprocessor microarchitecture by AMD based on the K8 microarchitecture.[1] The first third-generation Opteron products for servers were launched on September 10, 2007, with the Phenom processors for desktops following and launching on November 11, 2007, as the immediate successors to the K8 series of processors (Athlon 64, Opteron, 64-bit Sempron).

Nomenclature

[edit]

It appears that AMD has not used K-nomenclature (which originally stood for "Kryptonite" in the K5 processor[2]) from the time after the use of the codename K8 for the AMD K8 or Athlon 64 processor family, since no K-nomenclature naming convention beyond K8 has appeared in official AMD documents and press releases after the beginning of 2005.

The name "K8L" was first coined by Charlie Demerjian in 2005, at the time a writer at The Inquirer,[3] and was used by the wider IT community as a convenient shorthand[4] while according to AMD official documents, the processor family was termed "AMD Next Generation Processor Technology".[5]

The microarchitecture has also been referred to as Stars, as the codenames for desktop line of processors was named under stars or constellations (the initial Phenom models being codenamed Agena and Toliman).

In a video interview,[6] Giuseppe Amato confirmed that the codename is K10.

It was revealed, by The Inquirer itself, that the codename "K8L" referred to a low-power version of the K8 family, later named Turion 64, and that K10 was the official codename for the microarchitecture.[4]

AMD refers to it as Family 10h Processors, as it is the successor of the Family 0Fh Processors (codename K8). 10h and 0Fh refer to the main result of the CPUID x86 processor instruction. In hexadecimal numbering, 0Fh (h represents hexadecimal numbering) equals the decimal number 15, and 10h equals decimal 16. (The "K10h" form that sometimes pops up is an improper hybrid of the "K" code and Family identifier number.)

Schedule of launch and delivery

[edit]

Timeline

[edit]

Historical information

[edit]

In 2003, AMD outlined the features for upcoming generations of microprocessors after the K8 family of processors in various events and analyst meetings, including the Microprocessor Forum 2003.[7] The outlined features to be deployed by the next-generation microprocessors are as follows:

In June 2006, AMD executive vice president Henri Richard had an interview with DigiTimes commented on the upcoming processor developments:

Q: What is your broad perspective on the development of AMD processor technology over the next three to four years?

A: Well, as Dirk Meyer commented at our analysts meeting, we're not standing still. We've talked about the refresh of the current K8 architecture that will come in '07, with significant improvements in many different areas of the processor, including integer performance, floating point performance, memory bandwidth, interconnections and so on.

— AMD Executive Vice President, Henri Richard, Source: DigiTimes Interview with Henri Richard[8]


Live demonstrations

[edit]

On November 30, 2006, AMD live demonstrated the native quad core chip known as "Barcelona" for the first time in public,[9] while running Windows Server 2003 64-bit Edition. AMD claims 70% scaling of performance in real world loads, and better performance than Intel Xeon 5355 processor codenamed Clovertown.[10]

On January 24, 2007, AMD Executive Vice President Randy Allen claimed that in live tests, in regard to a wide variety of workloads, "Barcelona" was able to demonstrate 40% performance advantage over the comparable Intel Xeon codenamed Clovertown dual-processor (2P) quad-core processors.[11] The expected performance of floating point per core would be approximately 1.8 times that of the K8 family, at the same clock speed.[12]

On May 10, 2007, AMD held a private event demonstrating the upcoming processors codenamed Agena FX and chipsets, with one demonstrated system being AMD Quad FX platform with one Radeon HD 2900 XT graphics card on the upcoming RD790 chipset. The system was also demonstrated real-time converting a 720p video clip into another undisclosed format while all 8 cores were maxed at 100% by other tasks.[13]

Sister microarchitecture

[edit]

On the December 2006 analyst day, Executive vice president Marty Seyer announced a new mobile core codenamed Griffin launched in 2008 with inherited power optimizations technologies from the K10 microarchitecture, but based on a K8 design.

TLB bug

[edit]

In November 2007 AMD stopped delivery of Barcelona processors after a bug in the translation lookaside buffer (TLB) of stepping B2 was discovered that could rarely lead to a race condition and thus a system lockup.[14] A patch in BIOS or software worked around the bug by disabling cache for page tables, but it was connected to a 5 to 20% performance penalty. Kernel patches that would almost completely avoid this penalty were published for Linux. In April 2008, the new stepping B3 was brought to the market by AMD, including a fix for the bug plus other minor enhancements.[15]

Features

[edit]

Fabrication technology

[edit]

AMD has introduced the microprocessors manufactured at 65 nm feature width using Silicon-on-insulator (SOI) technology, since the release of K10 coincides with the volume ramp of this manufacturing process.[16]

Supported DRAM standards

[edit]

The K8 family was known to be particularly sensitive to memory latency since its design gains performance by minimizing this through the use of an on-die memory controller (integrated into the CPU); increased latency in the external modules negates the usefulness of the feature. DDR2 RAM introduces some additional latency over DDR RAM since the DRAM is internally driven by a clock at one quarter of the external data frequency, as opposed to one half that of DDR. However, since the command clock rate in DDR2 is doubled relative to DDR and other latency-reducing features (e.g. additive latency) have been introduced, common comparisons based on CAS latency alone are not sufficient. For example, Socket AM2 processors are known to demonstrate similar performance using DDR2 SDRAM as Socket 939 processors that utilize DDR-400 SDRAM. K10 processors support DDR2 SDRAM rated up to DDR2-1066 (1066 MHz).[17]

While some desktop K10 processors are AM2+ supporting only DDR2, an AM3 K10 processor supports both DDR2 and DDR3. A few AM3 motherboards have both DDR2 and DDR3 slots (this does not mean that both types can be fitted at the same time), but for the most part they have only DDR3.

Lynx desktop processors only support DDR3, as they use the FM1 socket.

Microarchitecture characteristics

[edit]
K10 architecture
K10 single core with overlay description, excluding the L2 cache array

Characteristics of the microarchitecture include the following:[18]

  • Form factors
    • Socket AM2+ with DDR2 for the 65 nm Phenom and Athlon 7000 Series
    • Socket AM3 with either DDR2 or DDR3 for Semprons and the 45 nm Phenom II and Athlon II Series. They can also be used on AM3+ motherboards with DDR3. Note that, while all K10 Phenom Processors are backwards compatible with Socket AM2+ and Socket AM2, some 45 nm Phenom II Processors are only available for Socket AM2+. Lynx processors do not use either AM2+ nor AM3.
    • Socket FM1 with DDR3 for Lynx processors.
    • Socket F with DDR2, DDR3 with Shanghai and later Opteron processors
  • Instruction set additions and extensions
    • New bit-manipulation instructions ABM: Leading Zero Count (LZCNT) and Population Count (POPCNT)
    • New SSE instructions named as SSE4a: combined mask-shift instructions (EXTRQ/INSERTQ) and scalar streaming store instructions (MOVNTSD/MOVNTSS). These instructions are not found in Intel's SSE4
    • Support for unaligned SSE load-operation instructions (which formerly required 16-byte alignment)[19]
  • Execution pipeline enhancements
    • 128-bit wide SSE units
    • Wider L1 data cache interface allowing for two 128-bit loads per cycle (as opposed to two 64-bit loads per cycle with K8)
    • Lower integer divide latency
    • 512-entry indirect branch predictor and a larger return stack (size doubled from K8) and branch target buffer
    • Side-Band Stack Optimizer, dedicated to perform increment/decrement of register stack pointer
    • Fastpathed CALL and RET-Imm instructions (formerly microcoded) as well as MOVs from SIMD registers to general purpose registers
  • Integration of new technologies onto CPU die:
    • Four processor cores (Quad-core)
    • Split power planes for CPU core and memory controller/northbridge for more effective power management, first dubbed Dynamic Independent Core Engagement or D. I. C. E. by AMD and now known as Enhanced PowerNow! (also dubbed Independent Dynamic Core Technology), allowing the cores and northbridge (integrated memory controller) to scale power consumption up or down independently.[20]
    • Shutting down portions of the circuits in core when not in load, named "CoolCore" Technology.
  • Improvements in the memory subsystem:
    • Improvements in access latency:
      • Support for re-ordering loads ahead of other loads and stores
      • More aggressive instruction prefetching, 32 bytes instruction prefetch as opposed to 16 bytes in K8
      • DRAM prefetcher for buffering reads
      • Buffered burst writeback to RAM in order to reduce contention
    • Changes in memory hierarchy:
      • Prefetch directly into L1 cache as opposed to L2 cache with K8 family
      • 32-way set associative L3 victim cache sized at least 2 MB, shared between processing cores on a single die (each with 512 K  of independent exclusive L2 cache), with a sharing-aware replacement policy.
      • Extensible L3 cache design, with 6 MB planned for 45 nm process node, with the chips codenamed Shanghai.
    • Changes in address space management:
      • Two 64-bit independent memory controllers, each with its own physical address space; this provides an opportunity to better utilize the available bandwidth in case of random memory accesses occurring in heavily multi-threaded environments. This approach is in contrast to the previous "interleaved" design, where the two 64-bit data channels were bounded to a single common address space.
      • Larger Tagged Lookaside Buffers; support for 1 GB page entries and a new 128-entry 2 MB page TLB
      • 48-bit memory addressing to allow for 256 TB memory subsystems[21]
      • Memory mirroring (alternatively mapped DIMM addressing),[22] data poisoning support and Enhanced RAS
      • AMD-V Nested Paging for improved MMU virtualization, claimed to have decreasing world switch time by 25%.
  • Improvements in system interconnect:
    • HyperTransport retry support
    • Support for HyperTransport 3.0, with HyperTransport Link unganging which creates 8 point-to-point links per socket.
  • Platform-level enhancements with additional functionality:
    • Five p-states allowing for automatic clock rate modulation
    • Increased clock gating
    • Official support for coprocessors via HTX slots and vacant CPU sockets through HyperTransport: Torrenza initiative.

Feature tables

[edit]

CPUs

[edit]

APUs

[edit]

APU features table

Desktop

[edit]

Phenom models

[edit]

Agena (65 nm SOI, quad-core)

[edit]

Toliman (65 nm SOI, tri-core)

[edit]

Phenom II models

[edit]

Thuban (45 nm SOI, hexa-core)

[edit]

Zosma (45 nm SOI, quad-core)

[edit]

Deneb (45 nm SOI, quad-core)

[edit]

42 TWKR Limited Edition (45 nm SOI, quad-core)

[edit]

AMD released a limited edition Deneb-based processor to extreme overclockers and partners. Fewer than 100 were manufactured.

The "42" officially represents four cores running at 2 GHz, but is also a reference to the answer to life, the universe, and everything from The Hitchhiker's Guide to the Galaxy.[25]

Propus (45 nm SOI, quad-core)

[edit]

Heka (45 nm SOI, tri-core)

[edit]

Callisto (45 nm SOI, dual-core)

[edit]

Regor (45 nm SOI, dual-core)

[edit]

Athlon X2 models

[edit]

Kuma (65 nm SOI, dual-core)

[edit]

Regor/Deneb (45 nm SOI, dual-core)

[edit]

Athlon II Models

[edit]

Zosma (45 nm SOI, quad-core)

[edit]

Propus (45 nm SOI, quad-core)

[edit]

Rana (45 nm SOI, tri-core)

[edit]

Regor (45 nm SOI, dual-core)

[edit]

Sargas (45 nm SOI, single-core)

[edit]

Lynx (32 nm SOI, dual or quad-core)

[edit]

Sempron models

[edit]

Sargas (45 nm SOI, single-core)

[edit]

Sempron X2 models

[edit]

Regor (45 nm SOI, dual-core)

[edit]

Lynx (32 nm SOI, dual-core)

[edit]

Llano "APUs"

[edit]

Lynx (32 nm SOI, dual or quad-core)

[edit]

The first generation desktop APUs based on the K10 microarchitecture were released in 2011 (some models do not provide graphics capability, such as the Lynx Athlon II and Sempron X2).

  • Fabrication 32 nm on GlobalFoundries SOI process
  • Socket FM1
  • Die size: 228 mm2, with 1.178 billion transistors[30][31]
  • AMD K10 cores with no L3 cache
  • GPU: TeraScale 2
  • All A and E series models feature Redwood-class integrated graphics on die (BeaverCreek for the dual-core variants and WinterPark for the quad-core variants). Sempron and Athlon models exclude integrated graphics.[32]
  • Support for up to four DIMMs of up to DDR3-1866 memory
  • 5 GT/s UMI
  • Integrated PCIe 2.0 controller
  • Select models support Turbo Core technology for faster CPU operation when the thermal specification permits
  • Select models support Hybrid Graphics technology to assist a discrete Radeon HD 6450, 6570, or 6670 discrete graphics card. This is similar to the current Hybrid CrossFireX technology available in the AMD 700 and 800 chipset series
  • ISA extensions: MMX, Enhanced 3DNow!, SSE, SSE2, SSE3, SSE4a, ABM, NX bit, AMD64, Cool'n'Quiet, AMD-V
  • Models: Lynx desktop APUs and CPUs

Mobile

[edit]

Turion II (Ultra) models

[edit]

"Caspian" (45nm SOI, dual-core)

[edit]

Turion II models

[edit]

"Caspian" (45nm SOI, dual-core)

[edit]

"Champlain" (45nm SOI, dual-core)

[edit]

Athlon II models

[edit]

"Caspian" (45nm SOI, dual-core)

[edit]

"Champlain" (45nm SOI, dual-core)

[edit]

Sempron models

[edit]

"Caspian" (45nm SOI, single-core)

[edit]

Turion II Neo models

[edit]

"Geneva" (45nm SOI, dual-core)

[edit]

Athlon II Neo models

[edit]

"Geneva" (45nm SOI, dual-core)

[edit]

"Geneva" (45nm SOI, single-core)

[edit]

V models

[edit]

"Geneva" (45nm SOI, single-core)

[edit]

"Champlain" (45nm SOI, single-core)

[edit]

Phenom II models

[edit]

"Champlain" (45nm SOI, quad-core)

[edit]

"Champlain" (45nm SOI, tri-core)

[edit]

"Champlain" (45nm SOI, dual-core)

[edit]

Llano APUs

[edit]

"Sabine" (32nm SOI, dual or quad-core)

[edit]

Server

[edit]

There are two generations of K10-based processors for servers: Opteron 65 nm and 45 nm.

Successor

[edit]

AMD discontinued further development of K10 based CPUs after Thuban, choosing to focus on Fusion products for mainstream desktops and laptops and Bulldozer based products for the performance market. However, within the Fusion product family, APUs such as the first generation A4, A6 and A8-series chips (Llano APUs) continued to use K10-derived CPU cores in conjunction with a Radeon graphics core. K10 and its derivatives were phased out of production by the introduction of Trinity-based APUs in 2012, which replaced the K10 cores in the APU with Bulldozer-derived cores.

Family 11h and 12h derivatives

[edit]

Turion X2 Ultra Family 11h

[edit]

The Family 11h microarchitecture was a mixture of both K8 and K10 designs with lower power consumption for laptop that was marketed as Turion X2 Ultra and was later replaced by completely K10-based designs.[1]

Fusion Family 12h

[edit]

The Family 12h microarchitecture is a derivative of the K10 design:[37][38]

  • Both CPU and GPU were re-used to avoid complexity and risk
  • Distinct Software and Physical integration makes Fusion (APU) microarchitectures different
  • Power-saving improvements including clock gating
  • Improvements to hardware pre-fetcher
  • Redesigned memory controller
  • 1MB L2 cache per core
  • No L3 cache
  • Two new buses for on-die GPU to access memory (called Onion and Garlic interfaces)
    • AMD Fusion Compute Link (Onion) – interfaces to CPU cache and coherent system memory (see cache coherence)
    • Radeon Memory Bus (Garlic) – dedicated non-coherent interface connected directly to memory

Media discussions

[edit]

Note: These media discussions are listed in ascending date of publication.

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The AMD Family 10h, also known as the K10 microarchitecture, is a family of 64-bit x86 microprocessors developed by Advanced Micro Devices (AMD) as a successor to the K8 architecture, featuring multi-core designs with up to six cores per processor die, an integrated dual-channel memory controller supporting DDR2 and DDR3 SDRAM, and HyperTransport 3.0 technology for high-speed I/O and inter-processor communication. Introduced in September 2007 with the first quad-core Opteron processors, the family marked AMD's push into high-performance computing with innovations like shared L3 cache (up to 6 MB), support for SSE4a and AMD64 extensions, and advanced power management including multiple P-states and C-states for efficiency. Key product lines under Family 10h encompassed server-oriented Opteron processors (such as the 4100 and 6100 series in quad- and hexa-core configurations), desktop-focused Phenom and Phenom II models (offering up to six cores with socket AM2/AM2+/AM3 compatibility), and value-oriented Athlon II and Sempron variants for mainstream and budget systems. Mobile implementations included Turion II and Athlon II Neo for laptops, utilizing packages like S1g4. These processors supported features like ECC memory with Chipkill protection, virtualization via AMD-V (SVM), and scalable multi-socket configurations up to eight processors via NUMA-aware designs, targeting servers, workstations, and consumer PCs. The architecture emphasized balanced through a 128-bit per core, 12-stage integer pipelines, and enhancements like instruction-based sampling (IBS) for performance monitoring, though early revisions faced errata related to training and power states that were addressed via updates across revisions from 3.00 (2007) to 3.92 (2012). Family 10h processors were fabricated on 65 nm and 45 nm processes, with ranging from 25 W for mobile SKUs to 140 W for high-end desktop and server models, competing directly with Intel's Core 2 and early Nehalem architectures. Production continued into the early , bridging 's shift toward the (Family 15h) era.

Overview and Nomenclature

Introduction

The Family 10h, also known as the K10 , is a 64-bit x86 processor architecture developed by as the successor to the K8 (Family 0Fh) . Introduced in 2007 with the launch of the quad-core "Barcelona" server processors, K10 marked AMD's shift toward native multi-core designs on a single die, building on the integrated and interconnect first pioneered in K8. This architecture powered both server and desktop processors, including the Phenom family, emphasizing scalability for workloads. The primary goals of the K10 were to deliver substantial improvements in and floating-point performance—up to 50% over prior generations—while enhancing power efficiency through innovations like AMD CoolCore Technology and Dual Dynamic Power Management. These advancements enabled better handling of multi-threaded applications in data centers and systems, with support for configurations scaling to 6 cores in desktop processors and up to 12 cores in server variants using multi-chip modules. By integrating these features, K10 aimed to reduce latency and power consumption compared to discrete-component designs prevalent at the time. Key specifications of K10 include an on-die integrated supporting DDR2 (and DDR3 in later implementations) memory for lower latency access, and the 2.0 interconnect (up to 2.0 GT/s), with later revisions supporting 3.0 (up to 5.2 GT/s) for inter-processor communication. In historical context, K10 was AMD's strategic response to Intel's Core 2 architecture, which had gained market traction in 2006; AMD positioned K10 to regain competitiveness in both performance benchmarks and energy-efficient multi-core processing for servers and desktops.

Naming Conventions

The AMD Family 10h processors were branded across consumer and server segments using established product lines to denote performance tiers and form factors. High-end desktop processors were marketed under the Phenom brand, mid-range desktop models under , entry-level desktop variants under Sempron, mobile processors under Turion II, and server-oriented chips under . These brands were officially trademarked by to distinguish their K10-based offerings from prior generations. Model numbering within these brands followed a consistent scheme emphasizing core count, generation, and features. Suffixes such as "X4" indicated quad-core configurations, while "X2", "X3", and "X6" denoted dual-core, triple-core, and hexa-core models, respectively; for example, the X4 targeted mainstream quad-core desktop use. The "II" suffix marked second-generation implementations on the , as seen in , , and Turion II lines. Unlocked multiplier variants, allowing , were designated as Black Edition (often abbreviated "BE") or with a star rating symbol (*), such as the X4 965 Black Edition. Family 10h revisions, or steppings, were identified via values and addressed specific hardware errata through silicon updates. Early steppings included B2 (CPUID 00100F22h) and B3 (00100F23h), which were affected by errata such as #254 (TLB livelock, mitigated via MSRC001_1023=1b) and #309 (concurrent L2/NB response issues, mitigated via MSRC001_1023=1b), as well as #263 (DQS distortion, mitigated via ). Later C2 steppings (e.g., RB-C2 at 00100F42h, BL-C2 at 00100F52h) fixed #254 and #309 but not #263. These revisions applied across , Phenom, , and Sempron processors, with errata details documented for developers to ensure compatibility. Internally, AMD used astronomical codenames for Family 10h designs, mapping them to public brands based on target markets. served as the codename for the initial quad-core server processor, released as the third-generation (e.g., 23xx series). Agena was the desktop counterpart, branded as the original Phenom quad-core. , a 45 nm shrink of Agena, underpinned and desktop models. , another 45 nm evolution, powered updated processors with enhanced cache, while mobile implementations like Champlain fell under Turion II. These codenames facilitated development tracking before public branding.

Development and Release

Timeline

The AMD 10h family, also known as K10, was first publicly detailed at AMD's Analyst Day event on December 14, 2006, where the company revealed its roadmap for quad-core processors targeting server, desktop, and mobile segments, with initial shipments planned for late 2007. Originally targeted for a mid-2006 tape-out and broader availability by year-end, development faced significant delays into 2007 primarily due to design bugs encountered during validation and fabrication ramp-up on the node. The family's production debut came with the server-oriented Opteron processors codenamed Barcelona, launched on September 10, 2007, as AMD's first native quad-core x86 offerings for data centers, built on 65 nm silicon. Desktop variants followed closely with the Phenom processors introduced on November 19, 2007, also at 65 nm, though early B2-stepping units suffered from a critical translation lookaside buffer (TLB) erratum that could cause system lock-ups under specific memory access patterns, impacting initial shipments and higher clock-speed bins. This defect prompted a temporary BIOS workaround that reduced performance by up to 10% in affected workloads, while delaying models like the 2.4 GHz Phenom until revisions could be implemented. AMD addressed the TLB issue hardware-side with the B3 stepping revision, which began shipping on March 27, 2008, enabling higher-volume production and clock speeds without the prior penalties for both Phenom and Barcelona lines. The transitioned to the node later that year, starting with the "Shanghai" processors launched on November 13, 2008, featuring a shrink to the and larger shared L3 cache for improved efficiency. Desktop evolution continued with the series, released in December 2008 on , featuring refined cores for better power efficiency and compatibility with DDR2 and DDR3 memory.

Launch Demonstrations

The K10 , codenamed 10h, made its initial public appearance with the server-oriented Barcelona processors. On September 10, 2007, unveiled the Quad-Core at a premiere event in , demonstrating its native quad-core design and integrated for improved performance in datacenter workloads. The demonstration highlighted up to 50% increase in performance compared to prior dual-core Opterons, emphasizing scalability for multi-socket systems. Following the server debut, the desktop variant arrived two months later. AMD launched the Phenom X4 processors on November 19, 2007, as part of the Spider platform, with the Phenom X4 9600 showcased running at 2.3 GHz during . This reveal included live benchmarks illustrating quad-core multitasking capabilities, such as simultaneous video encoding and gaming, to underscore the platform's enthusiast appeal alongside the 790FX chipset supporting multiple GPUs. Server-focused demonstrations continued at the SC07 Supercomputing Conference in , from 10-16, 2007. AMD presented Barcelona Opteron systems exhibiting quad-core scaling in high-performance computing tasks, including parallel simulations that showed up to 1.8 times the throughput of dual-core predecessors in memory-intensive applications. These exhibits targeted enterprise users, highlighting energy efficiency and Direct Connect Architecture for reduced latency in clustered environments. The K10 lineup expanded with previews of the 45 nm Phenom II at CES 2008 on January 8, 2008. AMD demonstrated early engineering samples, including live overclocking sessions where a Phenom II X4 reached beyond 3 GHz on air cooling, showcasing improved thermal headroom and unlocked multipliers for enthusiasts. These sessions emphasized the shrink's potential for higher clocks without proportional power increases, positioning Phenom II as a competitive refresh. Early media access further shaped public perception through previews from and in November 2007. 's hands-on with the Phenom X4 9600 noted approximately 10-15% IPC gains over the K8-based in integer-heavy tasks like compression, though floating-point workloads showed more modest uplifts of 5-10%. echoed this, crediting the reworked core for elevated instructions per clock cycle, with benchmarks revealing 8-12% better single-threaded efficiency versus K8 equivalents at matched frequencies. These reviews focused on conceptual advances like shared L3 cache benefits, using representative synthetic tests to illustrate real-world multitasking improvements without exhaustive metrics.

Microarchitecture

Core and Execution Units

The AMD 10h microarchitecture, also known as K10, features a 12-stage pipeline designed for balanced performance in both single-threaded and multi-threaded workloads. This pipeline spans from instruction fetch to , enabling up to three macro-operations per clock cycle through a three-wide superscalar design, with three parallel execution each containing an arithmetic-logic unit (ALU) and (AGU). Multiplication operations are restricted to one pipe with a three-cycle latency, while simpler ALU operations can issue across all pipes for higher throughput. The structure includes dedicated stages for fetch (32 bytes per cycle), decode, dispatch, schedule, execution, and retirement, contributing to a branch misprediction penalty of 12-13 cycles. The in the 10h core supports 128-bit SSE and SSE4a instructions, marking an upgrade from the prior K8 architecture's 64-bit paths. All FP execution units operate at 128-bit width, allowing single-cycle processing of full XMM register operations without splitting into multiple micro-operations, which improves throughput for vectorized workloads. The unit comprises three specialized pipelines: one for addition/subtraction (four-cycle latency, fully pipelined), one for multiplication/division (four-cycle latency for multiply, 11 cycles for division), and a miscellaneous unit for conversions and stores (two-cycle latency). This configuration enables simultaneous scalar and vector FP execution, with double-precision multiplies starting every other cycle at five-cycle latency. Branch prediction in the 10h core employs a two-level adaptive mechanism with global history tracking, akin to early precursors of TAGE predictors, using an 8- or 12-bit global history register to index a 16K-entry pattern history table for improved accuracy on correlated branches. A dedicated branch target buffer (BTB) of 2048 entries supports direct branch targets, with an additional 512-entry buffer for indirect jumps, limiting throughput to one taken branch every two cycles. A loop predictor enhances performance for repetitive small loops (up to 64 iterations), detecting patterns with repeat counts of 9-13 and enabling two-cycle execution for loops under six macro-operations without cache boundary crossings. Overall accuracy benefits from meta-prediction to select between global and loop modes, though the long pipeline amplifies misprediction costs. Multi-core integration in the 10h design connects up to six cores via an on-die crossbar interconnect, facilitating low-latency communication and shared access to a victim L3 cache of up to 6 MB (48-way associative with 64-byte lines in higher-end models like ). Each core retains private L1 and L2 caches, but the shared L3 acts as a unified victim cache to reduce inter-core data movement latency, with the crossbar enabling concurrent accesses from multiple cores to channels. This setup supports quad-core configurations in server processors with 2 MB L3 and scales to six cores in desktop variants, prioritizing bandwidth over ultra-low latency compared to ring-based alternatives. The base 10h does not include (SMT), relying instead on its wide issue and multi-core scaling for parallelism; SMT was introduced in later derivatives like the 15h family.

Cache and Memory Subsystem

The AMD 10h features a three-level designed to balance latency and capacity for multi-core workloads. Each core has a dedicated 64 KB L1 cache split equally between 32 KB instruction and 32 KB caches, both 2-way set-associative with 64-byte lines. The L2 cache is 512 KB per core, 16-way set-associative and exclusive, operating at full core clock speed to minimize latency for frequently accessed . A shared L3 cache, ranging from 2 MB in original implementations to 6 MB in later variants, is a non-inclusive victim cache that is 32-way set-associative in 2 MB versions and 48-way in 6 MB versions, providing a unified pool for inter-core and reducing main accesses. The integrated supports dual-channel DDR2 at speeds up to 1066 MT/s in initial 10h implementations, delivering peak theoretical bandwidth of 17 GB/s. Phenom II revisions upgraded to dual-channel DDR3 support at up to 1333 MT/s, increasing bandwidth to 21.3 GB/s while maintaining compatibility with unbuffered DIMMs up to 16 GB total capacity. This on-die controller reduces latency compared to external northbridge designs, with configurable interleaving modes (ganged or unganged) to optimize access patterns. 3.0 interconnect, rated at 5.2 GT/s, supplements the subsystem by providing up to 10.4 GB/s per direction (20.8 GB/s aggregate bidirectional bandwidth per link) for I/O and multi-socket communication, ensuring scalable access in server configurations. To enhance bandwidth efficiency, the L3 cache functions as a victim cache, exclusively holding blocks evicted from L2 to capture reused data without duplicating core-private contents. Hardware prefetchers in the core and memory controller further optimize performance by anticipating data needs; L1/L2 prefetchers detect stride patterns up to four cache lines ahead, while the DRAM prefetcher can issue up to three requests per access, configurable via model-specific registers for workload tuning. These features collectively improve hit rates in bandwidth-constrained scenarios, though they add minor overhead in random-access patterns.

Integrated Components

The AMD Family 10h processors integrate 3.0 as the primary on-die interconnect for I/O and inter-processor communication, featuring 16 bidirectional lanes operating at a maximum clock speed of 2.6 GHz (5.2 GT/s signaling) to deliver an aggregate bandwidth of 20.8 GB/s in full-duplex mode (10.4 GB/s per direction). This configuration supports scalable multi-socket systems by allowing coherent linking between processors and external devices, such as chipsets, while maintaining low latency for data transfers. Power management in AMD 10h is handled through 2.0 technology, an advanced implementation of dynamic frequency and voltage scaling that adjusts core operating parameters in response to workload demands, thereby reducing power consumption during idle or light-load scenarios. This system enables desktop processors to operate within limits of up to 125 W, balancing performance with energy efficiency without requiring external intervention. Virtualization support is provided via the AMD-V extensions, which include nested paging capabilities to accelerate translation in virtual environments by using a secondary hierarchy managed by the . Additionally, Secure Virtual Machine (SVM) functionality enhances security for through features like interrupt virtualization and controlled access to host resources, enabling robust isolation for multiple guest operating systems. While the northbridge functions for certain I/O operations remain off-die via the links, the architecture ensures on-chip coherency for multi-core operations, allowing efficient data consistency across cores without relying on external buses for intra-socket communication.

Manufacturing Technology

Process Nodes

The Family 10h processors were initially manufactured using a 65 nm silicon-on-insulator (SOI) process at 's Fab 36 facility in , . This node supported the launch of quad-core chips codenamed Agena for desktops and for servers, with each die containing approximately 463 million transistors and measuring 285 mm². The 65 nm SOI technology provided a balance of performance and power efficiency for the era, leveraging 's established expertise in SOI to reduce and improve speed compared to bulk alternatives. In late 2008 and 2009, AMD shifted production to a 45 nm SOI process for subsequent revisions, including the desktop and server variants. This transition, still primarily at Fab 36, enabled significant density improvements, resulting in quad-core dies with about 758 million transistors and a smaller 258 mm² footprint, despite expanding the shared L3 cache from 2 MB to 6 MB. The finer node reduced leakage currents to less than one-third of the 65 nm levels, enhancing overall energy efficiency and allowing higher clock speeds within similar thermal envelopes. Following AMD's spin-off of its manufacturing operations in 2009, the facility became ' Fab 1 (formerly Fab 36), which handled ongoing 45 nm production for Family 10h and derivatives, contributing to and lower per-unit costs over time. While the core Family 10h lineup remained on 65 nm and 45 nm nodes, later derivatives such as the Family 12h Llano APU adopted a 32 nm SOI process, marking an extension of the architecture on ' more advanced lines.

Socket Interfaces

The AMD Family 10h processors utilized several socket interfaces tailored to desktop, mobile, and server applications, each featuring specific pin configurations and electrical specifications to support integrated memory controllers and links. Desktop implementations primarily employed the AM2, AM2+, and AM3 sockets, all based on a 940-pin lidded micro-PGA (mPGA) ZIF with a 1.27 mm pitch in a 31x31 array configuration. These sockets facilitated unbuffered DDR2 or DDR3 support, with core voltage ranging from 1.1 V to 1.55 V managed via VID signaling for dynamic power scaling across P-states. The AM2+ variant, also denoted as AM2r2, introduced enhanced electrical tolerances for higher clock speeds while maintaining mechanical compatibility with prior AM2 infrastructure. Mobile variants of Family 10h processors, such as the series, were designed for the S1g2 socket, a 638-pin lidded mPGA ZIF interface with integrated DDR2 SO-DIMM support and core voltages between 0.9 V and 1.3 V to prioritize in environments. This socket evolved from earlier S1g1 designs, incorporating refined pin mappings for thermal sensing via THERMDA/THERMDC pins and up to two DDR channels operating at lower voltages like 1.8 V for DDR2. Low-power configurations adhered to similar pinouts but emphasized reduced drive strengths and timings to meet TDP constraints under 35 W. Server-oriented Family 10h processors, including models, adopted LGA-based sockets for scalability in multi-socket systems. The Socket F (1207-pin LGA at 1.10 mm pitch in a 35x35 array) supported single- and dual-socket setups with registered DDR2 RDIMMs, operating at core voltages of 1.1 V to 1.35 V and compatibility across Fr1 through Fr6 package revisions. For higher-density configurations, Socket G34 provided a 1944-pin LGA interface (1.00 mm pitch in a 57x40 array), enabling up to four-socket "" platforms with DDR3 RDIMM/UDIMM support and independent 3.0 links per node, including dual-node capabilities in Revision D and later steppings. Socket C32, a 1207-pin LGA variant, targeted single-socket workstation use with unbuffered or registered DDR3 options. Compatibility across sockets emphasized backward support for Family 10h implementations: AM2+ and AM3 desktop processors were mechanically and electrically compatible with AM2 motherboards, requiring BIOS updates for full DDR3 enablement on AM3 parts in older boards. Mobile S1g2 processors interchanged with S1g1 infrastructure but required matching DDR timings to avoid instability. Server sockets like F and C32 maintained cross-revision compatibility (e.g., Fr2 with Fr5 packages), while G34 focused on dedicated multi-socket scaling without direct backward ties to prior F-based systems. The AM3 socket extended forward compatibility to select Family 11h and 12h processors (e.g., and via AM3+ extensions), allowing DDR3 upgrades without socket changes, though BIOS validation was essential for mixed-stepping environments.
Socket TypePin CountPackage TypePrimary UseCore Voltage Range
AM2/AM2+/AM3940mPGA ZIFDesktop1.1–1.55 V
S1g2638mPGA ZIFMobile0.9–1.3 V
F/C321207LGAServer (1–2 socket)1.1–1.35 V
G341944LGAServer (2–4 socket)1.1–1.35 V

Consumer Processors

Desktop Models

The AMD Family 10h desktop processors included the premium Phenom and Phenom II lines, alongside more affordable Athlon II and Sempron offerings, all utilizing the K10 microarchitecture and targeting single-socket consumer systems on AM2+ or AM3 sockets. These models emphasized multi-core performance for tasks like gaming and content creation, with shared L3 cache in higher-end variants to improve data access efficiency. Thermal design power (TDP) ranged from 45 W to 140 W across the lineup, balancing performance and power efficiency for desktop environments. Launch prices spanned $50 to $300, positioning them competitively against Intel's Core 2 series.

Phenom Models

The initial Phenom desktop processors, introduced in late 2007, featured the Agena quad-core variant fabricated on a node, with clock speeds from 1.8 GHz to 2.6 GHz, 512 KB L2 cache per core, and a shared 2 MB L3 cache. Toliman-based tri-core models, released in as a response to yields, operated at similar speeds but with one disabled core for reliability. These processors supported DDR2 on AM2+ sockets and had TDPs of W to 140 W, with launch prices around $235 for top models like the Phenom X4 9950. These processors supported DDR2 on AM2+ sockets and had TDPs of to 140 W, with launch prices around $235 for top models like the Phenom X4 9950. Representative specifications are summarized below:
ModelCoresBase Clock (GHz)L2 Cache (per core)L3 CacheTDP (W)Launch Price (USD)
Phenom X4 9150e (Agena)41.8512 KB2 MB65~$200
Phenom X4 9950 (Agena)42.6512 KB2 MB140$235
Phenom X3 8750 ()32.4512 KB2 MB95~$150

Phenom II Models

Launched in 2009, the Phenom II series shifted to a , improving efficiency and enabling higher clocks up to 3.7 GHz. The quad-core variant included a 6 MB shared L3 cache, while offered hexa-core configurations at 2.5–3.2 GHz for enhanced multitasking. Tri-core Heka and dual-core Callisto/Regor models provided cost-effective options by disabling cores, all on AM3 sockets supporting DDR3. TDPs ranged from 80 to 140 , with Black Edition unlocked variants popular for . Launch prices started at $199 for quad-cores like the X4 965 and reached $295 for the X6 1090T. Key examples include:
ModelCoresBase Clock (GHz)L2 Cache (per core)L3 CacheTDP (W)Launch Price (USD)
Phenom II X4 920 (Deneb)42.8512 KB6 MB125$235
Phenom II X6 1090T (Thuban)63.2512 KB6 MB125$295
Phenom II X3 740 (Heka)32.8512 KB6 MB95$150

Athlon II Models

The Athlon II desktop lineup, debuting in 2009 on a 45 nm process, targeted budget users with no L3 cache to reduce costs, focusing on AM3 sockets and DDR3 support. Quad-core Propus and Zosma variants ran at 2.5–3.2 GHz, dual-core Regor at up to 3.2 GHz, and tri-core Rana at 2.1–2.5 GHz, all with 512 KB to 1 MB L2 cache per core. TDPs were efficient at 65–95 W, making them suitable for value-oriented builds. Launch prices began at $99 for the Athlon II X4 620, appealing to entry-level quad-core buyers. Selected models:
ModelCoresBase Clock (GHz)L2 Cache (per core)L3 CacheTDP (W)Launch Price (USD)
X2 250u (Regor)21.6512 KBNone25~$50
X3 405e (Rana)32.6512 KBNone65$76
X4 620 (Propus)42.6512 KBNone95$100

Sempron Models

Entry-level Sempron desktop processors in Family 10h, also on 45 nm since 2009, used Sargas for single-core at 2.2–2.6 GHz and Regor/ for dual-core up to 2.8 GHz, with 512 KB L2 cache and no L3. Designed for basic computing on AM3 sockets, they featured low TDPs of 45–65 W for energy-efficient systems. Launch prices hovered around $50, such as for the Sempron 140. Examples:
ModelCoresBase Clock (GHz)L2 Cache (per core)L3 CacheTDP (W)Launch Price (USD)
Sempron 130 (Sargas)12.6512 KBNone45$50
Sempron X2 210 (Regor)22.0512 KBNone65$53

Mobile Models

The AMD 10h mobile processors were designed for applications, emphasizing power efficiency and thermal management to support extended battery life while delivering multi-core performance based on the K10 . These processors targeted mainstream and budget notebooks, utilizing technology and socket interfaces like S1g3 and S1g4 to enable compact, low-profile designs. Unlike desktop variants, mobile 10h models prioritized reduced (TDP) ratings, typically ranging from 15 to 45 , with integrated features such as 3.0 interconnects and DDR2/DDR3 memory controllers to balance performance and portability. The Turion II Ultra series represented AMD's premium dual-core mobile offering within the 10h family, built on the Caspian core architecture at 45 nm. These processors operated at clock speeds between 2.0 GHz and 2.5 GHz, with a standard TDP of 35 W, featuring 2 MB of shared L2 cache and support for SSE4a instructions to enhance multimedia tasks in laptops. For instance, the Turion II Ultra M600 ran at 2.4 GHz, while the M620 model reached 2.5 GHz, both utilizing Socket S1g3 for compatibility with mid-range mobile platforms. Phenom II Mobile processors extended the quad-core capabilities of the 10h lineup to mobile devices via the Champlain core, also on a 45 nm process, targeting performance-oriented notebooks with clock speeds from 1.8 GHz to 2.8 GHz and TDPs of 35 W to 45 W. These models included dual-, triple-, and quad-core configurations, each with 512 KB L2 cache per core but no shared L3 cache to optimize power draw, and they supported up to 8 GB of DDR3-1066 memory. Representative examples include the quad-core N930 at 2.0 GHz (35 W TDP) for balanced workloads and the dual-core N620 at 2.8 GHz (35 W TDP) for lighter , all compatible with Socket S1g4. Athlon II Mobile processors provided cost-effective dual-core options for entry-level laptops, drawing from Caspian and Champlain cores at 45 nm, with some variants like targeting ultra-low power single- and dual-core designs at 15 W to 25 W TDP. Clock speeds ranged from 1.6 GHz to 2.2 GHz, featuring 1 MB of L2 cache and 3.0 at 1.6 GHz for efficient data transfer in budget systems. The M300, a Caspian-based dual-core at 2.0 GHz (25 W TDP), exemplified mainstream use, while the lower-power P320 (Champlain) at 2.1 GHz (25 W TDP) suited thin-and-light notebooks, both using Socket S1g3 or S1g4. Sempron and V-Series mobile processors served as single-core entry points in the 10h family, leveraging Caspian, , and Champlain cores at 45 nm for basic tasks, with clock speeds of 1.0 GHz to 2.3 GHz and a consistent 15 W to 25 W TDP to minimize . These models included 512 KB L2 cache and supported DDR2-800 , focusing on affordability for netbooks and low-end laptops. Examples include the V-Series V120 at 2.0 GHz (25 W TDP) and V140 (Champlain) at 2.3 GHz (25 W TDP), both on Socket S1g4, providing essential 64-bit processing without advanced multi-threading.
Processor LineCore ArchitectureCore CountClock Speed RangeTDP RangeSocketL2 Cache
Turion II UltraCaspianDual2.0–2.5 GHz35 WS1g32 MB shared
Phenom II MobileChamplainDual/Triple/Quad1.8–2.8 GHz35–45 WS1g4512 KB per core
Athlon II MobileCaspian/Champlain/Single/Dual1.6–2.2 GHz15–25 WS1g3/S1g4512 KB–1 MB
Sempron/V-SeriesCaspian//ChamplainSingle1.0–2.3 GHz15–25 WS1g4512 KB

Server Processors

Opteron Quad-Core Models

The Quad-Core processors codenamed marked AMD's entry into native quad-core server processing, launching on September 10, 2007. Built on a node, these processors featured four cores with clock speeds ranging from 1.8 GHz to 2.5 GHz, 2 MB of shared L3 cache per die, and a (TDP) of 95 W. Designed for Socket F (1207-pin), they supported dual- and multi-processor configurations up to eight sockets, enabling scalability for enterprise and (HPC) environments. Key architectural features of included AMD-Vi for , which allowed direct device assignment to virtual machines, and 2.0 links operating at up to 2.0 GT/s for coherent multi-socket operation, reducing latency in shared-memory systems. In HPC workloads, delivered notable SPECint performance improvements over prior dual-core Opterons, with up to 70% gains in integer-intensive tasks like scientific simulations, establishing a foundation for parallel processing in servers. The variant, introduced on November 13, , refined the design at a , boosting clock speeds to 2.5–3.1 GHz while expanding the shared L3 cache to 6 MB and retaining the 95 W TDP envelope. It maintained Socket F compatibility and DDR2 memory support, now extending to 800 MT/s speeds for enhanced bandwidth in registered configurations. Shanghai achieved approximately 30% higher instructions per clock (IPC) than Barcelona through optimizations in branch prediction and cache efficiency, yielding better per-watt performance in server applications. Shanghai inherited Barcelona's AMD-Vi and features, with the latter enabling low-latency coherency across up to eight sockets via probe filtering in HT Assist mode. Benchmarks in HPC scenarios showed Shanghai providing SPECint uplifts of 20–30% over Barcelona at equivalent clocks, particularly in integer-heavy workloads like database queries and modeling, while reducing idle power by up to 20%. These advancements positioned Shanghai as a competitive option for energy-efficient multi-socket servers before the shift to higher core counts.

Opteron Multi-Core Models

The AMD multi-core models in the Family 10h architecture extended the processor lineup beyond quad-core designs by introducing hexa-core dies and (MCM) configurations to achieve higher core counts for server workloads. These models targeted demanding enterprise environments, emphasizing scalability in multi-socket systems while maintaining compatibility with existing infrastructure where possible. The Istanbul processor, introduced in 2009, served as the foundational hexa-core implementation on a 45 nm silicon-on-insulator (SOI) process. Each Istanbul die featured six cores operating at clock speeds ranging from 2.0 GHz to 2.8 GHz, with a shared 6 MB L3 cache and support for HyperTransport 3.0 links at up to 6.4 GT/s. Designed for Socket F, it included HT Assist technology to optimize cache coherency in multi-processor setups, enabling configurations from two to eight sockets with thermal design power (TDP) options of 55 W to 115 W. Istanbul processors delivered up to 40% performance uplift over prior quad-core models in server benchmarks, focusing on throughput in virtualized and database applications. Building on the die, the series, launched in 2010, pioneered dual-die MCM packaging to scale core counts to eight or twelve per socket, addressing the need for greater parallelism in . The 12-core variant combined two 6-core Istanbul dies, while the 8-core version paired two quad-core dies, resulting in a total of 12 MB L3 cache (6 MB per die) and clock speeds from 1.7 GHz to 2.5 GHz. These processors used the new Socket G34 interface, supported DDR3-1333 memory across four channels per socket, and maintained a 115 W TDP for standard models, with lower-power HE variants at 85 W. Magny-Cours improved by up to 50% over Socket F predecessors, facilitating better handling of large datasets in enterprise servers. Magny-Cours enhanced system scalability, supporting up to four sockets in rack and blade servers through its four 3.0 links per die, which enabled coherent interconnects across 48 cores in a single node. This design was particularly suited for dense blade environments, such as those from and HP, where it powered 2P and 4P configurations for and HPC clusters. Performance in large-scale computing benefited from NUMA-aware optimizations, including directory-based coherency via HT Assist, which reduced remote memory access latencies by caching snoop filters on-chip and minimized inter-die traffic in multi-socket topologies. Software guidelines for Family 10h recommended affinity scheduling and I/O pinning to leverage these NUMA features, yielding up to 30% efficiency gains in multi-threaded workloads on multi-socket systems.

Derivatives

Family 11h

The AMD Family 11h processors, codenamed Griffin, constitute a mobile-optimized derivative of the K10 , blending select elements from the prior K8 to enhance power efficiency for notebook applications. Introduced in June 2008, this family was exclusively designed for low-power dual-core mobile use, fabricated on a 65 nm silicon-on-insulator (SOI) process without an L3 cache. Each core features 64 KB L1 instruction and data caches, paired with 512 KB or 1 MB of dedicated L2 cache per core (16-way associative), supporting and advanced branch prediction inherited from K10. Key features include an integrated dual-channel DDR2 memory controller capable of speeds up to DDR2-800 MT/s, enabling up to 12.8 GB/s of bandwidth in interleaved mode, and a single 3.0 interconnect running at 1.6 GHz (800 MHz signaling rate) for I/O connectivity. is supported via AMD-V (SVM Revision 1), with nested paging available but disabled by default, alongside robust through up to eight P-states for fine-grained frequency and voltage scaling. (TDP) ratings range from 25 to 35 , prioritizing battery life over peak performance in thin-and-light laptops. No single-core variants were produced in this family, distinguishing it from mainstream K10 offerings. Notable models encompass the Turion X2 Ultra series, such as the ZM-85 operating at 2.3 GHz with a 35 W TDP and 2 MB total L2 cache, and the lower-clocked ZM-80 at 2.1 GHz sharing the same power envelope. Athlon X2 variants like the QL-65, clocked at 2.1 GHz with 35 W TDP, targeted value-oriented notebooks. These processors powered AMD's Puma platform, integrating with the RS785M/SB600 chipset combination to deliver balanced performance against Intel's low-end Core 2 Duo mobile lineup, emphasizing integrated graphics and multimedia capabilities for everyday computing tasks.

Family 12h

The AMD Family 12h processors, codenamed Llano, extend the Family 10h lineage through refined K10.5 cores that incorporate instructions-per-clock (IPC) enhancements such as a larger reorder buffer, improved floating-point scheduling, and doubled L2 data translation lookaside buffer capacity compared to prior K10 implementations. These dual- or quad-core x86-64 designs are fabricated on a 32 nm silicon-on-insulator (SOI) process node with a die size of 227 mm² and approximately 1.45 billion transistors. Each core features 64 KB of L1 instruction cache and 64 KB of L1 data cache (both 2-way associative), paired with up to 1 MB of dedicated L2 cache per core (16-way associative), enabling a total of 4 MB L2 for quad-core variants without a shared L3 cache. Central to the Family 12h's innovation is the integration of a Radeon HD 6000 series graphics processing unit (GPU) directly on the die, utilizing a VLIW5 architecture with up to five SIMD units and 400 shader processors to deliver up to 480 GFLOPS of peak throughput. The GPU supports DirectX 11 features including tessellation and unified shaders, alongside OpenCL extensions for compute tasks, and incorporates the third-generation Unified Video Decoder (UVD 3.0) for hardware-accelerated H.264 and VC-1 decoding. Power efficiency is enhanced through core-specific power gating (CC6 state), dynamic GPU clock gating, and AMD Turbo Core technology, which reallocates thermal design power (TDP) budgets to boost single-threaded performance by up to 35% in low-threaded workloads. The Llano APUs encompass A4, A6, and A8 series models with clock speeds ranging from 1.5 GHz to 3.0 GHz and TDP values between 35 W and 100 W, targeting mainstream desktop and mobile applications. Desktop examples include the quad-core A8-3850 (2.9 GHz base, up to 3.0 GHz Turbo Core, 100 W TDP, Radeon HD 6550D with 400 shaders) and the dual-core A4-3400 (2.7 GHz, 65 W TDP, Radeon HD 6410D with 160 shaders). The Sabine platform extends this to mobile devices with FT3 socket variants, such as the quad-core A8-3500M (1.5 GHz base, up to 2.4 GHz Turbo Core, 35 W TDP, Radeon HD 6620G). These processors support FM1 sockets for desktop motherboards compatible with DDR3 memory up to 1866 MT/s and multi-display outputs including HDMI, DisplayPort, and DVI. Launched on June 14, 2011, the Family 12h bridged AMD's K10-era processors to the subsequent by prioritizing with on-chip , enabling discrete-level visual performance in power-constrained form factors while maintaining compatibility with existing AM3 ecosystems through pin-compatible designs.

Issues and Legacy

Known Bugs

One of the most prominent hardware defects in early AMD Family 10h processors was Erratum 298, a flaw in the translation lookaside buffer (TLB) that affected the B2 stepping of both desktop Phenom and server Opteron Barcelona models released in 2007. This issue arose during operations involving nested or recursive updates to page translation table entries, where L2 evictions could lead to non-atomic modifications, resulting in machine check exceptions, loss of cache line coherency, or data corruption, often manifesting as system hangs or crashes. To address the before a hardware fix was available, recommended a -level that disabled the L2 TLB cache by setting specific model-specific registers (MSRC001_0015[HWCR:TlbCacheDis] = 1 and MSRC001_1023 = 1), with the change applied across all cores in multiprocessor systems. This software mitigation, also supported in operating systems like via kernel patches, prevented the from triggering but incurred a penalty of 5-20% in 64-bit and memory-intensive workloads, with averages around 14-20% in synthetic and application benchmarks due to increased TLB miss rates and page walk overhead. updates from vendors enabled users to toggle the , though it was advised to keep it enabled on affected revisions to avoid . The was resolved in hardware with the B3 stepping for Phenom desktop processors, introduced in early 2008, eliminating the need for the and restoring full performance. Server Barcelona models received similar fixes in later steppings, such as BL-B3 and subsequent revisions. Early production runs of Barcelona also encountered additional manufacturing-related bugs that contributed to initial low yields. The collective impact of these defects delayed Barcelona and Phenom shipments by several months, prompted multiple silicon revisions, and drew scrutiny from investors, though no direct consumer lawsuits materialized; instead, they accelerated AMD's shift to improved process nodes and designs in successor families.

Sinkclose Vulnerability

In 2024, a high-severity vulnerability known as Sinkclose (CVE-2023-31315) was disclosed affecting AMD processors, including Family 10h models, that implement System Management Mode (SMM). This flaw allows an attacker with ring 0 (kernel-level) privileges to bypass SMM locks and execute arbitrary code within SMM, potentially leading to persistent, undetectable malware that survives OS reinstalls and affects system integrity. Exploitation requires prior kernel access, making it more relevant for compromised servers or environments with malware. AMD has issued firmware mitigations (AMD-SB-7014) for supported platforms, but legacy Family 10h systems may lack updates, leaving them vulnerable as of November 2025.

Successors

The AMD Family 10h microarchitecture, known as K10, was directly succeeded by the Family 15h Bulldozer architecture, which debuted in October 2011 with the FX-series desktop processors and Opteron server chips. Bulldozer introduced a modular core design featuring shared frontends and floating-point units to boost multi-threaded performance, marking a shift from K10's traditional per-core approach, though it retained key elements from Family 10h such as the integrated dual-channel DDR3 memory controller for low-latency access. This continuity helped maintain AMD's advantage in memory subsystem efficiency during the transition. For low-power applications, Family 10h's influence extended indirectly through the Family 14h Bobcat microarchitecture in 2011, which evolved K10's design principles into a compact, in-order core for netbooks and embedded systems, and further to the Family 16h Jaguar in 2013 and its Puma update. Bobcat served as a bridge for efficient, integrated CPU-GPU solutions, paving the way for Jaguar's out-of-order execution and quad-core scalability in consoles like the PlayStation 4 and Xbox One. These evolutions built on K10's integrated memory controller and multi-core foundations to target mobile and APU markets. Family 10h's legacy underpinned 's aggressive push into multi-core processing, enabling the first monolithic quad-core x86 designs and influencing the development of Accelerated Processing Units (APUs) like the 2011 Llano series, which paired K10-derived "Stars" cores with graphics. This multi-core emphasis and APU integration helped regain desktop to around 20% by late 2011 amid competition from , despite transitional challenges. Production of Family 10h processors wound down with final shipments occurring around early 2012, fully supplanted by later Family 15h iterations such as Piledriver and .

References

  1. Sep 14, 2007 · New Opterons Headed for Supercomputing Stardom. By Michael Feldman. September 14, 2007. AMD's public relations blitz for its new quad-core ...
  2. Nov 26, 2008 · The demo could be held at CES 2009. The professional overclocker chosen to achieve this feat would be none other than FUGGER from XtremeSystems.
  3. Jan 11, 2013 · This is the BIOS and Kernel Developer's Guide (BKDG) for AMD Family 10h Processors, version 31116 Rev 3.62, dated January 11, 2013.
Add your contribution
Related Hubs
User Avatar
No comments yet.