Hubbry Logo
Centaur TechnologyCentaur TechnologyMain
Open search
Centaur Technology
Community hub
Centaur Technology
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Centaur Technology
Centaur Technology
from Wikipedia

Centaur Technology was an x86 CPU design company started in 1995 and subsequently a wholly owned subsidiary of VIA Technologies. In 2015, the documentary Rise of the Centaur covered the early history of the company.[1] The company was broken up in 2021.[2]

Key Information

History

[edit]

Centaur Technologies Inc. was founded in April 1995 by Glenn Henry, Terry Parks, Darius Gaskins, and Al Sato.[citation needed] The funding was provided by Integrated Device Technology, Inc (IDT). The business goal was to develop compatible x86 processors that were less expensive than Intel processors and consumed less power.[citation needed] There were two main elements of the plan:[citation needed]

  1. a new design, developed from scratch, of an x86 processor core optimized differently from Intel's cores;
  2. a novel management approach designed to achieve high productivity.

While funded by IDT, three different Centaur designs were shipped under the marketing name of WinChip.[citation needed] In September 1999, Centaur was purchased from IDT by VIA Technologies, a Taiwanese company. Since then, five designs have shipped with the marketing name of VIA C3, as well as a number of designs for the VIA C7 processor and their latest 64-bit CPU, the VIA Nano.[citation needed]

The VIA Nano design has been further refined and improved in chips produced by Zhaoxin (a VIA joint venture company).[citation needed]

In late 2019, Centaur announced the "World’s First High-Performance x86 SoC with Integrated AI Coprocessor", the CNS core.[3]

In November 2021, Intel recruited the majority of the employees of the Centaur Technology division from VIA, a deal worth $125 million, effectively acquiring the talent and know-how of the x86 division.[4][5] VIA retained the x86 licence and associated patents, and its Zhaoxin CPU joint-venture continues.[6]

Design methodology

[edit]

Centaur's chips historically have been much smaller than comparable x86 designs at their time, and they are thus cheaper to manufacture and consume less power [citation needed]. This made them attractive in the embedded marketplace.[citation needed]

Centaur's design philosophy was always centered on "sufficient" performance for tasks that its target market demands. Some of the design trade-offs made by the design team ran contrary to accepted wisdom.[citation needed]

Centaur/VIA was among the first to design processors with hardware encryption, hash and random number acceleration in the form of VIA PadLock, starting with a 2004 VIA C7 release.[citation needed] Around the same time NSC Geode LX added support for AES128. In 2008 Intel and AMD followed up with specifications AES-NI, Intel SHA extensions in 2013, and RDRAND in 2015.[citation needed]

VIA C3

[edit]
  • Because memory performance is the limiting factor in many benchmarks, VIA processors implement large primary caches, large TLBs, and aggressive prefetching, among other enhancements. While these features are not unique to VIA, memory access optimization is one area where features were not sacrificed to save die space. In fact, generous primary caches (128KB) have always been a distinctive hallmark of Centaur designs.[citation needed]
  • Generally, clock frequency is favored over increasing instructions per cycle. Complex features such as out-of-order instruction execution are deliberately not implemented, because they impact the ability to increase the clock rate, require a lot of extra die space and power, and have little impact on performance in several common application scenarios.[citation needed]
  • The pipeline is arranged to provide one-clock execution of the heavily used register–memory and memory–register forms of x86 instructions. Several frequently used instructions require fewer clock cycles than on other x86 processors.[citation needed]
  • Rarely used x86 instructions are implemented in microcode and emulated as combinations of other x86 instructions. This saves die space and contributes to low power consumption. The impact on the majority of real-world application scenarios is minimal.[citation needed]
  • These design principles are derivative from the original RISC advocates, who claim that a smaller set of instructions, better optimized, can deliver faster overall CPU performance. The C3 design cannot be considered a pure RISC design because it accepts the x86 instruction set which is a CISC design.[citation needed]
  • In addition to x86, these processors support the undocumented Alternate Instruction Set.[citation needed]

VIA C7

[edit]
  • VIA C7 Esther (C5J) as an evolutionary step after VIA C3 Nehemiah+ (C5P), in which Centaur followed their traditional approach of balancing performance against a constrained transistor/power budget.[citation needed]
  • The cornerstone of the VIA C3 series chips' design philosophy has been that even a relatively simple in-order scalar core can offer reasonable performance against a complex superscalar out-of-order core if supported by an efficient "front-end", i.e. prefetch, cache and branch prediction mechanisms.[citation needed]
  • In the case of VIA C7, the design team focused on further streamlining the "front-end" of the chip, i.e. cache size, associativity and throughput as well as the prefetch system.[7] At the same time, no significant changes to the execution core ("back-end") of the chip seem to have been made.
  • The VIA C7 successfully further closes the gap in performance with AMD/Intel chips, since clock speed is not thermally constrained.[citation needed]

VIA Nano

[edit]
  • VIA Nano Isaiah (CN) is a combination of a number of firsts from Centaur, including their first superscalar out-of-order CPU and their first 64-bit CPU.[citation needed]
  • The development of the VIA Nano focused on radically improving the performance side of the performance-per-watt equation while still maintaining a similar TDP to the VIA C7.[citation needed]

CNS core

[edit]

Centaur announced a new x86-64 "CNS" CPU with AVX-512 support and integrated AI coprocessor in late 2019.[3] The CNS CPU was cancelled in 2021 when VIA sold parts of its Centaur division to Intel.[4] The CNS core CPUs had up to 8 cores and ran at a 2 GHz base frequency. It used the same LGA2011 pin socket as Intel's LGA2011-3 CPUs, however it is not electrically compatible with Intel motherboards. The CNS CPU cores were made on the TSMC 16 nm node. Some of the advancements made on CNS were later used in some CLUs by Zhaoxin Semiconductor, which is a joint venture co-owned by VIA.[8]

The CHA SoC comprised eight CNS cores and a 4096-wide deep learning coprocessor referred to as Ncore[9]. The deep learning accelerator was evaluated on the MLPerf Inference v0.5 benchmark[10]. Of its several results in that public benchmark submission, its MobileNetv1 Stream result was notable for being the lowest latency result among all systems benchmarked from all closed-division entrants.

Comparative die size

[edit]
Processor Secondary
cache (k)
Die size
130 nm (mm²)
Die size
90 nm (mm²)
Die size
65 nm (mm²)
VIA Nano 1000/2000 1024 N/A N/A 63.3
VIA C3 / VIA C7 64/128 52 30 N/A
Athlon XP 256 84 N/A N/A
Athlon 64 512 144 84 77
Pentium M 2048 N/A 84 N/A
P4 Northwood 512 146 N/A N/A
P4 Prescott 1024 N/A 110 N/A

Note: Even the 180 nm Duron Morgan core (106 mm²) with a 64K secondary cache, when shrunk down to a 130 nm process, would have still had a die size of 76 mm².[citation needed] The VIA x86 core is smaller and cheaper to produce.[citation needed] As can be seen in this table, almost four C7 cores could be manufactured in the same area as a one-P4 Prescott core on 90 nm process.

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Centaur Technology, Inc. was a fabless company based in , that specialized in designing low-power, x86-compatible microprocessors for personal computers and embedded systems. Founded in , the company developed a series of processor architectures, including the , VIA C3, C7, Nano, and Isaiah families, which were manufactured by partners like and integrated into products from major OEMs such as and . The company was established by a team led by Glenn Henry, a former IBM Fellow and Dell executive, along with Terry Parks, Darius Gaskins, and Al Sato, with initial funding and support from Integrated Device Technology (IDT). In 1997, Centaur released its first product, the WinChip, a cost-effective alternative to Intel's Pentium processors aimed at the low-end market. IDT sold Centaur to VIA Technologies in 1999 for $51 million, after which it became a key subsidiary focused on CPU core development for VIA's product lines. Under VIA ownership, continued innovating with designs emphasizing power efficiency and multimedia capabilities, such as the and cores in the C3 series (introduced in 2001) and the architecture in the Nano processors (2008), which supported and virtualization. The company's later efforts included the CHA (Centaur High-Performance Architecture) cores, which powered embedded and server applications until 2021. In November 2021, VIA Technologies agreed to transfer Centaur's x86 design team of about 50 engineers to for $125 million, while retaining the company's , licenses, and patents; this effectively ended independent operations. Over its 26-year history, Centaur shipped millions of processor units, contributing significantly to the x86 ecosystem as a niche player challenging and in efficiency-focused segments.

Company Background

Founding and Early Mission

Centaur Technology was established on April 1, 1995, in , as a startup focused on design. The company was founded by Glenn Henry, a former IBM Fellow and Senior Vice President at Computer Corporation, along with Terry Parks, Darius Gaskins, and Al Sato, all of whom brought extensive experience in processor architecture from their prior roles at and . Initially operating from the founders' homes with a compact team of around four core members that quickly expanded to about 20-30 engineers, Centaur emphasized a lean approach to innovation, prioritizing efficient design processes over large-scale operations. The founding team's initial mission was to prove that a small, agile group could create competitive x86-compatible microprocessors at significantly lower costs than industry leaders and , targeting the underserved low-end market, particularly in emerging regions outside the . This vision stemmed from Henry's observations at , where he identified a gap in affordable solutions for the vast majority of global PC users who could not afford high-end processors. By designing chips from the ground up without relying on licensed , aimed to deliver simplicity, low power consumption, and full compatibility with existing x86 software ecosystems, thereby enabling broader access to personal without prohibitive expenses. Early funding was secured through a with Integrated Device Technology (), an American firm, which provided an initial investment of approximately $15 million and established as its to support fabrication and market entry. This limited-resource model allowed the company to focus on core engineering challenges, such as optimizing for cost efficiency and performance in resource-constrained environments, rather than expansive marketing or infrastructure. The emphasis on underserved markets and innovative, no-frills design laid the groundwork for 's subsequent developments in affordable x86 solutions.

Acquisition by VIA Technologies

In September 1999, VIA Technologies, a Taiwanese semiconductor company specializing in chipsets, acquired Centaur Technology from Integrated Device Technology (IDT) for $51 million in cash, along with intellectual property related to the WinChip microprocessor and x86 design expertise. This transaction, announced definitively on September 16, 1999, following initial reports in August, marked VIA's strategic entry into the CPU market by absorbing Centaur's Austin, Texas-based design team. Key personnel, including Centaur's founder and president Glenn Henry, were retained in leadership roles to ensure continuity in x86 development. The acquisition was driven by VIA's ambition to build in-house x86 processor capabilities, complementing its dominant position in motherboard chipsets and enabling a vertically integrated push against Intel's dominance in the low-end PC segment. For Centaur, previously struggling as an IDT subsidiary with limited manufacturing scale, the deal provided access to VIA's established fabrication partnerships, notably with Taiwan Semiconductor Manufacturing Company (TSMC), which facilitated production of future designs without the constraints of independent fab arrangements. IDT, seeking to exit the competitive x86 microprocessor business, viewed the sale as a way to refocus on communications and networking technologies. Post-acquisition, transitioned from an independent entity to VIA's dedicated x86 CPU design subsidiary, operating as a semi-autonomous unit while aligning with VIA's broader ecosystem. The Austin headquarters remained the core operational base, with minimal relocation of staff to preserve the team's expertise and culture. This integration shifted 's priorities toward developing processors optimized for VIA's platforms, emphasizing compatibility, power efficiency, and cost-effectiveness to target embedded and budget PC markets. Early efforts focused on refining existing x86 architectures to integrate seamlessly with VIA chipsets, laying the groundwork for a unified product strategy.

Historical Milestones

WinChip Development

In 1995, Centaur Technology was established as a wholly owned subsidiary of Integrated Device Technology (IDT), with IDT providing funding and taking responsibility for fabrication and marketing while Centaur focused exclusively on processor design. This collaboration enabled Centaur to develop its inaugural x86-compatible processor, the WinChip series, targeting low-cost personal computers compatible with the Socket 7 platform. The partnership leveraged IDT's manufacturing expertise in CMOS processes to produce a design optimized for efficiency rather than high performance, aiming to undercut competitors like Intel's Pentium and AMD's K6 in price while maintaining compatibility with standard PC software and operating systems. The original , designated as the C6, was introduced in late 1997 at clock speeds ranging from 180 MHz to 240 MHz, fabricated on a 0.35-micron process with a compact 88 mm² die size containing 5.4 million transistors. Its featured a RISC-inspired internal —consisting of five stages for operations—with an additional stage that decoded complex x86 instructions into simpler micro-operations for execution, enabling higher efficiency and lower power draw compared to contemporary superscalar designs. Key elements included integrated 32 KB instruction and data caches, an 80-bit (FPU) running in parallel with the integer unit, and full support for Intel's MMX multimedia extensions, though it lacked advanced features like 3DNow! or . Power consumption was notably low for the era, typically around 10-13 W under full load at 3.3-3.52 V, making it suitable for budget desktops and early mobile systems without requiring exotic cooling. In 1998, released the 2 (also known as C6+), an evolution built on a 0.25-micron process that shrank the die to 58 mm² while boosting transistor count to about 6 million. Clock speeds extended up to 266 MHz (with some models reaching 300 MHz in performance-rated variants), supported by enhancements such as superscalar MMX execution with dual units, branch prediction, and compatibility with Super7 motherboards running 100 MHz front-side buses. The design retained the RISC-like core and x86 translation layer for sustained efficiency, with power dissipation remaining competitive at 9-12 W in normal operation and dropping to under 4 W in low-power states like StopGrant. It also added support for AMD's 3DNow! extensions via dedicated units, broadening its appeal for graphics-intensive applications, while the integrated FPU saw minor accuracy tweaks for better compatibility. The series achieved modest market success, with shipping approximately 1 million units cumulatively through 1999, primarily to OEMs and distributors for sub-$1,000 PCs where its $30-50 pricing provided a cost edge over pricier rivals. Reviewers praised its energy efficiency and value for basic office tasks, noting strong single-threaded performance in some benchmarks due to the efficient and large on-die caches, but criticized limitations in and floating-point workloads stemming from its in-order execution and absence of deeper features. 's inconsistent marketing and distribution—focusing more on its core MIPS business—hindered broader adoption, contributing to the line's discontinuation in mid-1999 when exited the x86 market and sold to .

Expansion Under VIA

Following its acquisition by in 1999, Technology's Austin, Texas-based engineering team became the core of VIA's x86 processor R&D efforts, supported by the parent company's resources in to enhance and manufacturing capabilities. This integration allowed to scale its operations, growing from a modest group to a robust team of over 100 engineers by the mid-2000s, focused on innovative low-power architectures. Under VIA, shifted its emphasis from general desktop processors to low-power embedded and mobile applications, targeting markets like thin clients and set-top boxes where energy efficiency and compact design were paramount. This pivot maintained full x86 compatibility to ensure seamless support for legacy software, enabling deployment in resource-constrained environments without compatibility trade-offs. Key milestones included the launch of VIA's first branded CPU, the C3 series, which marked Centaur's initial product under the new ownership. Subsequent releases built on this foundation: the 2005 C7 processor, optimized for ultra-low power consumption and fabricated on TSMC's 90nm silicon-on-insulator process; and the 2008 Nano introduction, leveraging TSMC's 65nm node for improved performance in embedded systems. These advancements stemmed from strategic partnerships with TSMC to access cutting-edge fabrication technologies. Centaur's expansion faced significant challenges, including fierce competition from dominant players and , who held substantial market share and resources in both desktop and emerging low-power segments. Additionally, VIA's joint ventures in encountered regulatory hurdles related to U.S. export controls on technology, complicating technology transfers and collaborations during the early .

Sale to Intel

In November 2021, VIA Technologies entered into an agreement with Corporation, under which paid $125 million to recruit the x86 design team from Centaur Technology, VIA's wholly-owned , along with certain assets. The transaction involved the transfer of about 50 engineers based in , who specialized in x86 processor development, but did not include the sale of Centaur as a company; VIA retained ownership of the 's licenses and patents unrelated to the transferred assets. The motivations for the deal stemmed from strategic shifts at both companies. For VIA, the sale aligned with its pivot away from x86 CPU design toward embedded systems and ARM-based architectures, allowing it to streamline operations and focus on core competencies in system-on-chip solutions for industrial and IoT applications. , facing intensifying competition from and ARM-based processors in data centers and client markets, sought to augment its x86 expertise by acquiring seasoned talent experienced in efficient, low-power core designs, including the transfer of Centaur's CNS for potential use in future hybrid architectures. The acquisition marked the dissolution of as an active VIA subsidiary, with its Austin facilities later shuttered and auctioned off by December 2021. The recruited team was integrated into Intel's processor design groups, contributing to ongoing x86 development efforts, though no specific Centaur-derived products have been publicly released since the deal. This transaction formed part of Intel's broader 2021-2022 talent acquisition strategy, which included thousands of hires to reinforce its position in amid industry challenges.

Processor Portfolio

VIA C3 Series

The VIA C3 series, introduced in 2001, represented Centaur Technology's initial x86 processor family developed under ' ownership, targeting low-cost computing solutions with an emphasis on power efficiency. These processors evolved through several core variants, each refining fabrication processes and clock speeds while maintaining a compact design suitable for embedded and entry-level desktop applications. The series spanned from the core to the core, culminating in 2004, and was fabricated exclusively by . The inaugural Samuel core (C5A), launched in 2001, utilized a and operated at clock speeds ranging from 533 MHz to 800 MHz, with a maximum of approximately 1 GHz in select models. This was followed by the Samuel 2 core (C5B) in 2002, which shrank to a 150 nm process, enabling higher frequencies up to 1.2 GHz while reducing power draw through architectural optimizations. The Ezra core (C5C) arrived in 2003 on a , supporting speeds from 800 MHz to 1.43 GHz and introducing minor enhancements for better thermal performance. The final Nehemiah core (C5XL/C5P), released in 2004, also on 130 nm, pushed clocks to a maximum of 2 GHz and added (SSE) for improved multimedia handling, marking a key upgrade in instruction set compatibility. Architecturally, the VIA C3 series employed an in-order execution model with a 12-stage integer pipeline, prioritizing simplicity and low latency over aggressive to achieve high at modest clock rates. Cache configuration was consistent across variants, featuring 64 KB of instruction cache and 64 KB of data cache in L1, paired with a 64 KB unified L2 victim cache operating at full core speed. The integrated x87 (FPU) ran at half core speed in early Samuel and Samuel 2 cores, limiting its throughput for compute-intensive tasks, while SIMD capabilities were restricted to MMX and AMD 3DNow! extensions until Nehemiah's SSE addition. Power consumption emphasized , with typical thermal design power (TDP) ratings of 5-11 W in normal operation for desktop variants, dropping to under 1 W in sleep modes, though actual figures varied by core and speed (e.g., 8.5 W TDP for an 800 MHz Ezra). All VIA C3 processors were manufactured by , with die sizes progressively shrinking from around 80 mm² in the Samuel core to 52 mm² (or 47 mm² in later steppings) for and , contributing to cost-effectiveness and lower heat output. These compact dies, combined with the 0.13-0.18 μm processes, enabled in many deployments. The VIA C3 series found primary use in budget desktop PCs, thin clients, and embedded appliances, where its low power profile—often under 10 W in typical loads—allowed for fanless designs and extended battery life in portable systems. Reception highlighted its strengths in power efficiency, making it viable for cost-sensitive markets, but critics noted significant lags in and performance compared to contemporaries like the ; for instance, an 800 MHz C3 delivered roughly equivalent throughput to a 500 MHz due to the simpler pipeline and half-speed FPU. Despite these shortcomings, the series' focus on affordability and compatibility sustained its niche appeal through 2004.

VIA C7

The processor, introduced in May 2005, represented Centaur Technology's next-generation x86 design following the C3 series, emphasizing low power consumption and integrated security features for embedded applications. Built on the core using IBM's 90 nm silicon-on-insulator (SOI) process, it delivered clock speeds ranging from 1.0 GHz to 2.0 GHz with a (TDP) of 3 W to 25 W, depending on the model and configuration. The processor featured a compact die size of 30 mm² and supported (SMP) for multi-processor setups, though it remained a single-core design. Key architectural enhancements included a 64 KiB instruction cache and 64 KiB data cache (both 4-way set associative with 64-byte lines) for L1, paired with a unified 128 KiB L2 cache (32-way set associative, exclusive to L1). The C7 incorporated a 16-stage with improved via a 1K-entry target cache (BTAC), enabling more efficient handling of conditional jumps compared to prior designs. It provided full hardware support for MMX, SSE, , and instruction sets, boosting multimedia processing capabilities over the C3 lineage. A standout innovation was the integrated VIA PadLock engine, which offered for AES encryption (up to 128-bit keys), and SHA-256 hashing, a Montgomery multiplier for , and a true generator (RNG). This on-chip security suite, first introduced in the C7, enhanced performance for encrypted data processing while maintaining ultra-low idle power draw of approximately 0.1 . Primarily targeted at embedded systems, single-board computers, thin clients, and early netbook platforms, the C7 excelled in power-sensitive environments with its 400 MHz front-side bus and versatile packaging options like nanoBGA2. Production of the C7 family tapered off around 2010 as VIA shifted focus to subsequent architectures.

VIA Nano (Isaiah Core)

The VIA Nano processor family, introduced in 2008, was based on the Isaiah core architecture developed by Centaur Technology, marking a significant advancement in low-power x86 computing. Isaiah implemented a 64-bit superscalar, out-of-order execution design with an 8-stage pipeline, enabling up to 7 instructions issued per cycle across 7 execution units, including 2 integer units, 2 vector/floating-point units, and 3 load/store units. The architecture supported the full x86-64 instruction set, along with extensions such as Intel VT virtualization, SSE4 (in later variants), macro- and micro-op fusion, memory disambiguation, store merging, and advanced branch prediction using 8 predictors. Fabricated initially on a 65 nm process by Fujitsu, Isaiah cores operated at clock speeds from 1.0 GHz to 2.0 GHz, with plans for a transition to 40 nm by TSMC in subsequent models. Each core featured 64 KB instruction and 64 KB data L1 caches (16-way associative and exclusive), plus a 1 MB exclusive L2 cache (16-way or 32-way associative in refined versions), contributing to efficient multimedia and general-purpose workloads. Key features of the Isaiah-based VIA Nano emphasized power efficiency for mobile and embedded applications, with (TDP) ratings ranging from 2.5 W to 25 W, including low idle power as little as 100 mW in ultra-low-voltage models. Integrated security elements like the VIA PadLock engine provided for AES encryption, /SHA-256 hashing, and a random number generator, alongside a secure execution mode with volatile secure memory. Power management innovations included the C6 deep sleep state, Adaptive PowerSaver technology, and dual-PLL for dynamic voltage and . The design prioritized compatibility, using a NanoBGA2 package pin-compatible with prior processors, and supported an 800 MHz to 1333 MHz . The initial VIA Nano launch in 2008 featured single-core models such as the L-series (up to 1.8 GHz, 25 W TDP) and U-series (down to 1.0 GHz, 2.5 W TDP), targeted at netbooks and thin clients. The Nano 3000 series, released in 2010, refined the Isaiah architecture on the , adding SSE4.1 support and delivering up to 20% higher performance at 20% lower power compared to the original Nano, with variants like the 2.0 GHz L3100 (500 mW idle) and 1.0 GHz U3500 (100 mW idle). In 2011, VIA introduced multi-core capabilities with the 40 nm Nano X2 dual-core processors (e.g., 1.6 GHz L4650E, 25 W TDP) and QuadCore series (e.g., 1.2 GHz E-series, up to 4 cores on a ~132 mm² die), enabling configurations for tablets, industrial PCs, and embedded systems while maintaining the low-power envelope. Performance-wise, the VIA Nano achieved notable instructions-per-clock (IPC) gains over the preceding 32-bit , with clock-for-clock improvements ranging from 1.6x to 3.2x in application benchmarks, reflecting the shift to and enhanced floating-point capabilities (up to 4 adds and 4 multiplies per clock). These processors excelled in power-constrained scenarios, supporting HD video playback and finding adoption in portable devices and rugged industrial computing, though they trailed contemporary in some multi-threaded tasks due to lower clock speeds and core counts.

CNS Core and CHA SoC

The CNS core represented Centaur Technology's most advanced processor design, oriented toward server applications and featuring an out-of-order execution pipeline with significantly improved (IPC) compared to prior architectures, estimated at approximately twice that of the VIA Nano core. The core supported up to eight cores per chip, with base clock speeds ranging from 2.0 to 2.5 GHz depending on silicon binning, and included advanced instruction set extensions such as for vector processing (implemented with 256-bit registers split into two micro-operations per cycle) and AES-NI for . It incorporated a sophisticated capable of handling 512 branches and 24-long patterns, along with enhanced prefetchers and a 32 KiB L1 instruction cache fetching up to 32 bytes per cycle, enabling competitive single-threaded performance in server workloads. The CHA SoC integrated the CNS cores with a dedicated AI co-processor called NCORE, a 32,768-bit VLIW neural unit delivering up to 20 tera-operations per second in INT8 precision and 6.8 TFLOPs in bfloat16 for tasks. Fabricated on TSMC's 16 nm process with a die size of 194 mm², the SoC featured 16 MB of shared L3 cache, a quad-channel DDR4-3200 , and 44 PCIe 3.0 lanes for high-bandwidth connectivity, targeting environments with a focus on power efficiency and integrated . Internal benchmarks demonstrated the CNS cores achieving levels comparable to Intel's Skylake processors in select integer and floating-point workloads, such as SPECint and certain AI inference tasks, though it lagged in memory-bound scenarios due to the older process node. Development of the CNS core and CHA SoC began around 2016 as part of Centaur's shift toward higher-performance server designs, culminating in a public announcement in late 2019 as the "world's first high-performance x86 SoC with integrated AI coprocessor." First arrived in mid-2019 for validation, with internal testing confirming viability against contemporary competitors like Skylake-SP, but the project was halted in 2021 following VIA Technologies' sale of Centaur's x86 design team and intellectual property to for $125 million. No commercial products based on CNS or CHA were released, as the acquisition redirected resources and terminated further development.

Design Approach

Methodology and Philosophy

Centaur Technology's design philosophy centered on the principle that "," emphasizing reduced die size, power efficiency, and manufacturing costs over raw peak performance to serve mainstream and embedded markets underserved by dominant players like . This approach drew from the founders' experiences at , where they developed innovative architectures like systems, inspiring a focus on practical, efficient solutions rather than high-end complexity. In response to 's market dominance, Centaur targeted low-end PCs, embedded systems, and cost-sensitive applications, producing processors that achieved "fast enough" performance for 90% of workloads while maintaining full x86 binary compatibility to leverage existing software ecosystems. The company's design process relied on an agile, small-team , typically involving 20 to 60 engineers per project to enable rapid iteration and low overhead, contrasting with larger competitors' teams of hundreds. This lean structure facilitated quick tape-outs, such as the first in just 13 months, supported by extensive simulation and tools like theorem proving and symbolic simulation to ensure reliability and compatibility without exhaustive physical prototyping. prioritized binary compatibility through rigorous testing, aiming for "nauseously compatible" execution of x86 code, which allowed their processors to drop into existing systems without software modifications. Architecturally, early designs like the adopted RISC-inspired internals—a simple, fixed-length —for efficiency, paired with an x86 frontend for instruction translation, avoiding the complexity of superscalar execution to minimize power and cost. This evolved in later processors, such as the (Isaiah core), which introduced while retaining RISC-like micro-op decoding to balance performance gains with the core philosophy of efficiency. Centaur generally avoided licensing proprietary extensions, implementing standard x86 features independently until later adopting in designs like the VIA Eden to meet embedded demands without compromising compatibility.

Core Architectures

Centaur Technology's early x86 core designs emphasized simplicity and power efficiency, beginning with the core introduced in 1997. The employed an in-order execution model with a RISC-based internal augmented by an x86 instruction decoder and translator that converted complex x86 instructions into simpler RISC-like micro-operations. This design featured a five-stage (fetch, decode/translate, address generation, execute, writeback), enabling efficient handling of basic integer operations while prioritizing low power consumption and high clock speeds over aggressive . The VIA C3 series, launched in 2001 under Centaur's partnership with , built on this foundation with the core, which adopted a 12-stage integer pipeline focused on optimizing integer workloads for embedded and low-power applications. This pipeline supported single-issue in-order execution, with large 64 KB L1 instruction and data caches to enhance hit rates and reduce latency for integer-heavy tasks like legacy software and basic multimedia processing. Branch prediction was basic, relying on static methods and a small return stack, but the design incorporated MMX support and aggressive clock scaling to achieve competitive frequencies in power-constrained environments. In the mid-2000s, the C7's core marked a shift toward modest superscalar capabilities while maintaining Centaur's low-power ethos. Esther utilized a 16-stage with dual-issue in-order execution, allowing two integer or floating-point to improve throughput without the complexity of full out-of-order processing. Branch prediction was enhanced with a larger branch target buffer and hybrid mechanisms, reducing misprediction penalties in control-intensive code; this included support for SSE and , enabling better handling of multimedia workloads. The core's 128 KB L2 cache and integrated security features like VIA PadLock further emphasized efficiency for mobile and embedded systems. The VIA Nano's Isaiah core, released in 2008, represented Centaur's first foray into , featuring a superscalar design with a three-wide x86 decoder that generated up to three fused micro-operations per cycle for dispatch to a reorder buffer estimated at 128 entries. This enabled across multiple ports, including integer ALUs, FP units, and load/store queues, with macro-fusion techniques combining common instruction pairs (e.g., compare-and-branch) to boost IPC. Branch prediction employed eight specialized predictors across two stages, including a 4K-entry BTB and loop detector, achieving low-latency resolution for branches critical to x86 code. The architecture supported 64-bit extensions and , targeting a and sub-5W power in small-form-factor devices. Centaur's later development, the CNS core within the CHA SoC announced around 2019, advanced to a wider execution model with a 12-stage supporting up to four-wide decode and dispatch, delivering Haswell-like IPC through duplicated ALUs and robust store forwarding (two loads and two stores per cycle). It incorporated vector units, splitting 512-bit operations into 256-bit micro-ops for compatibility with server-class workloads, while the integrated CHA design added an Ncore AI accelerator capable of 6.8 TFLOPs in bfloat16 precision for inference. This core emphasized scalability with private L2 caches and a shared 16 MB L3, optimized for . Throughout its evolution, Centaur's cores shared principles for easy integration into SoCs, with a consistent focus on low-latency handling via stacked predictors and efficient caching hierarchies to minimize per instruction in power-sensitive applications.

Comparative Analysis

Centaur Technology's processor designs consistently emphasized compact die sizes compared to contemporaries from and , enabling lower manufacturing costs through fabrication. For instance, the early WinChip C6 featured an 88 mm² die in 0.35-micron process technology, significantly smaller than the Pentium's approximately 90 mm² die in comparable eras, which reduced production expenses and facilitated affordability in budget systems. Later, the dual-core variants utilized two 66 mm² dies for a total of about 132 mm², closely rivaling the Duo's 143 mm² Conroe die while targeting low-power applications. The more recent CNS-based CHA SoC, an 8-core design on 's 16 nm process, achieved a 194 mm² die size, balancing core count with efficiency for server workloads without excessive scaling. In power efficiency, Centaur processors stood out for embedded and low-power segments, often consuming far less than and equivalents. The VIA C3 and C7 series operated at 1-5 W in ultra-low-voltage configurations, contrasting with contemporaries like the at 21 W TDP or at 50 W+, making them ideal for fanless systems and portable devices. The VIA Nano further improved on this, due to its in-order architecture optimized for idle and light loads. Performance comparisons revealed persistent gaps in (IPC) for Centaur's early designs relative to Intel's, though later iterations narrowed the divide. The VIA C3 achieved roughly 0.3 IPC compared to the Pentium III's baseline of 1.0 in operations at equivalent clocks, resulting in about one-third the overall throughput in compute-intensive tasks. Overall, Centaur's designs excelled in cost/performance ratios for embedded applications, where smaller dies and low power enabled TSMC-based production at lower prices than or silicon, though they rarely competed in raw speed for desktop or high-end server markets. This approach prioritized value in niche sectors like thin clients and industrial systems, leveraging efficient architectures for sustained viability.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.