Hubbry Logo
GraphcoreGraphcoreMain
Open search
Graphcore
Community hub
Graphcore
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Graphcore
Graphcore
from Wikipedia

Graphcore Limited is a British semiconductor company that develops accelerators for AI and machine learning. It has introduced a massively parallel Intelligence Processing Unit (IPU) that holds the complete machine learning model inside the processor.[3]

Key Information

History

[edit]

Graphcore was founded in 2016 by Simon Knowles and Nigel Toon.[4]

In the autumn of 2016, Graphcore secured a first funding round led by Robert Bosch Venture Capital. Other backers included Samsung, Amadeus Capital Partners, C4 Ventures, Draper Esprit, Foundation Capital, and Pitango.[5][6]

In July 2017, Graphcore secured a round B funding led by Atomico,[7] which was followed a few months later by $50 million in funding from Sequoia Capital.[8]

In December 2018, Graphcore closed its series D with $200 million raised at a $1.7 billion valuation, making the company a unicorn. Investors included Microsoft, Samsung and Dell Technologies.[9]

On 13 November 2019, Graphcore announced that their Graphcore C2 IPUs were available for preview on Microsoft Azure.[10]

Meta Platforms acquired the AI networking technology team from Graphcore in early 2023.[11]

In July 2024, Softbank Group agreed to acquire Graphcore for around $500 million. The deal is under review by the UK's Business Department's investment security unit.[12][13]

Products

[edit]

In 2016, Graphcore announced the world's first graph tool chain designed for machine intelligence called Poplar Software Stack.[14][15][16]

In July 2017, Graphcore announced its first chip, called the Colossus GC2, a "16 nm massively parallel, mixed-precision floating point processor", that became available in 2018.[17][18] Packaged with two chips on a single PCI Express card, called the Graphcore C2 IPU (an Intelligence Processing Unit), it is stated to perform the same role as a GPU in conjunction with standard machine learning frameworks such as TensorFlow.[17] The device relies on scratchpad memory for its performance rather than traditional cache hierarchies.[19]

In July 2020, Graphcore presented its second generation processor called GC200, built with TSMC's 7nm FinFET manufacturing process. GC200 is a 59 billion transistor, 823 square-millimeter integrated circuit with 1,472 computational cores and 900 Mbyte of local memory.[20] In 2022, Graphcore and TSMC presented the Bow IPU, a 3D package of a GC200 die bonded face to face to a power-delivery die that allows for higher clock rate at lower core voltage.[21] Graphcore aims at a Good machine, named after I.J. Good, enabling AI models with more parameters than the human brain has synapses.[21]

Release date Product Process node Cores Threads Transistors teraFLOPS (FP16)
July 2017 Colossus™ MK1 - GC2 IPU 16 nm TSMC 1216 7296 ? ~100-125[22]
July 2020 Colossus™ MK2 - GC200 IPU 7 nm TSMC 1472 8832 59 billion ~250-280[23]
Colossus™ MK3 ~500[24]

Both the older and newer chips can use 6 threads per tile[clarification needed] (for a total of 7,296 and 8,832 threads, respectively) "MIMD (Multiple Instruction, Multiple Data) parallelism and has distributed, local memory as its only form of memory on the device" (except for registers).[citation needed] The older GC2 chip has 256 KiB per tile while the newer GC200 chip has about 630 KiB per tile that are arranged into islands (4 tiles per island),[25] that are arranged into columns, and latency is best within tile.[clarification needed][citation needed] The IPU uses IEEE FP16, with stochastic rounding, and also single-precision FP32, at lower performance.[26] Code and data executed locally must fit in a tile, but with message-passing, all on-chip or off-chip memory can be used, and software for AI makes it transparently possible,[clarification needed] e.g. has PyTorch support.[citation needed]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia

Graphcore Limited is a British semiconductor company founded in 2016 in Bristol, United Kingdom, by serial entrepreneurs Nigel Toon and Simon Knowles, specializing in the design and production of Intelligence Processing Units (IPUs), a type of parallel processor architected specifically for accelerating artificial intelligence and machine learning workloads.
The company's IPUs emphasize massive on-chip parallelism, with each unit featuring thousands of independent processing cores and integrated memory to handle complex AI models more efficiently than traditional GPUs for certain tasks, supported by the proprietary Poplar software stack for model training and inference.
Graphcore raised significant venture funding, including a $32 million Series A round led by Robert Bosch Venture Capital in 2017, achieving unicorn status amid the AI hardware boom, before being acquired by SoftBank Group as a wholly owned subsidiary to bolster its global AI compute capabilities.
In 2025, Graphcore announced plans to invest up to £1 billion over the next decade in India, establishing an AI Engineering Campus in Bengaluru to create 500 semiconductor jobs and expand research in AI infrastructure.

Founding and Early Development

Inception and Founders

Graphcore was founded on 14 November 2016 in , , by serial entrepreneurs Nigel Toon and Simon Knowles, who respectively assumed the roles of and . The company emerged from a stealth development phase that began around late 2013, with formal incorporation aimed at creating specialized processors to address limitations in workloads beyond conventional GPUs and CPUs. The inception of Graphcore traces to January 2012, when Toon and Knowles met at the Marlborough Tavern in Bath to brainstorm opportunities following the exits from their prior ventures in design. Both founders brought extensive experience in processor innovation: Toon had served as CEO of two venture-backed firms, picoChip (acquired by in 2012) and , focusing on multicore and embedded processing technologies. Knowles, a and engineer with over 40 years in the field, had co-founded and exited two fabless companies, including Icera (acquired by in 2011), and contributed to 14 production chips, including early domain-specific architectures for . This partnership leveraged Bristol's engineering heritage, rooted in hardware innovation since the , to pioneer the Intelligence Processing Unit (IPU), a optimized for AI inference and through massive on-chip memory and parallelism. Initial seed funding in , led by and including early backers like the founders' networks, enabled prototyping amid a nascent competitive landscape dominated by general-purpose accelerators.

Initial Technology Focus and Prototyping

Graphcore's initial technology efforts concentrated on designing the Intelligence Processing Unit (IPU), a processor architecture optimized for machine intelligence applications, distinguishing it from graphics processing units (GPUs) by integrating the full machine learning model on-chip to minimize data transfer bottlenecks. Founded in 2016 by hardware engineers Nigel Toon and Simon Knowles—veterans of Icera, which they sold to Nvidia in 2011—the company targeted the inefficiencies of existing processors in managing AI's graph-like, probabilistic computations through a massively parallel, MIMD-based structure comprising thousands of lightweight processing threads. This approach prioritized low-precision arithmetic to accelerate inference and training tasks requiring rapid iteration over vast parameter spaces, rather than high-precision numerical simulations. Prototyping commenced in 2016 following the company's incorporation in , , with seed investments enabling the fabrication of early IPU silicon to validate the architecture's scalability and performance for AI workloads. These prototypes emphasized on-chip hierarchies and interconnects to support synchronous parallelism across processing elements, addressing latency issues inherent in off-chip model storage on GPUs. By mid-2017, this work culminated in the announcement of the Colossus GC2, Graphcore's inaugural IPU—a 16 nm device with 1,472 independent processor tiles delivering mixed-precision floating-point operations at scale. Concurrently, the team co-developed the Poplar software stack to facilitate model mapping onto the hardware, ensuring prototypes could demonstrate end-to-end AI acceleration.

Core Technology

Intelligence Processing Unit Architecture

The Intelligence Processing Unit (IPU) employs a , MIMD comprising thousands of independent processing tiles, each integrating compute and memory to minimize data movement latency inherent in traditional von Neumann designs. Unlike GPUs, which rely on hierarchical caches and global DRAM, the IPU distributes on-chip SRAM directly within tiles, enabling explicit, high-bandwidth data exchange without implicit caching overhead. This tile-based structure supports (BSP) execution, sequencing compute phases with collective synchronization and exchange operations across the fabric. Each tile features a single multi-threaded processor core capable of running up to six worker threads alongside a supervisor thread for , with vectorized floating-point units and dedicated matrix multiply engines delivering 64 multiply-accumulate operations per cycle in half-precision. In the second-generation IPU (GC200), the chip integrates 1,472 such s, providing nearly 9,000 parallel threads and 900 MB of aggregate In-Processor-Memory (SRAM) at 624 KB per , yielding aggregate bandwidths exceeding 45 TB/s for local access with latencies around 3.75 ns at 1.6 GHz clock speeds. First-generation IPUs (MK1) featured 1,216 s with 304 MiB total SRAM, scaling performance to 124.5 TFLOPS in mixed precision. The IPU's exchange hierarchy facilitates all-to-all communication via an on-chip with 7.7 TB/s throughput and sub-microsecond latencies for operations like gathers (0.8 µs across the IPU), enabling efficient handling of irregular, graph-like data flows common in AI models. Off-tile scaling occurs through IPU-Links (64 GB/s bidirectional) and host interfaces, supporting multi-IPU clusters without relying on PCIe bottlenecks. This contrasts with GPU SIMT models, where thread divergence and memory coalescing limit efficiency on non-uniform workloads; IPUs excel in fine-grained parallelism and small-batch inference by partitioning models across tiles with explicit messaging, achieving up to 3-4x speedups over GPUs in graph neural networks.

Key Innovations in Parallel Processing

Graphcore's Intelligence Processing Unit (IPU) introduces a tile-based optimized for machine intelligence workloads, featuring 1,472 independent processing tiles per second-generation (MK2) IPU, each capable of executing multiple threads. This design enables nearly 9,000 concurrent independent program threads, supporting a (MIMD) execution model where tiles operate with autonomous control flows, contrasting with the more rigid SIMD paradigms in traditional GPUs. A core innovation lies in the (BSP) , which structures computation into discrete phases of local tile processing, global , and inter-tile data exchange via an on-chip all-to-all fabric. This approach minimizes overhead in highly parallel AI tasks, such as graph-based computations, by enforcing synchronous execution across all tiles per step while allowing round-robin thread scheduling within tiles to hide latencies. Complementing this, each tile integrates local SRAM (624 KB per tile, totaling approximately 900 MB of In-Processor-Memory across the IPU), which colocates compute and data to drastically reduce memory access bottlenecks inherent in von Neumann architectures. Further enhancements include specialized hardware for vectorized floating-point operations (e.g., FP16 and FP32 with matrix multiply-accumulate units performing 64 operations per cycle) and high-bandwidth collective communication primitives, enabling efficient scaling to pod-level systems interconnecting up to 64,000 IPUs. Microbenchmarking reveals that this parallelism yields superior throughput for irregular, data-intensive workloads like , though performance is bounded by exchange fabric contention under unbalanced loads. These elements address the parallelism demands of large-scale models by prioritizing fine-grained, graph-oriented over sequential bottlenecks.

Software Stack and Ecosystem

Graphcore's software stack is anchored by the Poplar SDK, a comprehensive co-designed with the Intelligence Processing Unit (IPU) to facilitate graph-based programming for machine intelligence workloads. Released as the world's first dedicated framework for IPU graph software, Poplar encompasses a graph compiler, runtime environment, and supporting libraries that map computational graphs onto IPU tiles, enabling fine-grained parallelism across thousands of processing elements. Developers can program directly in C++ or Python, expressing algorithms as directed acyclic graphs that leverage IPU-specific features like in-memory computation and bulk synchronous parallelism. The SDK integrates with established frameworks to broaden accessibility. It provides IPU-enabled backends for (including PyTorch Geometric for graph neural networks) and /, allowing users to train and infer models with minimal code modifications via directives like @ipu_model. PopART, a core component, supports ONNX import/export for model portability, while Poplibs deliver optimized, low-level operations such as tensor manipulations and custom kernels. These integrations have been updated iteratively, with Poplar SDK 3.1 (December 2022) adding 1.13 support and enhanced sparse tensor handling. Complementary tools enhance development and optimization. PopVision suite includes the Graph Analyser for visualizing IPU graph execution, tile-level performance metrics, and memory usage, alongside the System Analyser for host-IPU interaction profiling. These enable of large-scale models distributed across IPU-POD systems. The stack supports containerized environments through Docker Hub images, certified under Docker's Verified Publisher Program since November 2021, facilitating reproducible deployments. The ecosystem fosters scalability via third-party integrations and community resources. Partnerships, such as UbiOps' IPU support added in July 2023, enable dynamic scaling of training jobs in cloud-like setups. Open-source contributions on , including Poplibs for reusable primitives, encourage custom extensions, though adoption has been critiqued for demanding expert-level tuning to achieve peak efficiency compared to GPU alternatives. Post-SoftBank acquisition in 2024, the stack remains centered on Poplar, with ongoing emphasis on large-model support like efficient fine-tuning of billion-parameter transformers.

Products and Hardware Offerings

IPU Generations and Evolution

Graphcore's first-generation Intelligence Processing Unit (IPU), prototyped in 2016 and commercially launched in 2018, introduced a novel architecture designed specifically for AI workloads, featuring thousands of independent processing tiles interconnected via a custom mesh to handle entire models in on-chip memory, eschewing the data movement bottlenecks of traditional GPUs. This initial design emphasized synchronous parallelism across 1,472 tiles, each with multiple cores, enabling high throughput for graph-based computations central to deep learning. In July 2020, Graphcore unveiled its second-generation IPU, embodied in the IPU-M2000 processor and integrated into systems like the IPU-Machine, which quadrupled on-chip memory to 900 MB per IPU and boosted compute density through refined tile interconnects and enhanced bulk memory management, delivering up to 250 teraFLOPS of 16-bit floating-point performance per unit while supporting scalable pods for exascale AI training. These advancements addressed limitations in the first generation by improving for large models, with each IPU-Machine housing four IPUs connected via 100 GbE fabric for distributed processing, marking a shift toward production-scale deployments in data centers. The evolution culminated in the Bow IPU, announced in March 2022 and entering shipment shortly thereafter, which applied TSMC's 3D wafer-on-wafer to stack the second-generation GC200 die face-to-face with a dedicated power-delivery die, enabling 40% higher clock speeds, reduced power consumption, and denser integration without redesigning the underlying processor logic. Bow systems, such as the Bow Pod with four IPUs aggregating 5,888 cores and 1.4 petaFLOPS of AI compute, extended the architecture's efficiency for hyperscale applications, though adoption remained constrained by ecosystem maturity relative to GPU incumbents. This packaging innovation represented Graphcore's focus on incremental hardware refinements amid competitive pressures, prior to its 2024 acquisition by SoftBank, which redirected resources toward integrated AI infrastructure rather than standalone generational leaps.

Scale-Up Systems like Colossus

Graphcore's scale-up systems, exemplified by configurations like the Colossus IPU clusters, enable datacenter-scale deployment of Intelligence Processing Units (IPUs) through rack-integrated IPU-POD architectures designed for efficient AI model and . Introduced in December 2018, the initial rackscale IPU-POD utilized first-generation Colossus Mk1 IPUs to deliver over 16 petaFLOPS of mixed-precision compute per 42U rack, with systems of 32 such pods scaling to more than 0.5 exaFLOPS. These systems leverage IPU-Link interconnects for low-latency, high-bandwidth communication, minimizing data movement overhead compared to traditional GPU clusters reliant on PCIe or . The second-generation systems, launched in July 2020, advanced with the IPU-Machine M2000—a 1U appliance housing four Colossus Mk2 GC200 IPUs, providing 1 petaFLOP of AI compute, up to 900 MB of in-processor per IPU, and support for up to 450 GB of exchange with 180 TB/s bandwidth. Rack-scale examples include the IPU-POD64, comprising 16 M2000 units for 64 IPUs, and the IPU-POD128 with 32 M2000 units for 128 IPUs, 8.2 TB of total , and enhanced scale-out via 100 GbE fabrics. These configurations support disaggregated host-to-IPU ratios, allowing flexible integration with standard servers from partners like and HPE, and extend to datacenter-scale clusters of up to 64,000 IPUs. Key features of these scale-up systems emphasize massive parallelism for large models, with first-generation Colossus Mk1 supporting up to 4,096 IPUs and optimized topologies for graph-based workloads via the Poplar software stack. Power efficiency is highlighted in configurations like 16 Mk2 IPUs delivering 4 petaFLOPS at 7 kW in a 4U unit, though real-world deployment depends on cooling and interconnect density. By 2021, expanded POD designs like POD128 facilitated training of models exceeding GPT-scale, with bandwidth exceeding 10 PB/s in projected ultra-scale systems.

Integration with Cloud and Software Tools

Graphcore's Poplar SDK serves as the primary software interface for its Intelligence Processing Units (IPUs), enabling seamless integration with popular machine learning frameworks such as (versions 1 and 2, with full support for TensorFlow XLA compilation) and . This co-designed stack facilitates efficient mapping of computational graphs to IPU hardware, supporting features like in-processor memory streaming and parallel execution optimized for AI workloads. Developers can access pre-optimized models and datasets through partnerships, including Hugging Face's Transformers library adapted for IPU acceleration as of May 2022. Containerization support enhances deployment flexibility, with official Poplar SDK images available on Docker Hub since November 2021, verified under Docker's Publisher Program. These images include tools for interacting with IPUs and running applications in isolated environments. integration is provided for orchestration in scale-up systems like IPU-PODs, allowing automated provisioning and management of IPU clusters alongside frameworks such as Slurm and . Additional ecosystem expansions, such as UbiOps platform support added in July 2023, enable dynamic scaling of IPU jobs for training and inference. For cloud deployment, Graphcore IPUs have been accessible via since at least 2020, permitting users to provision IPU instances without on-premises hardware. The company launched its own G-Core Labs IPU service in June 2022, bundling Poplar SDK access for rapid prototyping and production-scale AI tasks. Partnerships with infrastructure providers like for solutions and for further extend IPU usability in hybrid cloud environments, though adoption has remained limited compared to GPU-centric alternatives.

Funding Trajectory and Financial Challenges

Major Investment Rounds

Graphcore secured its Series B funding round of $30 million on July 20, 2017, led by Atomico with participation from investors including Catalyst Fund, Capital, Amadeus Capital Partners, Foundation Capital, Pitango Venture Capital, C4 Ventures, and Robert Bosch Venture Capital. This round supported the development of its Intelligence Processing Unit (IPU) technology for applications. The company followed with a Series C round of $50 million in November 2017, led by Sequoia Capital and including Dell as a participant. In December 2018, Graphcore closed a $200 million Series D round, achieving unicorn status with a post-money valuation of $1.7 billion; key investors included Microsoft, BMW i Ventures, Sofina, Merian Global Investors (now Chrysalis Investments), and Draper Esprit. This funding accelerated IPU production scaling and partnerships for AI hardware deployment. Graphcore extended its Series D with an additional $150 million raised on February 25, 2020, from investors including , Mayfair Equity Partners, and Chrysalis Investments, bringing the total for the round to approximately $350 million and elevating the valuation to $1.95 billion. The final major venture round was Series E, closing at $222 million on December 29, 2020, led by the with support from , , and existing backers, resulting in a $2.77 billion valuation. Across these rounds from 2017 to 2020, Graphcore raised over $700 million in total equity funding to fuel R&D and market expansion amid competition in AI accelerators.

Revenue Realities Versus Valuation Hype

Graphcore's valuation surged amid the AI hardware boom, reaching a of $2.77 billion in December 2020 following a $222 million funding round led by and others, positioning it as a high-profile challenger to in specialized AI processing. This peak reflected investor enthusiasm for its Intelligence Processing Unit (IPU) technology, with earlier rounds including a $200 million Series D in 2018 that elevated it to status at approximately $1.7 billion. However, these valuations were driven more by speculative promise than operational traction, as the company invested heavily in R&D and scaling without commensurate commercial uptake. In stark contrast, Graphcore's revenue remained negligible relative to its funding and hype. For the year ended , 2022—the most recent full-year figures publicly available pre-acquisition—revenue totaled just $2.7 million, a 46% decline from 2021, amid broader market challenges in AI chip adoption beyond dominant GPU ecosystems. Pre-tax losses ballooned to $205 million that year, reflecting high operational burn rates from a of around 500 and expansive hardware development, with cash reserves strained despite over $700 million raised cumulatively. These figures underscored a core disconnect: while Graphcore marketed IPUs as superior for certain workloads via massive on-chip memory and parallelism, customer inertia toward established CUDA software stacks limited deployments, resulting in revenue that equated to mere fractions of a percent of its valuation. The valuation-revenue mismatch culminated in SoftBank's 2024 acquisition for an estimated $500-600 million—less than a quarter of the 2020 peak—effectively a down-round that wiped out significant gains and highlighted over-optimism in early-stage AI hardware bets. Pre-acquisition filings revealed ongoing struggles to convert pilot programs into scalable , with growth stymied by lock-in and , prompting headcount reductions of over 20% by late 2022. This trajectory exemplifies how in AI semiconductors often prioritized technological novelty over proven market fit, leading to hype-fueled multiples unsupported by fundamentals.

Acquisition and Strategic Shifts

SoftBank Takeover in 2024

On July 11, 2024, Corp. announced the acquisition of Graphcore, the UK-based developer of Intelligence Processing Units (IPUs) for AI workloads, converting it into a wholly owned . The financial terms were not officially disclosed, though reports indicated a purchase price ranging from approximately $400 million to over $600 million, a sharp decline from Graphcore's peak valuation of $2.8 billion in 2020. This transaction followed months of speculation, as Graphcore had been seeking buyers since at least February 2024 amid competitive pressures in the AI chip market dominated by and ongoing financial strains, including just $4 million in revenue for 2023 despite over $700 million in cumulative investments. Graphcore's CEO Nigel Toon described the deal as a "positive outcome" that would enable accelerated development of next-generation AI compute infrastructure under SoftBank's resources, emphasizing continuity in operations and integration with SoftBank's broader AI ambitions, including synergies with its subsidiary. SoftBank, led by , positioned the acquisition as part of its strategic push toward (AGI), leveraging Graphcore's IPU technology for scalable AI training and inference systems. The move marked SoftBank's second major semiconductor investment, following its 2016 purchase of for $32 billion, and reflected a pattern of acquiring distressed AI hardware innovators to bolster its ecosystem amid global chip shortages and escalating demand for alternatives to GPU-centric architectures. The acquisition faced no major regulatory hurdles and closed promptly, with Graphcore retaining its Bristol headquarters and commitment to UK-based R&D, though it highlighted broader challenges for European AI startups in scaling against US incumbents. Industry analysts noted that while Graphcore's MIMD-based IPUs offered theoretical advantages in certain parallel processing tasks over Nvidia's SIMD GPUs, persistent ecosystem lock-in and slower market adoption had eroded its standalone viability, making SoftBank's deep pockets essential for survival.

Post-Acquisition Expansions and Plans

Following its acquisition by Corp. on July 11, 2024, Graphcore announced intentions to expand hiring in the and globally to bolster its engineering and research capabilities. This included a renewed recruitment drive starting in November 2024, targeting roles in AI hardware development and software optimization to align with SoftBank's broader artificial intelligence infrastructure goals. A key post-acquisition initiative materialized in October 2025, when Graphcore, as a SoftBank , committed £1 billion (approximately $1.3 billion) to development in over the next decade. The investment focuses on scaling AI chip , including the establishment of an AI Engineering Campus in Bengaluru as Graphcore's first office in the country. This expansion aims to create up to 500 semiconductor-related jobs, emphasizing design, fabrication support, and integration of Intelligence Processing Units (IPUs) for AI workloads. The plans integrate with SoftBank's global AI compute strategy, which includes multi-trillion-dollar commitments to advanced resources, positioning Graphcore's IPU technology as a complementary asset to GPU-dominant ecosystems. No further large-scale geographic expansions or product roadmap shifts have been publicly detailed as of October 2025, though the acquisition has enabled Graphcore to leverage SoftBank's resources for sustained R&D amid prior commercial challenges.

Competitive Landscape

Rivalry with Nvidia and GPU Dominance

Graphcore positioned its Intelligence Processing Units (IPUs) as a direct architectural alternative to 's graphics processing units (GPUs), emphasizing massive on-chip (up to 900 MB SRAM per IPU) and fine-grained parallelism tailored for AI and , contrasting with 's reliance on high-bandwidth (HBM) and tensor cores. In benchmarks published by Graphcore in December 2020, the IPU-M2000 (four MK2 IPUs) demonstrated up to 60x higher throughput and 16x lower latency than a single A100 GPU in specific low-latency AI tasks, such as BERT . Independent evaluations, including a 2021 study on cosmological simulations, showed mixed results: Graphcore's MK1 IPU outperformed 's V100 GPU in some deep scenarios but lagged in others due to software immaturity. These claims highlighted potential IPU advantages in -bound workloads, yet Graphcore's self-reported metrics often compared multi-IPU clusters to single GPUs, drawing skepticism over apples-to-oranges equivalency. Nvidia maintained overwhelming dominance in the AI accelerator market, capturing an estimated 86% share of AI GPU deployments by 2025, driven by its software that locked in developers through optimized libraries, vast community support, and seamless integration with frameworks like and . This moat proved insurmountable for Graphcore, whose Poplar SDK required significant porting efforts from codebases, limiting adoption among enterprises reliant on Nvidia's mature tooling and scale. By 2023-2024, Graphcore's remained under $100 million annually despite $700 million in , contrasting Nvidia's trillions in market cap fueled by AI demand, as customers prioritized compatibility over raw hardware specs. The rivalry underscored GPU dominance as a barrier to IPU penetration: while Graphcore targeted niches like sparse models or edge inference with claims of 11x better price-performance versus Nvidia's DGX A100 in announcements, real-world scalability issues and Nvidia's iterative GPU advancements (e.g., H100's tensor performance leaps) eroded these edges. Post- SoftBank acquisition, Graphcore pivoted toward hybrid IPU-GPU integrations, implicitly acknowledging Nvidia's entrenched position rather than outright displacement. This dynamic reflected broader causal factors in AI hardware: software inertia and network effects favored incumbents, rendering even superior architectures secondary without equivalent developer mindshare.

Performance Benchmarks and Claims

Graphcore has asserted superior performance for its Intelligence Processing Units (IPUs) in specific AI workloads, particularly those benefiting from massive parallelism and sparsity handling via MIMD architecture. In December 2020, the company claimed its IPU-M2000 system delivered up to 18x higher training throughput and 600x inference throughput over A100 GPUs in select models like BERT and ResNet-50, based on in-house optimizations with Poplar SDK. These assertions emphasized IPU advantages in and tile-based processing for irregular computations, contrasting 's SIMT GPU approach. Participation in standardized MLPerf training benchmarks provided more verifiable data. In MLPerf v1.1 (December 2021), Graphcore reported the fastest single-server BERT time-to-train at 10.6 minutes using an IPU-POD system, while its IPU-POD16 achieved 28.3 minutes for ResNet-50, surpassing A100's 29.1 minutes by 24%—attributed to software refinements in Poplar and PopART frameworks. Earlier, in MLPerf v1.0 (June 2021), results were less favorable, with Graphcore's ResNet-50 time at 32.12 minutes versus Nvidia's 28.77 minutes on DGX A100.
MLPerf BenchmarkGraphcore ConfigurationGraphcore Time-to-Train Time-to-TrainNotes
ResNet-50 (v1.0)IPU-POD (unspecified scale)32.12 minutes28.77 minutesClosed division; faster despite similar power envelopes.
ResNet-50 (v1.1)IPU-POD1628.3 minutes29.1 minutes24% edge for Graphcore via software gains; single-server closed.
BERT (v1.1)IPU-POD (single-server)10.6 minutesNot directly compared ( multi-node faster overall)Graphcore's claimed fastest single-server result.
Independent scrutiny reveals limitations in these claims. A 2021 SemiAnalysis evaluation of MLPerf v1.0 data compared 16 IPUs (totaling ~13,000 mm² silicon, 7nm ) against 8 A100s (~6,600 mm²), finding inferior training performance, performance per dollar (1.3-1.6x deficit), and efficiency per mm² for Graphcore, despite matched power consumption (~6-7 kW per server)—issues linked to poor scaling beyond small pods and immature software versus 's maturity. consistently led MLPerf overall, with up to 2.2x gains in subsequent rounds via ecosystem optimizations. Later studies confirm mixed outcomes. A 2024 arXiv evaluation of IPUs alongside GPUs and other accelerators noted IPU strengths in flexible SIMD/SIMT mapping for diverse workloads but no broad throughput superiority in standard or training, where GPUs excelled in optimized scenarios. In graph algorithms, a 2024 paper found IPUs outperforming GPUs in heterogeneous parallel execution times due to independent core control. Absent consistent post-2021 MLPerf submissions, claims of IPU parity or edges remain confined to niche cases, undermined by Nvidia's dominance in scalable, general-purpose AI via software and market inertia.

Market Adoption Barriers

Graphcore's IPUs encountered substantial market adoption barriers stemming from the immaturity of its software ecosystem relative to Nvidia's platform, which boasts extensive libraries, frameworks, and developer familiarity accumulated over nearly two decades. Porting workloads to Graphcore's Poplar SDK often necessitated significant code refactoring and optimization, deterring enterprises reliant on established GPU-optimized tools like and native implementations. This friction was compounded by Poplar's focus on IPU-specific features, such as fine-grained parallelism and sparsity handling, which provided advantages in niche inference tasks but lagged in seamless integration for broad AI pipelines. Architectural divergence from conventional GPUs represented another key impediment, as IPUs' MIMD (multiple instruction, multiple data) design and on-chip memory model required developers to abandon GPU-centric mental models, leading to steeper learning curves and higher initial deployment costs. Early adopters reported challenges in achieving consistent performance across diverse workloads, particularly in large-scale where IPU scaling to thousands of units exposed bottlenecks in inter-chip communication and software . Independent benchmarks occasionally highlighted IPU edges in memory-bound operations, but these were insufficient to overcome the ecosystem lock-in, with major hyperscalers prioritizing Nvidia's plug-and-play compatibility for trillion-parameter models. Customer acquisition hurdles further stalled penetration, as Graphcore targeted research-oriented and edge AI segments initially, missing timely traction in high-volume cloud and datacenter markets dominated by partnerships. High-profile setbacks, including the 2023 loss of a strategic deal with , eroded confidence among potential buyers wary of risks without proven hyperscale viability. These dynamics manifested in tepid revenue—merely £4.5 million in 2022 against a prior $2.8 billion valuation—reflecting limited commercial deployments beyond pilot programs.

Controversies and Criticisms

In early 2024, Dutch cloud provider HyperAI filed a lawsuit against Graphcore in courts, alleging over a failed partnership to develop AI cloud services powered by Graphcore's Intelligence Processing Units (IPUs). The dispute stemmed from initial discussions in February 2021, when HyperAI approached Graphcore to integrate its Bow POD16 hardware into a cloud platform, paying €121,000 via a German intermediary for the licenses, and three years of support. By February 2022, the parties had agreed to collaborate toward a formal cloud partnership, but delays arose from misconfigurations in the shipped hardware (ordered in April 2022 and delivered in August 2022), pushing HyperAI's platform launch to December 2, 2022. HyperAI claimed Graphcore abruptly withdrew three days after the launch on December 5, , reneged on exclusivity commitments, and denied the validity of the hardware sale despite delivery, rendering HyperAI's worthless and halting operations. HyperAI CEO Andrew Foe attributed these actions to Graphcore's pivot to an exclusive European deal with G-Core Labs and internal issues like layoffs, describing the behavior as a betrayal that exhausted his personal savings. Graphcore, facing its own financial pressures—including revenue of $2.7 million (down 46% year-over-year) and losses of $204.6 million—responded by stating it "vigorously disputes HyperAI’s meritless claims" and declined further comment on the pending litigation. The case highlighted tensions in early AI hardware partnerships amid Graphcore's struggles to scale amid competition from , though no resolution has been publicly reported as of mid-2024, coinciding with Graphcore's acquisition by SoftBank. No other significant legal disputes with partners were identified in public records.

Management and Strategic Errors

Graphcore's management faced criticism for architectural decisions that prioritized a novel MIMD-based Intelligence Processing Unit (IPU) design, featuring massive on-chip SRAM but lacking high-bandwidth memory (HBM), rendering it ill-suited for memory-intensive workloads like training prevalent after 2020. In 2021 benchmarks, systems with 16 IPUs, utilizing twice the silicon area of comparable A100 GPUs (823 mm² vs. 826 mm² per chip), underperformed in MLPerf training tasks such as ResNet-50 and BERT, even after hand-tuning, while matching power draw only to an 8x A100 setup at higher cost per performance. This stemmed from scalability limitations and an underdeveloped software stack, contrasting 's mature ecosystem, which executives like CEO Nigel Toon acknowledged required substantial investment but failed to match in adoption. Commercially, leadership erred in pivoting repeatedly between targeting hyperscalers like —losing a major 2021 deal due to buggy Poplar software and abrupt Zoom announcements without post-mortems—and smaller startups, leading to inventory mismanagement and sales confusion among staff. Partnership disputes exacerbated issues; in 2023, cloud provider HyperAI accused Graphcore of reneging on a 2021 agreement by prioritizing an undisclosed exclusive with G-Core Labs, delaying POD16 system deliveries ordered in April 2022, and withdrawing support post-layoffs in January 2023, prompting legal action. Such decisions contributed to talent drain, including key executives departing for Meta and by 2023, amid low morale from unfulfilled hype as a "Nvidia rival." Financially, overambitious pursuits like $120 million "brain-scale" supercomputer plans strained resources without commensurate revenue, yielding just $2.7 million in 2022 (a 46% drop from 2021) against $204.6 million in losses, necessitating layoffs that halved headcount from 620 to 418 by late 2023. Inability to secure pension fund backing or additional rounds despite a $2.8 billion peak valuation in 2020 culminated in a July SoftBank acquisition for approximately $500 million—below total funding raised—wiping employee share value and signaling validation failure for core IPU tech over ecosystem lock-in. These missteps reflected broader executive shortcomings in aligning tech innovation with market realities dominated by Nvidia's execution.

Broader Impact

Applications in Specific AI Workloads

Graphcore's Intelligence Processing Units (IPUs) have been applied to (NLP) tasks, enabling efficient training and inference of transformer models through integrations with frameworks like Optimum. In 2022, Graphcore expanded support for a broader range of NLP modalities and tasks, including text classification and generation, by optimizing pre-trained models for IPU execution. Providers such as NLP Cloud deployed IPU-hosted models for AI-as-a-service in 2023, leveraging partners like Gcore for scalable inference. In workloads, IPUs facilitate accelerated image processing and model scaling, as demonstrated by Graphcore's 2021 implementation of EfficientNet on IPU-POD systems, achieving training completion in under two hours for large-scale datasets. This architecture supports higher-accuracy vision models by exploiting IPU parallelism for convolutional operations, outperforming GPUs in memory-bound scenarios according to independent evaluations. Graph Neural Networks (GNNs), used in recommendation systems and , benefit from IPU's fine-grained parallelism and MIMD execution model, enabling breakthroughs in sparse graph computations. Applications extend to , where GNNs model molecular interactions for target identification. Bioinformatics workloads, including DNA and protein , see significant speedups on IPUs; a 2023 study reported 10x over leading GPUs for these tasks, attributed to IPU's high throughput in alignment algorithms. In drug discovery, biotech firm LabGenius utilized IPU-accelerated BERT models in 2022 to reduce experiment turnaround from months to weeks, enhancing for cancer and inflammatory treatments. Genome assembly pipelines also leverage IPUs for faster alignment of protein and DNA molecules, as verified in research. IPUs support hybrid AI-HPC simulations by using machine learning surrogate models to replace compute-intensive bottlenecks, transforming traditional high-performance computing in fields like physics. In particle physics, early evaluations showed IPU potential for event reconstruction and simulation due to efficient handling of irregular data patterns. These applications highlight IPU strengths in workloads requiring massive parallelism and low-latency memory access, though adoption remains limited by ecosystem maturity compared to GPU alternatives.

Contributions Versus Overstated Promises

Graphcore's development of the Intelligence Processing Unit (IPU) represented a significant architectural in AI hardware, introducing a processor with up to 1,472 independent cores per chip, 900 MB of on-chip SRAM, and specialized support for sparse computations and irregular memory access patterns, which enabled more efficient handling of certain operations compared to GPU architectures reliant on high-bandwidth memory hierarchies. This design facilitated advancements in workloads like graph algorithms and surrogate modeling in (HPC), where IPUs demonstrated superior execution times over GPUs in heterogeneous environments. Additionally, Graphcore contributed to the open-source ecosystem by integrating IPU support into , enabling developers to port and optimize models for its hardware without full rewrites. Despite these technical merits, Graphcore's assertions of broad superiority—such as claims of 11x price-performance gains over Nvidia's DGX A100 systems in scaled configurations—proved overstated in practice, as IPUs underperformed in large-scale training dominated by dense matrix operations, where Nvidia's mature ecosystem and software optimizations maintained dominance. Independent evaluations highlighted IPU strengths in niche tasks like skewed matrix multiplications but revealed limitations in general AI scaling, contributing to limited commercial traction beyond specialized applications. The company's peak valuation of over $2.8 billion in contrasted sharply with its trajectory, marked by revenue shortfalls and inability to secure major contracts, ultimately leading to acquisition by on July 11, 2024, for a reported $500 million—less than cumulative investor funding—amid struggles to compete in a GPU-centric market. This outcome underscored how Graphcore's hardware innovations, while pushing boundaries in parallelism and efficiency for targeted AI/HPC use cases, were hampered by ecosystem immaturity and failure to disrupt entrenched incumbents, rendering early hype about revolutionizing AI compute unfulfilled.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.