Hubbry Logo
search
logo

ARC (processor)

logo
Community Hub0 Subscribers
Read side by side
from Wikipedia

ARC
DesignerARC International PLC
Bits32-bit, 64-bit
Introduced1996; 30 years ago (1996)
VersionARCv3
DesignRISC
TypeLoad–store
EncodingVariable (16- and 32-bit)
BranchingCompare and branch
EndiannessBi
ExtensionsAPEX user-defined instructions
Registers
16 or 32 including SP user can increase to 60

Argonaut RISC Core (ARC) is a family of 32-bit and 64-bit reduced instruction set computer (RISC) central processing units (CPUs) originally designed by ARC International.

ARC processors are configurable and extensible for a wide range of uses in system on a chip (SoC) devices, including storage, digital home, mobile, automotive, and Internet of things (IoT) applications. They have been licensed by more than 200 organizations and are shipped in more than 1.5 billion products per year.[1]

ARC processors employ the 16-/32-bit ARCompact compressed instruction set instruction set architecture (ISA) that provides good performance and code density for embedded and host SoC applications.

History

[edit]

The ARC concept was developed initially within Argonaut Games through a series of 3D pipeline development projects starting with the Super FX chip for the Super Nintendo Entertainment System.

In 1995, Argonaut was split into Argonaut Technologies Limited (ATL), which had a variety of technology projects, and Argonaut Software Limited (ASL).

At the start of 1996, the General Manager of Argonaut, John Edelson, started reducing ATL projects such as BRender and motion capture and investing in the development of the ARC concept. In September 1996 Rick Clucas decided that the value of the ARC processor was in other people using it rather than Argonaut doing projects using it and asked Bob Terwilliger to join as CEO; Rick Clucas then took on the role of CTO.

In 1997, following investment by Apax Partners, ATL became ARC International and fully independent from Argonaut Games. Before their initial public offering on the London Stock Exchange, underwritten by Goldman Sachs and five other investment banks, three related technology companies were acquired: MetaWare in Santa Cruz, California (development and modeling software),[2] VAutomation in Nashua, New Hampshire (peripheral semiconductor IP), and Precise Software in Nepean, Ontario (RTOS).

In 2009, ARC International was acquired by Virage Logic.[3] In 2010, Virage was acquired by Synopsys, and ARC processors became part of the Synopsys DesignWare series.[4]

In April 2020 Synopsys released the ARCv3 ISA with 64-bit support.[5]

In November 2023, Synopsys released the RISC-V compatible ARC-V processor IP as an extension of its ARC product line.[6]

In January 2026, Synopsys announced that it was selling its processor IP business, including its ARC product line, to GlobalFoundries.[7]

Design configuration

[edit]

Designers can differentiate their products by using patented configuration technology to tailor each ARC processor instance to meet specific performance, power and area requirements.

Configuration of the ARC processors occurs at design time, using the ARChitect processor configurator.[8] The core was designed to be extensible, allowing designers to add their own custom instructions that can significantly increase performance or reduce power consumption.

Unlike most embedded microprocessors, extra instructions, registers, and functions can be added in a modular fashion. Customers analyse the task, break down the operations, and then choose the appropriate extensions, or develop their own, to create their own custom microprocessor. They might optimise for speed, energy efficiency, or code density. Extensions can include, for example, a memory management unit (MMU), a fast multiplier–accumulator, a Universal Serial Bus (USB) host, a Viterbi path decoder, or a user's proprietary RTL functions.

The processors are synthesizable and can be implemented in any foundry or process, and are supported by a complete suite of development tools.[9]

See also

[edit]

References

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The ARC processor is a family of highly configurable 32- and 64-bit reduced instruction set computer (RISC) intellectual property (IP) cores developed by Synopsys for embedded applications in system-on-chip (SoC) designs, offering optimized performance, power, and area (PPA) efficiency.[1] Originating as the Argonaut RISC Core in the early 1990s, it has evolved from gaming hardware accelerators into one of the industry's leading licensable processor architectures, with billions of units shipped annually across diverse sectors.[2][1] The ARC processor traces its roots to Argonaut Technologies, founded by Jez San and Rick Clucas as a spin-off from Argonaut Games, where it was first developed in 1993 as the core for the Super FX coprocessor chip used in select Super Nintendo Entertainment System (SNES) cartridges to enable 3D graphics rendering.[2] In 1998, the technology was spun off into ARC International plc, a dedicated IP licensing company that went public on the London Stock Exchange in 2000 with a market valuation of approximately £1.1 billion.[2] The company expanded through acquisitions such as MetaWare in 1999, but faced challenges leading to its acquisition by Virage Logic in 2009 for £25.2 million, integrating ARC's configurable processors into Virage's broader IP portfolio.[3][4] Synopsys then acquired Virage Logic in 2010 for $315 million, incorporating the ARC family into its DesignWare IP solutions and advancing its development over the subsequent 15 years.[5] Under Synopsys, the ARC portfolio has grown to encompass multiple instruction set architectures (ISAs) and specialized families tailored for demanding embedded workloads. The classic ARC processors utilize proprietary ARCv2 and ARCv3 ISAs, featuring the ARC Processor eXtension (APEX) technology for custom instruction addition to meet specific application needs.[1] Key families include the ARC EM series (e.g., EM4, EM6, EM9D) for energy-efficient, low-power embedded control in IoT and automotive systems; the ARC HS series (e.g., HS4x, HS5x) for high-performance 32/64-bit processing in networking and multimedia; the ARC VPX series (e.g., VPX2, VPX5) for digital signal processing (DSP) in audio and vision applications; and the ARC EV series for embedded vision tasks.[6][7] In 2023, Synopsys introduced the ARC-V family, based on the open-standard RISC-V ISA, providing mid-range to ultra-low-power options with extensibility for AIoT, safety-critical automotive, and neural processing applications, while leveraging the growing RISC-V ecosystem, with general availability beginning in 2024.[8][9] The ARC processors are deployed in a wide array of applications, including consumer electronics, industrial automation, telecommunications, and functional safety-certified automotive systems compliant with ISO 26262 standards up to ASIL-D.[6] They are supported by a comprehensive ecosystem, including the MetaWare Development Toolkit for software optimization, free and open-source operating systems, middleware, and third-party integrations through the ARC Access Program, enabling efficient design, debugging, and deployment.[1] This configurability and broad compatibility have solidified ARC's position as the second-most shipped processor IP by volume in the embedded market.[1]

History

Origins in Gaming

Argonaut Software was founded in 1982 by teenage entrepreneur Jez San, initially focusing on game development and software tools for platforms like the Commodore 64 and Atari ST.[10] By the early 1990s, the company shifted toward hardware innovation, driven by San's interest in 3D graphics and the limitations of existing console architectures for rendering complex visuals.[2] This pivot led to the development of the Super FX chip in 1993, a coprocessor designed specifically for the Super Nintendo Entertainment System (SNES) to enable enhanced 3D rendering capabilities. The chip incorporated the first iteration of the Argonaut RISC Core (ARC), a programmable 16-bit RISC processor operating at an effective clock speed of 10.5 MHz (with an internal rate of 21 MHz halved by a bus divider), featuring a multi-stage pipeline for efficient instruction execution and no integrated cache to keep the design compact.[2][11] Targeted at accelerating polygon-based graphics by up to 200 times compared to software-only approaches on the SNES, the Super FX allowed developers to perform tasks like 3D mapping, sprite scaling, and pixel shading through custom code rather than fixed hardware functions.[10] The Super FX emerged from a close collaboration between Argonaut and Nintendo, who invested approximately $2 million in its development after being impressed by an early 3D demo from San's team.[10] This partnership resulted in the chip's debut in landmark titles, most notably Star Fox (1993 in Japan, 1994 internationally), where it powered real-time polygonal rendering in a rail shooter format, marking one of the first consumer console games to feature true 3D graphics.[2] The success of Star Fox and subsequent Super FX-enhanced games like Stunt Race FX demonstrated the ARC core's potential beyond fixed gaming hardware, paving the way for its evolution into a more flexible technology. In the mid-1990s, Argonaut explored further gaming applications of the ARC core, including a custom chip for Hasbro's planned VR console that integrated three ARC processors to handle immersive 3D environments, though the project was ultimately canceled.[2] This period highlighted the core's adaptability for specialized gaming silicon before Argonaut began transitioning it toward configurable intellectual property for broader embedded applications.

Development as Configurable IP

In 1998, ARC International was spun out from Argonaut Software as a standalone company focused on licensing 32-bit configurable RISC processor cores for embedded systems.[12] This transition marked a shift from fixed-function gaming hardware to flexible intellectual property (IP) that SoC designers could tailor to specific applications, emphasizing customization for performance, power, and area constraints.[13] In 1999, ARC acquired MetaWare, a developer of software tools including compilers, to bolster its ecosystem for processor development.[14] A pivotal advancement came in 2000 with the introduction of the ARChitect tool, a graphical interface enabling users to configure ARC processors by adding custom instructions, peripherals, and extensions without deep hardware design expertise.[15] That same year, ARC International launched the ARC 600 family, featuring a load/store RISC architecture with configurable pipeline depths ranging from 4 to 8 stages to balance speed and efficiency in embedded tasks.[16] The company's initial public offering (IPO) on the London Stock Exchange in September 2000 raised significant capital, valuing ARC at over £1 billion and funding further IP development.[17] Partnerships with FPGA vendors, notably Xilinx in July 2000, enabled soft-core implementations of ARC processors in programmable logic, broadening accessibility for prototyping and low-volume production.[18] By 2004, ARC introduced the ARC 700 family, building on the 600 series with integrated DSP extensions for signal processing workloads and support for mixed 16/32-bit instruction modes to optimize code density in resource-constrained environments.[19] These cores gained traction in consumer electronics, powering applications in set-top boxes for multimedia decoding and printers for efficient control logic, where their configurability delivered low power consumption—up to 75% reductions in some scenarios—and compact silicon area.[20][21] Early licensees, including Digeo for next-generation set-top boxes, highlighted ARC's role in accelerating time-to-market for power-sensitive devices.[20]

Acquisition by Synopsys and Expansion

ARC International faced significant financial challenges amid the 2008-2009 global economic downturn, with revenue declining 35% in the first half of 2009 to $12.3 million due to reduced semiconductor demand.[22] In December 2009, Virage Logic Corporation acquired ARC International plc in an all-cash transaction valued at approximately £25.2 million (about $42 million), integrating ARC's configurable processor IP into Virage's portfolio of embedded memory and logic libraries.[23] Shortly thereafter, in June 2010, Synopsys, Inc. completed its acquisition of Virage Logic for $315 million, bringing ARC processors under Synopsys' DesignWare IP umbrella and enabling broader distribution within the semiconductor design ecosystem.[5] This integration allowed Synopsys to leverage ARC's expertise in customizable RISC cores to complement its existing IP offerings, targeting embedded applications in consumer electronics, automotive, and networking.[24] Following the acquisition, Synopsys accelerated ARC's evolution with the introduction of the ARCv2 instruction set architecture (ISA) in the early 2010s, emphasizing enhanced efficiency through improved vector processing and low-power optimizations for embedded systems.[25] A significant milestone came in 2020 with the launch of 64-bit support in the ARC HS series, featuring the ARCv3 ISA that delivered up to 3x performance gains for high-end embedded workloads while maintaining configurability.[26] Additionally, starting in 2016, Synopsys added AI and machine learning extensions to the ARC EV series, incorporating vector processing units and deep neural network accelerators optimized for embedded vision tasks, such as real-time image enhancement in automotive and consumer devices.[27] Key events underscoring this expansion include the 2025 ARC Processor Virtual Summit, where Synopsys highlighted ecosystem advancements, including partnerships for software toolchains and safety-certified implementations.[28] By 2025, Synopsys reported that ARC processors had shipped in billions of system-on-chips (SoCs) annually across diverse markets, reflecting robust growth.[1] To address intensifying competition from ARM and RISC-V architectures, Synopsys emphasized ARC's extreme configurability, culminating in the 2023 launch of the ARC-V family based on open RISC-V standards, which extended customization options while supporting 32-bit and 64-bit variants for scalable deployments.[29] This strategic pivot reinforced ARC's position in power-efficient, application-specific processing amid rising demands for open-source compatibility.[30]

Architecture

Instruction Set Evolution

The instruction set architecture (ISA) of the ARC processor began with the ARCompact ISA, introduced by ARC International in 2001 with the ARCtangent-A5 processor core.[31] This proprietary ISA featured a mixed-length format combining 16-bit and 32-bit instructions, similar to ARM's Thumb compression scheme, to optimize code density for embedded systems by reducing memory requirements by up to 30%.[32] As a load/store RISC design, ARCompact included over 70 base instructions that could be freely intermixed, enabling efficient execution of common operations like arithmetic, logical, and control flow while maintaining a 32-bit register file and addressing model.[33] The architecture emphasized configurability, allowing custom instructions to be added without disrupting the core set, which supported its use in power-sensitive applications during the 1990s and 2000s.[16] In the 2010s, Synopsys advanced the ISA with ARCv2, first announced in 2013 alongside the ARC HS processor family.[34] This enhanced 32-bit ISA built on ARCompact's efficiency but introduced greater scalability, including SIMD extensions for parallel data processing and support for up to 128-bit vectors in DSP configurations through the ARCv2DSP variant, which added over 150 optimized instructions for signal processing tasks.[35] Key additions encompassed atomic operations for multi-threaded synchronization, conditional execution to reduce branching overhead, and improved code density—up to 18% better than prior generations—while preserving binary compatibility with ARCompact code.[25] These features targeted embedded data and signal processing, enabling higher performance in applications like audio and control systems without sacrificing power efficiency.[6] The ARCv3 ISA, released in April 2020, marked a significant leap to a 32/64-bit foundation optimized for superscalar execution in high-end embedded environments.[36] It incorporated advanced branching instructions for better control flow prediction, dedicated cache management operations to handle larger memory hierarchies, and enhanced DSP capabilities including multiply-accumulate (MPY) instructions for secure and efficient computation in safety-critical domains.[37] Supporting 52-bit physical and 64-bit virtual address spaces, ARCv3 enabled access to expansive memory while integrating security features like pointer authentication to mitigate vulnerabilities.[36] This evolution emphasized functional safety up to ASIL-D and configurability for automotive and industrial uses, delivering up to 3x performance gains over ARCv2 in multi-core setups.[38] By 2023, Synopsys shifted the ARC lineage toward openness with the ARC-V family, adopting the RISC-V ISA as its base—specifically the RV32I and RV64I integer instruction sets—to leverage the ecosystem's interoperability and extensibility.[29] Announced on November 7, 2023, ARC-V incorporates custom RISC-V extensions tailored to ARC's configurable heritage, ensuring compatibility with ratified RISC-V profiles for standardized vector, atomic, and floating-point operations.[9] This transition from proprietary compressed ISAs like ARCompact and ARCv2/3 to the open RISC-V foundation promotes broader software portability and community-driven innovation, while retaining ARC's microarchitectural strengths for low-power, high-efficiency embedded designs in IoT and automotive sectors.[8] In July 2025, Synopsys introduced the ARC-V RHX-100 series, featuring a dual-issue, 32-bit superscalar architecture optimized for real-time embedded applications.[39]

Core Microarchitecture

The core microarchitecture of ARC processors is designed for embedded applications, emphasizing configurability, low power, and performance efficiency through a balance of pipeline depth, memory organization, and specialized execution hardware. Early licensable ARC cores, such as those in the 600 family, feature 5-stage in-order pipelines, while later low-power variants like the EM series use 3-stage pipelines to support deterministic real-time processing with minimal latency.[40][41] For instance, the ARC 601 employs a 5-stage Harvard pipeline that enables single-cycle execution of most instructions while maintaining low power consumption suitable for battery-operated devices.[40] In contrast, higher-performance variants like the HS series introduce superscalar dual-issue pipelines with up to 10 stages, incorporating out-of-order execution elements to enhance instruction throughput and utilization of functional units without proportionally increasing power draw.[42][43] These pipelines support 32-bit or 64-bit operations, with features like zero-overhead loops in simpler cores to optimize control flow in embedded code.[40] The memory hierarchy in ARC processors is highly configurable to accommodate diverse system requirements, typically supporting both Harvard and von Neumann bus architectures for separated or unified instruction and data access. Level 1 (L1) caches are optional and scalable, with instruction and data caches up to 64 KB each in advanced configurations, paired with closely coupled memories (CCM) ranging from 512 bytes to 16 MB for low-latency access to critical code and data.[44][45] For virtual memory support, optional memory management units (MMU) and translation lookaside buffers (TLB) are available, particularly in HS-series cores, enabling symmetric multiprocessing (SMP) and page-based addressing with support for large physical address spaces up to 40 bits and page sizes to 16 MB.[46] In resource-constrained setups, such as EM-family processors, an overlay management unit with micro-TLBs provides efficient virtual-to-physical translation without full MMU overhead, acting as a two-level cache for page descriptors to minimize memory access penalties.[47] Execution units in ARC cores center on a 32-bit integer arithmetic logic unit (ALU) for general-purpose computations, augmented by a configurable barrel shifter for efficient bit manipulation and address calculations. Digital signal processing (DSP) workloads are accelerated via multiply-accumulate (MAC) units, with options for 16x16 or 32x32 multipliers and radix-4 dividers to handle fixed-point operations common in embedded audio and control applications.[40][44] Branch prediction mechanisms vary by core complexity: static prediction in simpler pipelines like the ARC 625D ensures predictability for real-time tasks, while dynamic methods in HS-series processors, including sophisticated predictors with early misprediction detection, improve control flow efficiency and reduce stalls in performance-critical code.[48][49] Register files support 26 to 54 general-purpose registers, extendable for vector or DSP extensions, allowing parallel execution of operations like loads, stores, and arithmetic in a single cycle where pipeline depth permits.[40] Power management features are integral to the microarchitecture, enabling fine-grained control to meet stringent embedded constraints. Clock gating is implemented at the pipeline and unit levels to disable unused logic, reducing dynamic power, while support for dynamic voltage and frequency scaling (DVFS) allows runtime adjustment based on workload demands.[50] Low-power modes, including sleep and wake states with instant-on retention, further extend battery life by preserving register and memory contents during idle periods without full system reset.[50] These techniques are balanced with performance features, such as selective pipeline flushing, to avoid excessive overhead. Security is embedded in later core designs through features akin to ARM TrustZone, providing hardware-isolated trusted execution environments (TEE) via ARC SecureShield technology. This enables partitioning of secure and non-secure worlds for sensitive operations like cryptographic processing, with memory isolation and fault detection to counter injection attacks.[51] Secure boot mechanisms, supported by the Enhanced Security Package in EM and HS cores, verify firmware integrity at startup using hardware root-of-trust elements, ensuring tamper-resistant initialization and protection against unauthorized code execution.[52][53] Optional error-correcting code (ECC) on memories adds reliability for mission-critical applications.[44]

Configurability Mechanisms

The configurability of ARC processors is facilitated primarily through Synopsys' ARChitect tool, a graphical user interface (GUI)-based configurator that allows designers to tailor processor cores to specific application requirements without extensive hardware redesign. ARChitect employs drag-and-drop and point-and-click interfaces to select and customize components from IP libraries, including core features such as pipeline stages, cache sizes, and bus interfaces. Designers can add custom instructions, extend register files, and integrate peripherals like UARTs, timers, and interrupt controllers, enabling optimization for power, performance, and area (PPA) metrics. This tool supports rapid iteration, generating a customized processor configuration in under five minutes.[54][55] Key extension types include custom opcodes for application-specific operations, coprocessor interfaces for offloading complex tasks, and hardware accelerators developed using Synopsys' ASIP Designer tool. ASIP Designer automates the synthesis of instruction-set extensions, allowing users to define new instructions in C/C++ and automatically generate corresponding hardware logic, software tools, and verification components. For instance, custom opcodes can implement domain-specific primitives, such as those for signal processing or security algorithms, while coprocessor interfaces enable integration with dedicated units like DSP blocks. These mechanisms build upon the baseline ARC instruction set architecture (ISA), ensuring extensions maintain forward compatibility through defined subsets.[56][57] The configuration process begins with feature selection in ARChitect, followed by automated register-transfer level (RTL) generation in Verilog or VHDL, including simulation testbenches, synthesis scripts, and documentation. The resulting RTL can then be simulated, verified, and synthesized for ASIC or FPGA implementation using Synopsys' ecosystem tools like VCS for simulation and Design Compiler for synthesis. This tailoring approach supports significant area reductions—up to two-thirds in some DSP configurations—by omitting unused features and optimizing for target workloads, while preserving software portability. Verification is ensured through Synopsys' integrated flows, including cycle-accurate models and compliance checks against ISA subsets to avoid compatibility issues.[54][58][59] Representative examples illustrate the practical impact: the ARC CryptoPack extension adds custom instructions for AES encryption, accelerating software implementations by integrating hardware support for key expansion and block cipher operations directly into the core, without requiring a separate ASIC block. Similarly, ASIP Designer can synthesize neural network primitives, such as matrix multiply-accumulate operations, to enhance AI inference efficiency in embedded systems. These customizations allow up to 10x performance gains for targeted algorithms while reducing overall system area and power compared to discrete accelerator approaches. Limitations include the need to adhere to ISA subset rules for binary compatibility across ARC families and reliance on Synopsys verification tools to mitigate integration risks.[60][56][57]

Processor Families

ARC EM Series

The ARC EM Series comprises a family of 32-bit embedded processors developed by Synopsys, based on the ARCv2 instruction set architecture (ISA), targeting cost-sensitive and ultra-low-power applications such as microcontrollers and sensors. Introduced in the 2010s, the series began with the ARC EM4 and EM6 cores, which emphasize compact size, efficient code density, and minimal power usage for deeply embedded systems. These processors feature a low-latency 3-stage Harvard architecture pipeline, delivering up to 1.81 Dhrystone MIPS per MHz (DMIPS/MHz) and 4.18 CoreMark per MHz in performance metrics.[61][62] Key features of the ARC EM Series include optional multipliers (32x32 or 16x16) for enhanced arithmetic capabilities and support for up to 2 MB of tightly coupled instruction and data memories (ICCM/DCCM) with single-cycle access. The base configurations lack on-chip caches to minimize area and power, though the EM6 variant offers optional instruction and data caches up to 64 KB for improved efficiency in memory-intensive tasks. DSP extensions are available in EMxD variants, such as the EM5D and EM7D, which incorporate over 150 dedicated instructions for fixed-point operations, vector/SIMD processing, and a power-efficient 32x32 multiply-accumulate (MUL/MAC) unit, enabling applications like audio processing and sensor fusion. For automotive use, the series includes Safety Enhancement Package (SEP) options certified as ASIL D-ready under ISO 26262, with features like error-correcting code (ECC) memory protection and lockstep execution in variants such as the dual-core EM22FS.[61][62][63] Performance in the ARC EM Series is optimized for 32-bit operations, with typical clock frequencies up to 300 MHz in process nodes like 65 nm, with potential for higher in advanced nodes, though earlier implementations operated at 300 MHz or below. Power consumption is exceptionally low, often below 0.5 mW/MHz in dynamic operation, making it suitable for battery-powered devices. Variants are categorized as EM4x for basic microcontroller-like tasks without caches and EM6x for configurations including multipliers and optional caches; the DSP-focused EMxD lineup (e.g., EM5D without cache, EM7D with cache, and higher-end EM9D/EM11D) extends this for signal processing workloads. While primarily single-core designs, the series supports scalable configurations for multi-core systems up to four cores through Synopsys' interconnect options, though this is less common than in higher-performance families. These processors are widely deployed in simple MCUs for IoT sensors, wearables, and automotive control units requiring real-time responsiveness with minimal overhead.[64][65][62]

ARC HS Series

The ARC HS Series comprises a family of high-performance, configurable 32-bit and 64-bit embedded processors designed for compute-intensive applications such as automotive systems, networking, and multimedia processing. Introduced by Synopsys in 2013 with the initial HS3x cores, the series has evolved to deliver superscalar architectures optimized for performance efficiency in system-on-chip (SoC) designs.[34] These processors leverage Synopsys' ARChitect tool for customization, including extensions via APEX technology for application-specific instructions.[66] The foundational HS3x cores (HS34, HS36, HS38), based on the ARCv2 instruction set architecture (ISA), feature a 10-stage pipeline and dual-issue superscalar execution, achieving up to 2.13 DMIPS/MHz and 4.15 CoreMark/MHz.[44] They support configurable closely coupled memories (CCM) for low-latency access and optional L1 caches up to 64 KB, with the HS38 variant adding an MMU for 40-bit addressing and up to 8 MB L2 cache to enable symmetric multiprocessing (SMP) Linux. Multicore configurations scale to dual or quad cores with L1 cache coherency via an interconnect fabric.[44] In 2017, the HS4x series (HS44, HS46, HS48) advanced the architecture with enhanced superscalar capabilities on the ARCv2 ISA, delivering 3.0 DMIPS/MHz and 5.2 CoreMark/MHz while operating up to 1.9 GHz on 16 nm processes.[67] These cores include branch prediction, optional IEEE 754-compliant floating-point units (FPU), and SIMD extensions, with L1 caches up to 64 KB and L2 up to 8 MB in the HS48 model. Multicore support extends to dual- and quad-core setups, emphasizing power efficiency for high-end embedded tasks. Safety-focused variants, such as the HS4xFS processors, achieve ISO 26262 ASIL-D certification through dual-core lockstep implementations and self-checking monitors.[68][69] The 2020 introduction of the HS5x (HS56, HS57D, HS58) and HS6x (HS66, HS68) cores marked a shift to the ARCv3 ISA, introducing 64-bit support in the HS6x models with 52-bit physical and 64-bit virtual addressing for up to 4.5 petabytes of memory.[26] These dual-issue superscalar designs maintain 3.0 DMIPS/MHz efficiency, with the HS5x focused on 32-bit workloads and optional ARCv3DSP extensions in the HS57D, while the HS6x emphasizes 64-bit performance. Caches scale to 64 KB L1 and 64 MB L2 (in HS58/HS68), complemented by branch predictors and FPUs. Multicore scalability reaches up to 12 cores interconnected via a coherent fabric supporting up to 16 hardware accelerators, delivering up to 5400 DMIPS per core at 1.8 GHz on advanced nodes. Compared to prior generations, the HS5x/HS6x provide up to 3x improvement in SPECint benchmarks, underscoring their suitability for demanding embedded workloads.[37][70]

ARC VPX and EV Series

The ARC VPX series, introduced in the 2010s, comprises a family of 32-bit VLIW/SIMD digital signal processors based on the ARCv2 instruction set architecture, designed for high-performance signal processing in embedded applications such as audio, video, and IoT sensor fusion.[71] Models including the VPX2, VPX3, VPX5, and VPX6 support scalable vector widths up to 512 bits with SIMD operations, enabling up to 32 multiply-accumulate (MAC) operations per cycle for efficient handling of compute-intensive workloads like voice processing and multimedia encoding, with models scaling up to 1024 bits in the VPX6 variant.[72] These processors incorporate XY memory architecture to optimize data access patterns in DSP algorithms, reducing latency for complex filtering and transforms, while fault-tolerant variants such as the VPXxFS provide ASIL-B compliance for automotive applications like ADAS and radar processing.[73] The VPX family features a configurable pipeline typically spanning 5 to 7 stages, balancing throughput and power efficiency in low-power SoCs.[72] The ARC EV series, launched in 2016 with the EV6x family and extended by the EV7x in 2019, targets embedded vision and machine learning tasks through a heterogeneous multicore architecture optimized for AI acceleration at the edge.[74] Each EV processor integrates 32-bit scalar units, 512-bit wide vector DSPs configurable for 8-, 16-, or 32-bit operations, and a scalable convolutional neural network (CNN) engine with up to 14,080 MACs for deep learning inference.[75] Supporting frameworks like TensorFlow Lite via the MetaWare development toolkit, the EV series enables efficient deployment of vision models for applications such as object detection and image enhancement.[76] Functional safety variants, including EV7xFS certified to ASIL-D, incorporate dual-core lockstep and error-correcting code (ECC) memory for reliable operation in automotive and industrial environments.[77] The architecture employs a 10-stage pipeline in its vision CPU cores, supporting frequencies up to 1.2 GHz and delivering peak performance of over 150 GOPS per vector unit at 800 MHz, scaling to 35 TOPS in multi-core configurations on 16 nm processes for edge AI tasks.[78][79] In typical implementations, both VPX and EV processors are integrated into larger subsystems alongside ARC HS general-purpose cores to form hybrid compute platforms, where the DSP/vector units handle specialized media and AI workloads while HS manages control flow and system tasks, optimizing overall SoC efficiency for devices in automotive, consumer electronics, and smart sensors.[73]

ARC-V Series

The ARC-V series represents Synopsys' entry into the RISC-V ecosystem, announced on November 7, 2023, as a family of configurable processor IP cores designed for embedded applications requiring low power and high efficiency. Built on the open-standard RISC-V instruction set architecture (ISA), these processors support both 32-bit (RV32) and 64-bit (RV64) configurations, including the general-purpose G extension (IMAC), enabling broad compatibility with the RISC-V software ecosystem. The series extends Synopsys' decades of ARC processor expertise to RISC-V, offering options for high-performance, real-time, and ultra-low-power use cases while maintaining interoperability with existing RISC-V tools, operating systems, and middleware.[29][9][8] Key features of the ARC-V series include support for custom RISC-V extensions through Synopsys' ARChitect tool, which allows designers to add application-specific instructions, such as DSP enhancements, to optimize for particular workloads without deviating from RISC-V standards. Functional safety variants comply with ISO 26262 ASIL D requirements, making them suitable for safety-critical systems, and include hardware virtualization for multi-OS environments. The processors emphasize configurability in pipeline stages, cache sizes, and peripherals, with options for floating-point units (FPU) and trace/debug features like N-Trace. Multicore configurations are available in select variants, scaling up to 16 cores per cluster with coherent interconnect support via CHI ports in higher-end models.[80][81][82] The series comprises three primary variants tailored to different power and performance profiles. The ARC-V RMX series targets ultra-low-power embedded applications with 32-bit cores featuring 3-stage (RMX-100) or 5-stage (RMX-500) pipelines, including DSP-enhanced versions (RMX-100D and RMX-500D) for signal processing tasks; these are optimized for minimal area and energy use in IoT devices. The ARC-V RHX series provides 32-bit real-time processors with dual-issue execution for deterministic performance in control systems, supporting up to 16 cores and optional RISC-V Vector (RVV) extensions. The ARC-V RPX series delivers high-performance 64-bit superscalar cores with dual-issue pipelines, scalable to 16 cores, and targeted at host processing in complex SoCs, with variants like RPX-100 and RPX-110 offering large address spaces up to 64-bit virtual addressing, with physical addressing configurable up to 52 bits for exabyte-scale support in RV64 configurations.[83][84][85] Performance across the ARC-V series is geared toward efficient embedded operation, with capabilities to reach multi-GHz clock speeds in advanced process nodes, though specific figures vary by configuration and silicon. The cores integrate seamlessly with the RISC-V toolchain, including GNU tools, and Synopsys' MetaWare Development Toolkit for software development. Adoption focuses on automotive, industrial IoT, and storage applications, with early support from partners like Infineon for integration into their SoC platforms, and compatibility with third-party IP to accelerate RISC-V-based designs. General availability began in 2024, with the RMX series launching in Q2 and RHX/RPX following later in the year. By 2025, the ARC-V ecosystem has expanded with support from partners like Green Hills Software for safety-certified real-time operating systems.[86][87][80][88]

Applications and Implementations

Embedded and IoT Systems

The ARC EM series processors are particularly dominant in embedded and IoT systems due to their ultra-low power consumption, typically achieving as little as 3 μW/MHz in advanced processes, enabling operation below 1 mW in constrained environments.[89][90] This low-power profile, combined with the series' extensibility through Synopsys' APEX technology, allows customization for wireless protocols such as Bluetooth and Zigbee, making them ideal for battery-operated, connected devices like sensors and smart home gadgets.[61][91] These processors find widespread use in consumer IoT applications, including wearables such as fitness trackers for sensor fusion and activity monitoring, as well as smart meters for energy management and remote data collection.[92] Synopsys reports billions of ARC-based chips shipped annually across various sectors, including IoT, reflecting their scalability and adoption in volume production.[1][93] Key benefits include an exceptionally small die area of approximately 0.01 mm² in 28 nm high-performance mobile processes, which minimizes costs in area-constrained designs, alongside robust support for real-time operating systems like FreeRTOS to enable deterministic task handling in edge scenarios.[61][94] A notable case study is their integration in platforms akin to ESP32 modules, such as the ARC EM Software Development Platform, which incorporates multi-protocol wireless modules (Wi-Fi, Bluetooth, Zigbee) for edge computing tasks like local data processing and connectivity in IoT nodes.[95][91]

Automotive and Industrial Uses

The Synopsys ARC Functional Safety (FS) processors, including the ARC EM22FS and HS4xFS families, are widely deployed in automotive electronic control units (ECUs) and infotainment systems, where they support ASIL D compliance under ISO 26262 standards through dual-core lockstep configurations that enable rapid fault detection.[96][68] These processors integrate hardware safety mechanisms such as error-correcting code (ECC) memory and parity protection to ensure reliability in safety-critical environments like advanced driver-assistance systems (ADAS).[97] For instance, Bosch Sensortec incorporates the ARC EM4 core in its BHI360 programmable IMU smart sensor system for sensor fusion tasks in automotive applications, leveraging the processor's low-power efficiency and up to 3.6 CoreMark/MHz performance.[98] The ARC VPXxFS DSP family extends these capabilities to signal processing in automotive SoCs, achieving ASIL D systematic and up to ASIL C random hardware fault tolerance with minimal area and power overhead, making it suitable for tasks such as audio processing in electric vehicles (EVs).[99] This configurability allows integration into power-constrained EV systems for real-time audio enhancement and noise cancellation, balancing high performance with energy efficiency via its VLIW/SIMD architecture.[100] In industrial applications, ARC FS processors facilitate fault-tolerant designs for control systems and robotics, supporting ASIL B and D equivalents for ISO 26262-compliant industrial SoCs that demand deterministic real-time operation.[101] Features like lockstep execution and ECC-protected memory enhance system reliability in harsh environments, such as factory automation where transient faults from electromagnetic interference must be mitigated.[102] For example, the ARC EV processor IP accelerates image processing in industrial robotics through simultaneous localization and mapping (SLAM) algorithms, as demonstrated by Kudan, achieving 40% faster performance for navigation in automated manufacturing settings.[103] The ARC DSP lines, including VPX variants, further support industrial signal processing for motor control and vision-based automation, optimizing for low power in battery-operated or edge-deployed robotic systems.[104]

Multimedia and AI Processing

The ARC VPX and EV series processors are widely deployed in multimedia applications, particularly for high-throughput video processing in consumer devices such as set-top boxes and digital cameras. These processors support efficient H.265 (HEVC) decoding, enabling seamless handling of high-resolution content up to 4K UHD, which is critical for modern streaming and surveillance systems.[71] Their scalable vector processing architecture, with vector lengths ranging from 128-bit in VPX2 to 1024-bit in VPX6, optimizes power, performance, and area (PPA) for embedded workloads, allowing designers to tailor the core to specific video codec requirements without excess overhead.[71] In AI-driven tasks, the ARC EV6x processors excel in edge inference for embedded vision, powering object detection in smart cameras through integration with convolutional neural networks (CNNs). The more advanced EV7x variant delivers up to 35 TOPS of performance in a 16 nm process, supporting complex deep learning models while maintaining low power consumption suitable for battery-operated devices.[105] This capability stems from the processors' optional deep neural network (DNN) accelerators, which scale up to 3,520 multiply-accumulate (MAC) units across multiple cores.[106] Practical implementations highlight the versatility of ARC processors in multimedia and AI contexts. In storage systems, ARC EV cores are integrated into SSD controllers to accelerate AI-based data analytics and error correction, enhancing throughput in enterprise drives without dedicated ASICs.[107] For aerial applications, the Inuitive NU4100 SoC employs ARC EV processors for RGBD vision processing in drones, enabling real-time depth sensing and obstacle avoidance during flight.[108] These examples underscore the processors' role in fusing multimedia pipelines with AI inference at the edge. A key benefit of ARC's configurable vector units is their ability to reduce inference latency by up to 2x compared to general-purpose CPUs, achieved through hardware-software co-optimization via the MetaWare development toolkit, which supports standards like OpenCV and OpenVX for streamlined deployment.[109] This configurability minimizes area waste and power draw, making ARC processors ideal for resource-constrained environments where high-performance multimedia and AI must coexist efficiently.[71]

Development Tools

Software Development Kit

The Synopsys ARC MetaWare Development Toolkit serves as the primary software development kit (SDK) for programming and optimizing applications on ARC processors, encompassing a suite of tools, compilers, debuggers, libraries, and runtime software tailored for embedded systems.[110] It features an Eclipse-based integrated development environment (IDE) that facilitates project creation, code management, building, and debugging, with support for third-party plugins to extend functionality.[111][112] The toolkit includes ANSI-C compliant C compilers with ISO extensions and C++ compilers supporting partial specialization, the Standard Template Library, and LLVM-based optimizations, targeting ARC Classic processors (such as HS, EM, and VPX series) as well as ARC-V RISC-V-based architectures.[110] Additionally, Synopsys provides a complementary free GNU toolchain based on GCC and Clang for ARCv2, ARCv3 (ARC-V), and RISC-V, enabling broader accessibility for developers.[87] The MetaWare debugger offers both graphical and command-line interfaces for application debugging, profiling, and performance tuning, integrated with the ARC nSIM instruction-set simulator to support early software development across all ARC processor variants without hardware.[110] It includes semantic and peripheral displays for enhanced visibility into processor states and multicore debugging capabilities. The toolkit's libraries provide ARC-specific optimizations, such as the MetaWare Vector DSP Library for signal processing functions on VPX processors and the Vector Linear Algebra Library implementing BLAS/LAPACK algorithms for compute-intensive tasks.[113] For AI and machine learning, the MetaWare Neural Network SDK (ARC NN library) compiles TensorFlow and ONNX models into fixed-point code for ARC NPX NPUs and VPX DSPs, while math libraries cover essential functions like trigonometry and logarithms.[113] Middleware components support networking protocols and AI inference, with vision processing libraries for tasks like image resizing and filtering as pre- or post-processing aids.[113] Optimization features in the MetaWare compilers emphasize code efficiency for resource-constrained environments, including auto-vectorization to automatically generate SIMD instructions for vectorizable loops on ARC VPX DSPs, and intrinsics for accessing custom instructions like DSP extensions in ARC EM processors.[72][114] An automatic overlay manager further reduces memory footprint by dynamically loading code segments. The toolkit supports operating systems such as Linux and uClinux for ARC processors, enabling development of Linux-based embedded applications with OS-aware debugging.[115][110] The broader ecosystem is bolstered by the Synopsys ARC Access Program, which collaborates with third-party vendors to provide pre-ported tools and IP compatible with ARC processors, including Percepio's Tracealyzer for visual runtime diagnostics and tracing of embedded software behavior.[116][117] This program reduces development risks by offering tested solutions for RTOSes, middleware, and design services, fostering integration with ARC MetaWare for accelerated software deployment.[116]

Design and Verification Tools

Synopsys provides ARChitect as a dedicated IP configuration tool for customizing ARC processor cores, allowing designers to select and tailor architectural features such as pipeline depth, cache configurations, and peripheral integrations to meet specific SoC requirements. This tool generates synthesizable RTL code, including Verilog, streamlining the hardware implementation process for ARC-based designs.[55] For advanced customization, the ASIP Designer suite enables the synthesis of application-specific instruction-set processors (ASIPs), starting from high-level architectural descriptions in a domain-specific language. It automates the generation of both hardware RTL (in synthesizable Verilog or VHDL) and a corresponding software toolkit, including compilers and assemblers, to optimize performance for domain-specific workloads like signal processing or AI acceleration. This approach reduces design time by facilitating rapid exploration of custom instructions and microarchitectures.[118] Verification of ARC processors relies on Synopsys' Verdi Automated Debug System, which offers comprehensive waveform viewing, signal tracing, and protocol analysis to identify and resolve design issues during simulation and emulation. Complementing Verdi, TestWeaver provides automated test generation and execution for virtual prototypes, ensuring high functional coverage with minimal manual specification through AI-driven scenario creation. For ARC-V RISC-V-based processors introduced post-2023, Synopsys integrates verification solutions from its 2023 acquisition of Imperas, including extensible virtual platforms and compliance test suites to validate custom extensions against the RISC-V ISA.[119][120][121] Subsystem integration for ARC SoCs is supported through configurable ARC subsystems, which assemble processor cores with peripherals, memories, and interconnects using standard interfaces like AXI for high-performance data transfer. These pre-verified subsystems incorporate ARC-specific interconnect options to optimize latency and throughput, enabling seamless scaling from embedded controllers to complex multi-core designs.[122] The overall design flow for ARC processors is optimized for leading foundry processes, including TSMC's advanced nodes and Samsung Foundry's technologies, with full PDK support for synthesis and place-and-route. Power analysis is performed using PrimeTime, which delivers accurate static and dynamic power estimation, signal integrity checks, and variation-aware optimization to ensure low-power operation in battery-constrained applications. This end-to-end flow from RTL generation to signoff accelerates tapeout while maintaining design reliability.[123][124]

References

User Avatar
No comments yet.