Hubbry Logo
SM4 (cipher)SM4 (cipher)Main
Open search
SM4 (cipher)
Community hub
SM4 (cipher)
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
SM4 (cipher)
SM4 (cipher)
from Wikipedia
SM4
General
DesignersData Assurance & Communication Security Center, Chinese Academy of Sciences
First published2006 (declassified; standardized March 21, 2012)[1]
Cipher detail
Key sizes128 bits
Block sizes128 bits
Structureunbalanced Feistel network
Rounds32
Best public cryptanalysis
Linear and differential attacks against 22 rounds

ShāngMì 4 (SM4, 商密4) (formerly SMS4)[2] is a block cipher, standardised for commercial cryptography in China.[3] It is used in the Chinese National Standard for Wireless LAN WAPI (WLAN Authentication and Privacy Infrastructure), and with Transport Layer Security.[4]

SM4 was a cipher proposed for the IEEE 802.11i standard, but it has so far been rejected. One of the reasons for the rejection has been opposition to the WAPI fast-track proposal by the IEEE.[citation needed]

SM4 was published as ISO/IEC 18033-3/Amd 1 in 2021.

The SM4 algorithm was drafted by Data Assurance & Communication Security Center, Chinese Academy of Sciences (CAS), and Commercial Cryptography Testing Center, National Cryptography Administration. It is mainly developed by Lü Shuwang (Chinese: 吕述望). The algorithm was declassified in January, 2006, and it became a national standard (GB/T 32907-2016) in August 2016.[5]

Cipher detail

[edit]

The SM4 cipher has a key size and a block size of 128 bits each.[6][7] Encryption or decryption of one block of data is composed of 32 rounds. A non-linear key schedule is used to produce the round keys and the decryption uses the same round keys as for encryption, except that they are in reversed order.

Keys and key parameters

[edit]

The length of encryption keys is 128 bits, represented as , in which is a 32-bit word. The round keys are represented by , where each is a word. It is generated by the encryption key and the following parameters:

and are words, used to generate the round keys.

Round

[edit]

Each round are computed from the four previous round outputs such that:

Where is a substitution function composed of a non-linear transform, the S-box and linear transform

S-box

[edit]

SM4's S-box is fixed for 8-bit input and 8-bit output, noted as Sbox(). As with Advanced Encryption Standard (AES), the S-box is based on the multiplicative inverse over GF(28). The affine transforms and polynomial bases are different from that of AES, but due to affine isomorphism it can be calculated efficiently given an AES S-Box.[8]

History

[edit]

On March 21, 2012, the Chinese government published the industrial standard "GM/T 0002-2012 SM4 Block Cipher Algorithm", officially renaming SMS4 to SM4.[2]

A description of SM4 in English is available as an Internet Draft. It contains a reference implementation in ANSI C.[9]

SM4 is part of the ARMv8.4-A expansion to the ARM architecture.[10] SM4 support for the RISC-V architecture was ratified in 2021 as the Zksed extension.[11]

SM4 is supported by Intel processors, starting from Arrow Lake S, Lunar Lake, Diamond Rapids and Clearwater Forest.[12]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
SM4 is a symmetric standardized by China's State Administration as GB/T 32907-2016 for commercial cryptographic applications requiring data confidentiality. It operates on 128-bit blocks with 128-bit keys, employing 32 rounds of an unbalanced Feistel network where the F-function combines linear diffusion via a matrix multiplication-like operation and non-linearity through four 8-bit S-boxes, followed by a key-dependent round constant addition. Originally designated SMS4 and released in 2006 for securing the Chinese WLAN Authentication and Infrastructure (WAPI), the algorithm was redesignated SM4 upon its elevation to national standard status in 2012 under GM/T 0003-2012 and further formalized in 2016, reflecting China's emphasis on indigenous independent of foreign designs like AES. While SM4 has withstood various cryptanalytic attacks, including reduced-round differential and linear distinguishers, no practical breaks against the full 32 rounds have been demonstrated as of recent peer-reviewed analyses, affirming its security margin for 128-bit security levels. The cipher's deployment extends to hardware implementations in Chinese smart cards, VPNs, and systems, underscoring its role in the ShangMi (SM) family of algorithms that prioritize national self-reliance in infrastructure.

Technical Specifications

Block and Key Parameters


SM4 is a symmetric standardized in GB/T 32907-2016, featuring a fixed block size of 128 bits and a key length of 128 bits. Both and decryption consist of 32 rounds, utilizing an unbalanced Feistel network structure.
The and blocks are divided into four 32-bit words, typically denoted as X0,X1,X2,X3X_0, X_1, X_2, X_3 for processing. The user key, referred to as the master key MKMK, is likewise structured as four 32-bit words MK0,MK1,MK2,MK3MK_0, MK_1, MK_2, MK_3. System parameters include fixed 32-bit values FK0,FK1,FK2,FK3FK_0, FK_1, FK_2, FK_3 used in key expansion, ensuring consistent derivation of round keys.
ParameterDescriptionValue
Block sizeLength of / block128 bits
Key sizeLength of user key128 bits
Number of roundsIterations in /decryption32
Word sizeInternal processing unit32 bits

Overall Structure and Operations

SM4 is a symmetric-key that processes 128-bit blocks into 128-bit blocks using a 128-bit secret key. It employs an unbalanced Feistel network architecture with 32 rounds of identical transformations, differing from balanced Feistel ciphers by applying the round function to a combination of three state words to update the fourth. The is divided into four consecutive 32-bit words, denoted as X0,X1,X2,X3X_0, X_1, X_2, X_3. In each round ii (where ii ranges from 0 to 31), the state is updated according to the Xi+4=XiF(Xi+1Xi+2Xi+3rki)X_{i+4} = X_i \oplus F(X_{i+1} \oplus X_{i+2} \oplus X_{i+3} \oplus rk_i), with FF representing the round function and rkirk_i the 32-bit round subkey derived from the . This operation effectively shifts the state words leftward while XORing the output of FF into the oldest word, maintaining the unbalanced partition where one 32-bit word is updated based on the remaining 96 bits XORed with the subkey. Following the 32 rounds, the ciphertext is formed by concatenating X35X34X33X32X_{35} \| X_{34} \| X_{33} \| X_{32}. Decryption mirrors the encryption exactly, utilizing the same round function and but applying the round subkeys in reverse order from rk31rk_{31} to rk0rk_0, which recovers the original plaintext as X35X34X33X32X_{35} \| X_{34} \| X_{33} \| X_{32}. This symmetry in simplifies implementation, as no additional swaps or rearrangements are required beyond key reversal.

Round Function Details

The SM4 employs an unbalanced Feistel network structure with 32 rounds of iteration. The is divided into four 32-bit words denoted as X0,X1,X2,X3X_0, X_1, X_2, X_3. In the ii-th round (where i=0i = 0 to 3131), the state update is given by Xi+4=XiF(Xi+1Xi+2Xi+3rki)X_{i+4} = X_i \oplus F(X_{i+1} \oplus X_{i+2} \oplus X_{i+3} \oplus rk_i), where rkirk_i is the 32-bit round key for that round and FF is the core round function. The round function FF operates on inputs X0,X1,X2,X3,rkX_0, X_1, X_2, X_3, rk as F(X0,X1,X2,X3,rk)=X0T(X1X2X3rk)F(X_0, X_1, X_2, X_3, rk) = X_0 \oplus T(X_1 \oplus X_2 \oplus X_3 \oplus rk), with TT composed of a nonlinear byte-wise substitution τ\tau followed by a linear transformation LL. This design introduces and through the XOR combination of the three rightmost words with the round key before applying TT. Following the 32 rounds, the final state consists of words X32,X33,X34,X35X_{32}, X_{33}, X_{34}, X_{35}, and the is formed by reversing their order: Y0=X35Y_0 = X_{35}, Y1=X34Y_1 = X_{34}, Y2=X33Y_2 = X_{33}, Y3=X32Y_3 = X_{32}. Decryption mirrors the process, using the same round function but applying the round keys in reverse sequence (rk31rk_{31} to rk0rk_0) and performing the final word reversal. This structure ensures that encryption and decryption algorithms are nearly identical, differing only in round key order, which facilitates efficient implementation in hardware and software.

S-box and Linear Transformations

The SM4 cipher utilizes a single fixed S-box for nonlinear substitution, mapping 8-bit inputs to 8-bit outputs via a 256-entry lookup table defined in hexadecimal notation. This S-box, denoted S(·), achieves a nonlinearity of 112, differential uniformity with maximum probability 262^{-6}, and linear approximation bias bounded by 242^{-4}; it is bijective, complete, and satisfies the strict avalanche criterion, ensuring that on average, half of the output bits change for a single-bit input flip. The S-box can be expressed algebraically as a 254th-degree polynomial over GF(2^8), though practical implementations rely on the table to avoid computational overhead. In the round function, the nonlinear transformation τ processes a 32-bit word by splitting it into four bytes and applying the S-box in parallel: if the input is A=(a0,a1,a2,a3)A = (a_0, a_1, a_2, a_3), then τ(A)=(S(a0),S(a1),S(a2),S(a3))\tau(A) = (S(a_0), S(a_1), S(a_2), S(a_3)). Diffusion follows via byte-wise linear transformations on the 32-bit output of τ. The primary transformation L, integral to the data round function, operates on a 32-bit input BB as L(B)=B(B2)(B10)(B18)(B24)L(B) = B \oplus (B \ll 2) \oplus (B \ll 10) \oplus (B \ll 18) \oplus (B \ll 24), where \ll denotes left cyclic rotation and \oplus bitwise XOR; this inverts to L1(B)=B(B30)(B22)(B14)(B6)L^{-1}(B) = B \oplus (B \ll 30) \oplus (B \ll 22) \oplus (B \ll 14) \oplus (B \ll 6). The round permutation is then T=LτT = L \circ \tau. A related transformation L', used in key expansion, simplifies to L(B)=B(B13)(B23)L'(B) = B \oplus (B \ll 13) \oplus (B \ll 23), yielding T=LτT' = L' \circ \tau. These operations, specified in the Chinese national standard GB/T 32907-2016, promote rapid mixing while maintaining invertibility for decryption equivalence to encryption with reversed round keys.

Key Schedule and Expansion

Key Generation Process

The SM4 key generation process, also known as the key expansion algorithm, derives 32 round keys rk0rk_0 to rk31rk_{31}, each 32 bits long, from a 128-bit master key MKMK. The master key is divided into four 32-bit words: MK=(MK0,MK1,MK2,MK3)MK = (MK_0, MK_1, MK_2, MK_3). This expansion ensures that the round keys provide sufficient diffusion and nonlinearity for the 32 encryption rounds, using a structure analogous to the main cipher rounds but with distinct transformations. Fixed system parameters FK=(FK0,FK1,FK2,FK3)FK = (FK_0, FK_1, FK_2, FK_3) initialize the process, with values FK0=FK_0 = A3B1BAC616_{16}, FK1=FK_1 = 56AA335016_{16}, FK2=FK_2 = 677D919716_{16}, and FK3=FK_3 = B27022DC16_{16}. Initial array elements are computed as K0=MK0FK0K_0 = MK_0 \oplus FK_0, K1=MK1FK1K_1 = MK_1 \oplus FK_1, K2=MK2FK2K_2 = MK_2 \oplus FK_2, and K3=MK3FK3K_3 = MK_3 \oplus FK_3. Additionally, 32 fixed parameters CK=(CK0,CK1,,CK31)CK = (CK_0, CK_1, \dots, CK_{31}) are predefined, where each CKiCK_i (a 32-bit word in little-endian byte order) has bytes CKi,j=(4i+j)×7(mod256)CK_{i,j} = (4i + j) \times 7 \pmod{256} for j=0j = 0 to 33; for example, CK0=CK_0 = 00070E1516_{16}. The expansion proceeds iteratively for i=0i = 0 to 3131: Ki+4=KiT(Ki+1Ki+2Ki+3CKi),K_{i+4} = K_i \oplus T'(K_{i+1} \oplus K_{i+2} \oplus K_{i+3} \oplus CK_i), where TT' is a key schedule transformation defined as T(X)=L(τ(X))T'(X) = L'(\tau(X)). The function τ\tau applies the SM4 substitution in parallel to each byte of the 32-bit input XX, producing a nonlinear byte-wise output. The linear transformation LL' then operates on this result BB as L(B)=B(B13)(B23)L'(B) = B \oplus (B \ll 13) \oplus (B \ll 23), using 32-bit word rotations (denoted \ll) and XOR operations. The round keys are assigned as rki=Ki+4rk_i = K_{i+4} for i=0i = 0 to 3131, yielding rk0=K4rk_0 = K_4 through rk31=K35rk_{31} = K_{35}. This process generates keys K4K_4 to K35K_{35} without reusing K0K_0 to K3K_3 as round keys. For decryption, the same round keys are applied in reverse order (rk31rk_{31} to rk0rk_0) across the 32 rounds, leveraging the cipher's involutory where encryption and decryption differ only in key sequencing. The key schedule's reliance on the S-boxes and rotations aims to resist related-key attacks, though it shares structural similarities with the data path to facilitate efficient hardware .

Round Key Derivation

The round key derivation in SM4 begins with the 128-bit master key MK=(MK0,MK1,MK2,MK3)MK = (MK_0, MK_1, MK_2, MK_3), where each MKiMK_i is a 32-bit word. These are XORed with fixed parameters FK=(FK0,FK1,FK2,FK3)FK = (FK_0, FK_1, FK_2, FK_3), defined as FK0=A3B1BAC616FK_0 = \mathrm{A3B1BAC6}_{16}, FK1=56AA335016FK_1 = 56\mathrm{AA}3350_{16}, FK2=677D919716FK_2 = 677\mathrm{D}9197_{16}, and FK3=B27022DC16FK_3 = \mathrm{B270}22\mathrm{DC}_{16} (in ), to initialize intermediate values K0=MK0FK0K_0 = MK_0 \oplus FK_0, K1=MK1FK1K_1 = MK_1 \oplus FK_1, K2=MK2FK2K_2 = MK_2 \oplus FK_2, and K3=MK3FK3K_3 = MK_3 \oplus FK_3. Subsequent round keys rkirk_i for i=0i = 0 to 3131 are generated iteratively using Ki+4=KiT(Ki+1Ki+2Ki+3CKi)K_{i+4} = K_i \oplus T'(K_{i+1} \oplus K_{i+2} \oplus K_{i+3} \oplus CK_i), where rki=Ki+4rk_i = K_{i+4} and CKiCK_i are fixed 32-bit constants derived from the formula cki,j=(4i+j)×7(mod256)ck_{i,j} = (4i + j) \times 7 \pmod{256} for bytes j=0j = 0 to 33 (e.g., CK0=00070E1516CK_0 = 00070\mathrm{E}15_{16}, CK31=646B727916CK_{31} = 646\mathrm{B}7279_{16}). The transformation TT' is defined as T(X)=L(τ(X))T'(X) = L'(\tau(X)), with τ\tau applying the SM4 S-box to each byte of the 32-bit input and L(B)=B(B13)(B23)L'(B) = B \oplus (B \ll 13) \oplus (B \ll 23), where \ll denotes left circular rotation by the specified bits. This nonlinear mirrors the structure of the main rounds but uses distinct fixed parameters FKFK and CKCK to ensure and avoid issues, producing 32 independent 32-bit round keys for the 32 rounds. For decryption, the same round keys are applied in reverse order (rk31rk_{31} to rk0rk_0), leveraging the cipher's involutory design without requiring a separate .

History and Development

Origins in Chinese Cryptographic Standards

The SM4 block cipher, originally designated as SMS4, was developed specifically for the WLAN Authentication and Privacy Infrastructure (WAPI), China's national standard for securing wireless local area networks, codified as GB 15629.11-2003. WAPI was mandated by the Chinese government for all WLAN products sold domestically, aiming to establish a homegrown security protocol amid tensions over foreign standards like Wi-Fi Protected Access. The cipher's design emphasized a 128-bit block size and key length with 32 rounds of unbalanced Feistel operations, tailored for efficient hardware implementation in resource-constrained wireless environments. SMS4 was publicly disclosed on January 15, 2006, by China's State Cryptography Administration (now the State Cryptographic Administration), marking its initial release as part of the commercial cryptographic algorithm suite independent of international standards such as AES. This provided the full specification, including the nonlinear and linear transformations derived from operations, enabling global scrutiny while requiring its use in WAPI-compliant devices. The algorithm's origins reflect China's strategic push for cryptographic sovereignty, with subsequent refinements leading to its formal adoption as SM4 under the national standard GB/T 32907-2016, published on August 25, 2016.

Standardization Process

The SM4 block cipher, initially designated as SMS4, was developed by the Chinese government as a national cryptographic standard to support the WLAN Authentication and Privacy Infrastructure (WAPI), a security protocol defined in GB/T 15629.11-2003. Released publicly in January 2006 by the Office of State Commercial Cryptography Administration (OSCCA), SMS4 was specified for use in protecting within WAPI-compliant devices, serving as an indigenous alternative to international ciphers like AES amid concerns over foreign technology dependence in . The algorithm's publication included detailed specifications for public scrutiny and implementation, though its adoption was limited domestically due to WAPI's mandatory requirements, which sparked international disputes, including rejection of WAPI's integration into IEEE 802.11i standards over and control issues. Following initial deployment in WAPI products, the cipher underwent formal as a commercial . On March 21, 2012, the OSCCA issued GM/T 0002-2012, renaming SMS4 to SM4 and establishing it as the official for non-classified commercial applications in , with mandates for its use in government-approved products requiring cryptographic protection. This standard emphasized SM4's Feistel , 128-bit and key sizes, and 32-round , while requiring implementations to undergo by authorized labs to ensure compliance and resistance to known attacks. SM4's status was further elevated in 2016 when it was promulgated as the national standard GB/T 32907-2016 by the Standardization Administration of , solidifying its role as the primary symmetric for widespread commercial and industrial use, including in smart cards, VPNs, and systems. This progression from WAPI-specific algorithm to broad national standard reflected 's strategic emphasis on cryptographic , with ongoing evaluations ensuring its against cryptanalytic advances, though implementation guidelines remain partially restricted to prevent unauthorized exports.

Initial Secrecy and Public Disclosure

SMS4, initially designated SMS4, was developed by the Chinese State Cryptography Administration for securing the WLAN Authentication and Privacy Infrastructure (WAPI) standard, with its details kept confidential following WAPI's announcement in December 2003. The algorithm's specification remained classified to protect national cryptographic interests, limiting international scrutiny and contributing to disputes over WAPI's compatibility with global standards. In January 2006, the State Cryptography Administration declassified and publicly released the SMS4 algorithm to enable cryptanalytic evaluation and broader implementation, amid pressures from WAPI's mandatory adoption in and ongoing debates. This disclosure revealed SMS4 as a 128-bit with a Feistel-like structure, prompting immediate academic analysis that confirmed its resistance to known attacks at full rounds. On March 21, 2012, the algorithm was formally standardized as SM4 under GM/T 0002-2012 by the Commercial Cryptography Administration of , renaming it for commercial use and integrating it into national cryptographic guidelines while maintaining export controls on related technologies. This transition marked its evolution from a WAPI-specific to a general-purpose standard, though full implementation details continued to require official authorization in sensitive applications.

Cryptanalysis and Security Evaluation

Resistance to Differential and Linear Attacks

SM4 employs an unbalanced Feistel structure with 32 rounds, incorporating a nonlinear in the round function that provides strong and properties, designed to thwart differential and by ensuring a sufficient number of active across multiple rounds. The cipher's designers targeted a security margin comparable to AES, with the selected to maximize resistance against these attacks through low differential and probabilities. In differential cryptanalysis, the maximum probability of a 1-round differential characteristic is bounded by 26.172^{-6.17}, derived from the S-box's differential distribution table, leading to an expected minimum of 25-28 active S-boxes over the full 32 rounds under related-key settings, exceeding the threshold for 21282^{-128} . Optimal 19-round differential characteristics achieve a probability upper bound of 21232^{-123}, but extending to full rounds requires infeasible data and computation, with no practical key-recovery attack known; reduced-round attacks, such as on 22 rounds, remain theoretical and non-viable for the full due to the round count. For linear cryptanalysis, the best approximations over 3-round iterations yield biases around 2202^{-20} to 2242^{-24}, necessitating over 40 active S-boxes for negligible full-round bias, which SM4 satisfies with lower bounds of 36-40 linear active S-boxes across 32 rounds. Attacks on reduced variants include a 22-round linear key-recovery with 21172^{117} complexity and a 25-round improvement using refined statistics, both far from practical due to exceeding the block size in required plaintexts and time exceeding 21282^{128}. No full-round linear attack has been demonstrated, affirming SM4's resistance, though ongoing research refines bounds via mixed-integer without compromising the full cipher.

Side-Channel and Fault Injection Vulnerabilities

SM4 implementations are susceptible to side-channel attacks, particularly differential power analysis (DPA) and correlation power analysis (CPA), which exploit power consumption variations during lookups and linear transformations to recover round keys with thousands of traces. Distributed CPA variants have been shown to reduce the required traces and computation time by partitioning power traces into subsets, enabling key recovery on SM4 hardware chips using standard oscilloscopes and correlation metrics. Deep learning-based side-channel analysis has also demonstrated effectiveness against masked SM4 implementations, classifying intermediate values from electromagnetic or power traces to bypass first-order protections, often requiring fewer than 1,000 traces for full key extraction. Software implementations of SM4 face cache-timing vulnerabilities due to table lookups in key expansion and round functions, where access patterns leak information via cache state differences observable across multiple executions. Hardware realizations without masking or threshold schemes remain vulnerable to second-order DPA, which targets multivariate leakage from non-linear operations like the S-box, potentially recovering keys after 10^5 to 10^6 traces depending on noise levels. Fault injection attacks exploit SM4's iterative structure, with differential fault analysis (DFA) allowing key recovery by inducing random byte faults in the last few rounds and solving for round key differences using output differentials. A single random byte fault in the penultimate round suffices for DFA on SM4, enabling enumeration of the fault-affected byte and propagation to recover the full 128-bit key via 2^8 to 2^16 computations per candidate. Persistent fault analysis (PFA) targets T-table implementations, where one fault in the inverse linear transformation combined with differential equations leaks the entire key without additional faults. Practical low-cost electromagnetic (EM) fault injection has recovered SM4 keys on commercial SoCs using voltage glitching or pulsed EM probes, inducing single-bit or byte faults in 2-4 rounds with success rates exceeding 50% per attempt after 10-20 injections. DFA variants on early rounds require 16-32 faults for full key recovery under controlled injection, assuming attacker access to plaintext-ciphertext pairs and fault locations via internal collisions. These vulnerabilities highlight SM4's sensitivity to implementation faults, comparable to AES but amplified by its fixed 32-round design and lack of inherent fault detection in standard deployments.

Advanced and Theoretical Attacks

Advanced cryptanalytic efforts on SM4 have explored techniques beyond standard differential and linear cryptanalysis, including impossible differentials, and distinguishers, multiple linear approximations, and algebraic methods, but all successful attacks remain confined to reduced rounds with complexities far exceeding for the full 32 rounds. These approaches exploit structural properties of SM4's Feistel-like network and , yet the cipher's design provides a substantial margin against practical key recovery, as the highest round coverage is 22 rounds with data and time requirements around 2^{112} to 2^{124}. Impossible differential attacks leverage input-output differences that cannot propagate through certain round combinations, allowing key candidate sieving. A 17-round impossible differential attack requires approximately 2^{103} chosen plaintexts, 2^{124} encryptions, and 2^{89} words of memory, improving slightly on prior 16-round variants that used similar but less efficient propagators. Extensions and verifications of 12-round impossible differentials have confirmed their validity but do not extend to higher rounds without increasing complexity prohibitively. These attacks cover roughly half of SM4's rounds, underscoring the non-linear diffusion's resistance to longer propagators. Boomerang and related attacks combine short differential trails in a quartic manner to distinguish reduced SM4 from random permutations. Boomerang distinguishers and attacks have been applied to 18 rounds, outperforming single-trail differentials in round coverage but requiring chosen plaintexts on the order of 2^{100} or more, with time complexities approaching 2^{120}. Such methods exploit the cipher's balanced Feistel structure but falter beyond 18 rounds due to low-probability quartets and the accumulating effect of the 32 round keys. Multiple aggregates numerous low-bias approximations to amplify overall distinguishability. One such attack targets 22 rounds using six 18-round characteristics with aggregate 2^{-56.14} and two with 2^{-57.28}, necessitating 2^{112} known plaintext-ciphertext pairs and roughly 2^{124.21} operations for key recovery. This extends linear coverage beyond single-trail limits but remains theoretical, as the data volume exceeds feasible computation for the full . Algebraic attacks model SM4's S-box and linear layers as multivariate equations over GF(2) or GF(2^8), seeking low-degree solutions via Gröbner bases or SAT solvers. Applications to 20-round SM4, often combined with differentials, yield partial key bits but no full key recovery, with solving times scaling exponentially due to the non-linear F-function's resistance to algebraic simplification compared to AES. These efforts highlight SM4's algebraic degree but confirm no advantage over exhaustive search for practical scenarios. Overall, the absence of attacks nearing 32 rounds affirms SM4's theoretical against known advanced techniques.

Comparative Security with AES

SM4 and AES-128 both provide 128-bit block and key sizes, yielding equivalent brute-force security levels of approximately 21282^{128} operations against exhaustive search. SM4 utilizes an unbalanced Feistel network with 32 rounds, incorporating a nonlinear S-box and linear transformation in its F-function, while AES-128 employs a substitution-permutation network (SPN) with 10 rounds, featuring byte-wise SubBytes, ShiftRows, MixColumns, and AddRoundKey operations. These architectures achieve full and , but SM4's higher round count contributes to a broader margin against iterative attacks, as each round applies the F-function to one-quarter of the state. Against differential cryptanalysis, SM4 resists attacks beyond 19 rounds based on optimized differential distinguishers, with probability bounds ensuring full-round exceeds 21282^{128} complexity. AES-128 similarly withstands differential attacks, with the best theoretical multicollision distinguishers requiring 21282^{128} chosen plaintexts for full rounds, rendering them impractical. evaluations confirm SM4's approximations yield biases insufficient for full-round key recovery, comparable to AES's Matyas-Meyer-Oseas construction resistance, where linear hulls approximate 14 rounds but fail at full strength due to . Both ciphers incorporate es designed for low differential uniformity (SM4: maximum 4 active es per round; AES: similar bounds), thwarting related-key and variants up to reduced rounds without practical full breaks. Algebraic attacks, modeling rounds as multivariate equations over GF(2), indicate SM4's structure imposes higher algebraic degree and nonlinearity, potentially requiring more variables for Gröbner basis solutions than AES-128's polynomial system, suggesting relative robustness. Side-channel analyses, such as differential power analysis, exploit implementation leaks, but SM4's interleaved key-data mixing in rounds may reduce Hamming weight correlations compared to AES's sequential key addition, enhancing resistance in unprotected hardware. No full-round practical key recoveries exist for either, though AES has endured broader independent scrutiny since 2001, while SM4's evaluations, post-2006 disclosure, predominantly stem from Chinese-led research with fewer Western verifications.

Implementations and Performance

Software Implementations

provides support for SM4 through its , enabling symmetric encryption in modes such as CBC and XTS, with the latter added in version 3.2.0 released on November 27, 2023. This integration allows developers to invoke SM4 for encryption and decryption via standard function calls, as utilized in distributions like openEuler. Bouncy Castle, a cryptography library, implements SM4 via the SM4Engine class, which handles 128-bit block and key processing based on the cipher's specification. wolfSSL incorporated SM4 into its wolfCrypt library in July 2023, extending support to TLS 1.3 protocols and other embedded applications. Software optimizations for SM4 leverage SIMD instructions and bit-slicing techniques to enhance throughput. A bit-sliced implementation achieves 2,437 Mbps on processors using AVX2 extensions. Another approach yields 2,580 Mbps on an i7-7700HQ at 2.80 GHz, surpassing prior benchmarks by 43%. Constant-time implementations report 3.77 cycles per byte on x86 platforms with AES-NI and AVX2. The kernel's crypto API includes SM4 accelerations via AVX and AES-NI instructions, delivering approximately 5x performance gains over baseline scalar code on modern and CPUs as of June 2021 patches. These optimizations prioritize side-channel resistance while maintaining compatibility with standard modes like ECB, CBC, and GCM.

Hardware and Optimized Designs

Hardware implementations of SM4 capitalize on its 32-round Feistel-like structure, fixed nonlinear S-boxes, and linear byte transformations, enabling efficient parallelism through techniques such as pipelining, , and shared logic for key expansion and rounds. In ASIC designs using SMIC 18 nm technology, an optimized 8-bit iterative architecture (ULSM4) achieves 2.51 thousand gate equivalents (KGE), representing an 18% area reduction over unrolled counterparts, with throughput of 217.5 Mbps at 435 MHz and decryption at 149.7 Mbps. Key efficiencies stem from a single shared for both and data path, on-the-fly key expansion to eliminate storage, and dynamic constant generation via equations instead of lookup tables. Field-programmable gate array (FPGA) implementations balance area, throughput, and power for reconfigurable systems, with scalar (iterative) designs favoring intermittent and pipelined variants suiting high-volume streams. On platforms like Cyclone V, scalar designs with 1-2 rounds per iteration consume around 1,058 logic elements (LEs) and 14.87 mW, yielding ~400 Mbps throughput, while 8-16 round pipelined configurations reach up to 4 Gbps but require 14,860 LEs and higher power (163 mW), offering 40% better energy efficiency per block (e.g., 3,262 pJ/block). Commercial cores, such as CAST's SM4 IP, deliver up to 8 Gbps in and 2.6 Gbps in FPGAs with minimal area overhead. For resource-constrained IoT applications, combined SM4-CCM modes emphasize low power and area, as in a 90 nm ASIC/FPGA design using online key expansion and a single SM4 core with nonlinear transform optimizations (NLT4), attaining 200 Mbps throughput, 14.6 KGE, and 1.625 mW consumption. Advanced techniques like split-and-join processing and off-peak staggering further adapt SM4 to ultra-low-resource environments by redistributing computations and minimizing peak resource demands.

Quantum and Emerging Implementations

Quantum circuit implementations of SM4 have been developed to evaluate its execution on quantum hardware and to quantify resources for potential quantum attacks. An optimized reversible circuit requires 260 qubits, the lowest reported for SM4 or similar block ciphers with 8-bit S-boxes, 128-bit plaintext, and 128-bit keys. This design incorporates composite field arithmetic for four S-box variants, serial subcircuit connections to minimize qubits, and parallel structures to balance depth and width, achieving a depth-times-width product of 494,208 in a 288-qubit trade-off variant with 1,716 Toffoli depth—superior to prior implementations exceeding 82 million in this metric. Alternative constructions exploit SM4's Feistel network to reuse 32 state qubits across rounds and decompose linear transformations into fewer XOR operations (e.g., 83 for the ), while use 14 auxiliary qubits without initial-state constraints. Parallelism in evaluation trades qubit count (128 + 14n, where n is the number of parallel ) for reduced depth-times-width, enabling fault-tolerant adaptations via surface codes. These circuits facilitate Grover-based exhaustive key searches, each evaluation demanding the full SM4 computation. In post-quantum contexts, SM4's 128-bit key yields approximately 64-bit against , necessitating ~2^{64} oracle calls—each involving thousands of and billions of Toffoli/CNOT gates based on circuit metrics. Evaluations show SM4's quantum resource profile (higher qubit and depth-width demands relative to AES-128) as marginally less attacker-friendly, though both resist practical quantum threats given current hardware limitations of noisy intermediate-scale systems with under 1,000 qubits and high error rates. Emerging adaptations recommend doubling key sizes or hybrid modes for 128-bit post-quantum , as SM4 lacks native 256-bit keys.

Adoption, Applications, and Criticisms

Domestic Use in

SM4 serves as the foundational in 's WLAN and Privacy Infrastructure (WAPI), the national standard for securing local area networks (WLANs), where it provides and for data transmission in domestic environments. Adopted by the government in 2006 as a commercial standard, SM4 underpins WAPI's mechanisms, mandating its use in certified WLAN equipment to align with requirements for communications. This integration promotes indigenous cryptographic protocols over international alternatives like AES in government-approved networks. The State Cryptography Administration (SCA), formerly the Office of State Commercial Cryptography Administration (OSCCA), authorizes SM4 for protecting both classified government data and commercial transactions within , as outlined in standards such as GB/T 32907-2016 for applications in technology. Chinese regulators enforce SM4 alongside other ShangMi (SM) algorithms in , including financial systems for secure transactions, for protocol encryption, and automotive electronics for vehicle-to-vehicle communications. SM4's domestic deployment extends to adaptations of (TLS) and other protocols, enabling self-reliant implementations in enterprise and public sector systems to minimize dependence on foreign . Its specification in GM/T 0002-2012 further standardizes modes of operation for widespread use in symmetric encryption scenarios, supporting applications from to network protocols in state-controlled and commercial domains.

International Adoption and Barriers

SM4 has achieved formal recognition in select international standards, facilitating limited use beyond China. In 2021, the incorporated SM4 into ISO/IEC 18033-3 via Amendment 1, listing it among approved block ciphers for encryption algorithms. Concurrently, RFC 8998, published by the in 2021, defined ShangMi cipher suites incorporating SM4 for (TLS) 1.3, primarily to support interoperability in cross-border communications with Chinese networks. These inclusions enable SM4 in protocols requiring compatibility with Chinese commercial , such as secure data exchange in multinational supply chains or VPNs interfacing with state-mandated systems. Hardware implementations have emerged internationally to meet niche demands. In September 2025, CAST announced a high-performance SM4 cipher core as for integration into and FPGAs, optimized for throughput in embedded applications compliant with both Chinese GB/T 32907-2016 and ISO standards. Such offerings target sectors like exported to or from , where dual-cipher support (e.g., alongside AES) ensures regulatory adherence without full replacement of established algorithms. Despite these developments, SM4's global adoption remains marginal, overshadowed by AES in most software ecosystems and protocols. Major libraries like have not prioritized native SM4 support, limiting its deployment to custom or vendor-specific extensions for China-facing services. Key barriers stem from early controversies surrounding its precursor, SMS4, tied to the WLAN and (WAPI) standard. Proposed in 2003 for mandatory use in Chinese devices, WAPI—relying on SMS4 for —faced rejection for fast-tracking into IEEE 802.11i due to undisclosed patents held by Chinese firms, requirements for authentication via government-approved servers, and incompatibility with existing infrastructure. These elements were criticized as erecting technical trade barriers, prompting U.S. and international pushback viewing WAPI as protectionist rather than security-focused. The algorithm's initial non-disclosure until 2006, mandated by China's Office of State Commercial Cryptography Administration, delayed independent and fostered skepticism about potential weaknesses or backdoors. Even post-publication and ISO standardization, inertia favors AES, which benefits from decades of , broader patent-free implementations, and dominance in Western-led standards like NIST FIPS 197. Regulatory hurdles in sensitive sectors, such as defense or , further restrict SM4, as governments prioritize ciphers with transparent, non-state-affiliated origins.

Geopolitical and Trust Concerns

SM4, as a cryptographic standard originating from Chinese government-backed , has elicited geopolitical concerns primarily due to its opaque development origins and the broader of state-controlled in . The algorithm was designed by a domestic team under the Commercial Cryptography Administration of , with initial specifications released in following a period of classified evaluation, contrasting sharply with the transparent, multi-year open competition that produced AES through NIST's involvement of international cryptographers and public . This closed process has prompted observations that establishing full trust in SM4's design integrity may require extended independent verification, as limited restricted diverse cryptanalytic . A pivotal illustration of these tensions arose during the 2004–2006 WAPI (WLAN Authentication and Privacy Infrastructure) dispute, where mandated SM4-based encryption for all WLAN devices sold domestically, sparking a U.S.- trade conflict. International bodies, including ISO and IEEE, rejected WAPI as a global standard, citing incompatibilities with existing protocols, proprietary control by a restricted Chinese consortium of 11 firms, and insufficient transparency in licensing and algorithm access, which fueled perceptions of and potential state oversight mechanisms. Chinese proponents countered with allegations of bias in Western standards processes, but the episode highlighted risks of compelled adoption of non-interoperable, nationally mandated crypto. While no of algorithmic backdoors in SM4 has emerged from subsequent , trust deficits persist owing to China's regulatory framework, including provisions under the 2017 Cybersecurity Law enabling demands for decryption keys or data access in commercial systems. This environment discourages widespread international reliance on SM4 beyond niche compliance needs, such as in products targeting the Chinese market, amid escalating U.S.- tech decoupling and preferences for algorithms with proven, decentralized validation like AES. Geopolitical realities thus position SM4 as a vector for risks, where dependence on state-originated primitives could expose users to undisclosed policy-driven vulnerabilities or supply-chain manipulations.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.