Hubbry Logo
SubtypingSubtypingMain
Open search
Subtyping
Community hub
Subtyping
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Subtyping
Subtyping
from Wikipedia

In programming language theory, subtyping (also called subtype polymorphism or inclusion polymorphism) is a form of type polymorphism. A subtype is a datatype that is related to another datatype (the supertype) by some notion of substitutability, meaning that program elements (typically subroutines or functions), written to operate on elements of the supertype, can also operate on elements of the subtype.

If S is a subtype of T, the subtyping relation (written as S <: T,  ST,[1] or  S ≤: T ) means that any term of type S can safely be used in any context where a term of type T is expected. The precise semantics of subtyping here crucially depends on the particulars of how "safely be used" and "any context" are defined by a given type formalism or programming language. The type system of a programming language essentially defines its own subtyping relation, which may well be trivial, should the language support no (or very little) conversion mechanisms.

Due to the subtyping relation, a term may belong to more than one type. Subtyping is therefore a form of type polymorphism. In object-oriented programming the term 'polymorphism' is commonly used to refer solely to this subtype polymorphism, while the techniques of parametric polymorphism would be considered generic programming.

Functional programming languages often allow the subtyping of records. Consequently, simply typed lambda calculus extended with record types is perhaps the simplest theoretical setting in which a useful notion of subtyping may be defined and studied.[2] Because the resulting calculus allows terms to have more than one type, it is no longer a "simple" type theory. Since functional programming languages, by definition, support function literals, which can also be stored in records, records types with subtyping provide some of the features of object-oriented programming. Typically, functional programming languages also provide some, usually restricted, form of parametric polymorphism. In a theoretical setting, it is desirable to study the interaction of the two features; a common theoretical setting is system F<:. Various calculi that attempt to capture the theoretical properties of object-oriented programming may be derived from system F<:.

The concept of subtyping is related to the linguistic notions of hyponymy and holonymy. It is also related to the concept of bounded quantification in mathematical logic (see Order-sorted logic). Subtyping should not be confused with the notion of (class or object) inheritance from object-oriented languages;[3] subtyping is a relation between types (interfaces in object-oriented parlance) whereas inheritance is a relation between implementations stemming from a language feature that allows new objects to be created from existing ones. In a number of object-oriented languages, subtyping is called interface inheritance, with inheritance referred to as implementation inheritance.

Origins

[edit]

The notion of subtyping in programming languages dates back to the 1960s; it was introduced in Simula derivatives. The first formal treatments of subtyping were given by John C. Reynolds in 1980 who used category theory to formalize implicit conversions, and Luca Cardelli (1985).[4]

The concept of subtyping has gained visibility (and synonymy with polymorphism in some circles) with the mainstream adoption of object-oriented programming. In this context, the principle of safe substitution is often called the Liskov substitution principle, after Barbara Liskov who popularized it in a keynote address at a conference on object-oriented programming in 1987. Because it must consider mutable objects, the ideal notion of subtyping defined by Liskov and Jeannette Wing, called behavioral subtyping is considerably stronger than what can be implemented in a type checker. (See § Function types below for details.)

Examples

[edit]
Example of subtypes: where bird is the supertype and all others are subtypes as denoted by the arrow in UML notation

A simple practical example of subtypes is shown in the diagram. The type "bird" has three subtypes "duck", "cuckoo" and "ostrich". Conceptually, each of these is a variety of the basic type "bird" that inherits many "bird" characteristics but has some specific differences. The UML notation is used in this diagram, with open-headed arrows showing the direction and type of the relationship between the supertype and its subtypes.

As a more practical example, a language might allow integer values to be used wherever floating point values are expected (Integer <: Float), or it might define a generic type Number as a common supertype of integers and the reals. In this second case, we only have Integer <: Number and Float <: Number, but Integer and Float are not subtypes of each other.

Programmers may take advantage of subtyping to write code in a more abstract manner than would be possible without it. Consider the following example:

function max (x as Number, y as Number) is
    if x < y then
        return y
    else
        return x
end

If integer and real are both subtypes of Number, and an operator of comparison with an arbitrary Number is defined for both types, then values of either type can be passed to this function. However, the very possibility of implementing such an operator highly constrains the Number type (for example, one can't compare an integer with a complex number), and actually only comparing integers with integers, and reals with reals, makes sense. Rewriting this function so that it would only accept 'x' and 'y' of the same type requires bounded polymorphism.

Subtyping enables a given type to be substituted for another type or abstraction. Subtyping is said to establish an is-a relationship between the subtype and some existing abstraction, either implicitly or explicitly, depending on language support. The relationship can be expressed explicitly via inheritance in languages that support inheritance as a subtyping mechanism.

C++

[edit]

The following C++ code establishes an explicit inheritance relationship between classes B and A, where B is both a subclass and a subtype of A, and can be used as an A wherever a B is specified (via a reference, a pointer or the object itself).

class A {
public:
    void methodOfA() const {
        // ...
    }
};

class B : public A { 
public:
    void methodOfB() const {
        // ...
    }
};

void functionOnA(const A& a) {
    a.methodOfA();
}

int main() {
   B b;
   functionOnA(b); // b can be substituted for an A.
}

[5]

Python

[edit]

The following python code establishes an explicit inheritance relationship between classes B and A, where B is both a subclass and a subtype of A, and can be used as an A wherever a B is required.

class A:
    def method_of_a(self) -> None:
        pass

class B(A):
    def method_of_b(self) -> None:
        pass

def function_on_a(a: A) -> None:
    a.method_of_a()

if __name__ == "__main__":
    b: B = B()
    function_on_a(b) # b can be substituted for an A.

The following example, type(a) is a "regular" type, and type(type(a)) is a metatype. While as distributed all types have the same metatype (PyType_Type, which is also its own metatype), this is not a requirement. The type of classic classes, known as types.ClassType, can also be considered a distinct metatype.[6]

a = 0
print(type(a))
# prints: <type 'int'>
print(type(type(a)))
# prints: <type 'type'>
print(type(type(type(a))))
# prints: <type 'type'>
print(type(type(type(type(a)))))
# prints: <type 'type'>

Java

[edit]

In Java, is-a relation between the type parameters of one class or interface and the type parameters of another are determined by the extends and implements clauses.

Using the Collections classes, ArrayList<E> implements List<E>, and List<E> extends Collection<E>. So ArrayList<String> is a subtype of List<String>, which is a subtype of Collection<String>. The subtyping relationship is preserved between the types automatically. When defining an interface, PayloadList, that associates an optional value of generic type P with each element, its declaration might look like:

interface PayloadList<E, P> extends List<E> {
    void setPayload(int index, P val);
    ...
}

The following parameterizations of PayloadList are subtypes of List<String>:

PayloadList<String, String>
PayloadList<String, Integer>
PayloadList<String, Exception>

Subsumption

[edit]

In type theory the concept of subsumption[7] is used to define or evaluate whether a type S is a subtype of type T.

A type is a set of values. The set can be described extensionally by listing all the values, or it can be described intensionally by stating the membership of the set by a predicate over a domain of possible values. In common programming languages enumeration types are defined extensionally by listing values. User-defined types like records (structs, interfaces) or classes are defined intensionally by an explicit type declaration or by using an existing value, which encodes type information, as a prototype to be copied or extended.

In discussing the concept of subsumption, the set of values of a type is indicated by writing its name in mathematical italics: T. The type, viewed as a predicate over a domain, is indicated by writing its name in bold: T. The conventional symbol <: means "is a subtype of", and :> means "is a supertype of".[citation needed]

  • A type T subsumes S if the set of values T which it defines, is a superset of the set S, so that every member of S is also a member of T.
  • A type may be subsumed by more than one type: the supertypes of S intersect at S.
  • If S <: T (and therefore ST ), then T, the predicate which circumscribes the set T, must be part of the predicate S (over the same domain) which defines S.
  • If S subsumes T, and T subsumes S, then the two types are equal (although they may not be the same type if the type system distinguishes types by name).

In terms of information specificity, a subtype is considered more specific than any one of its supertypes, because it holds at least as much information as each of them. This may increase the applicability, or relevance of the subtype (the number of situations where it can be accepted or introduced), as compared to its "more general" supertypes. The disadvantage of having this more detailed information is that it represents incorporated choices which reduce the prevalence of the subtype (the number of situations which are able to generate or produce it).

In the context of subsumption, the type definitions can be expressed using Set-builder notation, which uses a predicate to define a set. Predicates can be defined over a domain (set of possible values) D. Predicates are partial functions that compare values to selection criteria. For example: "is an integer value greater than or equal to 100 and less than 200?". If a value matches the criteria then the function returns the value. If not, the value is not selected, and nothing is returned. (List comprehensions are a form of this pattern used in many programming languages.)

If there are two predicates, which applies selection criteria for the type T, and which applies additional criteria for the type S, then sets for the two types can be defined:

The predicate is applied alongside as part of the compound predicate S defining S. The two predicates are conjoined, so both must be true for a value to be selected. The predicate subsumes the predicate T, so S <: T.

For example: there is a subfamily of cat species called Felinae, which is part of the family Felidae. The genus Felis, to which the domestic cat species Felis catus belongs, is part of that subfamily.

The conjunction of predicates has been expressed here through application of the second predicate over the domain of values conforming to the first predicate. Viewed as types, Felis <: Felinae <: Felidae.

If T subsumes S (T :> S) then a procedure, function or expression given a value as an operand (parameter value or term) will therefore be able to operate over that value as one of type T, because . In the example above, we could expect the function ofSubfamily to be applicable to values of all three types Felidae, Felinae and Felis.

Subtyping schemes

[edit]

Type theorists make a distinction between nominal subtyping, in which only types declared in a certain way may be subtypes of each other, and structural subtyping, in which the structure of two types determines whether or not one is a subtype of the other. The class-based object-oriented subtyping described above is nominal; a structural subtyping rule for an object-oriented language might say that if objects of type A can handle all of the messages that objects of type B can handle (that is, if they define all the same methods), then A is a subtype of B regardless of whether either inherits from the other. This so-called duck typing is common in dynamically typed object-oriented languages. Sound structural subtyping rules for types other than object types are also well known.[citation needed]

Implementations of programming languages with subtyping fall into two general classes: inclusive implementations, in which the representation of any value of type A also represents the same value at type B if A <: B, and coercive implementations, in which a value of type A can be automatically converted into one of type B. The subtyping induced by subclassing in an object-oriented language is usually inclusive; subtyping relations that relate integers and floating-point numbers, which are represented differently, are usually coercive.

In almost all type systems that define a subtyping relation, it is reflexive (meaning A <: A for any type A) and transitive (meaning that if A <: B and B <: C then A <: C). This makes it a preorder on types.

Record types

[edit]

Width and depth subtyping

[edit]

Types of records give rise to the concepts of width and depth subtyping. These express two different ways of obtaining a new type of record that allows the same operations as the original record type.

Recall that a record is a collection of (named) fields. Since a subtype is a type which allows all operations allowed on the original type, a record subtype should support the same operations on the fields as the original type supported.

One kind of way to achieve such support, called width subtyping, adds more fields to the record. More formally, every (named) field appearing in the width supertype will appear in the width subtype. Thus, any operation feasible on the supertype will be supported by the subtype.

The second method, called depth subtyping, replaces the various fields with their subtypes. That is, the fields of the subtype are subtypes of the fields of the supertype. Since any operation supported for a field in the supertype is supported for its subtype, any operation feasible on the record supertype is supported by the record subtype. Depth subtyping only makes sense for immutable records: for example, you can assign 1.5 to the 'x' field of a real point (a record with two real fields), but you can't do the same to the 'x' field of an integer point (which, however, is a deep subtype of the real point type) because 1.5 is not an integer (see Variance).

Subtyping of records can be defined in System F<:, which combines parametric polymorphism with subtyping of record types and is a theoretical basis for many functional programming languages that support both features.

Some systems also support subtyping of labeled disjoint union types (such as algebraic data types). The rule for width subtyping is reversed: every tag appearing in the width subtype must appear in the width supertype.

Function types

[edit]

If T1T2 is a function type, then a subtype of it is any function type S1S2 with the property that T1 <: S1 and S2 <: T2. This can be summarised using the following typing rule:

The parameter type of S1S2 is said to be contravariant because the subtyping relation is reversed for it, whereas the return type is covariant. Informally, this reversal occurs because the refined type is "more liberal" in the types it accepts and "more conservative" in the type it returns. This is what exactly works in Scala: a n-ary function is internally a class that inherits the trait (which can be seen as a general interface in Java-like languages), where are the parameter types, and is its return type; "−" before the type means the type is contravariant while "+" means covariant.

In languages that allow side effects, like most object-oriented languages, subtyping is generally not sufficient to guarantee that a function can be safely used in the context of another. Liskov's work in this area focused on behavioral subtyping, which besides the type system safety discussed in this article also requires that subtypes preserve all invariants guaranteed by the supertypes in some contract.[8] This definition of subtyping is generally undecidable, so it cannot be verified by a type checker.

The subtyping of mutable references is similar to the treatment of parameter values and return values. Write-only references (or sinks) are contravariant, like parameter values; read-only references (or sources) are covariant, like return values. Mutable references which act as both sources and sinks are invariant.

Relationship with inheritance

[edit]

Subtyping and inheritance are independent (orthogonal) relationships. They may coincide, but none is a special case of the other. In other words, between two types S and T, all combinations of subtyping and inheritance are possible:

  1. S is neither a subtype nor a derived type of T
  2. S is a subtype but is not a derived type of T
  3. S is not a subtype but is a derived type of T
  4. S is both a subtype and a derived type of T

The first case is illustrated by independent types, such as Boolean and Float.

The second case can be illustrated by the relationship between Int32 and Int64. In most object oriented programming languages, Int64 are unrelated by inheritance to Int32. However Int32 can be considered a subtype of Int64 since any 32 bit integer value can be promoted into a 64 bit integer value.

The third case is a consequence of function subtyping input contravariance. Assume a super class of type T having a method m returning an object of the same type (i.e. the type of m is TT, also note that the first parameter of m is this/self) and a derived class type S from T. By inheritance, the type of m in S is SS.[citation needed] In order for S to be a subtype of T the type of m in S must be a subtype of the type of m in T [citation needed], in other words: SS ≤: TT. By bottom-up application of the function subtyping rule, this means: S ≤: T and T ≤: S, which is only possible if S and T are the same. Since inheritance is an irreflexive relation, S can't be a subtype of T.

Subtyping and inheritance are compatible when all inherited fields and methods of the derived type have types which are subtypes of the corresponding fields and methods from the inherited type .[3]

Coercions

[edit]

In coercive subtyping systems, subtypes are defined by explicit type conversion functions from subtype to supertype. For each subtyping relationship (S <: T), a coercion function coerce: ST is provided, and any object s of type S is regarded as the object coerceST(s) of type T. A coercion function may be defined by composition: if S <: T and T <: U then s may be regarded as an object of type u under the compound coercion (coerceTUcoerceST). The type coercion from a type to itself coerceTT is the identity function idT.

Coercion functions for records and disjoint union subtypes may be defined componentwise; in the case of width-extended records, type coercion simply discards any components which are not defined in the supertype. The type coercion for function types may be given by f'(t) = coerceS2T2(f(coerceT1S1(t))), reflecting the contravariance of parameter values and covariance of return values.

The coercion function is uniquely determined given the subtype and supertype. Thus, when multiple subtyping relationships are defined, one must be careful to guarantee that all type coercions are coherent. For instance, if an integer such as 2 : int can be coerced to a floating point number (say, 2.0 : float), then it is not admissible to coerce 2.1 : float to 2 : int, because the compound coercion coercefloatfloat given by coerceintfloatcoercefloatint would then be distinct from the identity coercion idfloat.

See also

[edit]

Notes

[edit]

References

[edit]

Further reading

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Subtyping is a fundamental concept in and type systems, enabling a type SS to be treated as compatible with another type TT (its supertype) if SS is a subtype of TT, thereby allowing values of subtype SS to be safely used in contexts expecting TT without violating program correctness. This relation, often denoted S<:TS <: T, supports polymorphism by promoting code reuse and flexibility, particularly in object-oriented and functional programming paradigms. Originating in the Simula-67 language developed by Ole-Johan Dahl and Kristen Nygaard in 1967—inspired by Tony Hoare's ideas on records—subtyping marked the birth of key object-oriented features like inheritance and dynamic dispatch. In type theory, subtyping is formally defined through rules that ensure the subtype relation is reflexive (T<:TT <: T), transitive (if S<:US <: U and U<:TU <: T, then S<:TS <: T), and respects type constructors, such as products and functions. For product types, subtyping is covariant: S1×S2<:T1×T2S_1 \times S_2 <: T_1 \times T_2 if S1<:T1S_1 <: T_1 and S2<:T2S_2 <: T_2. For function types, it exhibits contravariance in the argument and covariance in the return type: S1S2<:T1T2S_1 \to S_2 <: T_1 \to T_2 if T1<:S1T_1 <: S_1 and S2<:T2S_2 <: T_2, preventing unsafe substitutions that could lead to runtime errors. These rules underpin the subsumption principle, where a term typed as SS can be retyped as TT if S<:TS <: T, preserving the progress and preservation properties of typed lambda calculi. Subtyping plays a central role in object-oriented programming, where it facilitates subtype polymorphism by allowing subclasses to override or extend superclass behaviors while maintaining substitutability. The Liskov Substitution Principle, articulated by Barbara Liskov, formalizes this by requiring that objects of a subtype must be substitutable for supertype objects without altering the program's desirable properties. In record types, common in structural typing systems, subtyping often follows width subtyping (where the subtype includes all fields of the supertype plus possibly additional fields) and depth subtyping (allowing more specific field types), enabling flexible data handling as seen in languages like OCaml or Scala. Theoretical advancements, including Luca Cardelli's work on record calculi and bounded quantification, have extended subtyping to more expressive type systems, influencing modern languages and libraries.

Fundamentals

Definition and Principles

In type theory and programming languages, subtyping is a relation between types where a type SS is considered a subtype of type TT, denoted S<:TS <: T, if a value of type SS can be safely used in any context that expects a value of type TT, thereby preserving the program's type safety. This relation ensures that substitutions do not introduce errors, allowing for implicit type compatibility without explicit conversions. The subtyping relation <<: isreflexive( is reflexive (S <: Sforanytypefor any typeS),transitive(if), transitive (if S <: TandandT <: U,then, then S <: U),andtypicallyantisymmetric(if), and typically antisymmetric (if S <: TandandT <: S,then, then S = T$), forming a partial order on the set of types. These properties enable a hierarchical structure for types, supporting systematic reasoning about compatibility. A foundational principle of subtyping is the preservation of type safety, encapsulated by the Liskov Substitution Principle (LSP), which states that if SS is a subtype of TT, then objects of type SS must be substitutable for objects of type TT such that any property provable about objects of TT remains true for objects of SS. This includes maintaining behavioral specifications, such as preconditions not strengthened, postconditions not weakened, invariants preserved, and history constraints satisfied across method calls and state transitions. The LSP ensures that subtyping does not violate program correctness, even in the presence of mutable state or concurrency. Subtyping motivates the design of flexible type systems by enabling polymorphism—where a single interface can work with multiple types—and promoting code reuse through hierarchical type families, all while preventing runtime type errors at compile time. This facilitates modular and extensible software without compromising safety.

Subsumption

Subsumption refers to the type-theoretic mechanism that allows a value of a subtype SS to be implicitly treated as an instance of its supertype TT during type checking, provided STS \leq T. This process enables the automatic conversion, or upcasting, of subtype values to supertype contexts without requiring explicit intervention from the programmer. Formally, subsumption is captured by the typing rule: Γa:AABΓa:B\frac{\Gamma \vdash a : A \quad A \leq B}{\Gamma \vdash a : B} where Γ\Gamma is the typing context, aa is a term, and the rule permits reassigning the type of aa from AA to any supertype BB. This rule relies on the subtyping relation being reflexive (AAA \leq A) and transitive (ABA \leq B and BCB \leq C imply ACA \leq C). In static type checking, subsumption facilitates the use of subtype values in positions expecting the supertype, promoting polymorphism and code reuse. For example, consider a type hierarchy where natural numbers (Nat\text{Nat}) are a subtype of integers (Int\text{Int}), i.e., NatInt\text{Nat} \leq \text{Int}. An expression computing a value of type Nat\text{Nat}, such as the successor of zero, can be directly assigned to a variable of type Int\text{Int} via subsumption, without altering the program's behavior. Similarly, in a hierarchy of physical objects, if BookPhy\text{Book} \leq \text{Phy}, a book value can be used in any context requiring a physical object. The safety of such substitutions is guaranteed by the Liskov Substitution Principle, which ensures that replacing supertype objects with subtype objects preserves program correctness. Subsumption differs from explicit casts, which involve programmer-specified type conversions that may introduce runtime checks or errors if invalid. In contrast, subsumption is purely implicit and resolved at compile time in static systems, avoiding such overhead but relying on the declared subtyping relation. However, this implicit nature can lead to issues, such as the loss of subtype-specific information; once subsumed to the supertype, access to methods or fields unique to the subtype is no longer available without an explicit downcast, potentially complicating program logic. Additionally, in systems emphasizing canonical forms, subsumption can violate uniqueness of typing, as a term may admit multiple types, undermining certain proof-theoretic properties like canonicity.

Historical Context

Origins

The concept of subtyping originated in the 1960s with the development of the Simula-67 programming language by and at the Norwegian Computing Center. Inspired by 's ideas on records, Simula-67 introduced class-based inheritance and dynamic dispatch, enabling subtype polymorphism as a core feature for simulation and object-oriented programming. This marked the first practical implementation of subtyping, allowing derived classes to be substituted for base classes while preserving program behavior. Subsequent theoretical roots emerged in the 1970s through work on parametric polymorphism and type abstraction, particularly John C. Reynolds' introduction of the polymorphic lambda calculus (System F) in 1974, which laid foundational ideas for relating types hierarchically and emphasizing intrinsic semantics independent of external interpretations. Reynolds further formalized subtyping in 1980 using category theory to model implicit conversions and generic operators. His explorations enabled the conceptualization of type relations that built upon early practical uses. Subtyping drew significant influence from mathematical foundations in set theory and domain theory, where it parallels subset relations and partial orders on domains. In set theory, subtypes correspond to subsets, preserving operations and inclusions in a way that ensures safe substitutions. Domain theory, developed by Dana Scott in the early 1970s, provided a framework for modeling computable functions and data as complete partial orders, influencing subtyping by formalizing type inclusions as monotonic embeddings between domains. Early motivations for subtyping arose from the limitations of rigid typing systems in languages like Pascal, which enforced strict type distinctions without mechanisms for flexible reuse or extension, hindering modular program design. Pascal's strong but inflexible typing, designed for teaching structured programming in the 1970s, often required explicit conversions or redundant code for related data types, prompting researchers to seek more expressive type relations. A key milestone in the 1980s involved the formalization of subtyping in programming language systems such as ML and Ada, which integrated polymorphic features with type hierarchies for practical use. ML's type system, evolving from Robin Milner's work in the 1970s, incorporated parametric polymorphism that provided a basis for later type relations, including those involving subtyping in extended variants. Ada, released in 1983, introduced generic packages and type derivations that enabled subtyping for safer interoperability in large-scale software. This era's advancements were crystallized in the 1985 paper by Luca Cardelli and Peter Wegner, which provided a unified model of types, data abstraction, and polymorphism, explicitly framing subtyping as a core mechanism for type-safe extensibility in λ-calculus-based systems.

Evolution in Programming Languages

In the 1990s, subtyping concepts were prominently integrated into object-oriented programming languages, with a strong emphasis on nominal subtyping to enforce explicit type hierarchies and support inheritance-based polymorphism. Java, released in 1995 by Sun Microsystems, adopted nominal subtyping as a core feature of its type system, where subtype relationships are determined by explicit declarations rather than structural compatibility, enabling safe substitution in object-oriented designs. Similarly, C#, introduced by Microsoft in 2000 as part of the .NET Framework, built on this approach, using nominal subtyping to manage class and interface hierarchies while incorporating variance annotations for generics in later versions. These languages popularized subtyping in mainstream enterprise software, prioritizing compile-time checks for type safety in large-scale applications. The 2000s marked a shift toward dynamic languages that embraced more implicit forms of subtyping through duck typing, allowing behavioral compatibility without formal declarations. Python, evolving from its 1991 origins, exemplified this trend by relying on duck typing—where objects are treated as subtypes based on their ability to respond to method calls rather than explicit inheritance—fostering flexible, runtime-resolved polymorphism in scripting and web development. This implicit subtyping behavior aligned with Python's philosophy of practicality, enabling rapid prototyping while avoiding the rigidity of nominal systems, and influenced the growth of dynamic languages in data science and automation. Recent developments up to 2025 have introduced hybrid and more flexible subtyping mechanisms to address modern challenges in systems programming and safety. Rust, stable since its 1.0 release in 2015, employs traits to achieve structural-like subtyping for generic code, allowing types to satisfy interfaces based on implemented behaviors while maintaining nominal conformance for ownership and borrowing rules. Swift, launched in 2014 by Apple, advanced protocol-oriented programming, where protocols enable subtyping through explicit conformance but support structural composition for value types, enhancing modularity in iOS and macOS ecosystems. Additionally, dependently typed languages like Idris have influenced subtyping by embedding value-dependent constraints into types, enabling finer-grained proofs of subtype relations in functional contexts since its early versions around 2011. Overall trends reflect a move toward more flexible subtyping to accommodate concurrency and functional paradigms, with languages incorporating variance and effect systems to ensure thread safety without sacrificing expressiveness. For instance, subtyping in concurrent settings now often includes linear types or session types to track resource usage, as seen in extensions to and dependent systems, promoting safer parallelism in multicore environments. This evolution prioritizes composability in functional designs, reducing boilerplate while preserving guarantees against race conditions.

Subtyping Paradigms

Structural Subtyping

Structural subtyping determines type compatibility based on the internal structure of types rather than their names or declarations. A type SS is a subtype of type TT (denoted S<:TS <: T) if the components of SS align with or extend those of TT in a way that ensures values of SS can safely substitute for values of TT. This paradigm applies orthogonally to various type constructors, such as records, where compatibility arises from matching fields and their types. For record types, structural subtyping incorporates width and depth rules to establish the subtype relation. Width subtyping permits a record with additional fields to serve as a subtype of a record possessing only a subset of those fields, as the extra fields can be ignored in contexts expecting the supertype. Depth subtyping allows subtyping when records share the same fields but the field types in the subtype are themselves subtypes of the corresponding types in the supertype, enabling more precise typing without violating compatibility. These rules combine such that a record type RS={li:τiiI}R_S = \{ l_i : \tau_i \mid i \in I \} is a subtype of RT={lj:σjjJ}R_T = \{ l_j : \sigma_j \mid j \in J \} if JIJ \subseteq I (width condition) and jJ, τj<:σj\forall j \in J, \ \tau_j <: \sigma_j (depth condition). The formal inference rules for structural subtyping are typically presented in a type theory framework using judgments for typing and subtyping. For instance, the subtyping judgment ES<:TE \vdash S <: T holds under an environment EE if the structural conditions are met, with base cases for atomic types (e.g., reflexivity: EA<:AE \vdash A <: A) and inductive rules for constructors like records as described above. Transitivity ensures that if S<:TS <: T and T<:UT <: U, then S<:US <: U. Structural subtyping offers advantages in flexibility, particularly for anonymous or ad-hoc types where compatibility is inferred from usage patterns rather than predefined hierarchies, reducing the need for explicit type annotations. This approach is prevalent in functional languages, where it facilitates reusable, polymorphic functions over structurally compatible types without nominal constraints.

Nominal Subtyping

Nominal subtyping is a typing discipline in which the compatibility and subtype relationships between types are determined by their explicitly declared names and hierarchies, rather than by the internal structure of the types themselves. This approach requires programmers to define subtype-supertype links through direct declarations, typically within class or interface systems, ensuring that subtype relationships are only established when such naming conventions are explicitly specified. For example, a declaration such as class ColorPoint extends Point explicitly establishes ColorPoint as a subtype of Point, allowing values of type ColorPoint to be used wherever Point is expected, based solely on the named inheritance relation. The rules governing nominal subtyping mandate these explicit links, often using keywords like extends or implements, which form the basis for subtype checking and enforce a controlled hierarchy of types. This declaration-based mechanism contrasts with approaches that infer compatibility implicitly, providing a clear delineation of intended type relationships. A primary advantage of nominal subtyping lies in its promotion of clarity and control, particularly in large-scale codebases, where explicit declarations act as verifiable documentation of design intent and facilitate comprehensible error messages by referencing programmer-defined type names. By requiring overt specification of subtype relations, it prevents unintended subtyping that might occur due to coincidental similarities, thereby improving modularity, stability, and the ability to assign blame in component-based systems. Formally, nominal subtyping constructs a partial order on types derived from these declarations, often represented as a directed acyclic graph (DAG) with types as nodes and inheritance edges connecting subtypes to supertypes, ensuring the relation is reflexive, transitive, and antisymmetric. To maintain decidability and avoid cycles that could lead to infinite recursion in subtyping judgments, systems impose acyclicity constraints, such as bounding type heights or halting checks upon detecting revisited types during traversal. This structure supports variance in generic types while preserving the explicit nature of the subtype lattice.

Data Structures

Record Types

Record types, also known as labeled product types, are composite data structures in type theory that consist of a finite set of labeled fields, each associated with a specific type. For example, a record type might be defined as {name: String, age: Int}, where name holds a string value and age holds an integer value. These types are fundamental in languages that support structured data, enabling the representation of object-like entities with named components. In subtyping relations for record types, a record type R1R_1 is a subtype of another record type R2R_2 (denoted R1<:R2R_1 <: R_2) if R1R_1 includes all the fields of R2R_2 with types that are themselves subtypes or equal to those in R2R_2. This rule ensures that values of the subtype can be safely used wherever the supertype is expected, as the subtype provides at least the required fields with compatible types. Such subtyping is typically structural, relying on the composition of fields rather than explicit declarations. Record subtyping finds practical application in languages that model object-like data through structural typing, such as , where object types behave as records and allow subtypes to supply additional properties beyond those specified in an interface. For instance, an object with fields {name: string, age: number} can be assigned to a variable expecting {name: string}, as the extra field does not violate compatibility. In , while plain records are nominally typed without direct subtyping, object types leverage row polymorphism to achieve similar extensibility for record-like structures. Challenges in record subtyping arise particularly with field mutability and presence. If fields are mutable, subtyping must often be restricted to invariance to prevent unsafe modifications, such as assigning a value to an extra field in a subtype that alters behavior unexpected by the supertype. Field presence introduces issues when subtypes include optional or additional fields, potentially leading to runtime errors if code assumes a fixed structure, necessitating careful handling in type systems to maintain safety. Width and depth variations extend these basic rules for records by allowing subtypes to broaden or deepen field sets, respectively.

Width and Depth Subtyping

Width subtyping for record types permits a record with additional fields to be considered a subtype of a record with a subset of those fields, provided the common fields match exactly in name and type. This rule ensures that any operation valid on the supertype—such as accessing the shared fields—remains valid on the subtype, as the extra fields in the subtype can simply be ignored. For example, the record type {x: Int, y: Int} is a subtype of {x: Int}, allowing a value with both fields to be used wherever a value with only x is expected. Depth subtyping extends this by allowing subtyping relations within the types of corresponding fields, maintaining structural compatibility at nested levels. In this case, the field labels must match, but the type of each field in the subtype can be a subtype of the corresponding type in the supertype. For instance, if Int is a subtype of Num, then {pt: {x: Int}} is a subtype of {pt: {x: Num}}, as the nested structure preserves the required operations through covariance in the field types. The combined formal rule for record subtyping integrates width and depth aspects: under typing context Γ\Gamma, a record type {li:TiiI}\{ l_i : T_i \mid i \in I \} is a subtype of {li:SiiJ}\{ l_i : S_i \mid i \in J \} if JIJ \subseteq I (width condition, allowing extra fields in the subtype) and for all iJi \in J, Ti<:SiT_i <: S_i (depth condition, requiring covariant subtyping in common fields). This rule, covariant in the depth dimension, ensures safe substitution while permitting structural flexibility. A key limitation of width subtyping arises when dealing with optional fields, as the basic rule assumes all specified fields in the supertype are present and required in the subtype's common structure; handling truly optional or absent fields requires extensions like to avoid type errors from unexpected field presence or absence.

Function and Higher-Order Types

Function Subtyping

Function subtyping governs the compatibility between function types in typed languages, ensuring that a function of one type can safely substitute for a function of another type while preserving type safety. For function types of the form τσ\tau \to \sigma and τσ\tau' \to \sigma', the subtyping relation (τσ)<:(τσ)(\tau \to \sigma) <: (\tau' \to \sigma') holds if τ<:τ\tau' <: \tau (contravariance in the argument type τ\tau) and σ<:σ\sigma <: \sigma' (covariance in the return type σ\sigma). This rule, known as the S-Arrow rule, is formally stated as: if T1<:S1T_1 <: S_1 and S2<:T2S_2 <: T_2, then S1S2<:T1T2S_1 \to S_2 <: T_1 \to T_2. The contravariance in argument types ensures that a subtype function can accept a wider range of inputs than the supertype requires, allowing safe passage of arguments that may be subtypes of the expected parameter type. Covariance in return types guarantees that the subtype function's output, being a subtype of the expected return, can be used in any context anticipating the supertype's result. Together, these variance annotations prevent type errors during function application and higher-order function passing, such as when functions are treated as first-class values in lambda calculi. Consider a type hierarchy where Horse <:<: Animal. The function type Animal \to Horse is then a subtype of Animal \to Animal, as the return type satisfies Horse <:<: **Animal) under covariance. Similarly, Animal \to Animal <:<: Horse \to **Animal), since the argument types satisfy Horse <:<: **Animal) under contravariance, allowing the wider-accepting function to substitute for the narrower one. In typed lambda calculi, this enables subtyping for anonymous functions, such as λx:\lambda x: Animal. x (of type Animal \to **Animal)) being usable where a Horse \to **Animal) is expected. The typing rule for function application under subtyping accommodates these variances: Γe1:STΓe2:UU<:SΓe1e2:T(T-APP)\frac{\Gamma \vdash e_1 : S \to T \quad \Gamma \vdash e_2 : U \quad U <: S}{\Gamma \vdash e_1 \, e_2 : T} \quad (\text{T-APP}) This rule permits the argument e2e_2 to have a subtype UU of the parameter type SS, ensuring the application remains well-typed while leveraging subtyping flexibility.

Subtyping for Other Higher-Order Types

Subtyping for generic types extends the principles of parametric polymorphism by incorporating type parameters that respect subtype relations, enabling more flexible reuse of code across hierarchies. In systems with bounded quantification, a type constructor like α<:T.τ\forall \alpha <: T. \tau allows the parameter α\alpha to range over subtypes of TT, with the body τ\tau potentially depending on α\alpha. A key rule ensures that α<:T.αα<:α.αα\forall \alpha <: T. \alpha \to \alpha <: \forall \alpha. \alpha \to \alpha, permitting a more constrained universal quantifier to subtype an unconstrained one, as the former can be safely used wherever the latter is expected. This formulation, central to F-bounded polymorphism, supports self-referential bounds such as α<:Comparable<α>.α\forall \alpha <: Comparable<\alpha>. \alpha, which is crucial for typing methods that operate on subtypes of a class, as seen in languages like where generic classes declare variance at the site of definition to control or contravariance of parameters. For instance, a generic list List<T>List<T> subtypes Collection<S>Collection<S> if T<:ST <: S under covariant declaration-site variance, avoiding the pitfalls of use-site variance that can lead to undecidability in subtyping checks. Intersection types introduce a mechanism for combining multiple type behaviors into a single subtype, where TU<:TT \cap U <: T and TU<:UT \cap U <: U, allowing a value to inherit properties from both supertypes simultaneously. This enables precise typing of functions that must satisfy multiple contracts, such as a method that is both iterable and serializable, without resorting to nominal multiple inheritance. Union types complement this by broadening applicability, though their subtyping is more subtle: T<:TUT <: T \cup U and U<:TUU <: T \cup U, facilitating type widening for flexible polymorphism. Integrated frameworks ensure these operations compose intuitively with other type constructors, preserving decidability through restrictions like avoiding recursive intersections. In practice, intersection types support bounded polymorphism by refining universal quantifiers, as in α<:TU.τ\forall \alpha <: T \cap U. \tau, which constrains parameters to satisfy multiple bounds. Advanced applications of subtyping appear in dependent types, where subtypes can depend on values or terms, complicating but enriching expressiveness; for example, subtyping rules must ensure that dependent refinements like {x:Nx>0}<:N\{x : \mathbb{N} \mid x > 0\} <: \mathbb{N} preserve logical implications in the indices. Decidability is achieved in restricted systems by algorithmic subtyping that handles path dependencies without full higher-order unification. Similarly, for computational effects, subtyping on effect annotations allows refinement, such as an IO effect with read-only permissions subtyping general IO, enabling modular reasoning about side effects in monadic structures like Haskell's IO monad. Effect systems with coercive subtyping reconstruct precise effect information without merging, supporting subtype polymorphism for effects like state or exceptions. These extensions maintain type safety by ensuring that subtype effects imply supertype behaviors, facilitating optimization and verification in effectful programs.

Language Implementations

C++

In C++, subtyping is primarily nominal and implemented through class inheritance, where the subtype relationship is explicitly declared rather than inferred from type structure. Public inheritance establishes an "is-a" relationship, allowing a derived class to be used in contexts expecting the base class. For instance, declaring class Dog : public Animal {} means that Dog is a subtype of Animal, enabling substitution while preserving binary compatibility through shared memory layouts and virtual function tables (vtables). This approach aligns with nominal typing, as the subtyping depends on the named inheritance declaration, not the structural compatibility of members. Virtual functions enable runtime polymorphism, supporting dynamic dispatch that leverages subtyping for flexible behavior. When a function is declared virtual in the base class, derived classes can override it, and calls through base pointers or references resolve to the derived implementation at runtime via the vtable. For example, in class Animal { public: virtual void speak() { /* base */ } }; class Dog : public Animal { public: void speak() override { /* bark */ } };, invoking Animal* p = new Dog(); p->speak(); executes the Dog version. This mechanism ensures subtype polymorphism, allowing heterogeneous collections of subtypes to be processed uniformly without knowing exact types at compile time. Templates introduce elements of structural typing through techniques like Substitution Failure Is Not An Error (SFINAE), which enables compile-time selection based on type traits without explicit nominal declarations. SFINAE occurs when template argument substitution fails silently, allowing overload resolution to choose alternatives that match a type's structure, such as checking for member functions or types. For example, a template can detect if a type has an iterator member via template <typename T> auto has_iterator(...) -> std::true_type;, using SFINAE to enable/disable specializations. However, this remains auxiliary to C++'s nominal core, as templates do not create subtype hierarchies but rather enable with structural constraints. C++ lacks built-in support for structural subtyping, requiring explicit for any subtype relationship, which contrasts with languages offering implicit compatibility. , while supported, introduces complexities such as the problem, where ambiguous base access arises from shared ancestry (e.g., two bases deriving from a common grandbase). This is mitigated by , as in class D : virtual public B1, virtual public B2 {}, but it can lead to increased overhead and fragile hierarchies. These limitations emphasize C++'s reliance on careful design to avoid runtime errors or in subtype scenarios.

Java

Java employs a nominal subtyping system, where subtype relationships between types are established explicitly through declarations in class and interface definitions, rather than by comparing their structural properties. This approach ensures that subtyping is predictable and tied to the programmer's intent, as defined in the . In , class hierarchies are formed using the extends keyword, which allows a class to inherit from a single superclass, establishing a subtype relationship. For example, if class Cat extends Animal {}, then Cat is a subtype of Animal, enabling automatic upcasting where a Cat object can be assigned to an Animal reference without explicit . This single-inheritance model for classes prevents ambiguities but limits direct inheritance to one path, with Object serving as the ultimate superclass for all classes. Interface hierarchies, declared with the implements keyword, allow a class to implement multiple interfaces, creating subtype relationships with each. For instance, class Cat implements [Pet](/page/Pet), Playable {} makes Cat a subtype of both Pet and Playable, supporting mixin-like composition for behaviors without full implementation inheritance. Interfaces themselves can extend multiple other interfaces, further enriching the type hierarchy. Subtyping rules in facilitate safe polymorphism through automatic upcasting along the hierarchy, but generics introduce additional constraints to maintain . A parameterized type like List<Integer> is not a subtype of List<Number> even if Integer extends Number, to prevent operations that could violate type invariants. Wildcards address this by enabling flexible subtyping; for example, List<? extends Animal> accepts List<Cat> or List<Dog>, allowing read-only access to elements as Animal while prohibiting additions. Bounded type parameters further refine subtyping in generics for safer usage. Upper bounds, specified as <T extends Number>, restrict T to Number or its subtypes, enabling methods to invoke common operations like doubleValue() on the parameter. Lower bounds, using super wildcards like List<? super Integer>, permit writing to the list with Integer or narrower types but limit reading to Object. These bounds ensure covariance and contravariance align with subtyping directions, preventing runtime errors. A key feature of Java's subtyping is the prohibition of multiple class inheritance to avoid the problem and method resolution ambiguities, while multiple interface implementation provides a form of behavioral subtyping without implementation conflicts, as interfaces declare abstract methods without bodies until Java 8. This design balances expressiveness with simplicity in the .

Python

Python's approach to subtyping is predominantly dynamic and relies on , a concept where the suitability of an object for a particular context is determined by the presence of certain methods and properties rather than explicit type declarations. This implicit subtyping allows any object that supports the required interface—such as iteration in a loop—to be treated as a subtype of a sequence without formal . For example, a custom class implementing __iter__ can be used interchangeably with built-in lists or tuples in iterable contexts, embodying the principle "if it walks like a and quacks like a duck, then it is a duck." This structural compatibility promotes flexibility in Python's runtime behavior, enabling polymorphic usage without rigid class hierarchies. For explicit nominal subtyping, Python provides built-in functions like isinstance() and issubclass() to check class-based relationships. isinstance(obj, Class) returns True if obj is an instance of Class or any of its subclasses, supporting inheritance hierarchies such as class Bird(Animal): pass where Bird is a subtype of Animal. Similarly, issubclass(SubClass, SuperClass) verifies if SubClass inherits from SuperClass, either directly or indirectly, facilitating runtime type verification in scenarios requiring declared subtype relations. These mechanisms complement by allowing developers to enforce nominal checks when needed, such as in validation logic. Since Python 3.8, introduced in 2019, the typing module supports structural subtyping through Protocol classes, enabling static type checkers like mypy to infer compatibility based on method signatures without inheritance. A protocol defines an interface, such as class SupportsClose(Protocol): def close(self) -> None: ..., and any class implementing close()—even without inheriting from the protocol—satisfies the subtype relation at type-checking time. This feature bridges dynamic duck typing with static analysis, allowing for "static duck typing" where tools verify structural compatibility pre-runtime. In cases of multiple inheritance, Python employs the Method Resolution Order (MRO) to linearize the class hierarchy, ensuring a consistent order for attribute and method lookup. The MRO uses C3 linearization, which merges the linearizations of superclasses while preserving local precedence and monotonicity, as formalized in Python's implementation since version 2.3. For instance, in a class class D(B, C): pass where B and C both inherit from A, the MRO for D is [D, B, C, A, object], preventing ambiguities by resolving methods from B before C. This subtyping rule supports complex hierarchies while avoiding the diamond problem pitfalls seen in earlier languages.

Theoretical Connections

Relationship with Inheritance

In object-oriented programming, inheritance serves as a common mechanism to implement nominal subtyping, where a subclass is declared to conform to its superclass based on the hierarchy, enabling the subclass to inherit and potentially override methods while being treated as the supertype in type checks. This pairing allows for polymorphic substitution, as the subtype relationship is established nominally through explicit class declarations rather than . However, subtyping fundamentally concerns behavioral compatibility, ensuring that a subtype object can replace a supertype object without altering the program's observable , whereas focuses on and incremental extension of classes. does not inherently guarantee subtyping; for instance, modifications to a superclass can produce subclasses that reuse but fail to maintain the expected behavioral contract, leading to type insecurity or restricted flexibility in typed languages. Subtyping can also be achieved independently of , such as through interfaces that enforce a common behavioral interface without sharing implementation details. One key issue in relying on for subtyping is the fragile base class problem, where seemingly innocuous changes to a base class—such as adding or modifying methods—can unexpectedly break the behavioral guarantees for derived classes, violating the and compromising . This fragility arises because tightly couples the implementation of the base to its subclasses, making evolution of the hierarchy prone to unintended side effects across the codebase. As an alternative to for achieving subtyping, employs , where a class composes instances of other classes and forwards method calls to them, enabling behavioral reuse and polymorphism without the hierarchical dependencies or fragility of . This approach promotes looser and greater , allowing subtyping relations to be defined through explicit patterns rather than implicit class hierarchies.

Coercions and Conversions

In type theory and programming languages, coercions provide a mechanism for converting values from one type to a compatible type, typically a supertype, through a transformation that may alter the value's representation while aiming to preserve its semantics. Unlike pure subsumption, which reinterprets a value of a subtype as belonging to a supertype without any change to the value itself, coercions involve active modification, such as embedding an integer's exact value into a floating-point format. For instance, coercing an integer like 2 from type Int to Float results in the floating-point number 2.0, allowing seamless use in operations expecting floats without explicit rewriting. This approach, formalized in coercive subtyping frameworks, treats subtyping as an abbreviation where a coercion function defines the subtype-supertype relationship. Coercions extend traditional structural subtyping by enabling conversions like numeric promotions, where narrower types (e.g., Int) are promoted to wider ones (e.g., Float) to facilitate arithmetic operations, but they introduce risks of information loss, such as precision reduction when coercing in the reverse direction from Float to Int. In coercive subtyping, the subtyping relation A <: B holds if there exists a unique coercion c: A → B that is definable within the type theory and composable along subtype chains, ensuring the converted value behaves equivalently to a native supertype value under the theory's semantics. These coercions must satisfy preservation conditions, including acyclicity to prevent infinite reductions and uniqueness to avoid ambiguity in type checking. Coercions are classified as implicit or explicit based on their application. Implicit coercions are automatically inserted by the during evaluation or checking, such as promoting an Int operand in a mixed-type , streamlining code without programmer intervention. Explicit coercions, often via cast operators, require deliberate specification by the developer, providing control but increasing verbosity. Regarding , safe coercions guarantee lossless or semantics-preserving transformations, adhering to syntactic restrictions for unambiguous insertion, whereas unsafe coercions permit potential data alteration or errors, as seen in truncating conversions, and are typically restricted in strict type systems to maintain overall .

References

Add your contribution
Related Hubs
User Avatar
No comments yet.