Recent from talks
Nothing was collected or created yet.
Library (computing)
View on Wikipedia

In computing, a library is a collection of resources that can be used during software development to implement a computer program. Commonly, a library consists of executable code such as compiled functions and classes, or a library can be a collection of source code. A resource library may contain data such as images and text.
A library can be used by multiple, independent consumers (programs and other libraries). This differs from resources defined in a program which can usually only be used by that program. When a consumer uses a library resource, it gains the value of the library without having to implement it itself. Libraries encourage software reuse in a modular fashion. Libraries can use other libraries resulting in a hierarchy of libraries in a program.
When writing code that uses a library, a programmer only needs to know how to use it, its application programming interface (API) – not its internal details. For example, a program could use a library that abstracts a complicated system call so that the programmer can use the system feature without spending time to learn the intricacies of the system function.
History
[edit]The idea of a computer library dates back to the first computers created by Charles Babbage. An 1888 paper on his Analytical Engine suggested that computer operations could be punched on separate cards from numerical input. If these operation punch cards were saved for reuse then "by degrees the engine would have a library of its own."[1]

In 1947 Goldstine and von Neumann speculated that it would be useful to create a "library" of subroutines for their work on the IAS machine, an early computer that was not yet operational at that time.[2] They envisioned a physical library of magnetic wire recordings, with each wire storing reusable computer code.[3]
Inspired by von Neumann, Wilkes and his team constructed EDSAC. A filing cabinet of punched tape held the subroutine library for this computer.[4] Programs for EDSAC consisted of a main program and a sequence of subroutines copied from the subroutine library.[5] In 1951 the team published the first textbook on programming, The Preparation of Programs for an Electronic Digital Computer, which detailed the creation and the purpose of the library.[6]
COBOL included "primitive capabilities for a library system" in 1959,[7] but Jean Sammet described them as "inadequate library facilities" in retrospect.[8]
JOVIAL has a Communication Pool (COMPOOL), roughly a library of header files.
Another major contributor to the modern library concept came in the form of the subprogram innovation of FORTRAN. FORTRAN subprograms can be compiled independently of each other, but the compiler lacked a linker. So prior to the introduction of modules in Fortran-90, type checking between FORTRAN[NB 1] subprograms was impossible.[9]
By the mid 1960s, copy and macro libraries for assemblers were common. Starting with the popularity of the IBM System/360, libraries containing other types of text elements, e.g., system parameters, also became common.
In IBM's OS/360 and its successors this is called a partitioned data set.
The first object-oriented programming language, Simula, developed in 1965, supported adding classes to libraries via its compiler.[10][11]
Linking
[edit]The linking (or binding) process resolves references known as symbols (or links) by searching for them in various locations including configured libraries. If a linker (or binder) does not find a symbol, then it fails, but multiple matches may or may not cause failure.
Static linking is linking at build time, such that the library executable code is included in the program. Dynamic linking is linking at run time; it involves building the program with information that supports run-time linking to a dynamic link library (DLL). For dynamic linking, a compatible DLL file must be available to the program at run time, but for static linking, the program is standalone.
Smart linking is performed by a build tool that excludes unused code in the linking process. For example, a program that only uses integers for arithmetic, or does no arithmetic operations at all, can exclude floating-point library routines. This can lead to smaller program file size and reduced memory usage.
Relocation
[edit]Some references in a program or library module are stored in a relative or symbolic form which cannot be resolved until all code and libraries are assigned final static addresses. Relocation is the process of adjusting these references, and is done either by the linker or the loader. In general, relocation cannot be done to individual libraries themselves because the addresses in memory may vary depending on the program using them and other libraries they are combined with. Position-independent code avoids references to absolute addresses and therefore does not require relocation.
Categories
[edit]Executable
[edit]An executable library consists of code that has been converted from source code into machine code or an intermediate form such as bytecode. A linker allows for using library objects by associating each reference with an address at which the object is located. For example, in C, a library function is invoked via C's normal function call syntax and semantics.[12]
A variant is a library containing compiled code (object code in IBM's nomenclature) in a form that cannot be loaded by the OS but that can be read by the linker.
Static
[edit]A static library is an executable library that is linked into a program at build-time by a linker (or whatever the build tool is called that does linking).[13][14] This process, and the resulting stand-alone file, is known as a static build of the program. A static build may not need any further relocation if virtual memory is used and no address space layout randomization is desired.[15]
A static library is sometimes called an archive on Unix-like systems.
Dynamic
[edit]A dynamic library is linked when the program is run – either at load-time or runtime. The dynamic library was intended after the static library to support additional software deployment flexibility.
Sources
[edit]A source library consists of source code; not compiled code.
Shared
[edit]A shared library is a library that contains executable code designed to be used by multiple computer programs or other libraries at runtime, with only one copy of that code in memory, shared by all programs using the code.[16][17][18]
Object
[edit]Although generally an obsolete technology today, an object library exposes resources for object-oriented programming (OOP) and a distributed object is a remote object library. Examples include: COM/DCOM, SOM/DSOM, DOE, PDO and various CORBA-based systems.
The object library technology was developed since as OOP became popular, it became apparent that OOP runtime binding required information than contemporary libraries did not provide. In addition to the names and entry points of the code located within, due to inheritance, OOP binding also requires a list of dependencies – since the full definition of a method may be in different places. Further, this requires more than listing that one library requires the services of another. In OOP, the libraries themselves may not be known at compile time, and vary from system to system.
The remote object technology was developed in parallel to support multi-tier programs with a user interface application running on a personal computer (PC) using services of a mainframe or minicomputer such as data storage and processing. For instance, a program on a PC would send messages to a minicomputer via remote procedure call (RPC) to retrieve relatively small samples from a relatively large dataset. In response, distributed object technology was developed.
Class
[edit]A class library contains classes that can be used to create objects. In Java, for example, classes are contained in JAR files and objects are created at runtime from the classes. However, in Smalltalk, a class library is the starting point for a system image that includes the entire state of the environment, classes and all instantiated objects. Most class libraries are stored in a package repository (such as Maven Central for Java). Client code explicitly specifies dependencies to external libraries in build configuration files (such as a Maven Pom in Java).
Remote
[edit]A remote library runs on another computer and its assets are accessed via remote procedure call (RPC) over a network. This distributed architecture allows for minimizing installation of the library and support for it on each consuming system and ensuring consistent versioning. A significant downside is that each library call entails significantly more overhead than for a local library.
Runtime
[edit]A runtime library provides access to the runtime environment that is available to a program – tailored to the host platform.
Language standard
[edit]Many modern programming languages specify a standard library that provides a base level of functionality for the language environment.
Code generation
[edit]A code generation library has a high-level API generating or transforming byte code for Java. They are used by aspect-oriented programming, some data access frameworks, and for testing to generate dynamic proxy objects. They also are used to intercept field access.[19]
File naming
[edit]Unix-like
[edit]On most modern Unix-like systems, library files are stored in directories such as /lib, /usr/lib and /usr/local/lib. A filename typically starts with lib, and ends with .a for a static library (archive) or .so for a shared object (dynamically linked library). For example, libfoo.a and libfoo.so.
Often, symbolic link files are used to manage versioning of a library by providing a link file named without a version that links to a file named with a version. For example, libfoo.so.2 might be version 2 of library foo and a link file named libfoo.so provides a version independent name to that file that programs link to. The link file could be changed to a refer to a version 3 (libfoo.so.3) such that consuming programs will then use version 3 without having to change the program.
Files with extension .la are libtool archives; not usable by the system.
macOS
[edit]The macOS system inherits static library conventions from BSD, with the library stored in a .a file. It uses either .so or .dylib for dynamic libraries. Most libraries in macOS, however, consist of "frameworks", placed inside special directories called "bundles" which wrap the library's required files and metadata. For example, a framework called Abc would be implemented in a bundle called Abc.framework, with Abc.framework/Abc being either the dynamically linked library file or a symlink to the dynamically linked library file in Abc.framework/Versions/Current/Abc.
Windows
[edit]Often, a Windows dynamic-link library (DLL) has the file extension .dll,[20] although sometimes different extensions are used to indicate general content, e.g. .ocx for a OLE library.
A .lib file can be either a static library or contain the information needed to build an application that consumes the associated DLL. In the latter case, the associated DLL file must be present at runtime.
See also
[edit]- Code reuse – Using existing code in new software
- Object file – File containing relocatable format machine code
- Plug-in – Software component that extends the functionality of existing software
- Prelink, also known as prebinding
- Runtime library – Access to a program's runtime environment
- Visual Component Library – Object Pascal framework for Windows (VCL)
- Component Library for Cross Platform (CLX)
- C standard library – Standard library for the C programming language
- Java Class Library – Core Java libraries
- Framework Class Library – Standard library of Microsoft's .NET Framework
- Generic programming – Style of computer programming (used by the C++ Standard Library)
- soname – Field of data in a shared object file
- Method stub – Short and simple version of a method
- List of open source code libraries
Notes
[edit]- ^ It was possible earlier between, e.g., Ada subprograms.
References
[edit]- ^ Babbage, H. P. (1888-09-12). "The Analytical Engine". Proceedings of the British Association. Bath.
- ^ Goldstine, Herman H. (2008-12-31). The Computer from Pascal to von Neumann. Princeton: Princeton University Press. doi:10.1515/9781400820139. ISBN 978-1-4008-2013-9.
- ^ Goldstine, Herman; von Neumann, John (1947). Planning and coding of problems for an electronic computing instrument (Report). Institute for Advanced Study. pp. 3, 21–22. OCLC 26239859.
it will probably be very important to develop an extensive "library" of subroutines
- ^ Wilkes, M. V. (1951). "The EDSAC Computer". 1951 International Workshop on Managing Requirements Knowledge. 1951 International Workshop on Managing Requirements Knowledge. IEEE. p. 79. doi:10.1109/afips.1951.13.
- ^ Campbell-Kelly, Martin (September 2011). "In Praise of 'Wilkes, Wheeler, and Gill'". Communications of the ACM. 54 (9): 25–27. doi:10.1145/1995376.1995386. S2CID 20261972.
- ^ Wilkes, Maurice; Wheeler, David; Gill, Stanley (1951). The Preparation of Programs for an Electronic Digital Computer. Addison-Wesley. pp. 45, 80–91, 100. OCLC 641145988.
- ^ Wexelblat, Richard (1981). History of Programming Languages. ACM Monograph Series. New York, NY: Academic Press (A subsidiary of Harcourt Brace). p. 274. ISBN 0-12-745040-8.
- ^ Wexelblat, op. cit., p. 258
- ^ Wilson, Leslie B.; Clark, Robert G. (1988). Comparative Programming Languages. Wokingham, England: Addison-Wesley. p. 126. ISBN 0-201-18483-4.
- ^ Wilson and Clark, op. cit., p. 52
- ^ Wexelblat, op. cit., p. 716
- ^ Deshpande, Prasad (2013). Metamorphic Detection Using Function Call Graph Analysis (Thesis). San Jose State University Library. doi:10.31979/etd.t9xm-ahsc.
- ^ "Static Libraries". TLDP. Archived from the original on 2013-07-03. Retrieved 2013-10-03.
- ^ Kaminsky, Dan (2008). "Chapter 3 - Portable Executable and Executable and Linking Formats". Reverse Engineering Code with IDA Pro. Elsevier. pp. 37–66. doi:10.1016/b978-1-59749-237-9.00003-x. ISBN 978-1-59749-237-9. Retrieved 2021-05-27.
- ^ Collberg, Christian; Hartman, John H.; Babu, Sridivya; Udupa, Sharath K. (2003). SLINKY: Static Linking Reloaded. USENIX '05. Department of Computer Science, University of Arizona. Archived from the original on 2016-03-23. Retrieved 2016-03-17.
- ^ Levine, John R. (2000). "9. Shared Libraries". Linkers and Loaders. ISBN 1-55860-496-0.
- ^ UNIX System V/386 Release 3.2 Programmers Guide, Vol. 1 (PDF). Prentice Hall. 1989. p. 8-2. ISBN 0-13-944877-2.
- ^ "Shared Libraries in SunOS" (PDF). pp. 1, 3.
- ^ "Code Generation Library". Source Forge. Archived from the original on 2010-01-12. Retrieved 2010-03-03.
Byte Code Generation Library is high level API to generate and transform JAVA byte code. It is used by AOP, testing, data access frameworks to generate dynamic proxy objects and intercept field access.
- ^
Bresnahan, Christine; Blum, Richard (2015-04-27). LPIC-1 Linux Professional Institute Certification Study Guide: Exam 101-400 and Exam 102-400. John Wiley & Sons (published 2015). p. 82. ISBN 9781119021186. Archived from the original on 2015-09-24. Retrieved 2015-09-03.
Linux shared libraries are similar to the dynamic link libraries (DLLs) of Windows. Windows DLLs are usually identified by
.dllfilename extensions.
Further reading
[edit]- Levine, John R. (2000) [October 1999]. "Chapter 9: Shared Libraries & Chapter 10: Dynamic Linking and Loading". Linkers and Loaders. The Morgan Kaufmann Series in Software Engineering and Programming (1 ed.). San Francisco, USA: Morgan Kaufmann. ISBN 1-55860-496-0. OCLC 42413382. Archived from the original on 2012-12-05. Retrieved 2020-01-12. Code: [1][2][dead link] Errata: [3]
- Article Beginner's Guide to Linkers by David Drysdale
- Article Faster C++ program startups by improving runtime linking efficiency by Léon Bottou and John Ryland
- How to Create Program Libraries by Baris Simsek
- BFD - the Binary File Descriptor Library
- 1st Library-Centric Software Design Workshop LCSD'05 Archived 2019-08-28 at the Wayback Machine at OOPSLA'05
- 2nd Library-Centric Software Design Workshop LCSD'06 at OOPSLA'06
- How to create shared library by Ulrich Drepper (with much background info)
- Anatomy of Linux dynamic libraries at IBM.com
Library (computing)
View on GrokipediaFundamentals
Definition
In computing, a software library is a collection of pre-compiled routines, functions, classes, or data structures designed to perform common tasks and enable reuse across multiple programs.[7] These components encapsulate tested and optimized code, allowing developers to integrate functionality without rewriting it from scratch.[8] Key characteristics of software libraries include modularity, which organizes code into independent, self-contained units; reusability, permitting the same components to be employed in diverse applications; and abstraction, which hides implementation details from the calling code while exposing only necessary interfaces.[9] This design promotes efficient software development by reducing redundancy and enhancing maintainability.[4] Unlike standalone executables, which are complete programs that can run independently to perform specific operations, libraries serve as modular components that must be linked into applications during compilation or runtime to provide their functionality.[10] For instance, mathematics libraries such as the Apache Commons Math provide routines for statistical computations and linear algebra, while graphical user interface (GUI) libraries like those in Java's Swing toolkit offer reusable components for building interactive interfaces.[11][12]Purpose and Benefits
Software libraries primarily enable code reuse, preventing duplication of effort by allowing developers to incorporate pre-implemented functionalities into new programs rather than writing them from scratch each time. This approach accelerates development by providing access to thoroughly tested components that have already been refined through widespread use, thereby reducing the time required to build and validate software. Furthermore, libraries standardize common operations—such as data processing or input/output handling—across different applications, promoting consistency and easing integration in larger systems.[13][14] The benefits of libraries extend to enhanced efficiency and quality in software engineering. They significantly cut development time and costs by eliminating the need to "reinvent the wheel" for routine tasks, allowing teams to focus on unique aspects of their projects. Reliability improves as libraries often contain mature, debugged code that has undergone extensive testing in diverse environments, lowering the risk of errors in dependent applications. For dynamic libraries, maintenance becomes more straightforward since updates or fixes to the library can propagate to all programs using it without requiring recompilation, avoiding repetitive modifications across multiple codebases and enabling centralized improvements.[13][15][14][5] While libraries offer these advantages, they also involve trade-offs, particularly in dependency management, where handling versions, compatibility issues, and potential conflicts can add complexity to builds and deployments. However, this overhead is frequently offset by the modularity libraries introduce, which supports cleaner separation of concerns and more adaptable software designs. For example, a developer might use Python's Requests library to manage HTTP protocols and handle connections abstractly, avoiding the need to implement low-level socket programming manually for each network-enabled application.[13]History
Origins
The origins of libraries in computing emerged during the 1940s and 1950s, as programmers developed subroutine libraries in assembly language to enable code reuse on early electronic computers such as ENIAC and UNIVAC. On ENIAC, completed in 1945, the team of programmers, including Betty Snyder Holberton, utilized subroutines to extend the machine's flexibility beyond its initial wiring for specific tasks like ballistic trajectory calculations, allowing repetitive operations to be modularized and invoked as needed.[16] This approach addressed the limitations of ENIAC's plugboard-based programming, where reconfiguration for new problems was labor-intensive. Similarly, on UNIVAC I, delivered in 1951, subroutines formed the basis of early software organization, with programmers manually incorporating reusable routines for data processing and arithmetic operations.[17] A key milestone in the 1950s was the development of linking loaders, particularly in IBM systems, which automated the integration of reusable code blocks and reduced manual intervention. Grace Hopper's A-0 system, implemented in 1952 for the UNIVAC, functioned as an early compiler and linker that automatically selected and assembled subroutines from a library based on symbolic instructions, marking a shift toward automated code linking.[18] For IBM's 704 computer, introduced in 1954, the linking loader enabled the combination of relocatable object modules and library routines at load time, supporting modular programming by resolving addresses and external references dynamically.[19] These loaders, among the first full-featured examples, facilitated the creation of subroutine libraries that could be shared across programs, significantly improving efficiency on vacuum-tube mainframes.[19] The influence of mathematical subroutine libraries became prominent with the release of FORTRAN in 1957, designed for scientific computing on the IBM 704. FORTRAN's system included a library of precompiled mathematical subroutines, such as those for trigonometric functions (e.g., SINF) and absolute values (e.g., ABSF), stored in relocatable binary form on the master tape for easy incorporation into user programs via function calls.[20] These routines, supporting fixed- and floating-point operations with up to eight decimal digits of precision, were passed arguments through registers or common storage and returned results accordingly, enabling complex numerical computations without rewriting basic algorithms.[20] This standardization drew from earlier mathematical libraries but integrated them seamlessly into high-level code, promoting reusability in fields like physics and engineering. Early challenges in these foundational libraries included manual assembly and a lack of standardization, which hampered portability and reliability. On ENIAC, subroutines required physical rewiring or switch settings for each invocation, making maintenance error-prone and time-consuming.[16] Even with UNIVAC and early IBM systems, programmers often had to hand-code and explicitly include subroutines in assembly listings, without uniform formats for relocation or symbol resolution, leading to redundant efforts and compatibility issues across machines.[19] These limitations underscored the need for automated tools like linking loaders, setting the stage for more robust library systems.Evolution
The evolution of computing libraries began in the 1960s and 1970s with the rise of high-level programming languages that facilitated modular code reuse through object code libraries and associated mechanisms like header files. The development of the C programming language in 1972 at Bell Labs marked a pivotal advancement, as it introduced structured approaches to compiling code into reusable object files that could be archived into libraries, enabling efficient linking for Unix-based systems.[21] Header files in C further supported this by allowing declarations of library functions to be shared across source files, promoting portability and reducing redundancy in early system programming.[22] During this era, Unix environments relied primarily on static object code libraries, such as those for mathematical functions, which were integrated at compile time to build robust utilities and operating system components.[23] In the 1980s and 1990s, libraries advanced toward dynamic linking and object-oriented paradigms, enhancing runtime flexibility and application development efficiency. Microsoft introduced dynamic-link libraries (DLLs) with Windows 1.0 in 1985, allowing code and resources to be shared across multiple applications at runtime, which reduced memory usage and simplified updates in the burgeoning Windows ecosystem.[24] This built on earlier Unix concepts but adapted them for graphical user interfaces and broader commercial software. By the early 1990s, object-oriented class libraries emerged, exemplified by Microsoft's Foundation Classes (MFC), first released in 1992 with Microsoft C/C++ 7.0, which provided C++ wrappers for Windows APIs, streamlining GUI and event-driven programming through inheritance and encapsulation.[25] These innovations shifted library design from mere code repositories to comprehensive frameworks that supported complex, reusable abstractions. The 2000s saw the proliferation of open-source ecosystems, democratizing library access and fostering collaborative development across platforms. The GNU Project's libraries, particularly the GNU C Library (glibc), evolved significantly during this period, integrating with Linux distributions to provide standardized interfaces for system calls, threading, and internationalization, underpinning millions of open-source applications.[26] Package managers like npm, introduced in 2010 for Node.js, revolutionized dependency management by enabling declarative installation of JavaScript libraries from centralized registries, accelerating web development and reducing version conflicts in large-scale projects. Cross-platform standards, such as those refined in POSIX.1-2008 and emerging .NET frameworks, further promoted interoperability, allowing libraries to function seamlessly across Unix-like systems, Windows, and Java virtual machines without extensive rewrites. Recent trends from the 2010s to 2025 have integrated libraries with containerization and portable runtimes, addressing deployment challenges in distributed environments. Docker, launched in 2013, transformed library usage by encapsulating dependencies within lightweight containers, ensuring consistent runtime behavior across development, testing, and production while mitigating "works on my machine" issues through isolated library versions.[27] This approach has influenced library design, with many now optimized for container-friendly formats that minimize image sizes and enhance security via layered builds. Concurrently, WebAssembly (Wasm) modules have emerged as a versatile library format, compiling languages like C++, Rust, and Go into efficient, sandboxed binaries that run near-natively in browsers and edge computing setups, with advancements in the WebAssembly System Interface (WASI) by 2025 enabling secure, cross-platform library sharing beyond the web.Types
Static Libraries
Static libraries, also referred to as archive libraries, consist of collections of precompiled object files bundled together into a single archive file. On Unix-like systems, these archives are created using thear utility and conventionally named with a .a extension, such as libexample.a, where the lib prefix and .a suffix are standard conventions followed by the linker. On Windows platforms, static libraries use the .lib extension and are generated by tools like the Microsoft Visual Studio librarian.[28][29]
During the compilation and linking phase, the linker processes a static library by scanning its archive for object files that define symbols referenced by the program. Only the necessary object files are extracted and their machine code is directly embedded into the final executable, resolving all dependencies at build time without leaving any unresolved external references. This results in a standalone binary that incorporates the library code verbatim.[28][30]
The primary advantages of static libraries include the elimination of runtime dependencies, enabling executables to run on any compatible system without requiring additional library installations or version matching, which simplifies deployment and enhances portability. Execution can also be marginally faster due to the absence of dynamic loading overhead and function indirection. However, drawbacks include significantly larger executable sizes from embedding full library code, potential code duplication across multiple programs using the same library, and the need to recompile all dependent applications whenever the library is updated or bug-fixed.[30][29]
A representative example is the GNU C library's math library, libm.a, which provides static implementations of mathematical functions such as sin(), cos(), and sqrt(). When a C program includes <math.h> and invokes these functions, the linker incorporates the relevant object code from libm.a into the executable, ensuring all arithmetic operations are self-contained.[28]
Unlike dynamic libraries, static libraries provide fixed code integration at compile time, avoiding runtime flexibility but guaranteeing consistency.[30]
Dynamic Libraries
Dynamic libraries, also known as dynamically linked libraries, are collections of executable code and data that are loaded into memory by the operating system's loader at program execution time, enabling multiple processes to share the same library instance for efficiency.[24][31] This runtime loading contrasts with static libraries, which embed code directly into the executable during compilation.[32] A key feature of dynamic libraries is delayed binding, where symbol resolution—mapping function names to their actual addresses—occurs only when the code is first executed, rather than at load time, which reduces initial startup overhead through mechanisms like procedure linkage tables (PLTs).[32] They also support versioned loading, allowing the system to select specific library versions (e.g., via version numbers in filenames) to match application requirements without recompilation.[32] The primary advantages include smaller executable file sizes, as the library code is not duplicated in each program, leading to reduced disk storage needs.[24] Memory efficiency is achieved by loading the library once into shared memory space, accessible by multiple applications simultaneously, which optimizes resource usage in multitasking environments.[31] Updates to the library can be applied centrally without modifying or redistributing individual executables, facilitating maintenance and bug fixes across systems.[32] However, dynamic libraries introduce challenges such as dependency issues, where programs may fail if required libraries are missing or incompatible with the system's versions.[33] A notable disadvantage is "DLL Hell," a conflict arising when installing one application overwrites or replaces a shared library version needed by another, causing unexpected failures due to mismatched entry points or interfaces.[33] Additionally, runtime binding can impose a slight performance penalty, estimated at 5-15%, from generating position-independent code and resolving symbols on demand.[32] Examples of dynamic libraries include Windows DLL files, such as those implementing the Windows API (e.g., kernel32.dll), which are loaded explicitly or implicitly at runtime to provide system services.[24] On Linux, shared object (.so) files like libpthread.so.0 serve similar purposes, often used for plugins that extend application functionality without rebuilding the core program.[32]Shared Libraries
Shared libraries, also known as dynamic shared objects (DSOs) in Unix-like systems, are executable files containing code and data that can be loaded into memory once and mapped into the virtual address space of multiple processes simultaneously, allowing concurrent use by different applications without duplication.[23] These libraries typically have filenames ending in.so (shared object) on Linux and similar systems, and they are designed to provide reusable functions, such as those in the C standard library, that multiple programs can access at runtime.[34]
The loading and unloading of shared libraries are managed by the operating system's dynamic linker, which uses mechanisms like reference counting to track usage across processes. When a process requires a shared library, the linker loads it via functions such as dlopen(), increments a reference count for each dependent module, and performs necessary relocations to integrate it into the process's address space; inter-process compatibility is ensured by sharing read-only text segments while keeping data segments private per process. Unloading occurs through dlclose(), which decrements the reference count, and the library is only removed from memory when the count reaches zero and no other dependencies remain, preventing premature deallocation.[35][36]
Shared libraries offer significant advantages in resource conservation, as a single instance in physical memory serves multiple processes, reducing overall RAM usage and disk space compared to embedding code in each executable. This sharing also facilitates easier updates to library code without recompiling dependent applications, promoting efficiency in large-scale systems. However, they introduce disadvantages such as version conflicts, where incompatible changes in library versions can break applications (often termed "DLL hell" in Windows contexts, with analogous issues in Unix), requiring careful versioning schemes like SONAMEs to mitigate. Additionally, security risks arise from shared writable and executable memory segments, which can enable code injection attacks if not protected by features like RELRO (relocation read-only).[23][34][36]
A prominent example is libc.so, the GNU C Library shared object, which provides essential functions like printf() and malloc() and is loaded once into memory to serve all C programs on a system, exemplifying how shared libraries conserve resources in everyday computing environments.[23]
Object Libraries
Object libraries serve as an intermediate artifact in the software build process, consisting of collections of compiled object files—typically with extensions like.o on Unix-like systems or .obj on Windows—that remain unlinked and contain machine code modules derived from source files.[37] These files are generated by compilers such as GCC or Clang after the assembly stage, preserving unresolved symbols and relocation information for subsequent linking.[38] Unlike source code or fully linked executables, object libraries facilitate modular development by allowing separate compilation of individual modules without immediate resolution of external dependencies.[39]
In build pipelines, object libraries act as inputs to linkers and archivers, enabling the creation of static or dynamic libraries as well as executables. For instance, tools like the GNU archiver (ar) can package multiple object files into an archive file, which then serves as a unit for the linker to extract and incorporate only the necessary modules during final assembly.[40] This approach supports large-scale projects by decoupling compilation from linking, allowing build systems such as Make or CMake to manage dependencies efficiently. In CMake, for example, an object library target compiles sources without producing a linkable artifact, permitting those objects to be reused across multiple downstream targets via generator expressions like $<TARGET_OBJECTS:libname>.[37]
A primary advantage of object libraries is support for incremental compilation, where only modified source files need recompilation, significantly reducing build times in expansive codebases compared to full recompilations.[41] This modularity also promotes code reuse and team collaboration, as developers can compile and share object modules independently before integration. However, a key disadvantage is that object libraries are not directly executable; they require further processing by a linker to resolve symbols and generate runnable binaries, limiting their standalone utility.[38]
For example, the GNU Binutils ar tool can create an object library archive named libfoo.a from several object files using the command ar rcs libfoo.a file1.o file2.o file3.o, where r adds or replaces members, c ensures creation if absent, and s generates an index of symbols for accelerated linking.[40] This archive then provides the unlinked object modules as input for subsequent linking steps.
Runtime Libraries
Runtime libraries are collections of low-level routines and functions that support the execution of programs by managing essential runtime operations, such as initialization, resource allocation, and error management, often tailored to specific compilers or platforms.[42] In the context of C programming, the runtime library includes components like crt0, which serves as the startup code responsible for setting up the program's execution environment before invoking the main function.[43] These libraries bridge the gap between compiled code and the underlying operating system, ensuring that programs can perform necessary tasks without direct hardware interaction.[44] Key components of runtime libraries typically encompass startup and initialization code, which prepares the stack, initializes global variables, and handles program termination; standard input/output routines for file and console operations; memory allocation functions like malloc and free; and error handling mechanisms, including exception support for languages like C++.[42] For instance, in GCC's libgcc, these include arithmetic operations, exception handling routines, and basic memory operations such as memcpy, all implemented to support operations not feasible inline.[43] In Microsoft's C runtime library, similar features are provided through modules like vcruntime.lib for exception handling and the Universal CRT for memory management and I/O, enabling robust execution in multi-threaded environments.[42] The primary importance of runtime libraries lies in their role in enhancing software portability by abstracting platform-specific details, allowing programs to execute consistently across different operating systems and hardware architectures without extensive modifications.[45] This abstraction layer simplifies cross-platform development, as the library handles variations in system calls and resource management, thereby reducing the need for application-level adaptations.[46] For example, by standardizing runtime behaviors, these libraries ensure that applications compiled on one environment can run reliably on another, promoting broader compatibility.[47] A prominent example is the Java Runtime Environment (JRE), which includes libraries essential for supporting the Java Virtual Machine (JVM) during program execution, such as those for garbage collection, threading, and security management.[48] The JRE provides the necessary runtime components to load, verify, and execute Java bytecode, abstracting OS differences to enable "write once, run anywhere" portability across diverse platforms like Windows, Linux, and macOS.[49] This setup ensures that Java applications rely on the JRE's libraries for core execution needs, including memory allocation and exception propagation within the JVM.[48]Standard Libraries
Standard libraries in computing refer to the official collections of functions, types, macros, and modules that are specified and mandated by a programming language's international standard, ensuring a consistent set of core functionalities across compliant implementations. These libraries form an integral part of the language specification, providing essential building blocks for tasks such as input/output operations, data manipulation, and mathematical computations, without which the language would lack portability and standardization.[50][51] The C Standard Library, as defined in the ISO/IEC 9899 specification, exemplifies this concept through its inclusion of headers like<stdio.h> for stream-based input and output, <string.h> for string handling functions such as strlen and strcpy, and <math.h> for mathematical operations including sin, cos, and pow.[50] This library has evolved with standard revisions; for instance, the C11 edition (ISO/IEC 9899:2011) introduced enhancements like improved support for Unicode multibyte characters and atomic operations in <stdatomic.h>, expanding coverage to concurrent programming needs, while the latest C23 edition (ISO/IEC 9899:2024, published October 2024) adds features such as a built-in bool type, bit-precise integer types, and improved attributes for better code annotation and diagnostics.[50][52] Similarly, the C++ Standard Library, outlined in ISO/IEC 14882, builds upon the C library while adding object-oriented and generic programming support, such as the Standard Template Library (STL) components for containers (e.g., std::vector) and algorithms (e.g., std::sort), with the C++17 edition (ISO/IEC 14882:2017) incorporating features like the filesystem library in <filesystem> for directory and file operations, and the current C++23 edition (ISO/IEC 14882:2024, published October 2024) introducing modules for better encapsulation, enhanced coroutines, and improvements to ranges and concepts.[51][53]
In dynamically typed languages like Python, the standard library comprises a suite of built-in modules that interface with the operating system and provide standardized solutions for common tasks, as specified in the Python language reference (version 3.14 as of October 2025). Key examples include the os module for platform-independent path manipulations and process interactions, and the sys module for accessing interpreter-specific variables and command-line arguments.[54] These modules cover areas such as file I/O via io, mathematical functions in math, and string processing with built-in methods, ensuring developers have immediate access to robust, cross-platform tools.[54]
The primary role of standard libraries is to guarantee interoperability, allowing code written for one compliant compiler or interpreter to function predictably on another, while establishing a baseline of features that promote code reusability and reduce development overhead.[50][51][55] By mandating these implementations, language standards foster a ecosystem where baseline functionality is uniform, though actual runtime support may vary by platform.[55]
Specialized Libraries
Class libraries represent collections of object-oriented classes, interfaces, and other types designed to provide reusable components for software development. In the .NET ecosystem, the Framework Class Library (FCL) serves as a primary example, offering a comprehensive set of namespaces and types that encapsulate system functionality, including data types, interfaces, and utilities for tasks such as input/output, networking, and security.[56] These libraries promote modularity by allowing developers to assemble applications from pre-built, extensible components, reducing redundancy and enhancing maintainability through inheritance and polymorphism.[57] Remote libraries facilitate distributed computing by enabling communication between components across different machines or processes, often through generated stubs that abstract network interactions. In systems like CORBA (Common Object Request Broker Architecture), Interface Definition Language (IDL) files define remote object interfaces, from which client and server stubs are automatically generated to handle method invocations as if they were local calls, managing marshaling, location transparency, and error handling.[58] Similarly, Java Remote Method Invocation (RMI) employs stubs as proxy objects that serialize parameters and forward requests to remote servers, supporting object-oriented features like polymorphism in distributed environments without requiring explicit socket programming.[59] These libraries are essential for building scalable, heterogeneous systems where services are invoked remotely, such as in enterprise applications or microservices architectures. Code generation libraries automate the creation of source code during the build process, targeting specific domains like parsing or protocol implementation to streamline development. ANTLR (ANother Tool for Language Recognition), a prominent parser generator, takes grammar specifications in files (e.g., .g4 format) and produces lexer and parser code in languages like Java, C++, or Python at compile time, enabling efficient processing of structured inputs such as programming languages or configuration files.[60] This approach ensures high performance by generating tailored, recursive-descent parsers that can construct abstract syntax trees, avoiding the need for manual coding of lexical analysis and syntax rules.[60] Beyond these, specialized libraries include header-only variants, prevalent in C++ for template-heavy implementations, where all code resides in header files to allow the compiler to instantiate templates inline without separate compilation or linking steps. This design minimizes deployment complexity and optimizes for generic programming, as seen in many components of the Boost C++ Libraries, which distribute source code for user compilation while keeping template-based modules header-only to leverage C++'s type system fully.[61] Source-based libraries like Boost extend this by providing portable, peer-reviewed code that users build from source, ensuring compatibility across platforms and compilers without precompiled binaries.[62]Linking
Static Linking
Static linking is a process in which the linker incorporates the necessary code from static libraries directly into the final executable file during the build phase, resulting in a self-contained program that does not rely on external libraries at runtime.[63] This approach ensures that all required library functions and data are resolved and embedded at compile time, producing a larger but standalone binary.[28] The linker, such as the GNU linker (ld), combines multiple input object files—generated from source code compilation—with static libraries (typically in .a archive format) to form a single executable.[63] During this phase, external symbol references in the object files are matched to their corresponding definitions within the static libraries, a process known as symbol resolution.[64] The linker scans the symbol tables of the inputs to identify undefined symbols and locates their implementations, ensuring all dependencies are satisfied before generating the output. The linking process involves several key steps: first, the linker processes the command-line inputs in order, starting with the program's object files and then the specified static libraries via options like -l.[64] It scans each static library archive sequentially, examining its internal object files to find those that provide definitions for currently unresolved symbols.[63] Only the necessary object files are extracted from the archive and incorporated, avoiding inclusion of unused code to optimize the executable size; this selective extraction repeats as new undefined symbols arise from previously pulled objects.[63] Dependencies between libraries are resolved by iterating through the archives multiple times if required, often using constructs like --start-group and --end-group to handle circular references.[64] Once all symbols are resolved, the linker merges the code, data, and other sections from the selected objects into a cohesive executable layout.[63] A common example of invoking static linking is with the GCC compiler using the -static flag, as ingcc -static main.c -o program -lm, which directs the linker to embed the math library (libm.a) and other dependencies directly into the resulting program executable.[65] This contrasts with dynamic linking, where library resolution occurs at runtime.[28]
Dynamic Linking
Dynamic linking, also referred to as shared linking, is a mechanism in which the symbols referenced by an executable are resolved at load time or during execution, rather than being fully integrated at compile time. This process begins when the operating system's loader identifies a dynamically linked executable and invokes a runtime linker, such asld.so on Linux systems, to map the necessary shared libraries into the process's virtual address space. The runtime linker recursively loads dependencies listed in the executable's dynamic section (e.g., via DT_NEEDED entries in ELF files), applies initial relocations to itself, and prepares stubs—such as the Procedure Linkage Table (PLT)—for deferred symbol resolution. This approach enables memory sharing among multiple processes and facilitates updates to libraries without recompiling applications.[66][67][34]
Dynamic linking supports two primary modes: load-time (eager) binding and run-time (lazy) binding. In eager binding, all external symbols are resolved immediately upon program startup, often enforced by environment variables like LD_BIND_NOW on Linux or compiler flags such as -z now, ensuring complete symbol resolution before execution proceeds but potentially increasing startup latency. Lazy binding, the default in many systems, defers resolution until a symbol is first invoked; for instance, a PLT stub initially redirects calls to the runtime linker, which then updates the Global Offset Table (GOT) with the actual function address for subsequent direct calls, optimizing initial load times especially for large applications with unused library functions. Unlike static linking, which embeds library code directly into the executable at build time, dynamic linking promotes efficiency through shared memory usage across processes.[67][34]
On Unix-like systems, explicit control over run-time linking is provided by the Dynamic Loading API, including functions like dlopen() to load a shared object (e.g., a .so file) and return a handle, and dlsym() to retrieve the address of a specific symbol within that object. These tools support flags such as RTLD_LAZY for deferred binding or RTLD_NOW for immediate resolution, allowing programs to load libraries on demand without prior knowledge at compile time. Similarly, on Windows, run-time dynamic linking uses LoadLibrary() (or LoadLibraryEx()) to load a DLL into the process and increment its reference count, followed by GetProcAddress() to obtain function pointers, enabling flexible module loading independent of import libraries used in load-time scenarios.[35][68][69]
A common application of dynamic linking is in plugin architectures, where an application uses dlopen() to load extension modules (e.g., .so files) at runtime based on user input or configuration, resolving their symbols via dlsym() to extend functionality without requiring the core program to be rebuilt. This is exemplified in systems like web browsers loading renderer plugins or media players incorporating codec libraries, promoting modularity and reducing binary size. The reference count maintained by the runtime linker ensures libraries remain loaded only as long as needed, with unloading via dlclose() on Unix or when the count reaches zero on Windows.[34][69]
Relocation
Relocation Process
In computing, relocation is the process of modifying the addresses embedded in the machine code and data of a relocatable object, such as a dynamic library, to account for its actual loading position in memory. This adjustment ensures that references to symbols—such as functions, variables, or other code segments—resolve correctly, regardless of where the library is placed by the operating system's loader. For dynamic libraries, relocation typically occurs at load time, performed by the runtime linker (also known as the dynamic loader), which processes relocation entries stored in the library's executable format, such as the Executable and Linkable Format (ELF) used in Unix-like systems.[70][71] The relocation process begins after the dynamic linker has loaded the library's segments into memory and determined its base address, which may vary across executions to enhance security through address space layout randomization (ASLR). The linker then examines the relocation sections, such as.rel.dyn or .rela.dyn in ELF files, which contain an array of relocation entries. Each entry specifies an offset within the library (where the adjustment is needed), a type indicating the relocation kind (e.g., absolute address update or PC-relative adjustment), and an associated symbol from the library's symbol table. For instance, in ELF, a Elf32_Rel entry includes r_offset (the byte offset or virtual address to patch) and r_info (encoding the symbol index and relocation type), while Elf32_Rela variants add an explicit r_addend constant. The linker computes the target value by adding the symbol's resolved address (obtained via symbol lookup), any addend, and the library's base address, then stores the result at the specified offset—effectively patching the code or data in place.[70][72]
To handle external references efficiently, dynamic libraries employ indirection mechanisms like the Global Offset Table (GOT) for data symbols and the Procedure Linkage Table (PLT) for function calls. The GOT is a writable array of pointers initialized during relocation; the linker updates its entries with the actual addresses of global variables or data from other modules, allowing the library's code to access them indirectly without further patching. Similarly, the PLT serves as a trampoline for unresolved function calls: initial entries point to the dynamic linker, which performs lazy binding—resolving and updating the PLT only on the first invocation of the function—while subsequent calls jump directly to the target. This lazy approach defers non-essential relocations until needed, reducing startup overhead, though immediate binding (all relocations at load time) can be enforced for debugging or security. In ELF-based systems, symbol lookup during relocation prioritizes the main executable, followed by dependencies in load order, respecting visibility rules like global (search all objects) or local (within the same dependency group).[34][71][72]
Relocation types vary by architecture; for example, on x86, R_386_32 adds a 32-bit absolute address, while R_386_PC32 computes a PC-relative offset for branches. Consecutive relocations at the same offset are composed into a single adjustment to avoid intermediate computations. Post-relocation, the linker executes any initialization routines (e.g., in the .init section) before transferring control to the program. This process enables dynamic libraries to be shared across processes without duplication, but it introduces overhead from load-time computations and potential vulnerabilities if writable sections like the GOT are exploited. Modern systems mitigate this via techniques like RELRO (Relocation Read-Only), which protects relocated sections after processing.[70]
Position-Independent Code
Position-Independent Code (PIC) is a programming technique that enables executable code to operate correctly regardless of its absolute memory location, relying on relative addressing modes and runtime-resolved references rather than fixed addresses. In formats like ELF, PIC achieves this through mechanisms such as the Global Offset Table (GOT), which stores absolute addresses for data accesses, and the Procedure Linkage Table (PLT), which handles indirect jumps to external functions, allowing the code itself to remain unmodified and relocatable at load time. This approach ensures that shared libraries can be mapped to arbitrary addresses without requiring per-process relocations of the text segment, preserving its read-only and sharable nature.[73][74] The primary advantages of PIC include enhanced security through compatibility with Address Space Layout Randomization (ASLR), which randomizes load addresses to thwart memory-based exploits, and improved efficiency in library sharing across multiple processes by eliminating the need for private copies or runtime text relocations. It also reduces memory footprint and swap usage, as the shared text segment avoids duplication. However, PIC incurs a slight performance overhead due to the added indirections—such as extra instructions for GOT/PLT accesses—which can make data loads and function calls marginally slower compared to position-dependent code, particularly in performance-critical applications.[75][74][76] PIC is implemented via compiler options, such as the-fPIC flag in GCC, which generates code without Global Offset Table size limitations and defines macros like __PIC__ to 2 for conditional compilation. This flag is essential for building shared libraries on platforms supporting dynamic linking, as it ensures compatibility with the runtime loader. For instance, most modern dynamic libraries in Unix-like systems, including those in glibc and other standard distributions, are compiled as PIC to facilitate secure and efficient deployment.[76]
Platform Conventions
Unix-like Systems
In Unix-like systems, libraries follow standardized naming conventions to facilitate identification and linking. Static libraries are typically archived in files with the extension.a, prefixed by lib, such as libexample.a, which bundles object files for inclusion during the linking phase of compilation.[77] Dynamic libraries, known as shared objects, use the .so extension, also prefixed by lib, and include version information in the filename to manage compatibility, for example libexample.so.1.2.3, where the numbers represent major, minor, and release versions, respectively.[78] This versioning scheme, supported by tools like GNU Libtool, allows multiple versions to coexist, enabling backward compatibility while introducing new interfaces.[79]
Several command-line tools are integral to managing libraries in Unix-like environments. The ar utility creates and maintains static library archives by combining object files into a single .a file, serving as a binary utility for subroutine libraries.[77] The GNU linker ld resolves symbols and combines object files, static libraries, and shared objects into executables or further libraries, supporting options for both static and dynamic linking.[80] For inspecting dependencies, the ldd command lists the shared libraries required by an executable or another shared object, revealing the dynamic linker's resolution paths and versions at runtime.[81]
These conventions align with POSIX standards, which promote portability across Unix-like systems by defining a core set of system interfaces, including standard libraries like the C library (libc), ensuring applications can be compiled and run consistently without platform-specific modifications.[82] POSIX compliance, as outlined in IEEE 1003.1, emphasizes header files, function prototypes, and behaviors for library functions to enable source-level portability.[82]
A representative example of library organization is the /usr/lib directory, which holds architecture-dependent libraries for user-installed applications, often structured with subdirectories for specific architectures (e.g., /usr/lib/x86_64-linux-gnu) to separate 32-bit and 64-bit variants, adhering to the Filesystem Hierarchy Standard (FHS).[83] Essential system libraries may reside in /lib, but /usr/lib typically contains development libraries and shared objects for broader software ecosystems.[83]
macOS
In macOS, libraries are primarily formatted using the Mach-O executable file format, which serves as the native binary structure for executables, dynamic libraries, and static archives. Dynamic libraries, known as .dylib files, are shared libraries that are loaded at runtime by the dynamic linker, allowing multiple applications to share the same code in memory and reducing redundancy. Static libraries, on the other hand, use the .a extension and consist of archived Mach-O object files that are linked directly into the executable during compilation, embedding the library code permanently into the application binary.[84][85][86] Naming conventions for macOS libraries follow a structured pattern to facilitate identification and loading. Dynamic libraries typically adopt the formlib<library_name>.dylib, such as libz.dylib for the zlib compression library, and are installed in system directories like /usr/lib or /System/[Library](/page/Library)/Frameworks. For higher-level abstractions, macOS employs framework bundles, which package dynamic libraries (.dylib) along with headers, resources, and metadata into a directory structure (e.g., Foundation.framework), promoting modularity and version control through umbrella frameworks that encompass related subframeworks. These bundles are located in /System/[Library](/page/Library)/Frameworks for system-provided libraries, enabling developers to link against comprehensive APIs without managing individual .dylib files directly.[87][88][87]
Key tools support the inspection, loading, and management of these libraries. The otool utility examines Mach-O binaries, revealing dependencies (via otool -L), symbols, and sections, which aids in debugging linking issues or verifying library integrity. The dynamic linker, dyld (located at /usr/lib/dyld), handles runtime loading of .dylib files, resolving symbols and enforcing compatibility versions to prevent mismatches between library revisions. Developers can interact with dyld programmatically using functions from /usr/include/dlfcn.h, such as dlopen for loading libraries and dlsym for symbol resolution.[85][84][84]
A distinctive feature of macOS libraries is support for universal binaries, which encapsulate multiple architectures (e.g., x86_64 for Intel and arm64 for Apple silicon) within a single file, enabling seamless execution across hardware without recompilation. Both .dylib and .a libraries can be built as universal binaries using the lipo tool to merge architecture-specific variants, ensuring backward compatibility and simplifying distribution for developers targeting diverse Mac systems. This multi-architecture capability stems from macOS's evolution to support transitions like the shift to Apple silicon, with verification possible via lipo -info or otool -f.[89][86][89]
Windows
In Windows, libraries follow the Portable Executable (PE) format, which is based on the Common Object File Format (COFF), enabling both executable files and shared libraries to share a common structure for loading and execution.[90][91] Dynamic libraries are distributed as files with the .dll extension, containing executable code, data, and resources that can be loaded at runtime by applications or other libraries.[24] Static libraries and import libraries use the .lib extension; static .lib files archive object code for direct linking into executables, while import .lib files provide stub information for resolving references to functions exported by DLLs during the build process.[92][93] DLL naming conventions typically append ".dll" to the library name, such as "example.dll", to indicate its dynamic nature and facilitate system identification during loading.[94] To mitigate DLL hell—conflicts from version mismatches—Windows supports side-by-side (SxS) assemblies, where DLLs are packaged with XML manifests that specify assembly identity, including name, version, public key token, and dependencies, allowing multiple versions to coexist without overwriting system files.[95][96] These manifests are embedded in the DLL or provided externally and are used by the application verifier and fusion loader to bind to the correct assembly version at runtime.[97] The Microsoft Visual C++ (MSVC) linker, link.exe, is the primary tool for creating both static libraries and DLLs by combining object files (.obj) and libraries into PE-format outputs, supporting options for export definitions and manifest generation.[98] For analysis, dumpbin.exe examines PE/COFF files, displaying details such as exports, imports, sections, and dependencies in DLLs and .lib files, aiding developers in debugging linking issues or verifying binary contents.[99][100] A distinctive feature of Windows libraries, particularly those implementing Component Object Model (COM) interfaces, is self-registration via the registry; DLLs expose a DllRegisterServer function that, when invoked by regsvr32.exe, adds entries under HKEY_CLASSES_ROOT with class IDs (CLSIDs), paths to the DLL, and threading models to enable discovery and instantiation by COM clients.[101][102] This registry-based mechanism contrasts with Unix-like systems' reliance on file paths and environment variables, providing centralized metadata for binary reuse across applications.[103]References
- https://cio-wiki.org/wiki/Software_Development_Kit_%28SDK%29