Hubbry Logo
Digital imageDigital imageMain
Open search
Digital image
Community hub
Digital image
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Digital image
Digital image
from Wikipedia

A digital image is an image composed of picture elements, also known as pixels, each with finite, discrete quantities of numeric representation for its intensity or gray level that is an output from its two-dimensional functions fed as input by its spatial coordinates denoted with x, y on the x-axis and y-axis, respectively.[1] An image can be vector or raster type. By itself, the term "digital image" usually refers to raster images or bitmapped images (as opposed to vector images).[2]

Raster

[edit]

Raster images have a finite set of digital values, called picture elements or pixels.[3] The digital image contains a fixed number of rows and columns of pixels.[4] Pixels are the smallest individual element in an image, holding quantized values that represent the brightness of a given color at any specific point.

Typically, the pixels are stored in computer memory as a raster image or raster map, a two-dimensional array of small integers. These values are often transmitted or stored in a compressed form.

Raster images can be created by a variety of input devices and techniques, such as digital cameras, scanners, coordinate-measuring machines, seismographic profiling, airborne radar, and more. They can also be synthesized from arbitrary non-image data, such as mathematical functions or three-dimensional geometric models; the latter being a major sub-area of computer graphics. The field of digital image processing is the study of algorithms for their transformation.

Raster file formats

[edit]

Most users come into contact with raster images through digital cameras, which use any of several image file formats.

Some digital cameras give access to almost all the data captured by the camera, using a raw image format. The Universal Photographic Imaging Guidelines (UPDIG) suggests these formats be used when possible since raw files produce the best quality images. These file formats allow the photographer and the processing agent the greatest level of control and accuracy for output. Their use is inhibited by the prevalence of proprietary information (trade secrets) for some camera makers, but there have been initiatives such as OpenRAW to influence manufacturers to release these records publicly. An alternative may be Digital Negative (DNG), a proprietary Adobe product described as "the public, archival format for digital camera raw data".[5] Although this format is not yet universally accepted, support for the product is growing, and increasingly professional archivists and conservationists, working for respectable organizations, variously suggest or recommend DNG for archival purposes.[6][7][8][9][10][11][12][13]

Vector

[edit]

Vector images resulted from mathematical geometry (vector). In mathematical terms, a vector consists of both a magnitude, or length, and a direction.

Often, both raster and vector elements will be combined in one image; for example, in the case of a billboard with text (vector) and photographs (raster).

Example of vector file types are EPS, PDF, and AI.

Image viewing

[edit]

Image viewer software displayed on images. Web browsers can display standard internet images formats including JPEG, GIF and PNG. Some can show SVG format which is a standard W3C format. In the past, when the Internet was still slow, it was common to provide "preview" images that would load and appear on the website before being replaced by the main image (to give at preliminary impression). Now Internet is fast enough and this preview image is seldom used.

Some scientific images can be very large (for instance, the 46 gigapixel size image of the Milky Way, about 194 GB in size).[14] Such images are difficult to download and are usually browsed online through more complex web interfaces.

Some viewers offer a slideshow utility to display a sequence of images.

History

[edit]
The first scan done by the SEAC in 1957
The SEAC scanner

Early digital fax machines such as the Bartlane cable picture transmission system preceded digital cameras and computers by decades. The first picture to be scanned, stored, and recreated in digital pixels was displayed on the Standards Eastern Automatic Computer (SEAC) at NIST.[15] The advancement of digital imagery continued in the early 1960s, alongside development of the space program and in medical research. Projects at the Jet Propulsion Laboratory, MIT, Bell Labs and the University of Maryland, among others, used digital images to advance satellite imagery, wirephoto standards conversion, medical imaging, videophone technology, character recognition, and photo enhancement.[16]

Rapid advances in digital imaging began with the introduction of MOS integrated circuits in the 1960s and microprocessors in the early 1970s, alongside progress in related computer memory storage, display technologies, and data compression algorithms.

The invention of computerized axial tomography (CAT scanning), using x-rays to produce a digital image of a "slice" through a three-dimensional object, was of great importance to medical diagnostics. As well as origination of digital images, digitization of analog images allowed the enhancement and restoration of archaeological artifacts and began to be used in fields as diverse as nuclear medicine, astronomy, law enforcement, defence and industry.[17]

Advances in microprocessor technology paved the way for the development and marketing of charge-coupled devices (CCDs) for use in a wide range of image capture devices and gradually displaced the use of analog film and tape in photography and videography towards the end of the 20th century. The computing power necessary to process digital image capture also allowed computer-generated digital images to achieve a level of refinement close to photorealism.[18]

Digital image sensors

[edit]

The first semiconductor image sensor was the CCD, developed by Willard S. Boyle and George E. Smith at Bell Labs in 1969.[19] While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next.[20] The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting.[21]

Early CCD sensors suffered from shutter lag. This was largely resolved with the invention of the pinned photodiode (PPD).[22] It was invented by Nobukazu Teranishi, Hiromitsu Shiraki and Yasuo Ishihara at NEC in 1980.[22][23] It was a photodetector structure with low lag, low noise, high quantum efficiency and low dark current.[22] In 1987, the PPD began to be incorporated into most CCD devices, becoming a fixture in consumer electronic video cameras and then digital still cameras. Since then, the PPD has been used in nearly all CCD sensors and then CMOS sensors.[22]

The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-1980s. This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling reaching smaller micron and then sub-micron levels.[24][25] The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985.[26] The CMOS active-pixel sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993.[22] By 2007, sales of CMOS sensors had surpassed CCD sensors.[27]

Digital image compression

[edit]

An important development in digital image compression technology was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972.[28] DCT compression is used in JPEG, which was introduced by the Joint Photographic Experts Group in 1992.[29] JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format on the Internet.[30]

Mosaic

[edit]

In digital imaging, a mosaic is a combination of non-overlapping images, arranged in some tessellation. Gigapixel images are an example of such digital image mosaics. Satellite imagery are often mosaicked to cover Earth regions.

Interactive viewing is provided by virtual-reality photography.

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A digital image is defined as a two-dimensional function f(x,y)f(x, y), where xx and yy denote spatial coordinates and the represents the intensity or gray level at any point, with all values being finite and discrete quantities forming an array of picture elements known as . This representation arises from the of a continuous analog image through two primary processes: sampling, which discretizes the spatial coordinates into a grid of positions, and quantization, which maps continuous intensity values to a of discrete levels, typically using L=2kL = 2^k gray levels where kk is the number of bits per pixel. For instance, an 8-bit image employs 256 levels ranging from 0 () to 255 (), while color images often use 24 bits across , , and channels to yield over 16 million possible colors. The structure of a digital image is fundamentally a matrix of pixels, with each element a[m,n]a[m, n] holding an integer value corresponding to its position in rows (mm) and columns (nn), originating typically at the top-left corner where coordinates increase rightward and downward. Pixels capture localized intensity information, derived from readings that average light over finite areas via a , and the overall image size is specified by dimensions such as M×NM \times N pixels, common values being 256, 512, or 1024 in each direction. Key properties include , which measures the smallest discernible detail based on and must satisfy the Nyquist criterion (sampling at least twice the highest to avoid ), and bit depth, which determines the and number of distinguishable gray levels—for example, 8 bits provide 48 dB of range, while 12 bits extend to 72 dB. Aspect ratios, such as 4:3 for standard video or 16:9 for high-definition, further define the geometric proportions. Digital images underpin a broad array of applications in science and , including like computerized axial tomography () scans for diagnostic visualization since the 1970s, for satellite photo analysis, and industrial inspection for . In astronomy, they enable enhancement and analysis of celestial data, while in , they support tasks such as and media processing. Fundamental processing steps—ranging from acquisition and enhancement to segmentation and recognition—facilitate these uses, with digital formats like for compression and for lossless storage ensuring efficient handling across domains.

Fundamentals

Definition and Properties

A digital image is a numeric representation, typically in binary form, of a two-dimensional image, composed of a finite set of digital values that capture visual information through discrete spatial and intensity samples. This representation arises from the digitization of a continuous analog image via two principal processes: sampling, which discretizes the spatial coordinates into a grid of points, and quantization, which discretizes the amplitude or intensity values into a finite number of levels. As a result, a digital image is fundamentally a matrix of numerical entries, where each entry corresponds to the intensity at a specific location. Key properties of digital images stem from their discrete nature, which imposes finite resolution and limits the representation to a grid-based structure. Spatial sampling divides the into a regular of picture elements (s), typically arranged in M rows and N columns, determining the overall image dimensions and thus the . Quantization further discretizes the continuous range of intensities into L discrete levels, often where L = 2^k and k is the bit depth per pixel, enabling a range of shades or colors (e.g., 8-bit depth yields 256 levels for ). This bit depth directly influences the precision of intensity representation, with higher values allowing finer gradations but increasing storage requirements. Unlike analog images, which are continuous and susceptible to accumulation and degradation during copying or transmission, digital images are stored and processed electronically as exact , permitting perfect replication without loss of quality. Basic metrics of a digital image include its dimensions, expressed as width × height in pixels (e.g., ), which quantify the total number of pixels and thus the . The , calculated as the ratio of width to height (e.g., 16:9 for formats), describes the proportional shape and affects display compatibility. File size implications arise from these metrics, as an uncompressed image requires approximately M × N × k bits of storage, scaling with resolution and bit depth to impact transmission and archival efficiency.

Pixels, Resolution, and Color

A , short for picture element, is the smallest addressable element in a digital image, typically represented as a square or rectangular unit that holds a single intensity or color value. This value corresponds to the sampled intensity of light at a specific point in the original scene, forming the fundamental building block of the image's structure. To ensure accurate representation without artifacts, the sampling of pixels must adhere to the Nyquist-Shannon sampling theorem, which requires capturing at least twice the highest present in the scene to reconstruct the image faithfully. Resolution in digital images quantifies the detail level, primarily through , which measures the number of pixels per unit length, often expressed as pixels per inch (ppi) or (dpi). Higher allows finer details but increases and computational demands; for example, a 300 dpi image provides sharper output for print than 72 dpi suited for web display. refers to the lens or sensor's inherent ability to resolve fine details based on , while digital resolution is limited by the grid, with the effective resolution being the lower of the two. For dynamic digital images like video frames, may also apply, indicating the number of frames per second to capture motion smoothly. Color in digital images is represented through models that define how hues, intensities, and shades are encoded. The RGB model is an system used for displays, where , , and primaries are combined in varying intensities to produce a wide of colors, with full white achieved by maximum levels of all three. In contrast, the CMYK model is subtractive and optimized for , employing , , , and inks that absorb specific wavelengths from white light to create colors on paper. The HSV (hue, saturation, value) model, also known as HSB (hue, saturation, brightness), aligns more closely with human perception by separating color into hue (the , measured in degrees from 0 to 360), saturation (color purity from 0% gray to 100% vivid), and value (brightness from 0% to 100% full intensity). images simplify to a single channel representing intensity levels without color, ideal for applications like where hue is irrelevant. Digital images store color via channels, with bit depth determining the precision of each channel's values. In the common 8-bit per channel configuration for RGB images—totaling 24 bits per pixel—each red (R), green (G), and blue (B) component ranges from 0 to 255, enabling up to 16.7 million distinct colors (2^24). This can be expressed mathematically as a color tuple: (R,G,B)where0R,G,B255(R, G, B) \quad \text{where} \quad 0 \leq R, G, B \leq 255 for an 8-bit RGB pixel. An optional alpha channel adds transparency information, typically also 8 bits, where values from 0 (fully transparent) to 255 (fully opaque) control how the pixel blends with underlying layers, essential for compositing in graphics software. Higher bit depths, such as 16 bits per channel, expand dynamic range for professional editing but are less common in standard displays.

Representation Methods

Raster Graphics

Raster graphics, also known as bitmap images, represent digital images as a two-dimensional of pixels arranged in a grid, where each pixel stores a specific intensity or color value corresponding to its spatial coordinates. This structure allows for the precise depiction of visual details at the discrete level defined by the image's resolution, with the overall dimensions typically expressed as M rows by N columns. These representations excel in capturing complex, photorealistic scenes, such as photographs, where fine details and continuous gradients are essential, as the pixel grid enables the simulation of smooth tonal variations through techniques like dithering. Dithering creates the illusion of intermediate tones by spatially distributing limited color values, making it particularly effective for rendering subtle shades in images with constrained palettes. However, raster graphics have notable limitations: enlarging the image beyond its native resolution results in pixelation, where individual pixels become visible and degrade sharpness, and high-resolution files demand substantial storage, for instance, a 1024×1024 8-bit grayscale image requires about 1 MB. Raster graphics find primary applications in digital photography for preserving intricate details in captured scenes, web images where photorealistic elements enhance user engagement, and video frames that form the basis of motion sequences in multimedia. A key challenge in their rendering is aliasing, a sampling artifact that produces jagged edges or "jaggies" on diagonal or curved lines due to insufficient resolution relative to the scene's frequencies. To address this, anti-aliasing methods such as low-pass filtering or bilinear interpolation smooth transitions by averaging pixel values, reducing visible distortions without altering core image content.

Vector Graphics

Vector graphics represent images through mathematical descriptions of geometric shapes rather than discrete s, enabling precise and scalable depictions suitable for illustrations, , and technical drawings. These graphics are constructed from basic primitives such as lines, polygons, and splines, which are defined by coordinates and parameters rather than a grid of color values. Unlike that degrade in quality when enlarged due to interpolation, vector formats maintain sharpness at any scale because they rely on parametric equations to regenerate the image dynamically. The core structure of vector graphics consists of paths—sequences of connected points that outline shapes—along with curves and attributes like fill colors, stroke widths, and gradients applied to those paths. Curves are typically modeled using parametric polynomials, with being a prominent example due to their flexibility in creating smooth contours. A cubic , the most common variant, is defined by four control points: two endpoints (P₀ and P₃) through which the curve passes, and two interior control points (P₁ and P₂) that influence the curve's direction and tangency without lying on the curve itself. The for a cubic Bézier curve is given by: P(t)=(1t)3P0+3(1t)2tP1+3(1t)t2P2+t3P3,t[0,1]\mathbf{P}(t) = (1-t)^3 \mathbf{P}_0 + 3(1-t)^2 t \mathbf{P}_1 + 3(1-t) t^2 \mathbf{P}_2 + t^3 \mathbf{P}_3, \quad t \in [0,1] This formulation allows for intuitive editing by adjusting control points, ensuring the remains smooth and continuous. like straight lines (defined by endpoints) and polygons (closed paths of line segments) form the foundation, while splines such as Bézier or curves handle complex contours; for display, these mathematical definitions are rasterized—converted to pixels—by rendering engines in real-time. A key advantage of is their infinite without quality loss, as the underlying ensures crisp edges regardless of output resolution, making them ideal for applications from print to . File sizes are often smaller for simple illustrations since only shape parameters are stored, not vast arrays, and individual components remain editable, facilitating workflows. Affine transformations, such as scaling, , or , can be applied efficiently through matrix operations on control points, preserving geometric integrity. However, vector graphics struggle with photorealistic scenes requiring continuous tone variations, as rendering complex fills, textures, or gradients demands significant computational resources and may still appear less natural than raster equivalents.

Storage and Formats

Raster File Formats

Raster file formats store digital images as grids of , each containing color and intensity values, enabling the representation of complex visual data through structures. These formats vary in compression techniques, color support, and additional features to suit different applications, from simple icons to high-resolution photographs. The BMP (Bitmap Image File) format, developed by , is an uncompressed raster format that stores pixel data directly without loss of information, resulting in large file sizes but preserving exact image quality. It supports various color depths, including 1-bit up to 32-bit with alpha channels for transparency in modern implementations. BMP files consist of a file header followed by a bitmap information header and raw pixel array, making it straightforward for Windows-based applications. JPEG (Joint Photographic Experts Group), defined by the ISO/IEC 10918-1 standard, employs optimized for photographic images, achieving significant file size reduction by discarding less perceptible details through and quantization. It supports full-color images in RGB or spaces, with baseline mode for sequential encoding and progressive mode for gradual image refinement during display. This format excels in balancing and storage efficiency for continuous-tone images but introduces artifacts at high compression levels. PNG (Portable Network Graphics), specified in ISO/IEC 15948 and the W3C recommendation, provides using algorithms, ensuring no while supporting progressive display. It accommodates truecolor, , and indexed-color modes with palettes up to 256 entries, and includes alpha channel support for variable transparency, enabling seamless . PNG files are structured as a series of chunks for metadata like and text annotations, making it ideal for web graphics requiring precision. TIFF (Tagged Image File Format), outlined in the TIFF 6.0 specification, offers high flexibility through a tag-based structure that allows embedding metadata, multiple pages, and various compression options such as LZW or . It supports multiple images or pages via sub-image file directories (IFDs), extensive color spaces including CMYK for print, and high bit depths for professional workflows. Widely adopted in scanning and due to its robustness and extensibility, TIFF serves as an archival master format in industries like . WebP, developed by and based on the video codec (RFC 6386), is a modern raster format supporting both lossy and , as well as transparency and . It achieves better compression efficiency than and , making it suitable for web images, with widespread browser support as of 2025. WebP files include features like alpha channels and lossless modes for illustrations. (Graphics Interchange Format), version 89a as per the W3C specification, limits images to 256 colors via a palette, using LZW to minimize file sizes for simple graphics. It uniquely supports through sequenced frames with inter-frame delays and disposal methods, alongside basic transparency via a single . GIF's block-based structure facilitates streaming, though its color constraints make it unsuitable for photographs. Common use cases for these formats include for web-optimized photographs due to its compression efficiency, for logos and illustrations needing transparency without quality loss, and for short animations or icons with limited palettes. BMP suits internal Windows processing where file size is not a concern, while TIFF is preferred in professional scanning and printing pipelines for its metadata richness.

Vector File Formats

Vector file formats encode using mathematical primitives such as paths, curves, and shapes, allowing for infinite scalability and resolution independence. These formats are essential for applications requiring precise, editable illustrations, such as logos, diagrams, and technical drawings. Common standards include open formats like and PDF, alongside proprietary ones like AI, each optimized for specific use cases in web, print, and design workflows. (SVG) is an XML-based format developed by the (W3C) for describing two-dimensional vector and mixed vector/raster . It excels in web applications due to its scalability across different display resolutions and integration with or other XML languages. SVG supports scripting for interactivity and declarative animations, making it suitable for dynamic content like charts and icons. The current specification, SVG 2, builds on SVG 1.1 and is a W3C Candidate Recommendation. Encapsulated PostScript (EPS) is a vector format based on Adobe's language, designed for high-quality professional printing and production. It is printer-friendly and resolution-independent, allowing scaling from small formats like business cards to large ones like billboards without quality loss. EPS can combine vector elements with raster data, including images and specific linescreen settings, and serves as an early industry standard for integrating into text-based designs. Developed by in the late 1980s, it remains compatible with tools like and most printers. Portable Document Format (PDF), standardized as ISO 32000-2:2020, is widely used for document exchange and often incorporates for illustrations and layouts. It ensures portability across environments, enabling consistent viewing and interaction independent of software or hardware. PDF supports embedding of vector content, fonts, and metadata, making it ideal for professional documents like reports and brochures that require precise rendering. As an open ISO standard, it facilitates broad for vector-based printing and sharing. Adobe Illustrator (AI) is the native file format for software, optimized for creating and editing complex vector artwork. It stores detailed information such as layers, transparency effects, multiple artboards, and , allowing full editability within . While , AI is widely used in design industries for scalable graphics like and posters due to its small file sizes and rich feature set. However, it requires software for complete access, limiting editing in non-Adobe tools. Standards bodies like the W3C govern to promote open web graphics, while the (ISO) maintains PDF for reliable document portability. Interoperability is enhanced by open formats such as and PDF, which are supported across diverse software and platforms. In contrast, proprietary formats like AI and older EPS files can face compatibility challenges, often requiring conversion to PDF or for broader use, as EPS offers wider historical support but AI provides more detailed editing capabilities within ecosystems.

Acquisition Techniques

Digital Image Sensors

Digital image sensors capture light through an array of photosites, each consisting of a that converts photons into electrons via the . These electrons accumulate as charge proportional to the incident light intensity, forming the basis for values in a digital image. To enable color imaging, a color filter array, such as the invented by Bryce Bayer at in 1976, is overlaid on the . The pattern arranges red, green, and blue filters in an RGGB , with green filters twice as prevalent to match human visual sensitivity, allowing of full-color data from single-color samples at each photosite. Charge-coupled device (CCD) sensors, invented in 1969 by and at Bell Laboratories, were the first widely adopted solid-state imagers. In CCDs, light-generated charges are stored in potential wells beneath MOS capacitors and transferred serially across the to a single output via clocked voltage pulses, enabling high-quality with minimal noise through correlated double sampling. This architecture provided superior sensitivity—up to 100 times that of —and low , making CCDs ideal for early digital single-lens reflex (DSLR) cameras in the 1990s and scientific applications like astronomy, where they remain preferred for their low readout noise. Complementary metal-oxide-semiconductor (CMOS) sensors emerged as a competitive alternative in the , with active-pixel sensor (APS) designs invented at NASA's in 1993, incorporating amplifiers at each for on-chip . CMOS offers advantages over CCDs, including faster readout speeds due to parallel pixel access, lower power consumption from standard fabrication, and seamless integration of analog-to-digital conversion and circuitry, reducing system complexity and cost. By the 2000s, CMOS dominated consumer markets, powering most cameras and compact devices with their scalability to high resolutions. The evolution of digital image sensors began with 1970s CCD prototypes, such as Fairchild's early devices with resolutions under 0.1 megapixels, transitioning to video-capable interline-transfer CCDs in the 1980s. The 1990s saw resurgence, with pinned technology improving charge transfer efficiency in both types. By the early 2000s, megapixel sensors became standard—such as Kodak's 1.3-megapixel CCD in 2001—with advancements enabling higher resolutions, including 10+ megapixels in smartphones by the mid-2000s, starting with the first 10 MP model in 2006—driven by backside illumination and scaling laws that balanced resolution with performance. Key performance metrics for image sensors include size, , and . Sensor size, measured by physical dimensions like full-frame (approximately 36×24 mm, akin to 35mm film) versus crop formats (e.g., at 23.6×15.6 mm), influences light-gathering capacity and ; larger sensors reduce by accommodating bigger photosites with higher full-well capacities, up to 300,000 electrons in examples like back-illuminated CCDs. quantifies the span—from darkest shadows to brightest highlights—over which the sensor maintains good (SNR), typically 60–120 dB in modern devices, essential for high-contrast scenes. sources include thermal (dark current, doubling every 8–10°C and dominant in long exposures) and readout (2–20 electrons per pixel, minimized in cooled scientific CCDs), impacting low-light performance and overall image fidelity.

Scanning and Digitization

Scanning and digitization involve converting analog images, such as printed photographs, documents, or film negatives, into digital representations through optical capture and . This process is essential for preserving in digital archives, enabling long-term storage and accessibility without further degradation of originals. Unlike direct digital capture from sensors, scanning targets existing analog materials, requiring careful handling to maintain . Flatbed scanners are the most common devices for digitizing reflective materials like documents and photographs. They employ a linear array of sensors that move across the scanning bed beneath a glass platen, illuminating the subject with LED or fluorescent light and capturing reflected light line by line. These CCD arrays typically consist of three parallel lines of pixels, one each for red, green, and blue channels, with pixel sizes around 2–4 μm to achieve resolutions up to 2400 dpi. This configuration allows single-pass scanning for color images, making flatbed scanners versatile for everyday and moderate-volume archival tasks. Drum scanners, historically used for high-end applications, rotate the analog medium—such as mounted film transparencies or prints—around a transparent cylinder while a (PMT) assembly reads transmitted or reflected light. PMTs, which are highly sensitive vacuum tubes that amplify photon signals into electrical currents, provide superior (up to 4.0 ) and resolutions exceeding 8000 dpi, outperforming CCD-based systems in capturing subtle tonal gradations in professional prints or films. However, their mechanical complexity, need for wet mounting of media, and slower operation limit their use in modern , where they are generally not recommended due to handling risks. The process begins with optical sampling, where continuous analog light intensities are spatially divided into discrete pixels based on the scanner's resolution, measured in (dpi) or pixels per inch (ppi). For instance, 300–400 ppi is standard for books and documents in archival settings, while films may require 1000–4000 ppi to resolve fine details. Quantization follows, converting these sampled analog values into discrete digital levels, typically 8–16 bits per channel, to represent intensity or color. Software then enhances the output by estimating pixel values between samples, such as via bilinear or bicubic methods, to achieve higher apparent resolutions without additional hardware. These steps, grounded in fundamental image processing principles, ensure the digital image approximates the analog source while introducing minimal artifacts. Applications of scanning and digitization are prominent in preservation, such as archiving motion picture films and rare books to create searchable digital collections. For films, specialized or planetary scanners capture negatives at high ppi to retain details, supporting restoration efforts at institutions like the . Book digitization often integrates (OCR), where post-scan software analyzes pixel patterns to extract text, enabling in digitized volumes and facilitating access for researchers. This OCR integration, applied after scanning, converts raster images into editable formats while preserving layout metadata. Challenges in scanning include moiré patterns, which arise from interference between the scanner's sampling grid and periodic structures in printed s, such as images, producing unwanted wavy or dotted overlays. These artifacts can be mitigated by adjusting resolution to exceed the halftone frequency or applying descreening filters during capture. Dust and scratches pose another issue, appearing as dark spots on scans; (IR) channels in multi-spectral scanners detect these defects since dust scatters IR light differently from film bases, allowing software to clone surrounding pixels for removal without altering master files. Such techniques are vital for clean archival masters, though manual cleaning of originals remains the primary prevention method.

Processing and Analysis

Compression Methods

Digital image compression techniques aim to minimize file sizes while preserving essential visual information, facilitating efficient storage, transmission, and processing of raster-based . These methods exploit redundancies in image signals, such as spatial correlations between neighboring or perceptual irrelevancies in human vision, to achieve reduction without altering the fundamental representation of the . Compression algorithms are broadly classified into lossless and lossy categories, each balancing efficiency with fidelity to the original . Lossless compression ensures exact reconstruction of the original image, making it suitable for applications requiring pixel-perfect accuracy, such as or archival storage. , a variable-length , assigns shorter binary codes to more frequent values or transform coefficients, reducing overall bit usage based on symbol probabilities. Developed by , this method achieves close to the theoretical minimum for a given source. (RLE) is a simple technique that replaces sequences of identical s—common in binary or low-color images—with a single value and a count of repetitions, effectively compressing uniform regions like skies or backgrounds. Lempel-Ziv-Welch (LZW) extends dictionary-based compression by building a dynamic of recurring patterns during encoding, enabling adaptive reduction of redundancy in raster scans; it underpins formats like and TIFF for reversible data packing. Lossy compression discards less perceptible data to attain higher ratios, often at the cost of minor quality degradation, and is prevalent in web and consumer . The JPEG standard employs the (DCT) on 8x8 blocks to concentrate energy into low-frequency coefficients, followed by quantization that rounds less significant values to zero based on psycho-visual models. This process leverages the human visual system's reduced sensitivity to high frequencies and fine spatial details, allowing substantial size reduction—typically 10:1 or more—while introducing reversible approximations. The two-dimensional DCT for an 8x8 block is defined as: F(u,v)=x=07y=07f(x,y)cos[(2x+1)uπ16]cos[(2y+1)vπ16]F(u,v) = \sum_{x=0}^{7} \sum_{y=0}^{7} f(x,y) \cos\left[\frac{(2x+1)u\pi}{16}\right] \cos\left[\frac{(2y+1)v\pi}{16}\right] where f(x,y)f(x,y) represents the input intensities, and F(u,v)F(u,v) are the transformed coefficients. The foundational DCT algorithm was introduced by Ahmed, Natarajan, and Rao for efficient signal decorrelation in compression pipelines. Beyond these core approaches, alternative methods include , which models images as self-similar iterated function systems to approximate complex textures with compact affine transformations, as pioneered by and Hurd for resolution-independent encoding. -based techniques, as in the standard, decompose images into multi-resolution subbands using discrete wavelet transforms, enabling scalable and region-of-interest compression superior to DCT in artifact reduction for high-fidelity needs. For raster images, the Portable Network Graphics () format integrates , a combination of LZ77 sliding-window matching and , to provide versatile adaptable to varying image complexities. As of 2025, emerging AI-based compression methods leverage neural networks and large language models to achieve superior performance. For example, learned compression techniques in formats like JPEG XL use end-to-end neural models for both lossless and lossy encoding, often surpassing traditional methods in rate-distortion efficiency. Innovations such as LMCompress employ large models for lossless compression, setting new benchmarks by exploiting semantic understanding of images. Key trade-offs in compression involve balancing ratio gains against potential degradation: lossless methods like LZW yield modest reductions (2:1 to 3:1 for typical images) without artifacts but falter on high-entropy content, while lossy DCT-based schemes achieve 20:1 or higher at the expense of visible distortions such as blocking in uniform areas or ringing near edges, particularly at aggressive quantization levels. These compromises guide selection based on , with psycho-visual tuning mitigating perceptible losses in lossy paradigms.

Viewing, Editing, and Display

Digital images are viewed through a combination of software applications and hardware displays that render pixel data into visible output. Software viewers, such as image editors and dedicated browsers, interpret file formats and apply rendering algorithms to display images on screen, often incorporating zoom, pan, and metadata overlays for user interaction. Hardware displays like LCD and OLED panels process this data via backlighting and pixel modulation; LCDs use liquid crystals to control light transmission from a backlight, while OLEDs emit light directly from organic compounds for deeper blacks and higher contrast. Gamma correction is essential in these displays to compensate for non-linear human perception of brightness, mapping input intensities to output levels—typically using a gamma value of 2.2 for sRGB content on LCDs to ensure smooth tonal reproduction and avoid washed-out or crushed shadows. Editing digital images involves fundamental operations to manipulate pixel arrays for refinement or adaptation. Cropping removes unwanted portions by defining a rectangular subset of the image, preserving aspect ratios or enforcing specific dimensions for composition. Resizing scales the image by interpolating pixel values; bicubic interpolation, a common method, uses a cubic polynomial to estimate new pixel intensities from a 4x4 neighborhood, providing smoother results than bilinear or nearest-neighbor approaches by reducing aliasing and blurring artifacts. Filters apply spatial transformations via convolution, where a kernel matrix slides over the image, computing weighted sums of neighboring pixels to produce effects like blurring or sharpening. For Gaussian blur, a kernel such as [1/162/161/162/164/162/161/162/161/16]\begin{bmatrix} 1/16 & 2/16 & 1/16 \\ 2/16 & 4/16 & 2/16 \\ 1/16 & 2/16 & 1/16 \end{bmatrix}
Add your contribution
Related Hubs
User Avatar
No comments yet.