Hubbry Logo
Computer programComputer programMain
Open search
Computer program
Community hub
Computer program
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Computer program
Computer program
from Wikipedia
Source code for a computer program written in the JavaScript language. It demonstrates the appendChild method. The method adds a new child node to an existing parent node. It is commonly used to dynamically modify the structure of an HTML document.

A computer program is a sequence or set[a] of instructions in a programming language for a computer to execute. It is one component of software, which also includes documentation and other intangible components.[1]

A computer program in its human-readable form is called source code. Source code needs another computer program to execute because computers can only execute their native machine instructions. Therefore, source code may be translated to machine instructions using a compiler written for the language. (Assembly language programs are translated using an assembler.) The resulting file is called an executable. Alternatively, source code may execute within an interpreter written for the language.[2]

If the executable is requested for execution,[b] then the operating system loads it into memory[3] and starts a process.[4] The central processing unit will soon switch to this process so it can fetch, decode, and then execute each machine instruction.[5]

If the source code is requested for execution, then the operating system loads the corresponding interpreter into memory and starts a process. The interpreter then loads the source code into memory to translate and execute each statement. Running the source code is slower than running an executable.[6][c] Moreover, the interpreter must be installed on the computer.

Example computer program

[edit]

The "Hello, World!" program is used to illustrate a language's basic syntax. The syntax of the language BASIC (1964) was intentionally limited to make the language easy to learn.[7] For example, variables are not declared before being used.[8] Also, variables are automatically initialized to zero.[8] Here is an example computer program, in Basic, to average a list of numbers:[9]

10 INPUT "How many numbers to average?", A
20 FOR I = 1 TO A
30 INPUT "Enter number:", B
40 LET C = C + B
50 NEXT I
60 LET D = C/A
70 PRINT "The average is", D
80 END

Once the mechanics of basic computer programming are learned, more sophisticated and powerful languages are available to build large computer systems.[10]

History

[edit]

Improvements in software development are the result of improvements in computer hardware. At each stage in hardware's history, the task of computer programming changed dramatically.

Analytical Engine

[edit]
Lovelace's description from Note G

In 1837, Jacquard's loom inspired Charles Babbage to attempt to build the Analytical Engine.[11] The names of the components of the calculating device were borrowed from the textile industry. In the textile industry, yarn was brought from the store to be milled. The device had a store which consisted of memory to hold 1,000 numbers of 50 decimal digits each.[12] Numbers from the store were transferred to the mill for processing. The engine was programmed using two sets of perforated cards. One set directed the operation and the other set inputted the variables.[11][13] However, the thousands of cogged wheels and gears never fully worked together.[14]

Ada Lovelace worked for Charles Babbage to create a description of the Analytical Engine (1843).[15] The description contained Note G which completely detailed a method for calculating Bernoulli numbers using the Analytical Engine. This note is recognized by some historians as the world's first computer program.[14]

Universal Turing machine

[edit]

In 1936, Alan Turing introduced the Universal Turing machine, a theoretical device that can model every computation.[16] It is a finite-state machine that has an infinitely long read/write tape. The machine can move the tape back and forth, changing its contents as it performs an algorithm. The machine starts in the initial state, goes through a sequence of steps, and halts when it encounters the halt state.[17] All present-day computers are Turing complete.[18]

ENIAC

[edit]
Glenn A. Beck changing a tube in ENIAC

The Electronic Numerical Integrator And Computer (ENIAC) was built between July 1943 and Fall 1945. It was a Turing complete, general-purpose computer that used 17,468 vacuum tubes to create the circuits. At its core, it was a series of Pascalines wired together.[19] Its 40 units weighed 30 tons, occupied 1,800 square feet (167 m2), and consumed $650 per hour (in 1940s currency) in electricity when idle.[19] It had 20 base-10 accumulators. Programming the ENIAC took up to two months.[19] Three function tables were on wheels and needed to be rolled to fixed function panels. Function tables were connected to function panels by plugging heavy black cables into plugboards. Each function table had 728 rotating knobs. Programming the ENIAC also involved setting some of the 3,000 switches. Debugging a program took a week.[20] It ran from 1947 until 1955 at Aberdeen Proving Ground, calculating hydrogen bomb parameters, predicting weather patterns, and producing firing tables to aim artillery guns.[21]

Stored-program computers

[edit]

Instead of plugging in cords and turning switches, a stored-program computer loads its instructions into memory just like it loads its data into memory.[22] As a result, the computer could be programmed quickly and perform calculations at very fast speeds.[23] Presper Eckert and John Mauchly built the ENIAC. The two engineers introduced the stored-program concept in a three-page memo dated February 1944.[24] Later, in September 1944, John von Neumann began working on the ENIAC project. On June 30, 1945, von Neumann published the First Draft of a Report on the EDVAC, which equated the structures of the computer with the structures of the human brain.[23] The design became known as the von Neumann architecture. The architecture was simultaneously deployed in the constructions of the EDVAC and EDSAC computers in 1949.[25][26]

The IBM System/360 (1964) was a family of computers, each having the same instruction set architecture. The Model 20 was the smallest and least expensive. Customers could upgrade and retain the same application software.[27] The Model 195 was the most premium. Each System/360 model featured multiprogramming[27]—having multiple processes in memory at once. When one process was waiting for input/output, another could compute.

IBM planned for each model to be programmed using PL/1.[28] A committee was formed that included COBOL, Fortran and ALGOL programmers. The purpose was to develop a language that was comprehensive, easy to use, extendible, and would replace Cobol and Fortran.[28] The result was a large and complex language that took a long time to compile.[29]

Switches for manual input on a Data General Nova 3, manufactured in the mid-1970s

Computers manufactured until the 1970s had front-panel switches for manual programming.[30] The computer program was written on paper for reference. An instruction was represented by a configuration of on/off settings. After setting the configuration, an execute button was pressed. This process was then repeated. Computer programs also were automatically inputted via paper tape, punched cards or magnetic-tape. After the medium was loaded, the starting address was set via switches, and the execute button was pressed.[30]

Very Large Scale Integration

[edit]
A VLSI integrated-circuit die

A major milestone in software development was the invention of the Very Large Scale Integration (VLSI) circuit (1964).

Robert Noyce, co-founder of Fairchild Semiconductor (1957) and Intel (1968), achieved a technological improvement to refine the production of field-effect transistors (1963).[31] The goal is to alter the electrical resistivity and conductivity of a semiconductor junction. First, naturally occurring silicate minerals are converted into polysilicon rods using the Siemens process.[32] The Czochralski process then converts the rods into a monocrystalline silicon, boule crystal.[33] The crystal is then thinly sliced to form a wafer substrate. The planar process of photolithography then integrates unipolar transistors, capacitors, diodes, and resistors onto the wafer to build a matrix of metal–oxide–semiconductor (MOS) transistors.[34][35] The MOS transistor is the primary component in integrated circuit chips.[31]

Originally, integrated circuit chips had their function set during manufacturing. During the 1960s, controlling the electrical flow migrated to programming a matrix of read-only memory (ROM). The matrix resembled a two-dimensional array of fuses. The process to embed instructions onto the matrix was to burn out the unneeded connections. There were so many connections, firmware programmers wrote a computer program on another chip to oversee the burning. The technology became known as Programmable ROM. In 1971, Intel installed the computer program onto the chip and named it the Intel 4004 microprocessor.[36]

IBM's System/360 (1964) CPU was not a microprocessor.

The terms microprocessor and central processing unit (CPU) are now used interchangeably. However, CPUs predate microprocessors. For example, the IBM System/360 (1964) had a CPU made from circuit boards containing discrete components on ceramic substrates.[37]

x86 series

[edit]
The original IBM Personal Computer (1981) used an Intel 8088 microprocessor.

In 1978, the modern software development environment began when Intel upgraded the Intel 8080 to the Intel 8086. Intel simplified the Intel 8086 to manufacture the cheaper Intel 8088.[38] IBM embraced the Intel 8088 when they entered the personal computer market (1981). As consumer demand for personal computers increased, so did Intel's microprocessor development. The succession of development is known as the x86 series. The x86 assembly language is a family of backward-compatible machine instructions. Machine instructions created in earlier microprocessors were retained throughout microprocessor upgrades. This enabled consumers to purchase new computers without having to purchase new application software. The major categories of instructions are:[d]

Changing programming environment

[edit]
The DEC VT100 (1978) was a widely used computer terminal.

VLSI circuits enabled the programming environment to advance from a computer terminal (until the 1990s) to a graphical user interface (GUI) computer. Computer terminals limited programmers to a single shell running in a command-line environment. During the 1970s, full-screen source code editing became possible through a text-based user interface. Regardless of the technology available, the goal is to program in a programming language.

Programming paradigms and languages

[edit]

Programming language features exist to provide building blocks to be combined to express programming ideals.[39] Ideally, a programming language should:[39]

  • express ideas directly in the code.
  • express independent ideas independently.
  • express relationships among ideas directly in the code.
  • combine ideas freely.
  • combine ideas only where combinations make sense.
  • express simple ideas simply.

The programming style of a programming language to provide these building blocks may be categorized into programming paradigms.[40] For example, different paradigms may differentiate:[40]

Each of these programming styles has contributed to the synthesis of different programming languages.[40]

A programming language is a set of keywords, symbols, identifiers, and rules by which programmers can communicate instructions to the computer.[41] They follow a set of rules called a syntax.[41]

Programming languages get their basis from formal languages.[42] The purpose of defining a solution in terms of its formal language is to generate an algorithm to solve the underlining problem.[42] An algorithm is a sequence of simple instructions that solve a problem.[43]

Generations of programming language

[edit]
Machine language monitor on a W65C816S microprocessor

The evolution of programming languages began when the EDSAC (1949) used the first stored computer program in its von Neumann architecture.[44] Programming the EDSAC was in the first generation of programming language.[45]

  • The basic structure of an assembly language statement is a label, operation, operand, and comment.[48]
  • Labels allow the programmer to work with variable names. The assembler will later translate labels into physical memory addresses.
  • Operations allow the programmer to work with mnemonics. The assembler will later translate mnemonics into instruction numbers.
  • Operands tell the assembler which data the operation will process.
  • Comments allow the programmer to articulate a narrative because the instructions alone are vague.
The key characteristic of an assembly language program is it forms a one-to-one mapping to its corresponding machine language target.[49]

Imperative languages

[edit]
A computer program written in an imperative language

Imperative languages specify a sequential algorithm using declarations, expressions, and statements:[53]

  • A declaration introduces a variable name to the computer program and assigns it to a datatype[54] – for example: var x: integer;
  • An expression yields a value – for example: 2 + 2 yields 4
  • A statement might assign an expression to a variable or use the value of a variable to alter the program's control flow – for example: x := 2 + 2; if x = 4 then do_something();

Fortran

[edit]

FORTRAN (1958) was unveiled as "The IBM Mathematical FORmula TRANslating system". It was designed for scientific calculations, without string handling facilities. Along with declarations, expressions, and statements, it supported:

It succeeded because:

  • programming and debugging costs were below computer running costs.
  • it was supported by IBM.
  • applications at the time were scientific.[55]

However, non-IBM vendors also wrote Fortran compilers, but with a syntax that would likely fail IBM's compiler.[55] The American National Standards Institute (ANSI) developed the first Fortran standard in 1966. In 1978, Fortran 77 became the standard until 1991. Fortran 90 supports:

COBOL

[edit]

COBOL (1959) stands for "COmmon Business Oriented Language". Fortran manipulated symbols. It was soon realized that symbols did not need to be numbers, so strings were introduced.[56] The US Department of Defense influenced COBOL's development, with Grace Hopper being a major contributor. The statements were English-like and verbose. The goal was to design a language so managers could read the programs. However, the lack of structured statements hindered this goal.[57]

COBOL's development was tightly controlled, so dialects did not emerge to require ANSI standards. As a consequence, it was not changed for 15 years until 1974. The 1990s version did make consequential changes, like object-oriented programming.[57]

Algol

[edit]

ALGOL (1960) stands for "ALGOrithmic Language". It had a profound influence on programming language design.[58] Emerging from a committee of European and American programming language experts, it used standard mathematical notation and had a readable, structured design. Algol was first to define its syntax using the Backus–Naur form.[58] This led to syntax-directed compilers. It added features like:

Algol's direct descendants include Pascal, Modula-2, Ada, Delphi and Oberon on one branch. On another branch the descendants include C, C++ and Java.[58]

Basic

[edit]

BASIC (1964) stands for "Beginner's All-Purpose Symbolic Instruction Code". It was developed at Dartmouth College for all of their students to learn.[9] If a student did not go on to a more powerful language, the student would still remember Basic.[9] A Basic interpreter was installed in the microcomputers manufactured in the late 1970s. As the microcomputer industry grew, so did the language.[9]

Basic pioneered the interactive session.[9] It offered operating system commands within its environment:

  • The 'new' command created an empty slate.
  • Statements evaluated immediately.
  • Statements could be programmed by preceding them with line numbers.[h]
  • The 'list' command displayed the program.
  • The 'run' command executed the program.

However, the Basic syntax was too simple for large programs.[9] Recent dialects added structure and object-oriented extensions. Microsoft's Visual Basic is still widely used and produces a graphical user interface.[8]

C

[edit]

C programming language (1973) got its name because the language BCPL was replaced with B, and AT&T Bell Labs called the next version "C". Its purpose was to write the UNIX operating system.[51] C is a relatively small language, making it easy to write compilers. Its growth mirrored the hardware growth in the 1980s.[51] Its growth also was because it has the facilities of assembly language, but it uses a high-level syntax. It added advanced features like:

Computer memory map

C allows the programmer to control which region of memory data is to be stored. Global variables and static variables require the fewest clock cycles to store. The stack is automatically used for the standard variable declarations. Heap memory is returned to a pointer variable from the malloc() function.

  • The global and static data region is located just above the program region. (The program region is technically called the text region. It is where machine instructions are stored.)
  • The global and static data region is technically two regions.[59] One region is called the initialized data segment, where variables declared with default values are stored. The other region is called the block started by segment, where variables declared without default values are stored.
  • Variables stored in the global and static data region have their addresses set at compile time. They retain their values throughout the life of the process.
  • The global and static region stores the global variables that are declared on top of (outside) the main() function.[60] Global variables are visible to main() and every other function in the source code.
On the other hand, variable declarations inside of main(), other functions, or within { } block delimiters are local variables. Local variables also include formal parameter variables. Parameter variables are enclosed within the parenthesis of a function definition.[61] Parameters provide an interface to the function.
  • Local variables declared using the static prefix are also stored in the global and static data region.[59] Unlike global variables, static variables are only visible within the function or block. Static variables always retain their value. An example usage would be the function int increment_counter(){static int counter = 0; counter++; return counter;}[i]
  • The stack region is a contiguous block of memory located near the top memory address.[62] Variables placed in the stack are populated from top to bottom.[j][62] A stack pointer is a special-purpose register that keeps track of the last memory address populated.[62] Variables are placed into the stack via the assembly language PUSH instruction. Therefore, the addresses of these variables are set during runtime. The method for stack variables to lose their scope is via the POP instruction.
  • Local variables declared without the static prefix, including formal parameter variables,[63] are called automatic variables[60] and are stored in the stack.[59] They are visible inside the function or block and lose their scope upon exiting the function or block.
  • The heap region is located below the stack.[59] It is populated from the bottom to the top. The operating system manages the heap using a heap pointer and a list of allocated memory blocks.[64] Like the stack, the addresses of heap variables are set during runtime. An out of memory error occurs when the heap pointer and the stack pointer meet.
  • C provides the malloc() library function to allocate heap memory.[k][65] Populating the heap with data is an additional copy function.[l] Variables stored in the heap are economically passed to functions using pointers. Without pointers, the entire block of data would have to be passed to the function via the stack.

C++

[edit]

In the 1970s, software engineers needed language support to break large projects down into modules.[66] One obvious feature was to decompose large projects physically into separate files. A less obvious feature was to decompose large projects logically into abstract data types.[66] At the time, languages supported concrete (scalar) datatypes like integer numbers, floating-point numbers, and strings of characters. Abstract datatypes are structures of concrete datatypes, with a new name assigned. For example, a list of integers could be called integer_list.

In object-oriented jargon, abstract datatypes are called classes. However, a class is only a definition; no memory is allocated. When memory is allocated to a class and bound to an identifier, it is called an object.[67]

Object-oriented imperative languages developed by combining the need for classes and the need for safe functional programming.[68] A function, in an object-oriented language, is assigned to a class. An assigned function is then referred to as a method, member function, or operation. Object-oriented programming is executing operations on objects.[69]

Object-oriented languages support a syntax to model subset/superset relationships. In set theory, an element of a subset inherits all the attributes contained in the superset. For example, a student is a person. Therefore, the set of students is a subset of the set of persons. As a result, students inherit all the attributes common to all persons. Additionally, students have unique attributes that other people do not have. Object-oriented languages model subset/superset relationships using inheritance.[70] Object-oriented programming became the dominant language paradigm by the late 1990s.[66]

C++ (1985) was originally called "C with Classes".[71] It was designed to expand C's capabilities by adding the object-oriented facilities of the language Simula.[72]

An object-oriented module is composed of two files. The definitions file is called the header file. Here is a C++ header file for the GRADE class in a simple school application:

// grade.h
// -------

// Used to allow multiple source files to include
// this header file without duplication errors.
// ----------------------------------------------
#ifndef GRADE_H
#define GRADE_H

class GRADE {
public:
    // This is the constructor operation.
    // ----------------------------------
    GRADE ( const char letter );

    // This is a class variable.
    // -------------------------
    char letter;

    // This is a member operation.
    // ---------------------------
    int grade_numeric( const char letter );

    // This is a class variable.
    // -------------------------
    int numeric;
};
#endif

A constructor operation is a function with the same name as the class name.[73] It is executed when the calling operation executes the new statement.

A module's other file is the source file. Here is a C++ source file for the GRADE class in a simple school application:

// grade.cpp
// ---------
#include "grade.h"

GRADE::GRADE( const char letter )
{
    // Reference the object using the keyword 'this'.
    // ----------------------------------------------
    this->letter = letter;

    // This is Temporal Cohesion
    // -------------------------
    this->numeric = grade_numeric( letter );
}

int GRADE::grade_numeric( const char letter )
{
    if ( ( letter == 'A' || letter == 'a' ) )
        return 4;
    else
    if ( ( letter == 'B' || letter == 'b' ) )
        return 3;
    else
    if ( ( letter == 'C' || letter == 'c' ) )
        return 2;
    else
    if ( ( letter == 'D' || letter == 'd' ) )
        return 1;
    else
    if ( ( letter == 'F' || letter == 'f' ) )
        return 0;
    else
        return -1;
}

Here is a C++ header file for the PERSON class in a simple school application:

// person.h
// --------
#ifndef PERSON_H
#define PERSON_H

class PERSON {
public:
    PERSON ( const char *name );
    const char *name;
};
#endif

Here is a C++ source file for the PERSON class in a simple school application:

// person.cpp
// ----------
#include "person.h"

PERSON::PERSON ( const char *name )
{
    this->name = name;
}

Here is a C++ header file for the STUDENT class in a simple school application:

// student.h
// ---------
#ifndef STUDENT_H
#define STUDENT_H

#include "person.h"
#include "grade.h"

// A STUDENT is a subset of PERSON.
// --------------------------------
class STUDENT : public PERSON{
public:
    STUDENT ( const char *name );
    GRADE *grade;
};
#endif

Here is a C++ source file for the STUDENT class in a simple school application:

// student.cpp
// -----------
#include "student.h"
#include "person.h"

STUDENT::STUDENT ( const char *name ):
    // Execute the constructor of the PERSON superclass.
    // -------------------------------------------------
    PERSON( name )
{
    // Nothing else to do.
    // -------------------
}

Here is a driver program for demonstration:

// student_dvr.cpp
// ---------------
#include <iostream>
#include "student.h"

int main( void )
{
    STUDENT *student = new STUDENT( "The Student" );
    student->grade = new GRADE( 'a' );

    std::cout
        // Notice student inherits PERSON's name
        << student->name
        << ": Numeric grade = "
        << student->grade->numeric
        << "\n";
	return 0;
}

Here is a makefile to compile everything:

# makefile
# --------
all: student_dvr

clean:
    rm student_dvr *.o

student_dvr: student_dvr.cpp grade.o student.o person.o
    c++ student_dvr.cpp grade.o student.o person.o -o student_dvr

grade.o: grade.cpp grade.h
    c++ -c grade.cpp

student.o: student.cpp student.h
    c++ -c student.cpp

person.o: person.cpp person.h
    c++ -c person.cpp

Declarative languages

[edit]

Imperative languages have one major criticism: assigning an expression to a non-local variable may produce an unintended side effect.[74] Declarative languages generally omit the assignment statement and the control flow. They describe what computation should be performed and not how to compute it. Two broad categories of declarative languages are functional languages and logical languages.

The principle behind a functional language is to use lambda calculus as a guide for a well defined semantic.[75] In mathematics, a function is a rule that maps elements from an expression to a range of values. Consider the function:

times_10(x) = 10 * x

The expression 10 * x is mapped by the function times_10() to a range of values. One value happens to be 20. This occurs when x is 2. So, the application of the function is mathematically written as:

times_10(2) = 20

A functional language compiler will not store this value in a variable. Instead, it will push the value onto the computer's stack before setting the program counter back to the calling function. The calling function will then pop the value from the stack.[76]

Imperative languages do support functions. Therefore, functional programming can be achieved in an imperative language, if the programmer uses discipline. However, a functional language will force this discipline onto the programmer through its syntax. Functional languages have a syntax tailored to emphasize the what.[77]

A functional program is developed with a set of primitive functions followed by a single driver function.[74] Consider the snippet:

function max( a, b ){/* code omitted */}

function min( a, b ){/* code omitted */}

function range( a, b, c ) {

return max( a, max( b, c ) ) - min( a, min( b, c ) );

}

The primitives are max() and min(). The driver function is range(). Executing:

put( range( 10, 4, 7) ); will output 6.

Functional languages are used in computer science research to explore new language features.[78] Moreover, their lack of side-effects have made them popular in parallel programming and concurrent programming.[79] However, application developers prefer the object-oriented features of imperative languages.[79]

Lisp

[edit]

Lisp (1958) stands for "LISt Processor".[80] It is tailored to process lists. A full structure of the data is formed by building lists of lists. In memory, a tree data structure is built. Internally, the tree structure lends nicely for recursive functions.[81] The syntax to build a tree is to enclose the space-separated elements within parenthesis. The following is a list of three elements. The first two elements are themselves lists of two elements:

((A B) (HELLO WORLD) 94)

Lisp has functions to extract and reconstruct elements.[82] The function head() returns a list containing the first element in the list. The function tail() returns a list containing everything but the first element. The function cons() returns a list that is the concatenation of other lists. Therefore, the following expression will return the list x:

cons(head(x), tail(x))

One drawback of Lisp is when many functions are nested, the parentheses may look confusing.[77] Modern Lisp environments help ensure parenthesis match. As an aside, Lisp does support the imperative language operations of the assignment statement and goto loops.[83] Also, Lisp is not concerned with the datatype of the elements at compile time.[84] Instead, it assigns (and may reassign) the datatypes at runtime. Assigning the datatype at runtime is called dynamic binding.[85] Whereas dynamic binding increases the language's flexibility, programming errors may linger until late in the software development process.[85]

Writing large, reliable, and readable Lisp programs requires forethought. If properly planned, the program may be much shorter than an equivalent imperative language program.[77] Lisp is widely used in artificial intelligence. However, its usage has been accepted only because it has imperative language operations, making unintended side-effects possible.[79]

ML

[edit]

ML (1973)[86] stands for "Meta Language". ML checks to make sure only data of the same type are compared with one another.[87] For example, this function has one input parameter (an integer) and returns an integer:

fun times_10(n : int) : int = 10 * n;

ML is not parenthesis-eccentric like Lisp. The following is an application of times_10():

times_10 2

It returns "20 : int". (Both the results and the datatype are returned.)

Like Lisp, ML is tailored to process lists. Unlike Lisp, each element is the same datatype.[88] Moreover, ML assigns the datatype of an element at compile time. Assigning the datatype at compile time is called static binding. Static binding increases reliability because the compiler checks the context of variables before they are used.[89]

Prolog

[edit]

Prolog (1972) stands for "PROgramming in LOGic". It is a logic programming language, based on formal logic. The language was developed by Alain Colmerauer and Philippe Roussel in Marseille, France. It is an implementation of Selective Linear Definite clause resolution, pioneered by Robert Kowalski and others at the University of Edinburgh.[90]

The building blocks of a Prolog program are facts and rules. Here is a simple example:

cat(tom).                        % tom is a cat
mouse(jerry).                    % jerry is a mouse

animal(X) :- cat(X).             % each cat is an animal
animal(X) :- mouse(X).           % each mouse is an animal

big(X)   :- cat(X).              % each cat is big
small(X) :- mouse(X).            % each mouse is small

eat(X,Y) :- mouse(X), cheese(Y). % each mouse eats each cheese
eat(X,Y) :- big(X),   small(Y).  % each big animal eats each small animal

After all the facts and rules are entered, then a question can be asked:

Will Tom eat Jerry?
?- eat(tom,jerry).
true

The following example shows how Prolog will convert a letter grade to its numeric value:

numeric_grade('A', 4).
numeric_grade('B', 3).
numeric_grade('C', 2).
numeric_grade('D', 1).
numeric_grade('F', 0).
numeric_grade(X, -1) :- not X = 'A', not X = 'B', not X = 'C', not X = 'D', not X = 'F'.
grade('The Student', 'A').
?- grade('The Student', X), numeric_grade(X, Y).
X = 'A',
Y = 4

Here is a comprehensive example:[91]

1) All dragons billow fire, or equivalently, a thing billows fire if the thing is a dragon:

billows_fire(X) :-
    is_a_dragon(X).

2) A creature billows fire if one of its parents billows fire:

billows_fire(X) :-
    is_a_creature(X),
    is_a_parent_of(Y,X),
    billows_fire(Y).

3) A thing X is a parent of a thing Y if X is the mother of Y or X is the father of Y:

is_a_parent_of(X, Y):- is_the_mother_of(X, Y).
is_a_parent_of(X, Y):- is_the_father_of(X, Y).

4) A thing is a creature if the thing is a dragon:

is_a_creature(X) :-
    is_a_dragon(X).

5) Norberta is a dragon, and Puff is a creature. Norberta is the mother of Puff.

is_a_dragon(norberta).
is_a_creature(puff).
is_the_mother_of(norberta, puff).

Rule (2) is a recursive (inductive) definition. It can be understood declaratively, without the need to understand how it is executed.

Rule (3) shows how functions are represented by using relations. Here, the mother and father functions ensure that every individual has only one mother and only one father.

Prolog is an untyped language. Nonetheless, inheritance can be represented by using predicates. Rule (4) asserts that a creature is a superclass of a dragon.

Questions are answered using backward reasoning. Given the question:

 ?- billows_fire(X).

Prolog generates two answers :

X = norberta
X = puff

Practical applications for Prolog are knowledge representation and problem solving in artificial intelligence.

Object-oriented programming

[edit]

Object-oriented programming is a programming method to execute operations (functions) on objects.[92] The basic idea is to group the characteristics of a phenomenon into an object container and give the container a name. The operations on the phenomenon are also grouped into the container.[92] Object-oriented programming developed by combining the need for containers and the need for safe functional programming.[93] This programming method need not be confined to an object-oriented language.[94] In an object-oriented language, an object container is called a class. In a non-object-oriented language, a data structure (which is also known as a record) may become an object container. To turn a data structure into an object container, operations need to be written specifically for the structure. The resulting structure is called an abstract datatype.[95] However, inheritance will be missing. Nonetheless, this shortcoming can be overcome.

Here is a C programming language header file for the GRADE abstract datatype in a simple school application:

/* grade.h */
/* ------- */

/* Used to allow multiple source files to include */
/* this header file without duplication errors.   */
/* ---------------------------------------------- */
#ifndef GRADE_H
#define GRADE_H

typedef struct
{
    char letter;
} GRADE;

/* Constructor */
/* ----------- */
GRADE *grade_new( char letter );

int grade_numeric( char letter );
#endif

The grade_new() function performs the same algorithm as the C++ constructor operation.

Here is a C programming language source file for the GRADE abstract datatype in a simple school application:

/* grade.c */
/* ------- */
#include "grade.h"

GRADE *grade_new( char letter )
{
    GRADE *grade;

    /* Allocate heap memory */
    /* -------------------- */
    if ( ! ( grade = calloc( 1, sizeof ( GRADE ) ) ) )
    {
        fprintf(stderr,
                "ERROR in %s/%s/%d: calloc() returned empty.\n",
                __FILE__,
                __FUNCTION__,
                __LINE__ );
        exit( 1 );
    }

    grade->letter = letter;
    return grade;
}

int grade_numeric( char letter )
{
    if ( ( letter == 'A' || letter == 'a' ) )
        return 4;
    else
    if ( ( letter == 'B' || letter == 'b' ) )
        return 3;
    else
    if ( ( letter == 'C' || letter == 'c' ) )
        return 2;
    else
    if ( ( letter == 'D' || letter == 'd' ) )
        return 1;
    else
    if ( ( letter == 'F' || letter == 'f' ) )
        return 0;
    else
        return -1;
}

In the constructor, the function calloc() is used instead of malloc() because each memory cell will be set to zero.

Here is a C programming language header file for the PERSON abstract datatype in a simple school application:

/* person.h */
/* -------- */
#ifndef PERSON_H
#define PERSON_H

typedef struct
{
    char *name;
} PERSON;

/* Constructor */
/* ----------- */
PERSON *person_new( char *name );
#endif

Here is a C programming language source file for the PERSON abstract datatype in a simple school application:

/* person.c */
/* -------- */
#include "person.h"

PERSON *person_new( char *name )
{
    PERSON *person;

    if ( ! ( person = calloc( 1, sizeof ( PERSON ) ) ) )
    {
        fprintf(stderr,
                "ERROR in %s/%s/%d: calloc() returned empty.\n",
                __FILE__,
                __FUNCTION__,
                __LINE__ );
        exit( 1 );
    }

    person->name = name;
    return person;
}

Here is a C programming language header file for the STUDENT abstract datatype in a simple school application:

/* student.h */
/* --------- */
#ifndef STUDENT_H
#define STUDENT_H

#include "person.h"
#include "grade.h"

typedef struct
{
    /* A STUDENT is a subset of PERSON. */
    /* -------------------------------- */
    PERSON *person;

    GRADE *grade;
} STUDENT;

/* Constructor */
/* ----------- */
STUDENT *student_new( char *name );
#endif

Here is a C programming language source file for the STUDENT abstract datatype in a simple school application:

/* student.c */
/* --------- */
#include "student.h"
#include "person.h"

STUDENT *student_new( char *name )
{
    STUDENT *student;

    if ( ! ( student = calloc( 1, sizeof ( STUDENT ) ) ) )
    {
        fprintf(stderr,
                "ERROR in %s/%s/%d: calloc() returned empty.\n",
                __FILE__,
                __FUNCTION__,
                __LINE__ );
        exit( 1 );
    }

    /* Execute the constructor of the PERSON superclass. */
    /* ------------------------------------------------- */
    student->person = person_new( name );
    return student;
}

Here is a driver program for demonstration:

/* student_dvr.c */
/* ------------- */
#include <stdio.h>
#include "student.h"

int main( void )
{
    STUDENT *student = student_new( "The Student" );
    student->grade = grade_new( 'a' );

    printf( "%s: Numeric grade = %d\n",
            /* Whereas a subset exists, inheritance does not. */
            student->person->name,
            /* Functional programming is executing functions just-in-time (JIT) */
            grade_numeric( student->grade->letter ) );

	return 0;
}

Here is a makefile to compile everything:

# makefile
# --------
all: student_dvr

clean:
    rm student_dvr *.o

student_dvr: student_dvr.c grade.o student.o person.o
    gcc student_dvr.c grade.o student.o person.o -o student_dvr

grade.o: grade.c grade.h
    gcc -c grade.c

student.o: student.c student.h
    gcc -c student.c

person.o: person.c person.h
    gcc -c person.c

The formal strategy to build object-oriented objects is to:[96]

  • Identify the objects. Most likely these will be nouns.
  • Identify each object's attributes. What helps to describe the object?
  • Identify each object's actions. Most likely these will be verbs.
  • Identify the relationships from object to object. Most likely these will be verbs.

For example:

  • A person is a human identified by a name.
  • A grade is an achievement identified by a letter.
  • A student is a person who earns a grade.

Syntax and semantics

[edit]
Production rules consist of a set of terminals and non-terminals.

The syntax of a computer program is a list of production rules which form its grammar.[97] A programming language's grammar correctly places its declarations, expressions, and statements.[98] Complementing the syntax of a language are its semantics. The semantics describe the meanings attached to various syntactic constructs.[99] A syntactic construct may need a semantic description because a production rule may have an invalid interpretation.[100] Also, different languages might have the same syntax; however, their behaviors may be different.

The syntax of a language is formally described by listing the production rules. Whereas the syntax of a natural language is extremely complicated, a subset of the English language can have this production rule listing:[101]

  1. a sentence is made up of a noun-phrase followed by a verb-phrase;
  2. a noun-phrase is made up of an article followed by an adjective followed by a noun;
  3. a verb-phrase is made up of a verb followed by a noun-phrase;
  4. an article is 'the';
  5. an adjective is 'big' or
  6. an adjective is 'small';
  7. a noun is 'cat' or
  8. a noun is 'mouse';
  9. a verb is 'eats';

The words in bold-face are known as non-terminals. The words in 'single quotes' are known as terminals.[102]

From this production rule listing, complete sentences may be formed using a series of replacements.[103] The process is to replace non-terminals with either a valid non-terminal or a valid terminal. The replacement process repeats until only terminals remain. One valid sentence is:

  • sentence
  • noun-phrase verb-phrase
  • article adjective noun verb-phrase
  • the adjective noun verb-phrase
  • the big noun verb-phrase
  • the big cat verb-phrase
  • the big cat verb noun-phrase
  • the big cat eats noun-phrase
  • the big cat eats article adjective noun
  • the big cat eats the adjective noun
  • the big cat eats the small noun
  • the big cat eats the small mouse

However, another combination results in an invalid sentence:

  • the small mouse eats the big cat

Therefore, a semantic is necessary to correctly describe the meaning of an eat activity.

One production rule listing method is called the Backus–Naur form (BNF).[104] BNF describes the syntax of a language and itself has a syntax. This recursive definition is an example of a metalanguage.[99] The syntax of BNF includes:

  • ::= which translates to is made up of a[n] when a non-terminal is to its right. It translates to is when a terminal is to its right.
  • | which translates to or.
  • < and > which surround non-terminals.

Using BNF, a subset of the English language can have this production rule listing:

<sentence> ::= <noun-phrase><verb-phrase>
<noun-phrase> ::= <article><adjective><noun>
<verb-phrase> ::= <verb><noun-phrase>
<article> ::= the
<adjective> ::= big | small
<noun> ::= cat | mouse
<verb> ::= eats

Using BNF, a signed-integer has the production rule listing:[105]

<signed-integer> ::= <sign><integer>
<sign> ::= + | -
<integer> ::= <digit> | <digit><integer>
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Notice the recursive production rule:

<integer> ::= <digit> | <digit><integer>

This allows for an infinite number of possibilities. Therefore, a semantic is necessary to describe a limitation of the number of digits.

Notice the leading zero possibility in the production rules:

<integer> ::= <digit> | <digit><integer>
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Therefore, a semantic is necessary to describe that leading zeros need to be ignored.

Two formal methods are available to describe semantics. They are denotational semantics and axiomatic semantics.[106]

Software engineering and computer programming

[edit]
Prior to programming languages, Betty Jennings and Fran Bilas programmed the ENIAC by moving cables and setting switches.

Software engineering is a variety of techniques to produce quality computer programs.[107] Computer programming is the process of writing or editing source code. In a formal environment, a systems analyst will gather information from managers about all the organization's processes to automate. This professional then prepares a detailed plan for the new or modified system.[108] The plan is analogous to an architect's blueprint.[108]

Performance objectives

[edit]

The systems analyst has the objective to deliver the right information to the right person at the right time.[109] The critical factors to achieve this objective are:[109]

  1. The quality of the output. Is the output useful for decision-making?
  2. The accuracy of the output. Does it reflect the true situation?
  3. The format of the output. Is the output easily understood?
  4. The speed of the output. Time-sensitive information is important when communicating with the customer in real time.

Cost objectives

[edit]

Achieving performance objectives should be balanced with all of the costs, including:[110]

  1. Development costs.
  2. Uniqueness costs. A reusable system may be expensive. However, it might be preferred over a limited-use system.
  3. Hardware costs.
  4. Operating costs.

Applying a systems development process will mitigate the axiom: the later in the process an error is detected, the more expensive it is to correct.[111]

Waterfall model

[edit]

The waterfall model is an implementation of a systems development process.[112] As the waterfall label implies, the basic phases overlap each other:[113]

  1. The investigation phase is to understand the underlying problem.
  2. The analysis phase is to understand the possible solutions.
  3. The design phase is to plan the best solution.
  4. The implementation phase is to program the best solution.
  5. The maintenance phase lasts throughout the life of the system. Changes to the system after it is deployed may be necessary.[114] Faults may exist, including specification faults, design faults, or coding faults. Improvements may be necessary. Adaption may be necessary to react to a changing environment.

Computer programmer

[edit]

A computer programmer is a specialist responsible for writing or modifying the source code to implement the detailed plan.[108] A programming team is likely to be needed because most systems are too large to be completed by a single programmer.[115] However, adding programmers to a project may not shorten the completion time. Instead, it may lower the quality of the system.[115] To be effective, program modules need to be defined and distributed to team members.[115] Also, team members must interact with one another in a meaningful and effective way.[115]

Computer programmers may be programming in the small: programming within a single module.[116] Chances are a module will execute modules located in other source code files. Therefore, computer programmers may be programming in the large: programming modules so they will effectively couple with each other.[116] Programming-in-the-large includes contributing to the application programming interface (API).

Program modules

[edit]

Modular programming is a technique to refine imperative language programs. Refined programs may reduce the software size, separate responsibilities, and thereby mitigate software aging. A program module is a sequence of statements that are bounded within a block and together identified by a name.[117] Modules have a function, context, and logic:[118]

  • The function of a module is what it does.
  • The context of a module are the elements being performed upon.
  • The logic of a module is how it performs the function.

The module's name should be derived first by its function, then by its context. Its logic should not be part of the name.[118] For example, function compute_square_root( x ) or function compute_square_root_integer( i : integer ) are appropriate module names. However, function compute_square_root_by_division( x ) is not.

The degree of interaction within a module is its level of cohesion.[118] Cohesion is a judgment of the relationship between a module's name and its function. The degree of interaction between modules is the level of coupling.[119] Coupling is a judgement of the relationship between a module's context and the elements being performed upon.

Cohesion

[edit]

The levels of cohesion from worst to best are:[120]

  • Coincidental Cohesion: A module has coincidental cohesion if it performs multiple functions, and the functions are completely unrelated. For example, function read_sales_record_print_next_line_convert_to_float(). Coincidental cohesion occurs in practice if management enforces silly rules. For example, "Every module will have between 35 and 50 executable statements."[120]
  • Logical Cohesion: A module has logical cohesion if it has available a series of functions, but only one of them is executed. For example, function perform_arithmetic( perform_addition, a, b ).
  • Temporal Cohesion: A module has temporal cohesion if it performs functions related to time. One example, function initialize_variables_and_open_files(). Another example, stage_one(), stage_two(), ...
  • Procedural Cohesion: A module has procedural cohesion if it performs multiple loosely related functions. For example, function read_part_number_update_employee_record().
  • Communicational Cohesion: A module has communicational cohesion if it performs multiple closely related functions. For example, function read_part_number_update_sales_record().
  • Informational Cohesion: A module has informational cohesion if it performs multiple functions, but each function has its own entry and exit points. Moreover, the functions share the same data structure. Object-oriented classes work at this level.
  • Functional Cohesion: a module has functional cohesion if it achieves a single goal working only on local variables. Moreover, it may be reusable in other contexts.

Coupling

[edit]

The levels of coupling from worst to best are:[119]

  • Content Coupling: A module has content coupling if it modifies a local variable of another function. COBOL used to do this with the alter verb.
  • Common Coupling: A module has common coupling if it modifies a global variable.
  • Control Coupling: A module has control coupling if another module can modify its control flow. For example, perform_arithmetic( perform_addition, a, b ). Instead, control should be on the makeup of the returned object.
  • Stamp Coupling: A module has stamp coupling if an element of a data structure passed as a parameter is modified. Object-oriented classes work at this level.
  • Data Coupling: A module has data coupling if all of its input parameters are needed and none of them are modified. Moreover, the result of the function is returned as a single object.

Data flow analysis

[edit]
A sample function-level data-flow diagram

Data flow analysis is a design method used to achieve modules of functional cohesion and data coupling.[121] The input to the method is a data-flow diagram. A data-flow diagram is a set of ovals representing modules. Each module's name is displayed inside its oval. Modules may be at the executable level or the function level.

The diagram also has arrows connecting modules to each other. Arrows pointing into modules represent a set of inputs. Each module should have only one arrow pointing out from it to represent its single output object. (Optionally, an additional exception arrow points out.) A daisy chain of ovals will convey an entire algorithm. The input modules should start the diagram. The input modules should connect to the transform modules. The transform modules should connect to the output modules.[122]

Functional categories

[edit]
A diagram showing that the user interacts with the application software. The application software interacts with the operating system, which interacts with the hardware.

Computer programs may be categorized along functional lines. The main functional categories are application software and system software. System software includes the operating system, which couples computer hardware with application software.[123] The purpose of the operating system is to provide an environment where application software executes in a convenient and efficient manner.[123] Both application software and system software execute utility programs. At the hardware level, a microcode program controls the circuits throughout the central processing unit.

Application software

[edit]

Application software is the key to unlocking the potential of the computer system.[124] Enterprise application software bundles accounting, personnel, customer, and vendor applications. Examples include enterprise resource planning, customer relationship management, and supply chain management software.

Enterprise applications may be developed in-house as a one-of-a-kind proprietary software.[125] Alternatively, they may be purchased as off-the-shelf software. Purchased software may be modified to provide custom software. If the application is customized, then either the company's resources are used or the resources are outsourced. Outsourced software development may be from the original software vendor or a third-party developer.[126]

The potential advantages of in-house software are features and reports may be developed exactly to specification.[127] Management may also be involved in the development process and offer a level of control.[128] Management may decide to counteract a competitor's new initiative or implement a customer or vendor requirement.[129] A merger or acquisition may necessitate enterprise software changes. The potential disadvantages of in-house software are time and resource costs may be extensive.[125] Furthermore, risks concerning features and performance may be looming.

The potential advantages of off-the-shelf software are upfront costs are identifiable, the basic needs should be fulfilled, and its performance and reliability have a track record.[125] The potential disadvantages of off-the-shelf software are it may have unnecessary features that confuse end users, it may lack features the enterprise needs, and the data flow may not match the enterprise's work processes.[125]

Application service provider

[edit]

One approach to economically obtaining a customized enterprise application is through an application service provider.[130] Specialty companies provide hardware, custom software, and end-user support. They may speed the development of new applications because they possess skilled information system staff. The biggest advantage is it frees in-house resources from staffing and managing complex computer projects.[130] Many application service providers target small, fast-growing companies with limited information system resources.[130] On the other hand, larger companies with major systems will likely have their technical infrastructure in place. One risk is having to trust an external organization with sensitive information. Another risk is having to trust the provider's infrastructure reliability.[130]

Operating system

[edit]
Program vs. Process vs. Thread
Scheduling, Preemption, Context Switching

An operating system is the low-level software that supports a computer's basic functions, such as scheduling processes and controlling peripherals.[123]

In the 1950s, the programmer, who was also the operator, would write a program and run it. After the program finished executing, the output may have been printed, or it may have been punched onto paper tape or cards for later processing.[30] More often than not the program did not work. The programmer then looked at the console lights and fiddled with the console switches. If less fortunate, a memory printout was made for further study. In the 1960s, programmers reduced the amount of wasted time by automating the operator's job. A program called an operating system was kept in the computer at all times.[131]

The term operating system may refer to two levels of software.[132] The operating system may refer to the kernel program that manages the processes, memory, and devices. More broadly, the operating system may refer to the entire package of the central software. The package includes a kernel program, command-line interpreter, graphical user interface, utility programs, and editor.[132]

Kernel Program

[edit]
A kernel connects the application software to the hardware of a computer.

The kernel's main purpose is to manage the limited resources of a computer:

Physical memory is scattered around RAM and the hard disk. Virtual memory is one continuous block.
  • When the kernel initially loads an executable into memory, it divides the address space logically into regions.[134] The kernel maintains a master-region table and many per-process-region (pregion) tables—one for each running process.[134] These tables constitute the virtual address space. The master-region table is used to determine where its contents are located in physical memory. The pregion tables allow each process to have its own program (text) pregion, data pregion, and stack pregion.
  • The program pregion stores machine instructions. Since machine instructions do not change, the program pregion may be shared by many processes of the same executable.[134]
  • To save time and memory, the kernel may load only blocks of execution instructions from the disk drive, not the entire execution file completely.[133]
  • The kernel is responsible for translating virtual addresses into physical addresses. The kernel may request data from the memory controller and, instead, receive a page fault.[135] If so, the kernel accesses the memory management unit to populate the physical data region and translate the address.[136]
  • The kernel allocates memory from the heap upon request by a process.[65] When the process is finished with the memory, the process may request for it to be freed. If the process exits without requesting all allocated memory to be freed, then the kernel performs garbage collection to free the memory.
  • The kernel also ensures that a process only accesses its own memory, and not that of the kernel or other processes.[133]
  • The kernel program should perform file system management.[133] The kernel has instructions to create, retrieve, update, and delete files.
  • The kernel program should perform device management.[133] The kernel provides programs to standardize and simplify the interface to the mouse, keyboard, disk drives, printers, and other devices. Moreover, the kernel should arbitrate access to a device if two processes request it at the same time.
  • The kernel program should perform network management.[137] The kernel transmits and receives packets on behalf of processes. One key service is to find an efficient route to the target system.
  • The kernel program should provide system level functions for programmers to use.[138]
    • Programmers access files through a relatively simple interface that in turn executes a relatively complicated low-level I/O interface. The low-level interface includes file creation, file descriptors, file seeking, physical reading, and physical writing.
    • Programmers create processes through a relatively simple interface that in turn executes a relatively complicated low-level interface.
    • Programmers perform date/time arithmetic through a relatively simple interface that in turn executes a relatively complicated low-level time interface.[139]
  • The kernel program should provide a communication channel between executing processes.[140] For a large software system, it may be desirable to engineer the system into smaller processes. Processes may communicate with one another by sending and receiving signals.

Originally, operating systems were programmed in assembly; however, modern operating systems are typically written in higher-level languages like C, Objective-C, and Swift.[m]

Utility program

[edit]

A utility is a program that aids system administration and software execution. An operating system typically provides utilities to check hardware such as storage, memory, speakers, and printers.[141] A utility may optimize the performance of a storage device. System utilities monitor hardware and network performance and may trigger an alert when a metric is outside the nominal range.[142] A utility may compress files to reduce storage space and network transmission time.[141] A utility may sort and merge data sets[142] or detect computer viruses.[142]

Microcode program

[edit]
NOT gate
NAND gate
NOR gate
AND gate
OR gate

A microcode program is the bottom-level interpreter[n] that controls the datapath of software-driven computers.[144] (Advances in hardware have migrated these operations to hardware execution circuits.)[144] Microcode instructions allow the programmer to more easily implement the digital logic level[145]—the computer's real hardware. The digital logic level is the boundary between computer science and computer engineering.[146]

A logic gate is a tiny transistor that can return one of two signals: on or off.[147]

  • Having one transistor forms the NOT gate.
  • Connecting two transistors in series forms the NAND gate.
  • Connecting two transistors in parallel forms the NOR gate.
  • Connecting a NOT gate to a NAND gate forms the AND gate.
  • Connecting a NOT gate to a NOR gate forms the OR gate.

These five gates form the building blocks of binary algebra—the digital logic functions of the computer.

Microcode instructions are mnemonics programmers may use to execute digital logic functions instead of forming them in binary algebra. They are stored in a central processing unit's (CPU) control store.[148] These hardware-level instructions move data throughout the data path.

The micro-instruction cycle begins when the microsequencer uses its microprogram counter to fetch the next machine instruction from random-access memory.[149] The next step is to decode the machine instruction by selecting the proper output line to the hardware module.[150] The final step is to execute the instruction using the hardware module's set of gates.

A symbolic representation of an ALU

Instructions to perform arithmetic are passed through an arithmetic logic unit (ALU).[151] The ALU has circuits to perform elementary operations to add, shift, and compare integers. By combining and looping the elementary operations through the ALU, the CPU performs its complex arithmetic.

Microcode instructions move data between the CPU and the memory controller. Memory controller microcode instructions manipulate two registers. The memory address register is used to access each memory cell's address. The memory data register is used to set and read each cell's contents.[152]

Notes

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A computer program is a syntactic unit that conforms to the rules of a particular programming language and is composed of declarations, statements, or equivalent syntactical units, intended to direct a computer system to execute specific operations or computations. These instructions are typically written by human programmers, often with assistance from artificial intelligence tools, and can range from simple algorithms to complex systems that manage hardware resources or process vast datasets. In essence, a computer program translates human intent into machine-executable actions, forming the core of all software functionality. The origins of computer programs trace back to the 19th century with Charles Babbage's designs for mechanical computing devices, such as the Analytical Engine, which envisioned programmable operations using punched cards for input. In 1843, Ada Lovelace, collaborating with Babbage, authored extensive notes on the Analytical Engine that included what is widely recognized as the first algorithm intended for a general-purpose computer—a method to compute Bernoulli numbers—establishing her as the world's first computer programmer. Late 19th-century developments, like the punched-card systems used in tabulating machines by Herman Hollerith for the 1890 U.S. Census, further advanced the concept of stored instructions, laying groundwork for modern programming. The mid-20th century marked a pivotal shift with electronic computers, such as ENIAC in 1945, where programming involved manual rewiring or switch settings, evolving rapidly to stored-program architectures exemplified by John von Neumann's 1945 report on EDVAC. In contemporary computing, computer programs underpin virtually every digital technology, from operating systems like Linux that orchestrate hardware interactions to applications such as web browsers and machine learning models. They are developed using diverse programming languages—high-level ones like Python for readability and low-level ones like assembly for direct hardware control—and are either compiled into machine code for efficiency or interpreted for flexibility. The creation of programs follows structured processes, including design, coding, testing, and maintenance, often employing tools like integrated development environments (IDEs) to enhance productivity. As software permeates society, programs enable innovations in fields like artificial intelligence, cybersecurity, and scientific simulation, while raising challenges in reliability, security, and ethical use.

Definition and Fundamentals

Core Definition

A computer program is a syntactic unit that conforms to the rules of a particular programming language and is composed of declarations, statements, or equivalent syntactical units, intended to direct a computer system to execute specific operations or computations. This set of ordered operations enables the machine to execute tasks ranging from simple calculations to complex simulations, forming the executable core of computational processes. Unlike hardware, which consists of the physical components of a computing system such as processors and memory devices, a computer program is non-physical and intangible, serving as the directives that control hardware behavior. A program implements algorithms, which are abstract, language-independent procedures for solving problems, by providing a concrete representation in a specific programming language. Data structures refer to specialized formats for organizing and storing data within a program, supporting but distinct from the instructional logic that manipulates them. While software encompasses programs along with associated data, documentation, and configurations, a program specifically denotes the instructional sequence itself.[](https://www.coursera.org/learn/introduction-computer-programming/lecture/0 intro-what-is-a-program) Programs typically exhibit properties such as finiteness (terminating after a finite number of steps), determinism (producing the same outputs for the same inputs under identical conditions), and executability (capable of being run on compatible hardware, either directly or after compilation). These ensure reliable computation, though some programs may incorporate non-determinism for specific purposes like simulation or security. The term "computer program" evolved from early 20th-century references to "computing machines" for mechanical tabulation, gaining prominence in the mid-1940s with the advent of electronic stored-program computers, where instructions were held in modifiable memory, distinguishing modern digital usage from prior conceptual precursors.

Basic Components

A computer program consists of fundamental building blocks that enable it to process information: instructions that perform operations, control structures that dictate the order of execution, and data elements that store and represent information. These components interact to form a coherent sequence of actions, allowing the program to achieve its intended purpose through systematic manipulation of data. Instructions form the core operational units of a program, specifying actions such as arithmetic computations (e.g., addition or multiplication), logical operations (e.g., comparisons or boolean evaluations), and input/output tasks (e.g., reading from a device or writing to a display). Arithmetic instructions handle numerical manipulations, logical instructions evaluate conditions or perform bitwise operations, and I/O instructions facilitate interaction with external systems or users. These instructions are executed by the computer's central processing unit, which interprets and carries them out sequentially or as directed. Control structures govern the flow of execution, determining which instructions run and in what order based on program logic. The primary types include sequence, where instructions execute one after another in a linear fashion; selection, such as conditional statements (e.g., if-then-else) that branch execution depending on whether a condition evaluates to true or false; and repetition, such as loops (e.g., while or for) that repeat a block of instructions until a specified condition is met. These structures ensure that programs can adapt to varying inputs and conditions without rigid linearity. Data elements provide the information that instructions operate upon, including variables, which are named storage locations whose values can change during execution; constants, which are fixed values that remain unchanged; and data types that define the nature of the data, such as integers for whole numbers, strings for textual sequences, or booleans for true/false states. Variables and constants maintain the program's state, representing the current configuration of data at any point in execution. The program's state evolves as instructions modify these elements, influencing control structures and directing the overall execution flow toward completion or halting conditions, such as reaching the end of the instruction sequence or satisfying a termination criterion.

Simple Examples

Simple examples of computer programs can be illustrated using pseudocode, a high-level description that outlines the logic of a program without adhering to a specific programming language's syntax. This approach highlights fundamental operations such as input, processing, and output in a clear, step-by-step manner. A basic example is a program to add two numbers. The pseudocode for this task is as follows:

BEGIN INPUT num1 INPUT num2 result ← num1 + num2 OUTPUT result END

BEGIN INPUT num1 INPUT num2 result ← num1 + num2 OUTPUT result END

This example demonstrates a sequential structure where the program first accepts two numerical inputs from the user (input phase), performs arithmetic addition (processing phase), and then displays the sum (output phase). Execution proceeds linearly: upon running, it prompts for num1 and num2, computes the result immediately after both are provided, and terminates after outputting the value, ensuring a finite and predictable flow. Another fundamental example involves a loop to print numbers from 1 to 10, introducing repetition to handle multiple iterations efficiently. The pseudocode is:

FOR i ← 1 to 10 OUTPUT i END FOR

FOR i ← 1 to 10 OUTPUT i END FOR

Here, the loop initializes a counter i to 1, checks the condition i ≤ 10 before each iteration, outputs the current value of i, increments i by 1, and repeats until the condition fails after i = 10. This structure processes a fixed number of steps (input is the loop bounds, processing involves counting and output, and the final output is the sequence 1 through 10), showcasing how loops automate repetitive tasks while maintaining control through a termination condition. Common pitfalls in such basic programs often arise from errors in control flow, such as infinite loops, where the termination condition never becomes false. For instance, consider this flawed pseudocode intended to process positive numbers but lacking proper decrement:

i ← 1 WHILE i > 0 OUTPUT i // Missing: i ← i - 1 END WHILE

i ← 1 WHILE i > 0 OUTPUT i // Missing: i ← i - 1 END WHILE

In execution, i starts at 1 and the condition i > 0 holds true indefinitely since i is not updated inside the loop, causing endless output of 1 and preventing the program from terminating. This error underscores the need for careful management of loop variables to ensure progression toward the exit condition.

Historical Development

Pre-Digital Concepts

The precursors to modern computer programs emerged in the 19th and early 20th centuries through mechanical designs and theoretical models that formalized the concepts of instructions, computation, and universality. These ideas laid the groundwork for programmable machines and abstract notions of computability, predating electronic implementation. Charles Babbage proposed the Analytical Engine in 1837 as a general-purpose mechanical computer capable of performing any calculation through a sequence of programmable operations. The design featured an arithmetic unit for executing operations like addition and multiplication, a central store for holding numbers and intermediate results, and a control flow mechanism to sequence instructions. Programs for the Analytical Engine were to be encoded on punched cards, inspired by Jacquard looms, where holes represented specific instructions or data, allowing for loops, conditional branching, and reusable subroutines. This punched-card system represented an early form of stored instructions, enabling the machine to follow a predefined sequence rather than performing fixed calculations. Ada Lovelace, collaborating with Babbage, expanded on these ideas in her 1843 notes accompanying a translation of Luigi Menabrea's article on the Analytical Engine. In Note G, she detailed the first published algorithm intended for a machine: a method to compute Bernoulli numbers using the engine's operations, including a step-by-step table of card instructions for division, multiplication, and variable manipulation. Lovelace recognized the engine's potential beyond numerical computation, envisioning it as capable of manipulating symbols like those in music or graphics, thereby articulating the generality of programmable machines. Her work emphasized that the machine's output depended on the input instructions, foreshadowing the programmable nature of software. In the 1930s, theoretical foundations for computation shifted to abstract models independent of physical machinery. Alan Turing introduced the universal Turing machine in his 1936 paper "On Computable Numbers, with an Application to the Entscheidungsproblem," defining an idealized device that could simulate any algorithmic process on a tape using a finite set of states and symbols. This machine reads and writes symbols, moves left or right, and alters its state based on a table of rules, embodying the concept of a program as a finite description of computable functions. The universal variant reads an encoded description of another Turing machine's rules on its tape, effectively running any program, thus establishing computability as the execution of arbitrary instructions. Concurrently, Alonzo Church developed lambda calculus in the 1930s as a formal system for expressing functions and computation through abstraction and application. Introduced in papers from 1932 onward and refined in 1936, it uses lambda abstractions to define anonymous functions, such as λx.x for the identity function, allowing computation via substitution rules like beta-reduction. Church proved lambda calculus equivalent to Turing machines in expressive power, providing an alternative foundation for recursion and effective calculability without mechanical specifics. This functional approach influenced later understandings of programs as compositions of higher-order functions.

Early Electronic Computers

The development of early electronic computers marked a pivotal transition from mechanical and electromechanical devices to fully electronic systems capable of high-speed computation, though programming remained manual and hardware-dependent. One of the earliest examples was the Colossus, designed by engineer Tommy Flowers at the British General Post Office Research Station and completed in December 1943. This machine was built specifically for cryptanalytic tasks during World War II, aiding in the decryption of German Lorenz cipher (Tunny) messages sent between high command. Colossus was semi-programmable, configured by operators using switches and plug panels to set Boolean functions, counters, and connections, allowing reconfiguration for variations in the cryptanalysis process without altering its core wiring. It employed approximately 1,600 vacuum tubes in its Mark I version and processed paper tape inputs at speeds up to 5,000 characters per second, significantly accelerating code-breaking efforts that contributed to Allied intelligence successes. In the United States, the ENIAC (Electronic Numerical Integrator and Computer), completed in 1945 by John Mauchly and J. Presper Eckert at the University of Pennsylvania's Moore School of Electrical Engineering under U.S. Army contract, represented the first general-purpose electronic digital computer. Initially developed to compute artillery firing tables for the Ballistic Research Laboratory, ENIAC performed complex ballistics calculations that would have taken weeks on mechanical calculators, using 18,000 vacuum tubes to execute up to 5,000 additions per second. Like Colossus, ENIAC was programmed manually via plugboards, cables, and switches, where operators physically rewired panels to define the flow of data and operations among its 40 functional units, akin to setting up a telephone switchboard. This approach enabled flexibility for numerical problems but required meticulous planning using flowcharts and diagrams to avoid errors in the intricate wiring. The wiring-based programming of these machines imposed significant limitations, primarily due to the absence of internal program storage, which forced complete hardware reconfiguration for each new task. Reprogramming ENIAC, for instance, often necessitated shutting down the system and manually adjusting thousands of plugs and switches, a process that could consume days or even weeks of labor by a team of technicians. Such reconfiguration times hindered rapid iteration and scalability, as changes risked introducing faults in the physical connections, and the machines' vast size—ENIAC occupied 1,800 square feet and weighed 30 tons—exacerbated maintenance challenges. These constraints highlighted the need for more efficient programming methods, setting the stage for conceptual advances. In June 1945, John von Neumann drafted a report on the proposed EDVAC (Electronic Discrete Variable Automatic Computer), outlining initial ideas for a stored-program architecture where instructions and data would reside in the same modifiable memory, addressing the reconfiguration bottlenecks of machines like ENIAC. This influential document, circulated among computing pioneers, synthesized wartime experiences and theoretical foundations, including influences from Alan Turing's earlier universal machine concepts, to propose a unified framework for electronic computing. Although EDVAC itself was not completed until 1952, von Neumann's report catalyzed the shift toward programmable systems that could alter their own instructions electronically, fundamentally transforming computer program design.

Stored-Program Era

The stored-program era marked a pivotal transition in computer programming, beginning in the late 1940s, where instructions for computation were stored in the same electronic memory as data, allowing machines to execute and modify programs dynamically without hardware reconfiguration. This innovation addressed the limitations of earlier electronic computers, which relied on fixed wiring or manual plugboards for task-specific setups, by treating programs as modifiable data accessible by the central processing unit (CPU). The concept enabled general-purpose computing, where a single machine could perform diverse calculations by simply loading different instruction sets into memory. The first demonstration of this approach occurred with the Manchester Baby, also known as the Small-Scale Experimental Machine (SSEM), developed at the University of Manchester. Completed in June 1948, it successfully executed its initial stored program on June 21, 1948, solving a simple mathematical problem involving greatest common divisors. Built by Frederic C. Williams and Tom Kilburn using Williams-Kilburn tube memory for both data and instructions, the Baby featured a rudimentary CPU capable of basic arithmetic and control operations, with programs entered via switches and stored in its 32-word memory. This prototype, though limited to 2^17 instructions per run due to its small scale, proved the feasibility of electronic stored-program execution. The theoretical foundation for this era was formalized in John von Neumann's 1945 "First Draft of a Report on the EDVAC," which outlined a architecture comprising a CPU for processing, a single main memory for storing both programs and data (treated uniformly as binary sequences), and input/output mechanisms for external interaction. In this design, the CPU fetches instructions sequentially from memory, decodes them, and operates on data from the same addressable space, enabling programs to be loaded, altered, or even self-modified during execution. This unified memory model, often termed the von Neumann architecture, became the blueprint for subsequent computers, emphasizing sequential instruction processing and random access to memory contents. Building on these advances, the Electronic Delay Storage Automatic Calculator (EDSAC) at the University of Cambridge emerged as the first practical stored-program computer in 1949. Designed by Maurice Wilkes and operational from May 6, 1949, EDSAC used mercury delay-line memory to hold up to 1,024 17-bit words for instructions and data, supporting subroutine libraries for reusable code segments. It ran its debut program to compute a table of squares, demonstrating reliable operation for scientific calculations and establishing a routine computing service. EDSAC's initial subroutines, documented in Wilkes' 1951 book Preparation of Programs for an Electronic Digital Computer, facilitated modular programming. The stored-program paradigm profoundly enhanced software reusability and debugging. By storing instructions in modifiable memory, programs could be easily swapped or reused across sessions without rewiring, promoting the development of libraries and general-purpose applications that accelerated scientific and engineering computations. Debugging benefited from direct memory inspection and alteration, allowing programmers to trace errors, insert test instructions, or patch code in real-time, which reduced development cycles compared to hardware-dependent methods. This flexibility laid the groundwork for modern software engineering practices.

Integrated Circuit Revolution

The integrated circuit (IC) revolution began building on the 1947 invention of the transistor at Bell Laboratories, but gained momentum in the late 1950s with the development of ICs that allowed multiple transistors and components to be fabricated on a single semiconductor chip. In 1958, Jack Kilby at Texas Instruments demonstrated the first IC prototype by integrating transistors, resistors, and capacitors on a germanium substrate, proving the feasibility of monolithic construction. Independently, in 1959, Robert Noyce at Fairchild Semiconductor patented a practical monolithic IC using silicon and the planar process, which enabled high-volume manufacturing by interconnecting components with aluminum lines over an oxide layer. These innovations dramatically reduced the size, cost, and power consumption of electronic circuits, paving the way for more sophisticated computer programs by supporting greater computational density. A pivotal observation came in 1965 when Gordon Moore, then director of research at Fairchild Semiconductor, formulated what became known as Moore's Law in his article "Cramming More Components onto Integrated Circuits." Moore predicted that the number of transistors on an IC would double annually for at least a decade, driven by improvements in manufacturing techniques, thereby exponentially increasing processing power while decreasing costs. This foresight spurred the semiconductor industry to invest heavily in scaling, with the law later revised in 1975 to doublings every two years, but its original projection accurately captured the trajectory that amplified the capabilities of stored-program computers from the previous era. In the 1970s, advancements in Very Large Scale Integration (VLSI)—which involved fabricating thousands to millions of transistors on a single chip—enabled the creation of microprocessors, single-chip central processing units that revolutionized programming by integrating control logic, arithmetic units, and memory interfaces. A landmark example was the Intel 4004, released in 1971 as the world's first commercially available microprocessor, containing 2,300 transistors on a 10-micrometer process and operating at 740 kHz to power a calculator. VLSI techniques, refined through nMOS technology, allowed such chips to handle complex instructions efficiently, supporting the development of more intricate software for embedded systems and early personal computers. The x86 architecture, introduced with the Intel 8086 microprocessor in 1978, exemplified the ongoing evolution of IC-based processors, featuring a 16-bit data path and 29,000 transistors that became the foundation for personal computing. Subsequent iterations, such as the 80286 (1982), 80386 (1985), and later Pentium and Core series processors into the 21st century, maintained backward compatibility through multiple operating modes, ensuring that binary executables compiled for earlier x86 chips could run on newer ones with minimal modification. This design choice enhanced program portability, allowing software ecosystems to persist across hardware generations and fostering widespread adoption of standardized applications. As IC density and processor speeds surged under Moore's Law, the 1960s and 1970s witnessed a shift toward high-level programming languages, which abstracted hardware details and improved developer productivity over low-level assembly code. Enhanced compiler technology, made viable by faster hardware, efficiently translated languages like FORTRAN (evolving since 1957) and emerging ones such as C (1972) into optimized machine code, reducing the need for programmers to manage transistor-level operations directly. This transition enabled the creation of larger, more reliable programs for diverse applications, from scientific simulations to operating systems, as the underlying hardware's raw power compensated for the interpretive overhead of high-level constructs.

Programming Languages and Paradigms

Generations of Languages

The concept of generations in programming languages refers to the progressive levels of abstraction from hardware, beginning with direct machine instructions and evolving toward more human-readable and problem-oriented constructs. This classification, first formalized in the 1970s, highlights how each generation built upon the previous to reduce programming complexity and improve productivity. First-generation languages (1GL), emerging in the 1940s, consisted of machine code written in binary form—sequences of 0s and 1s that directly corresponded to a computer's instruction set and were executed without translation. These languages were tied to specific hardware architectures, such as those in early electronic computers like the ENIAC, making programming labor-intensive and error-prone as programmers had to manage low-level details like memory addresses manually. Second-generation languages (2GL), introduced in the 1950s, marked an improvement through assembly languages that used mnemonic codes and symbolic names to represent machine instructions, which were then translated into binary by an assembler. Examples include early assemblers for computers like the IBM 701, allowing programmers to work with human-readable abbreviations like "ADD" instead of binary opcodes, though still requiring knowledge of the underlying hardware. Third-generation languages (3GL), developed from the late 1950s onward, introduced high-level abstractions closer to natural language and mathematical notation, compiled or interpreted into machine code to enable portable, procedural programming. Fortran, released in 1957 by IBM, exemplified this generation as the first widely used high-level language for scientific computing, featuring structured control flow and variables that hid hardware specifics. Fourth-generation languages (4GL), arising in the 1970s, focused on domain-specific tasks with even higher abstraction, often non-procedural and oriented toward data manipulation rather than step-by-step instructions. SQL, developed in 1974 at IBM as SEQUEL for relational database querying, became a cornerstone example, allowing users to specify what data to retrieve without detailing how the operations were performed. Fifth-generation languages (5GL), also from the 1970s, emphasized logic-based and AI-driven paradigms where programs define rules and goals rather than explicit procedures, aiming to support knowledge representation and inference. Prolog, created in 1972 by Alain Colmerauer and Philippe Roussel at the University of Aix-Marseille, with theoretical contributions from Robert Kowalski at the University of Edinburgh, pioneered this approach using first-order logic for declarative programming in artificial intelligence applications.

Imperative Languages

Imperative programming is a programming paradigm that uses statements to change a program's state, focusing on describing how a computation is performed through explicit sequences of commands, including assignments, loops, and conditional branches. This approach relies on mutable variables and direct control over the flow of execution, enabling programmers to specify the step-by-step operations needed to achieve a result. Fortran, developed by IBM in 1957 as the first high-level programming language, exemplifies imperative programming tailored for scientific and engineering computations. It introduced fixed-form syntax where statements were formatted in specific columns, facilitating numerical calculations through imperative constructs like DO loops for iteration and assignment statements for variable updates. Fortran's design emphasized efficiency in mathematical operations, becoming a cornerstone for simulations and data analysis in research environments. COBOL, released in 1959 by the U.S. Department of Defense through a consortium including IBM and others, targeted business data processing with an imperative style using verbose, English-like syntax to make programs readable for non-technical users. Its structure featured imperative control flow via PERFORM statements for procedures and IF-THEN-ELSE for decisions, along with data division sections to manage records and files explicitly. COBOL's focus on sequential processing of business transactions ensured its widespread adoption in financial and administrative systems. C, created by Dennis Ritchie at Bell Labs in 1972, advanced imperative programming for systems-level tasks, providing low-level access through pointers and manual memory management while abstracting hardware details. Programmers use imperative constructs such as while loops, for iterations, and assignment operators to manipulate memory addresses directly, enabling efficient operating system and compiler development. C's portability and performance made it foundational for Unix and subsequent software infrastructure. Building on C's imperative foundation, C++ emerged in 1985 under Bjarne Stroustrup at Bell Labs, initially as "C with Classes" to extend systems programming with object-oriented features while retaining core imperative control structures like loops and conditionals. This evolution preserved explicit state management through assignments and pointers but introduced mechanisms for abstraction, influencing modern software development without altering the paradigm's step-by-step essence. As a third-generation language, C++ exemplifies how imperative principles scaled to complex applications.

Declarative Languages

Declarative programming is a paradigm in which programs describe the desired results or properties of the computation, leaving the details of control flow and execution strategy to the underlying system, such as an interpreter or compiler. This contrasts with imperative approaches by emphasizing what should be achieved rather than how to achieve it step by step, often leveraging mathematical logic or functional specifications to enable automatic optimization and reasoning about the program. Lisp, developed in 1958 by John McCarthy at MIT, pioneered declarative elements through its focus on list processing and symbolic computation, where programs manipulate symbolic expressions as data, treating code and data uniformly via s-expressions. This homoiconic design allows declarative specification of computations on nested lists, supporting recursive definitions that abstract away low-level operations. Lisp introduced automatic garbage collection in 1959, a declarative memory management mechanism that relieves programmers from explicit allocation and deallocation, enabling higher-level focus on logic. Prolog, created in 1972 by Alain Colmerauer and Philippe Roussel at the University of Aix-Marseille, with theoretical contributions from Robert Kowalski at the University of Edinburgh, embodies declarative logic programming by allowing users to define facts, rules, and queries in first-order logic, with the system handling inference through unification—matching terms to bind variables—and backtracking to explore solution spaces automatically. Programs in Prolog specify relationships and constraints declaratively, and the interpreter acts as a theorem prover to derive results, making it ideal for knowledge representation and automated reasoning without procedural control. ML (Meta Language), introduced in 1973 by Robin Milner at the University of Edinburgh as part of the LCF theorem-proving system, exemplifies declarative functional programming with strong static typing and pattern matching, where functions are defined by their input-output mappings and case analysis on data structures. Its polymorphic Hindley-Milner type inference system automatically deduces types declaratively, ensuring safety without explicit annotations and allowing concise specifications of generic algorithms. This enables programmers to focus on mathematical function definitions, with the compiler managing evaluation order and optimizations.

Object-Oriented and Functional Paradigms

Object-oriented programming (OOP) emerged as a paradigm in the 1970s, pioneered by Alan Kay and his team at Xerox PARC through the development of Smalltalk, which introduced a model where programs are composed of interacting objects that encapsulate data and behavior. In this approach, objects communicate via message passing, enabling flexible and extensible systems. The core principles of OOP include encapsulation, which bundles data and methods within objects to restrict direct access and promote data integrity; inheritance, allowing new classes to derive properties and behaviors from existing ones for code reuse; and polymorphism, enabling objects of different classes to be treated uniformly through overridden methods. These principles foster modularity by organizing code into self-contained units, simplifying maintenance and scalability in large software systems. Functional programming, in contrast, emphasizes computation as the evaluation of mathematical functions, avoiding mutable state and side effects to ensure predictability and composability. Key concepts include pure functions, which produce the same output for the same input without modifying external state; immutability, where data structures cannot be altered after creation, reducing errors from unintended changes; and higher-order functions, which accept or return other functions to enable abstraction and reuse. Haskell, first defined in a 1990 report by a committee including Paul Hudak and Philip Wadler, exemplifies a purely functional language with lazy evaluation and strong typing, supporting these concepts in practical applications. Hybrid languages integrate OOP and functional paradigms to leverage strengths from both, such as Scala's 2004 release by Martin Odersky, which combines object-oriented features like classes and inheritance with functional elements including first-class functions and immutability on the JVM. This fusion allows developers to model complex systems with modular objects while benefiting from functional purity for reliable concurrency. OOP's modularity aids in decomposing programs into reusable components, enhancing extensibility in evolving projects, while functional programming's immutability facilitates safe parallelism by eliminating race conditions, enabling efficient execution on multicore systems without synchronization overhead.

Program Structure and Elements

Syntax and Semantics

In programming languages, syntax refers to the set of rules that define the valid structure and form of programs, specifying how symbols, keywords, and tokens must be arranged to form well-formed expressions and statements. These rules ensure that a program can be parsed by a compiler or interpreter without structural violations. Syntax is typically divided into lexical rules, which govern the formation of basic tokens such as identifiers, numbers, and operators from individual characters, and grammatical rules, which describe how these tokens combine into higher-level constructs like statements and blocks. Grammars provide a formal way to specify syntax, often using context-free grammars (CFGs) that generate valid program strings through production rules. A seminal notation for expressing such grammars is Backus-Naur Form (BNF), introduced by John Backus and Peter Naur in the context of the ALGOL 60 language specification. BNF uses a simple recursive structure with non-terminal symbols (represented by angle brackets) on the left of production rules and sequences of terminals or non-terminals on the right, as in the rule <expression> ::= <term> | <expression> + <term>, allowing hierarchical descriptions of language structure without ambiguity in most cases. Semantics, in contrast, concerns the meaning of syntactically valid programs, defining what computations they perform and what results they produce. Operational semantics describes this meaning by specifying the execution steps of a program, often through transition rules that model how states evolve, such as in small-step or big-step evaluations. This approach, formalized by Gordon Plotkin in his structural operational semantics (SOS), uses inference rules to define transitions like program configurations reducing to new configurations, enabling precise simulation of language behavior. Denotational semantics assigns mathematical objects, such as functions or domains, to language constructs to capture their meaning compositionally, independent of execution details. Developed by Dana Scott and Christopher Strachey, this method maps programs to elements in a semantic domain, where, for example, an expression denotes a function from environments to values, facilitating proofs of equivalence and correctness. Type systems classify values and expressions according to their expected behavior, enforcing rules to prevent invalid operations and enhance program reliability. Static typing checks and infers types at compile time, catching mismatches early, as in languages where variable declarations specify types explicitly or via inference. Dynamic typing defers these checks to runtime, allowing more flexibility but potentially leading to delayed errors. Independently, strong typing prohibits implicit conversions between incompatible types, requiring explicit casts to avoid unintended behavior, while weak typing permits automatic coercions, which can simplify code but introduce subtle bugs. For instance, strong typing might reject adding an integer to a string without conversion, whereas weak typing could implicitly convert the integer to a string for concatenation. Common errors in programs include syntax violations, detected during parsing, such as mismatched parentheses or invalid keywords, which prevent compilation or interpretation. Semantic errors, however, arise from meaningful but incorrect interpretations, like type mismatches or undefined variables, which may compile but yield wrong results or crashes at runtime. These distinctions guide debugging, with syntax issues resolvable via structural fixes and semantic ones requiring logical corrections.

Modularity and Modules

In computer programming, modularity refers to the practice of dividing a program into self-contained units known as modules, each encapsulating specific functionality and exposing it through well-defined interfaces such as functions, classes, or procedures. This approach allows developers to manage complexity by isolating related code, making the overall system easier to understand and modify. Modules serve as building blocks that can be developed, compiled, and maintained independently, while their interfaces specify how they interact with the rest of the program. The primary benefits of modularity include enhanced reusability, where a single module can be incorporated into multiple programs or projects without duplication, and improved testing isolation, which permits thorough examination of individual modules in isolation from the larger system. These advantages reduce development time and errors, as changes to one module do not necessitate widespread revisions elsewhere. For instance, reusable modules facilitate parallel development by teams and simplify debugging by localizing issues to specific units. Techniques for implementing modularity vary across programming languages but commonly involve libraries—collections of pre-compiled modules—and namespaces, which organize code to avoid naming conflicts by creating distinct scopes. In Python, for example, modules are typically defined in separate files that can be imported into other scripts, each maintaining its own private namespace for variables, functions, and classes. This structure supports the creation of extensible programs where external libraries, like NumPy for numerical computations, can be seamlessly integrated. Central to effective modularity is the information hiding principle, which emphasizes concealing a module's internal implementation details while revealing only the essential interface to users. Introduced by David Parnas, this principle ensures that modifications to a module's internals do not affect dependent code, thereby enhancing system flexibility and long-term maintainability. This aligns briefly with encapsulation in object-oriented paradigms, where data and methods are bundled within classes.

Cohesion and Coupling

In software engineering, cohesion refers to the degree to which the elements of a module work together to achieve a single, well-defined purpose, while coupling measures the degree of interdependence between modules. High cohesion within modules promotes focused functionality, making them easier to understand, maintain, and reuse, whereas low coupling between modules minimizes ripple effects from changes, enhancing overall system flexibility. Cohesion is classified into several types, ordered from lowest to highest quality. Coincidental cohesion, the lowest form, occurs when module elements perform unrelated tasks grouped arbitrarily, leading to maintenance challenges due to lack of logical unity. In contrast, functional cohesion represents the highest level, where all elements contribute directly to a single, specific task, such as a module dedicated solely to computing interest rates on loans. Coupling types range from tight to loose, with tighter forms increasing complexity. Content coupling, the tightest, happens when one module directly accesses or modifies another's internal data, creating strong dependencies that complicate testing and updates. Loose coupling, exemplified by data coupling, involves modules interacting only through simple parameter passing without shared globals or control flags, allowing independent development and evolution. The ideal design balances high cohesion with low coupling, as this combination supports scalability by enabling modules to be added, removed, or modified with minimal impact on the system. For instance, refactoring a monolithic program—such as splitting a single file handling both data validation and user interface rendering into separate, functionally cohesive modules connected via data parameters—reduces coupling and improves maintainability without altering core logic.

Software Development and Engineering

Development Models

The development of computer programs follows structured methodologies that guide teams through phases of planning, creation, testing, and deployment. These models have evolved from linear, sequential processes suited to well-defined projects to iterative, adaptive approaches that accommodate changing requirements and emphasize collaboration. Key models include the Waterfall model, Spiral model, Agile methodologies, and DevOps practices, each addressing different aspects of risk, flexibility, and efficiency in software engineering. The Waterfall model, introduced by Winston W. Royce in 1970, represents a traditional, sequential approach to program development. It consists of distinct phases executed in order: system requirements analysis to define needs, software requirements specification to detail functional and non-functional aspects, preliminary and detailed design to architect the program structure, coding and implementation to write the source code, testing and integration to verify functionality, and finally deployment and maintenance. Progress flows downward like a waterfall, with each phase producing deliverables that inform the next, and revisions generally discouraged once a phase is complete to maintain discipline in large-scale projects. This model assumes stable requirements and is effective for projects with clear upfront specifications, though it can lead to challenges if changes arise late. In contrast, the Spiral model, proposed by Barry Boehm in 1986, introduces iteration and risk management to address the limitations of purely linear processes. It structures development as a series of spirals, each cycle encompassing four quadrants: determining objectives and constraints, evaluating alternatives and identifying risks, developing and verifying a prototype or portion of the program, and planning the next iteration. Risk analysis is central, allowing teams to prototype high-risk elements early and refine based on feedback, making it suitable for complex, uncertain projects like large-scale systems where uncertainties could derail progress. Boehm emphasized that this risk-driven approach combines elements of prototyping and Waterfall, enabling progressive elaboration while mitigating potential failures through repeated evaluation. Agile methodologies, formalized in the 2001 Manifesto for Agile Software Development, shift toward flexibility and customer collaboration in program creation. The Manifesto outlines four core values—individuals and interactions over processes and tools, working software over comprehensive documentation, customer collaboration over contract negotiation, and responding to change over following a plan—supported by 12 principles that prioritize delivering valuable software early and continuously. Development occurs in short iterations called sprints, typically 1-4 weeks, where teams break requirements into user stories—concise descriptions of features from the end-user perspective—and incrementally build, test, and refine the program based on feedback. This iterative nature fosters adaptability, with practices like daily stand-ups and retrospectives ensuring alignment, and has been widely adopted for its ability to handle evolving needs in dynamic environments. DevOps practices, which gained prominence in the 2010s, integrate development and operations to streamline program lifecycle management beyond Agile's focus on coding. Originating from efforts to bridge silos between developers and IT operations, DevOps emphasizes automation and continuous processes, particularly continuous integration—where code changes are frequently merged and automatically tested—and continuous delivery, which ensures code is always in a deployable state for rapid releases. Jez Humble and David Farley detailed these in their 2010 work, advocating for deployment pipelines that automate building, testing, and staging to reduce errors and accelerate feedback loops. This integration supports high-velocity development, enabling organizations to deploy updates multiple times daily while maintaining reliability, and often incorporates infrastructure as code for scalable operations.

Performance and Optimization

Performance and optimization in computer programs focus on achieving efficiency in execution speed and resource consumption, ensuring that programs meet functional requirements while operating within practical constraints. Key metrics for evaluating performance include time complexity, which measures the number of computational operations an algorithm requires as a function of input size, and space complexity, which quantifies the amount of memory needed during execution, both typically expressed using Big O notation. Time complexity assesses how runtime scales with larger inputs—for instance, a linear search has O(n) time complexity, while binary search achieves O(log n)—allowing developers to predict scalability. Space complexity, distinct from time, evaluates auxiliary memory usage beyond the input; a non-recursive in-place quicksort can achieve O(1) auxiliary space, whereas the standard recursive version uses O(log n) on average, and merge sort requires O(n). These metrics, rooted in theoretical models like the Turing machine, provide a foundation for comparing algorithm efficiency without relying on specific hardware. Optimization techniques aim to improve these metrics through targeted interventions, such as selecting algorithms with superior complexity profiles. For example, replacing a quadratic O(n²) sorting algorithm with an O(n log n) one like heapsort can dramatically reduce execution time for large datasets, a practice emphasized in algorithm design literature. Profiling identifies performance bottlenecks by measuring execution times and resource usage at the code level, enabling developers to focus efforts on high-impact areas; quality-of-service profiling, for instance, highlights computation hotspots in parallel systems to guide optimizations. Caching enhances performance by storing frequently accessed data in fast memory, reducing retrieval latency—techniques like refreshable compact caching can boost runtime by up to 22.8% while minimizing memory overhead through compression and periodic eviction. Automated algorithm selection further refines this by using machine learning to choose the best solver for specific problem instances, improving overall efficiency in optimization tasks. However, optimizations often involve trade-offs with code readability and maintainability, as complex implementations for speed can obscure logic and increase debugging difficulty. Conflicts arise in industrial applications where performance enhancements, such as intricate data structures, compromise modularity, yet guidelines like modular design rules and modern execution platforms (e.g., just-in-time compilers) help balance these by preserving maintainability without sacrificing efficiency. Donald Knuth famously warned that "premature optimization is the root of all evil," advocating measurement-driven improvements over speculative tweaks to avoid unnecessary complexity—about 97% of code requires no such focus. These trade-offs extend to development costs, where intensive optimization may elevate initial effort but yield long-term savings in runtime resources. In modern contexts, particularly mobile and cloud computing, energy efficiency has emerged as a critical optimization goal alongside traditional metrics. Task offloading to edge servers in mobile systems reduces energy consumption more effectively than cloud offloading due to lower latency, with optimization studies showing significant power savings for real-time applications. Edge computing frameworks enable real-time energy management by preprocessing data at the edge, achieving up to 23.6% efficiency gains through load forecasting and resource coordination in distributed environments. These approaches address the growing demands of battery-constrained devices and sustainable data centers, integrating with profiling and caching to minimize power draw without compromising performance.

Programmer Role and Practices

Programmers, often referred to as software developers or engineers, play a central role in the creation and upkeep of computer programs by translating conceptual designs into functional code. Their primary responsibilities include analyzing user requirements, writing and implementing code using programming languages, and debugging issues to ensure reliability and performance. For instance, they design algorithms, develop prototypes, and integrate components to build applications that meet specified needs. Collaboration is integral to their work, involving coordination with designers, testers, and stakeholders to align software with broader project goals and incorporate feedback throughout development. This teamwork often occurs in agile environments where iterative reviews help refine code and resolve conflicts. To facilitate these responsibilities, programmers rely on specialized tools that streamline workflows and enhance productivity. Integrated development environments (IDEs), such as Microsoft Visual Studio and Eclipse, provide a unified platform combining code editors, compilers, debuggers, and testing tools, enabling real-time syntax checking, autocompletion, and error identification. Version control systems like Git, created in 2005 by Linus Torvalds to manage the Linux kernel's development after the withdrawal of BitKeeper's free access, allow multiple programmers to track changes, merge contributions, and revert modifications without overwriting others' work. Git's distributed nature supports non-linear development, making it essential for collaborative projects by handling branching and merging efficiently. Other tools, including Visual Studio Code, further extend these capabilities with extensions for language-specific support and integrated terminals. Economic considerations shape programmer practices, with a focus on optimizing development time to control upfront costs and budgeting for ongoing maintenance, which often dominates lifecycle expenses. Studies indicate that maintenance activities—such as updates, bug fixes, and adaptations to new environments—can consume up to 90% of a software product's total costs, far exceeding initial development outlays. Programmers must balance rapid prototyping to shorten development cycles, typically measured in weeks or months depending on project scale, against long-term sustainability to avoid escalating maintenance budgets that arise from poor initial design or inadequate testing. Professional practices emphasize quality assurance through structured processes like code reviews and rigorous documentation. Code reviews, a cornerstone of modern software engineering, involve peers examining code changes for defects, adherence to standards, and design improvements before integration, often using tools like GitHub pull requests to facilitate asynchronous feedback and reduce post-release bugs by over twofold compared to unreviewed code. This practice promotes knowledge sharing and collective ownership while being less resource-intensive than alternatives like pair programming. Documentation standards require programmers to maintain clear records of code functionality, APIs, and decisions, as mandated by ethical guidelines, to support maintenance, onboarding, and compliance; inadequate documentation can increase technical debt and hinder scalability. These practices, including thorough debugging and testing, ensure software reliability and ethical responsibility in professional settings.

Analysis Techniques

Analysis techniques in computer programming encompass systematic methods for examining programs to detect errors, optimize performance, and ensure correctness, often integrated into compilers or used as standalone tools. These techniques range from examining code structure without execution to runtime observation, enabling developers to identify issues early and improve reliability. Key approaches include static and dynamic analysis, data flow tracking, formal verification, and supporting tools like linters and debuggers, which collectively reduce defects and enhance program efficiency. Static analysis involves inspecting program code without executing it, typically during compilation, to detect potential errors such as type mismatches, unused variables, or security vulnerabilities before runtime. In contrast, dynamic analysis evaluates the program during execution, often using test cases or instrumentation to observe behavior, memory usage, and interactions that static methods might miss. Compilers commonly employ static analysis for early error detection, while dynamic techniques are vital for uncovering runtime-specific issues like race conditions. Data flow analysis is a foundational static technique that tracks the flow of data through a program by modeling variable definitions, uses, and propagations across control flow paths, primarily to support optimizations like dead code elimination and constant propagation. This method represents programs as control flow graphs and applies lattice-based frameworks to compute information such as reaching definitions or live variables at each point, enabling compilers to eliminate redundant computations and improve efficiency. Pioneered in the late 1960s, it forms the basis for many modern compiler optimizations. Formal verification provides mathematical proofs of program correctness against specified properties, going beyond empirical testing to guarantee absence of errors in critical systems. Model checking, a prominent automated technique, exhaustively explores all possible states of a finite-state model to verify temporal logic properties, detecting deadlocks or safety violations. Developed in the early 1980s, it has been widely adopted in hardware and software design for its ability to provide counterexamples when properties fail. Linters are static analysis tools that scan source code for stylistic inconsistencies, potential bugs, and coding standard violations, originating from the 1978 Lint program developed for C at Bell Labs to aid compiler optimizations and error checking. Debuggers, conversely, facilitate dynamic analysis by allowing programmers to execute code step-by-step, set breakpoints, inspect variables, and trace execution paths during runtime. Examples include GDB for multiple languages since 1986 and language-specific tools like pdb for Python, which help isolate and resolve defects interactively.

Categories of Programs

Application Programs

Application programs, also known as application software, are computer programs designed to perform specific tasks for end users, enabling productivity, entertainment, or other personal and professional functions directly interacting with the user. These programs differ from system software by focusing on user-oriented applications rather than managing hardware or underlying operations. Common categories include productivity tools, multimedia editors, and educational software, all tailored to simplify complex tasks through intuitive interfaces. Early examples of application programs include word processors like Microsoft Word, first released on October 25, 1983, for Xenix systems, which revolutionized document creation by introducing graphical user interfaces and WYSIWYG editing. Another seminal example is the NCSA Mosaic web browser, launched on April 22, 1993, which popularized web surfing by supporting images and hyperlinks, paving the way for modern internet access. These standalone applications were typically distributed via physical media or downloads, requiring local installation on personal computers. Over time, delivery models for application programs evolved from traditional standalone installations to cloud-based Software as a Service (SaaS) models, where software is hosted remotely and accessed via the internet, reducing maintenance needs for users. A key milestone in this shift was the introduction of Google Docs in 2006, an SaaS platform for collaborative document editing that eliminated the need for local storage and enabled real-time sharing. This transition gained momentum in the late 1990s with pioneers like Salesforce, marking a broader move toward subscription-based, scalable access over perpetual licenses. The evolution continued with the rise of mobile application programs, facilitated by app stores that distribute lightweight, device-specific software. Apple's iOS App Store, launched on July 10, 2008, with an initial 500 applications, transformed delivery by allowing seamless downloads and updates directly to smartphones, expanding application programs to touch-based, on-the-go use cases like navigation and social networking. Today, mobile apps represent a dominant form of application software, blending standalone functionality with cloud integration for enhanced portability and user engagement.

System Programs

System programs encompass the foundational software that manages computer hardware and system resources, with operating systems serving as the primary example. An operating system coordinates the use of hardware resources among various application programs and users, providing essential services such as resource allocation and hardware abstraction. This coordination enables multitasking, where multiple processes share processing resources like the CPU through time-sharing mechanisms, allowing concurrent execution and efficient resource utilization. At the heart of an operating system lies the kernel, which implements core functions including process scheduling, memory management, and inter-process communication. Process scheduling involves the kernel selecting and allocating CPU time to processes to optimize system performance and responsiveness. Memory management, another key kernel responsibility, handles allocation, protection, and deallocation of memory spaces for processes, preventing unauthorized access and ensuring efficient use of physical and virtual memory. Prominent examples of system programs include the Unix operating system, initially developed in 1969 at Bell Labs as a multi-user, time-sharing system for the PDP-7 computer, emphasizing simplicity and modularity in resource management. Similarly, the Linux kernel, created by Linus Torvalds in 1991 as a free, Unix-like system for personal computers, has evolved into a widely used foundation for multitasking environments across diverse hardware platforms. Device drivers function as modular extensions to the operating system kernel, providing the necessary interface for specific hardware devices such as peripherals and storage units. These drivers translate high-level OS commands into device-specific operations, allowing the kernel to manage hardware without embedding device-specific code directly. By loading dynamically, device drivers enhance system extensibility while maintaining the kernel's core integrity.

Utility and Low-Level Programs

Utility programs, often simply called utilities, form a category of system software that analyzes, configures, optimizes, and maintains computer hardware and software to ensure safe and efficient operation. These tools are typically bundled with operating systems or installed separately to perform tasks that support overall system functionality without directly implementing end-user applications. Common examples include file managers, which offer graphical or command-line interfaces for tasks such as creating, renaming, deleting, and organizing files and folders while managing permissions and storage locations. Compilers serve as utilities by translating source code written in high-level programming languages into machine-readable object code or executables, enabling the creation and deployment of other programs. Backup tools automate data preservation and recovery, such as by creating incremental or full copies to external drives, cloud storage, or mirrored systems to mitigate risks from hardware failures, malware, or user errors; a representative example is gzip, a file compression utility developed by Jean-loup Gailly and Mark Adler and first released on October 31, 1992, as part of the GNU project to replace the Unix compress tool with improved efficiency for data archiving and transmission. Among utility programs, interpreters and compilers play crucial roles in program execution by bridging high-level code and hardware. A compiler processes the entire source code in a single pass before runtime, generating an independent executable file in machine code that can run efficiently without further translation, though it requires more initial memory and time for compilation. In contrast, an interpreter translates and executes code line by line at runtime, offering immediate feedback and easier debugging but potentially slower performance due to repeated translation during execution. This distinction allows interpreters to support dynamic languages with platform-independent bytecode, while compilers optimize for speed in production environments. Low-level programs operate closer to hardware, with microcode representing a foundational example as an intermediary layer within central processing units (CPUs). Microcode comprises sequences of low-level instructions stored in the CPU's control store, typically in read-only memory or flash, that interpret and implement higher-level machine instructions by controlling the processor's internal circuits, such as registers and arithmetic logic units. This abstraction enables CPU designers to implement complex instruction set computing (CISC) architectures using simpler reduced instruction set computing (RISC)-like hardware primitives, facilitating bug fixes, performance enhancements, and feature additions through updates without altering the physical silicon. For instance, in modern x86 processors, microcode translates user-visible instructions into hardware-internal operations, with updates loaded during the power-on self-test (POST) phase by the system's firmware. Firmware encompasses another class of low-level programs embedded directly into hardware devices to provide persistent control and initialization routines that persist across power cycles. Stored in non-volatile memory like EPROM or flash chips, firmware executes automatically upon device startup to configure components, test integrity, and hand off control to higher-level software. A prominent example is the Basic Input/Output System (BIOS), a type of firmware integrated into computer motherboards that performs hardware initialization during boot, such as detecting peripherals and loading the operating system, while also offering runtime services like interrupt handling for device interactions. Modern variants, such as Unified Extensible Firmware Interface (UEFI), extend BIOS capabilities with modular drivers and secure boot features to enhance compatibility and security in contemporary systems.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.