Comma-separated values
Comma-separated values
Main page

Comma-separated values

logo
Community Hub0 subscribers
What are your thoughts?
Be the first to start a discussion here.
Be the first to start a discussion here.
Comma-separated values

Comma-separated values (CSV) is a text data format that uses commas to separate delimiter-separated values, and newlines to separate records. CSV data stores tabular data (numbers and text) in plain text, where each line typically represents one data record. Each record consists of the same number of fields, and these are separated by commas. If the field delimiter itself may appear within a field, fields can be surrounded with quotation marks. A CSV file is a file containing data in CSV format.

CSV is widespread in data applications and is widely supported by a variety of software, including common spreadsheet applications such as Microsoft Excel. Benefits cited in favor of CSV include human readability and the simplicity of the format.

The CSV file format was formalized in the 2005 technical standard RFC 4180, which defines the MIME type "text/csv" for the handling of text-based fields.

Comma-separated values is a data format that predates personal computers by more than a decade: the IBM Fortran (level H extended) compiler under OS/360 supported list-directed ("free form") input/output, with commas between values, in 1972. List-directed input/output was defined in FORTRAN 77, approved in 1978. List-directed input used commas or spaces for delimiters, so unquoted character strings could not contain commas or spaces.

The term "comma-separated value" and the "CSV" abbreviation were in use by 1983. The manual for the Osborne Executive computer, which bundled the SuperCalc spreadsheet, documents the CSV quoting convention that allows strings to contain embedded commas.

Comma-separated value lists are easier to type (for example into punched cards) than fixed-column-aligned data, and they were less prone to producing incorrect results if a value was punched one column off from its intended location.

Comma separated files are used for the interchange of database information between machines of two different architectures. The plain-text character of CSV files largely avoids incompatibilities such as byte-order and word size. The files are largely human-readable, so it is easier to deal with them in the absence of perfect documentation or communication.

The main standardization initiative—transforming "de facto fuzzy definition" into a more precise and de jure one—was in 2005, with RFC 4180, defining CSV as a MIME Content Type. Later, in 2013, some of RFC 4180's deficiencies were tackled by a W3C recommendation.

See all
User Avatar
No comments yet.