Bencode

Bencode (pronounced like Bee-encode) is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data.^[1]

It supports four different types of values:

Bencoding is most commonly used in torrent files, and as such is part of the BitTorrent specification. These metadata files are simply bencoded dictionaries.

Bencoding is simple and (because numbers are encoded as text in decimal notation) is unaffected by endianness, which is important for a cross-platform application like BitTorrent. It is also fairly flexible, as long as applications ignore unexpected dictionary keys, so that new ones can be added without creating incompatibilities.

Encoding Algorithm

Bencode uses ASCII characters as delimiters and digits to encode data structures in a simple and compact format.

Integers are encoded as i<base10 integer>e.
- The integer is encoded in base 10 and may be negative (indicated by a leading hyphen-minus).
- Leading zeros are not allowed unless the integer is zero.
- Examples:
  - Zero is encoded as i0e.
  - The number 42 is encoded as i42e.
  - Negative forty-two is encoded as i-42e.

Byte Strings are encoded as <length>:<contents>.
- The length is the number of bytes in the string, encoded in base 10.
- A colon (:) separates the length and the contents.
- The contents are the exact number of bytes specified by the length.
- The contents are a sequence of bytes (not a textual string)
- Examples:
  - An empty string is encoded as 0:.
  - The string "bencode" is encoded as 7:bencode.

Lists are encoded as l<elements>e.
- Begins with l and ends with e.
- Elements are bencoded values concatenated without delimiters.
- Examples:
  - An empty list is encoded as le.
  - A list containing the string "bencode" and the integer -20 is encoded as l7:bencodei-20ee.

Dictionaries are encoded as d<pairs>e.
- Begins with d and ends with e.
- Contains key-value pairs.
- Keys are byte strings and must appear in lexicographical order.
- Each key is immediately followed by its value, which can be any bencoded type.
- Examples:
  - An empty dictionary is encoded as de.
  - A dictionary with keys "wiki" → "bencode" and "meaning" → 42 is encoded as d7:meaningi42e4:wiki7:bencodee.

There are no restrictions on the types of values stored within lists and dictionaries; they may contain other lists and dictionaries, allowing for arbitrarily complex data structures.

Bencode defines only byte string types, rather than any particular character encoding for storing text. Downstream applications and data format specifications that use bencode are free to specify whichever encoding they prefer for encoding text into bencoded byte strings.

Types of errors in Bencode

Here is the list of the possible errors that a ill-formatted bencode may have:

Null root value.
Non-singular root item.
Invalid type encountered (character not 'i', 'l', 'd', or '0'-'9').
Missing 'e' terminator for 'i', 'l', or 'd' types.
Integer errors:
1. Contains non-digit characters.
2. Has a leading zero.
3. Is negative zero.
Byte string errors:
1. Negative length.
2. Length not followed by ':'.
3. Unexpected EOF before completing string.
4. Length specified in units of codepoints (characters) rather than bytes.
Dictionary errors:
1. Key is not a string.
2. Duplicate keys.
3. Keys not sorted.
4. Keys incorrectly sorted by codepoint in a particular character encoding, rather than lexicographically sorted by ordinal.
5. Missing value for a key.

Features

Bencode is a very specialized kind of binary coding with some unique properties:

For each possible (complex) value, there is only a single valid bencoding; i.e. there is a bijection between values and their encodings. This has the advantage that applications may compare bencoded values by comparing their encoded forms, eliminating the need to decode the values.
Bencoding serves similar purposes as data languages like JSON and YAML, allowing complex yet loosely structured data to be stored in a platform independent way. This allowing a linear memory storage for complex data.

Drawbacks

Bencode is not considered a human-readable encoding format. While the BE codegroups can be decoded manually, the bencoded values often contain binary data, so decoding by hand may be error prone. It is not safe to edit bencode files in text editors because bencoded files contain binary data, so a hex editor or specialised bencode editor tool must be used.

Bencode does not store any metadata about the size of list or dictionary structures, requiring all preceding elements to be read sequentially in order to reach a particular field. As such, bencode may not be suitable for large data structures where random access to fields is required.

References

^ The BitTorrent Protocol Specification Archived 2019-07-26 at the Wayback Machine. BitTorrent.org. Retrieved 8 October 2018.

External links

Bencoding specification
File_Bittorrent2 - Another PHP Bencode/decode implementation
The original BitTorrent implementation in Python as standalone package
Torrent File Editor cross-platform GUI editor for BEncode files
bencode-tools - a C library for manipulating bencoded data and a XML schema like validator for bencode messages in Python
Bento - Bencode library in Elixir.
Beecoder - the file stream parser that de/encoding "B-encode" data format on Java using java.io.* stream Api.
Bencode parsing in Java
Bencode library in Scala
Bencode parsing in C
There are numerous Perl implementations on CPAN

[1] The BitTorrent Protocol Specification Archived 2019-07-26 at the Wayback Machine. BitTorrent.org. Retrieved 8 October 2018.

[1]

History

Bencode

Recent from talks

Recent from talks

Contribute something

Contribute something

Media Pages

Timelines

Articles

Notes collections

Notes

Notes

Days in Chronicle

Bencode

Encoding Algorithm

Types of errors in Bencode

Features

Drawbacks

See also

References

External links

Bencode

Overview

Definition and Purpose

History and Development

Encoding Specification

Data Types

Encoding Rules

Parsing and Decoding

Decoding Process

Error Handling

Applications and Usage

Role in BitTorrent

Other Implementations

Properties and Comparisons

Advantages and Limitations

Comparison to Other Formats

References

Add your contribution

Related Hubs

Contribute something