Hubbry Logo
search
logo
1532991

Left-child right-sibling binary tree

logo
Community Hub0 Subscribers
Read side by side
from Wikipedia
6-ary tree represented as a binary tree

Every multi-way or k-ary tree structure studied in computer science admits a representation as a binary tree, which goes by various names including child-sibling representation,[1] left-child, right-sibling binary tree,[2] doubly chained tree or filial-heir chain.[3]

In a binary tree that represents a multi-way tree T, each node corresponds to a node in T and has two pointers: one to the node's first child, and one to its next sibling in T. The children of a node thus form a singly-linked list. To find a node n's k'th child, one needs to traverse this list:

procedure kth-child(n, k):
    child ← n.child
    while k ≠ 0 and child ≠ nil:
        child ← child.next-sibling
        k ← k − 1
    return child  // may return nil
A trie implemented as a doubly chained tree: vertical arrows are child pointers, dashed horizontal arrows are next-sibling pointers. Tries are edge-labeled, and in this representation the edge labels become node labels on the binary nodes.

The process of converting from a k-ary tree to an LC-RS binary tree is sometimes called the Knuth transform.[4] To form a binary tree from an arbitrary k-ary tree by this method, the root of the original tree is made the root of the binary tree. Then, starting with the root, each node's leftmost child in the original tree is made its left child in the binary tree, and its nearest sibling to the right in the original tree is made its right child in the binary tree.

Doubly chained trees were described by Edward H. Sussenguth in 1963.[5]

Processing a k-ary tree to LC-RS binary tree, every node is linked and aligned with the left child, and the next nearest is a sibling. For example, we have a ternary tree below:

                  1
                 /|\
                / | \
               /  |  \
              2   3   4
             / \      |
            5   6     7
                     / \
                    8   9

We can re-write it by putting the left child node to one level below its parents and by putting the sibling next to the child at the same level – it will be linear (same line).

                  1
                 /
                /
               /
              2---3---4
             /       /
            5---6   7
                   /
                  8---9

We can transform this tree to a binary tree by turning each sibling 45° clockwise.[6]

                1
               /
              2
             / \
            5   3
             \   \
              6   4
                 /
                7
               /
              8
               \
                9

Use cases

[edit]

The LCRS representation is more space-efficient than a traditional multiway tree, but comes at the cost that looking up a node's children by index becomes slower. Therefore, the LCRS representation is preferable if

  1. Memory efficiency is a concern, and/or
  2. Random access of a node's children is not required.

Case (1) applies when large multi-way trees are necessary, especially when the trees contains a large set of data. For example, if storing a phylogenetic tree, the LCRS representation might be suitable.

Case (2) arises in specialized data structures in which the tree structure is being used in very specific ways. For example, many types of heap data structures that use multi-way trees can be space optimized by using the LCRS representation. (Examples include Fibonacci heaps, pairing heaps and weak heaps.) The main reason for this is that in heap data structures, the most common operations tend to be

  1. Remove the root of a tree and process each of its children, or
  2. Join two trees together by making one tree a child of the other.

Operation (1) it is very efficient. In LCRS representation, it organizes the tree to have a right child because it does not have a sibling, so it is easy to remove the root.

Operation (2) it is also efficient. It is easy to join two trees together.[7]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The left-child right-sibling binary tree (LCRS tree) is a representation of a multi-way tree, also known as an n-ary or k-ary tree, using a binary tree structure where each node contains exactly two pointers: one to its leftmost child and one to its immediate right sibling.[1] This approach transforms the general tree into a binary form by treating the children of any node as a linked list connected via right-sibling pointers, with the left pointer linking to the first child in that list.[1] The transformation, known as the Knuth transform, maps the root and subtrees identically while chaining siblings horizontally.[1] In this structure, finding the k-th child of a node requires descending to the left child and then traversing k-1 right-sibling links, which supports ordered children but incurs linear time for operations involving multiple children.[2] Each node typically stores its data value along with the two pointers, and an optional parent pointer can be added for full navigation, though the core representation uses only the child and sibling links.[2] This method is particularly useful for representing forests (collections of trees) by linking roots as siblings.[2] The primary advantage of the LCRS representation is its memory efficiency, requiring a fixed two pointers per node regardless of the number of children, which contrasts with array-based child lists that scale with arity and can waste space for nodes with few children.[2] It simplifies dynamic memory management and facilitates operations like subtree attachment by adjusting just three pointers, making it suitable for applications such as binomial heaps, file systems, or phylogenetic trees where memory constraints are significant and random child access is infrequent.[3][4][5] However, it trades off traversal speed for this efficiency, as sibling searches are sequential rather than direct.[2] The concept originates from discussions in Donald Knuth's The Art of Computer Programming, where it is presented as a standard technique for tree representation in algorithms.[1]

Fundamentals

Definition

The left-child right-sibling binary tree is a binary tree representation of a general (multiway or n-ary) tree, in which each node corresponds to a node in the original tree, the left pointer references the node's first child, and the right pointer references the next sibling of the node. This mapping transforms the ordered children of each general tree node into a right-linked chain, allowing the entire structure to be encoded using only binary pointers without additional fields for multiple children. This representation, sometimes referred to as the Knuth transform when applied to convert k-ary trees, was introduced by Donald E. Knuth in The Art of Computer Programming, Volume 1 (1968), as an efficient way to simulate arbitrary trees using binary tree operations and storage. Knuth described it in the context of fundamental algorithms for tree traversal and manipulation, highlighting its utility in languages or systems limited to binary linking. A key distinction of this approach from other binary encodings of general trees, such as those using left-child/next-child or parent pointers, lies in its explicit use of right pointers for sibling traversal, which preserves the linear order of children while simplifying child access through the left chain.[6]

Node Structure

In the left-child right-sibling binary tree representation, each node typically contains a data payload along with two pointers: a left pointer referencing the node's first child from the original general tree, and a right pointer referencing the node's next sibling at the same level.[1] This design encodes the hierarchical and sibling relationships of multi-way trees using a binary structure, where the left pointer simulates descent to children and the right pointer simulates horizontal traversal among siblings. A common programmatic declaration of such a node in pseudocode or languages like C appears as follows:
struct Node {
    int data;              // Payload or key value
    struct Node* left_child;  // Pointer to first [child](/page/Child)
    struct Node* right_sibling; // Pointer to next sibling
};
This structure is flexible for implementation in various programming languages, with the data field holding the node's value and the pointers enabling navigation.[1] Leaf nodes, which have no children, set their left_child pointer to null, while nodes without siblings set their right_sibling pointer to null; nodes lacking both children and siblings have both pointers as null.[1] This null-handling ensures efficient memory usage and prevents invalid traversals, maintaining the integrity of the tree representation during operations like insertion or deletion.

Tree Representation

Converting General Trees

The conversion of a general tree, which may have nodes with arbitrary numbers of children, to a left-child right-sibling (LCRS) binary tree follows a recursive algorithm that maps each node's children into a binary structure. In this representation, the left pointer of a binary node points to the LCRS representation of its first child in the original general tree, while the right pointer links to the LCRS representation of the next sibling, forming a chain for all children of the same parent. This approach, referred to as the Knuth transform[7], enables efficient storage and traversal of general trees using only two pointers per node.[2] The step-by-step algorithm proceeds as follows:
  1. If the general tree node is null or a leaf (no children), create a corresponding binary node with the node's value and no left or right pointers, then return it.
  2. For a non-leaf node, create a binary node with the same value.
  3. Recursively apply the algorithm to each child of the general node, producing a list of binary subtrees in the order of the children.
  4. If children exist, assign the first binary subtree as the left child of the new binary node.
  5. Chain the remaining binary subtrees sequentially using right pointers: set the right pointer of the last node in the current chain to the next binary subtree, updating the chain end iteratively.
  6. Return the completed binary node, repeating the process for all subtrees to cover the entire general tree.
This recursive chaining ensures that sibling relationships are preserved horizontally via right pointers, while parent-child descent is handled vertically via left pointers.[2] The following pseudocode illustrates the conversion function, assuming a general tree node has a value and an ordered list of children, and a binary node has left and right pointers:
function convertToLCRS(generalNode):
    if generalNode is null:
        return null
    binaryNode = new BinaryNode(generalNode.value)
    if generalNode.children is empty:
        return binaryNode
    // Recursively convert children
    binaryChildren = []
    for each [child](/page/Child) in generalNode.children:
        binaryChildren.[append](/page/Append)(convertToLCRS([child](/page/Child)))
    // Chain the binary children
    if binaryChildren is not empty:
        binaryNode.left = binaryChildren[0]
        current = binaryNode.left
        for i = 1 to length(binaryChildren) - 1:
            current.right = binaryChildren[i]
            current = current.right
    return binaryNode
To convert an entire general tree, invoke convertToLCRS(root). This implementation iterates over the children list once per node, achieving O(n time complexity where n is the number of nodes, as each node and edge is processed constantly. The resulting LCRS binary tree preserves the original general tree's hierarchy and child order identically: the depth from root to any leaf remains unchanged due to recursive left-child mappings mirroring the descent paths, and the left-to-right sequence of siblings is maintained through the right-pointer chain, allowing recovery of the original structure via traversal.

Example Representations

Consider a simple general tree consisting of a root node labeled A with three children B, C, and D, where B has two additional children E and F. This structure is commonly used to illustrate the left-child right-sibling representation, which transforms the n-ary tree into a binary tree using only two pointers per node: left for the first child and right for the next sibling.[2] In the converted binary form, node A's left pointer connects to B (its first and only direct child link), while A's right pointer is null as A has no siblings. Node B's left pointer connects to E (B's first child), and B's right pointer connects to C (B's next sibling). Node C's left pointer is null (no children), and its right pointer connects to D (C's next sibling). Node D's left and right pointers are both null. Node E's left pointer is null, and its right pointer connects to F (E's next sibling). Node F's left and right pointers are both null. These assignments preserve the original tree's hierarchy by chaining siblings horizontally via right pointers and descending into subtrees vertically via left pointers.[8] The pointer structure can be represented textually as follows:
NodeLeft PointerRight Pointer
ABnull
BEC
CnullD
Dnullnull
EnullF
Fnullnull
A common pitfall in visualizing this representation arises when interpreting the right-pointer chains as part of the descent hierarchy, rather than recognizing them as linear sibling sequences that must be distinguished from the subtree explorations initiated by left pointers; this distinction is crucial for correctly reconstructing the original general tree during traversals.[2]

Properties and Operations

Space and Time Complexity

The left-child right-sibling binary tree representation requires O(n space for a tree with n nodes, as each node stores exactly two pointers—one to its left child (the first child in the original general tree) and one to its right sibling—independent of the node's degree or arity.[9] This fixed overhead per node provides consistent storage efficiency, typically using two pointer fields plus data, which is more compact than child-list representations that allocate variable pointers proportional to each node's number of children.[9] In contrast to array-based encodings for general trees with assumed fixed maximum arity, this binary form eliminates wasted slots for absent children, avoiding unnecessary space allocation in sparse structures.[9] Insertion and deletion of children occur in O(1) average time through direct updates to the sibling chain, assuming the parent or target position is accessible.[9] Broader tree operations, such as searching for a node via a specified path, take O(path length + total siblings traversed along the path) time, which is O(n) in the worst case due to sequential pointer following along the structure.[9] This representation balances these efficiencies for dynamic multiway trees without arity constraints.

Traversal Methods

Traversal methods for left-child right-sibling binary trees adapt standard binary tree algorithms to navigate the structure where the left pointer represents the first child and the right pointer represents the next sibling, enabling efficient processing of general trees. The binary pre-order traversal visits the node first, then recurses on the left child to process all descendants, and finally recurses on the right sibling to continue with peer nodes at the same level. This order ensures a depth-first exploration that aligns with pre-order semantics for the underlying general tree. Pseudocode for pre-order traversal:
procedure PreOrder(node):
    if node is null:
        return
    visit(node)
    PreOrder(node.left)  // Recurse on children first
    PreOrder(node.right) // Then siblings
The binary in-order traversal recurses on the left (children), visits the node, then recurses on the right (siblings). This corresponds to the post-order of the general tree. Pseudocode for post-order (general tree) via binary in-order:
procedure PostOrder(node):
    if node is null:
        return
    PostOrder(node.left)  // Recurse on children first
    visit(node)
    PostOrder(node.right) // Then siblings
Depth-first search (DFS) in left-child right-sibling trees uses the above recursive procedures, with sibling iteration achieved through the right-pointer recursion after handling the child subtree. An iterative DFS can employ a stack, pushing the right sibling onto the stack after pushing the left child to preserve the order of processing.[10] Breadth-first search (BFS) employs a queue to process nodes level by level. Upon dequeuing a node, it is visited, and its children are enqueued by iterating over the sibling chain starting from the left child. This loop explicitly handles the right pointers to include all children in left-to-right order before continuing. Pseudocode for BFS, highlighting the sibling loop:
procedure BFS([root](/page/Root)):
    if [root](/page/Root) is null:
        return
    queue = empty queue
    queue.enqueue([root](/page/Root))
    while queue is not empty:
        node = queue.dequeue()
        visit(node)
        current = node.left
        while current is not null:
            queue.enqueue(current)
            current = current.right  // Iterate over siblings

Comparisons

With First-Child Next-Sibling

The left-child right-sibling binary tree representation is also known as the first-child next-sibling representation. In this encoding for general trees in binary form, each node's left pointer references its first (or leftmost) child, while the right pointer links to the next sibling in the parent's child list.[2] This setup transforms the multi-way branches of a general tree into a binary structure where descending via left pointers follows the child hierarchy and right pointers traverse horizontal sibling relations. This representation maintains space efficiency, utilizing exactly two pointers per node—typically left and right—beyond the node's data payload, allowing arbitrary-arity trees to be stored without arrays of variable-sized child lists. It requires no additional fields for degree or child counts. The form often aligns well with established binary tree conventions that emphasize left subtrees as primary (e.g., in search trees or heap implementations), potentially simplifying adaptations of standard binary traversal or balancing algorithms.[2]

With Standard Binary Trees

Standard binary trees are inherently limited to a maximum of two children per node—one left and one right—making them inadequate for directly representing general trees where nodes can have an arbitrary number of children. To simulate higher arity in standard binary trees, additional techniques such as threading (adding parent or sibling pointers) or separate encoding of child lists are required, which increase complexity and space overhead.[2] In contrast, the left-child right-sibling representation leverages the binary tree structure more flexibly by designating the left pointer to the first child and the right pointer to the next sibling, thereby encoding arbitrary arity without needing extra data structures or encoding schemes. This approach utilizes standard binary tree primitives like insertion and traversal while naturally accommodating varying degrees of branching, resulting in a more straightforward and space-efficient simulation of general trees compared to augmented standard binary representations.[2][11] Standard binary trees are best suited for scenarios involving balanced, two-child structures, such as binary search trees or complete binary heaps, where the fixed arity aligns with algorithmic requirements for efficiency in operations like searching or priority queuing. The left-child right-sibling method, however, excels in representing irregular general trees with unbounded children, including file system hierarchies or abstract syntax trees, where the sibling chaining simplifies handling of dynamic, multi-child nodes without the rigidity of binary limits.[2]

Applications

Use Cases in Data Structures

The left-child right-sibling binary tree representation can be used to model file systems and directory hierarchies, where directories contain multiple subdirectories and files. In this structure, a directory node points to its first subdirectory or file via the left-child pointer, while subsequent siblings are linked via right-sibling pointers, enabling traversal of hierarchical structures without variable-length child arrays. This approach aligns with the organization of operating system file trees, such as those in Unix-like systems, where the root directory serves as the tree's root and nested folders form multi-way branches.[12][13] In binomial heaps, a priority queue data structure, LCRS is used to represent each binomial tree. Each node points to its leftmost child and right sibling, facilitating efficient operations like union of heaps by linking sibling chains, as described in standard algorithms texts. This representation allows O(log n) time for insert and extract-min, with O(1) union.[14] In XML and HTML parsing, the left-child right-sibling representation facilitates modeling the Document Object Model (DOM) as a general tree with elements having arbitrary numbers of child nodes, such as nested tags or attributes. Each element node links to its first child (e.g., the opening tag's content) through the left pointer and to sibling elements (e.g., adjacent tags) via the right pointer, supporting operations like tree traversal for rendering or querying without the overhead of dynamic arrays for children. This binary encoding is particularly useful in transformer-based models for processing DOM trees, where it converts multi-way structures into a binary form compatible with sequence-based neural architectures.[15][16] For expression trees in compilers, the left-child right-sibling binary tree handles n-ary operators and operands, representing complex expressions with variable arity, such as function calls or arithmetic operations with multiple arguments. The operator node points left to its first operand or subexpression and right to the next sibling operand, preserving precedence and associativity while simplifying the structure to a binary tree suitable for code generation and optimization passes. This representation is integrated into abstract syntax trees (ASTs) during intermediate code generation, enabling efficient evaluation and transformation of expressions in compiler pipelines.[17][16]

Implementations in Programming

The left-child right-sibling (LCRS) binary tree representation is commonly implemented in programming languages using pointer or reference-based node structures, where each node typically holds the data value, a pointer to its left child (first child in the general tree), and a pointer to its right sibling (next sibling in the general tree).[18] In C/C++, implementations leverage raw pointers for dynamic memory allocation, requiring manual management to avoid memory leaks. A basic node structure and functions for creation, insertion (as child or sibling), and deletion can be defined as follows:
#include <iostream>
using namespace std;

struct Node {
    int data;
    Node* child;  // Left child pointer
    Node* sibling;  // Right sibling pointer
    Node(int val) : data(val), child(nullptr), sibling(nullptr) {}
};

// Create a new node
Node* createNode(int data) {
    return new Node(data);
}

// Insert as child of a given parent (appends to end)
void insertChild(Node* parent, int data) {
    Node* newNode = createNode(data);
    if (parent->child == nullptr) {
        parent->child = newNode;
    } else {
        Node* temp = parent->child;
        while (temp->sibling != nullptr) {
            temp = temp->sibling;
        }
        temp->sibling = newNode;
    }
}

// Insert as sibling after a given node
void insertSibling(Node* node, int data) {
    Node* newNode = createNode(data);
    newNode->sibling = node->sibling;
    node->sibling = newNode;
}

// Delete entire tree recursively
void deleteTree(Node* node) {
    if (node == nullptr) return;
    deleteTree(node->child);  // Delete child subtree
    deleteTree(node->sibling);  // Delete sibling chain
    delete node;
}

// Example usage: Traverse and print (preorder)
void printTree(Node* root) {
    if (root == nullptr) return;
    cout << root->[data](/page/Data) << " ";
    printTree(root->child);
    printTree(root->sibling);
}

int main() {
    Node* root = createNode(1);
    insertChild(root, 2);
    insertChild(root, 3);
    insertSibling(root->child, 4);  // Sibling of 2 (after 2)
    printTree(root);  // Output: 1 2 4 3 
    deleteTree(root);
    return 0;
}
This C++ example demonstrates node creation via dynamic allocation, insertion by appending to the child list or inserting after a sibling (O(n) in worst case for traversal to end), and recursive deletion of the entire tree by processing subtrees and sibling chains before freeing the node; manual deallocation is essential to prevent leaks.[18][2] Adaptations in higher-level languages like Java and Python use object references instead of raw pointers, simplifying memory management through automatic garbage collection. In Java, the node class employs references, and operations mirror the C++ logic but without explicit deletion—setting references to null allows the garbage collector to reclaim memory for unreachable nodes. Similarly, in Python, classes with attributes for child and sibling references handle insertions via method chaining, with the interpreter's garbage collector managing deallocation when reference counts drop to zero; however, for large trees, explicit cleanup may be needed to prompt collection.[18] For library support, the Boost.Graph library in C++ provides adjacency list models that can simulate LCRS representations for general trees by mapping child lists to sibling chains, enabling efficient graph-based tree operations. In other languages, tree libraries such as the anytree package in Python allow custom node implementations that emulate LCRS for n-ary tree simulation, facilitating applications like hierarchical data parsing.
User Avatar
No comments yet.