Exploring Modern Data Structures in C
15 mins read

Exploring Modern Data Structures in C

Discuss the implementation and applications of advanced data structures like self-balancing trees, skip lists, and trie structures in the context of C programming.

1. Self-Balancing Trees:

Self-balancing trees are a type of binary search tree (BST) data structure that automatically maintains a balanced structure during insertions and deletions of elements. The primary goal of self-balancing trees is to ensure that the height of the tree remains relatively small, which guarantees efficient search, insertion, and deletion operations with a time complexity of O(log n), where n is the number of nodes in the tree.

In a conventional binary search tree, the height of the tree can become skewed, leading to a worst-case scenario where the tree degenerates into a linked list, resulting in search operations taking linear time (O(n)). Self-balancing trees address this issue by enforcing specific rules and performing rotations whenever necessary to rebalance the tree.

There are several types of self-balancing trees, each with its own set of rules and rotation operations. One of the most well-known self-balancing trees is the AVL tree.

AVL Trees:

AVL trees are a type of self-balancing binary search tree named after their inventors Adelson-Velsky and Landis. In an AVL tree, the heights of the left and right subtrees of any node differ by at most one. If, at any time during insertion or deletion, this balance property is violated, rotations are performed to restore the balance.

Balancing operations in AVL trees include:

  1. Rotation Operations:
    • Left Rotation (LL Rotation): Restores balance when the left subtree of a node is taller by more than one level than the right subtree.
    • Right Rotation (RR Rotation): Restores balance when the right subtree of a node is taller by more than one level than the left subtree.
    • Left-Right Rotation (LR Rotation): A combination of left and right rotations to restore balance.
    • Right-Left Rotation (RL Rotation): A combination of right and left rotations to restore balance.
  2. Insertion: When inserting a new node, the AVL tree checks the balance factor of each node from the inserted node up to the root. If the balance factor is violated, appropriate rotations are performed to rebalance the tree.
  3. Deletion: Similar to insertion, deletion may cause the balance factor of nodes to be violated. The tree checks for balance violations during deletion and performs rotations to maintain balance.

Diagram:

AVL-TREE

In this AVL tree, the balance factor of each node (difference in heights between left and right subtrees) is maintained within the range of -1 to 1. If any insertion or deletion operation causes a node’s balance factor to be outside this range, rotation operations are performed to restore balance.

The advantage of self-balancing trees, such as AVL trees, is that they ensure the worst-case time complexity of search, insertion, and deletion operations remains logarithmic, making them suitable for applications where efficient dynamic data management is crucial.

It’s important to note that self-balancing trees come with a trade-off in terms of increased complexity and overhead due to the additional rotation operations. In certain scenarios, other self-balancing trees like Red-Black trees or B-trees might be more suitable depending on specific use cases and performance requirements.

Syntax:

struct Node {
    int key;
    struct Node* left;
    struct Node* right;
    int height; // AVL balance factor
};
//Sample C Program (Insertion and Traversal)
#include <stdio.h>
#include <stdlib.h>

struct Node {
    int key;
    struct Node* left;
    struct Node* right;
    int height;
};

int getHeight(struct Node* node) {
    if (node == NULL) return 0;
    return node->height;
}

int getBalance(struct Node* node) {
    if (node == NULL) return 0;
    return getHeight(node->left) - getHeight(node->right);
}

struct Node* newNode(int key) {
    struct Node* node = (struct Node*)malloc(sizeof(struct Node));
    node->key = key;
    node->left = node->right = NULL;
    node->height = 1;
    return node;
}

struct Node* insert(struct Node* node, int key) {
    if (node == NULL) return newNode(key);

    if (key < node->key)
        node->left = insert(node->left, key);
    else if (key > node->key)
        node->right = insert(node->right, key);
    else
        return node;

    node->height = 1 + max(getHeight(node->left), getHeight(node->right));

    int balance = getBalance(node);

    // Perform rotations if the tree becomes unbalanced

    // Left Left Case
    if (balance > 1 && key < node->left->key)
        return rightRotate(node);

    // Right Right Case
    if (balance < -1 && key > node->right->key)
        return leftRotate(node);

    // Left Right Case
    if (balance > 1 && key > node->left->key) {
        node->left = leftRotate(node->left);
        return rightRotate(node);
    }

    // Right Left Case
    if (balance < -1 && key < node->right->key) {
        node->right = rightRotate(node->right);
        return leftRotate(node);
    }

    return node;
}

void inOrder(struct Node* root) {
    if (root != NULL) {
        inOrder(root->left);
        printf("%d ", root->key);
        inOrder(root->right);
    }
}

int main() {
    struct Node* root = NULL;

    root = insert(root, 10);
    root = insert(root, 5);
    root = insert(root, 15);
    root = insert(root, 2);
    root = insert(root, 8);
    root = insert(root, 20);

    printf("Inorder traversal of the AVL tree: ");
    inOrder(root);

    return 0;
}
Output:

Inorder traversal of the AVL tree: 2 5 8 10 15 20

Explanation:

The sample program demonstrates the insertion and in-order traversal of an AVL tree. The struct Node represents each node in the tree with key values and height for balance factor calculation. The insert function performs balanced insertion and rotation operations when needed to maintain the AVL property. The inOrder function traverses the tree in ascending order.

2. Skip Lists:

A skip list is a data structure that provides an efficient way to perform search, insertion, and deletion operations on a sorted collection of elements, typically numbers or keys. It is designed to improve the average-case time complexity of these operations while remaining relatively easy to implement. Skip lists are similar in concept to balanced trees, but they achieve balancing through a randomized structure rather than strict balancing rules.

Basic Structure and Idea: At its core, a skip list consists of multiple levels of linked lists, where each level contains a subset of the elements from the level below. The bottom level is a standard sorted linked list that contains all the elements. As we move up the levels, the number of elements included decreases exponentially.

In the context of skip lists, “skipping” refers to the fact that during a search operation, we can quickly eliminate a large portion of the list by moving across levels, effectively “skipping” elements that are not likely to be part of the search target.

Key Components:

  • Nodes: Each element in the skip list is represented by a node, containing the element’s value and references to the next nodes on the same level and on the level below.
  • Header Nodes: Each level has a header node that acts as a sentinel and helps with navigation. The header nodes point to the first node on the level.
  • Bottom Level: The bottom level is a standard sorted linked list containing all elements. It serves as the base for all other levels.
  • Tower Building: A tower is formed by nodes that “jump” across levels. For example, if node A on level 0 also exists on level 2, it forms a tower with nodes on levels 0, 1, and 2.

Operations:

  1. Search: To search for an element, we start from the top-left corner (header node of the top level) and move right until we find an element greater than or equal to the target. Then, we drop down a level and continue the search within the sublist.
  2. Insertion: To insert an element, we perform a search to find the appropriate position, and then we insert the new node into the bottom level. With a randomized algorithm, we may choose to promote the new node to higher levels, creating “towers.”
  3. Deletion: Similar to insertion, we perform a search to find the element to be deleted. Once found, we remove the element from the bottom level and potentially unlink it from higher levels.

Advantages of Skip Lists:

  • Skip lists provide average-case time complexity of O(log n) for search, insertion, and deletion operations, which is comparable to many balanced tree structures.
  • Skip lists are simpler to implement than some other balanced trees, such as AVL trees or red-black trees.
  • The randomized nature of skip lists helps maintain balance without complex balancing rules.

Drawbacks of Skip Lists:

  • The worst-case time complexity of search, insertion, and deletion operations can be O(n), although this is unlikely in practice.
  • Skip lists may consume more memory compared to some other data structures.

Diagram:

Syntax:

struct Node {
    int key;
    struct Node* right;
    struct Node* down;
};
//Sample C Program (Search Operation):

#include <stdio.h>
#include <stdlib.h>

struct Node {
    int key;
    struct Node* right;
    struct Node* down;
};

struct Node* createNode(int key) {
    struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
    newNode->key = key;
    newNode->right = newNode->down = NULL;
    return newNode;
}

struct Node* search(struct Node* head, int target) {
    struct Node* current = head;

    while (current != NULL) {
        while (current->right != NULL && current->right->key <= target) {
            current = current->right;
        }

        if (current->key == target) {
            return current;
        }

        current = current->down;
    }

    return NULL;
}

int main() {
    // Create a sample skip list
    struct Node* head = createNode(-1);
    head->right = createNode(3);
    head->right->right = createNode(6);
    head->right->right->right = createNode(9);

    head->down = createNode(0);
    head->down->right = createNode(1);
    head->down->right->right = createNode(3);
    head->down->right->right->right = createNode(6);
    head->down->right->right->right->right = createNode(9);
    head->down->right->right->right->right->right = createNode(11);

    int target = 6;
    struct Node* result = search(head, target);

    if (result != NULL) {
        printf("Element %d found in the skip list.\n", target);
    } else {
        printf("Element %d not found in the skip list.\n", target);
    }

    return 0;
}
Output:

Element 6 found in the skip list.

Explanation:

The sample program demonstrates the search operation in a skip list. The struct Node represents each node in the skip list with the key value, right pointer to the next node in the same level, and down pointer to the next node in the level below. The search function uses a two-pointer approach to navigate through the layers of the skip list to find the target element efficiently.

3. Trie Structures:

Diagram:

A Trie (pronounced “try”), also known as a prefix tree, is a tree-like data structure that is used to store and manage a dynamic set of strings, typically words from a language or keys from a dictionary. It is particularly efficient for tasks involving string-related operations like searching, insertion, and deletion. The primary advantage of a trie is its ability to provide fast and efficient retrieval of data based on prefixes.

Basic Concepts:

  1. Nodes and Edges: A trie consists of nodes connected by edges. Each node represents a character in a string, and each edge represents a link between characters.
  2. Root Node: The top-level node, often referred to as the root, has no incoming edges and represents an empty string or a null character.
  3. Child Nodes: Nodes that follow the root represent the first character of the strings in the trie.
  4. End-of-Word Flag: Each node may have an associated boolean flag indicating whether it marks the end of a valid word.

Advantages:

  1. Efficient Prefix Searching: Tries are extremely efficient for searching strings with a common prefix. It allows you to quickly identify all strings that share a given prefix.
  2. Space Efficiency: While tries can consume more memory than other data structures, they are memory-efficient when storing large sets of strings with common prefixes, as the common prefixes are shared.
  3. Auto-Complete and Predictive Text: Tries are widely used in applications like auto-complete and predictive text, where users are provided with suggestions based on the characters they have typed.
  4. Dictionary Implementations: Tries are suitable for implementing dictionaries, spell checkers, and word-based games like Scrabble.

Operations:

  1. Insertion: To insert a word into a trie, you start from the root and follow edges corresponding to each character of the word. If a required edge doesn’t exist, you create a new node. At the end of the word, you mark the final node with an end-of-word flag.
  2. Search: Searching for a word involves traversing the trie by following edges that correspond to each character. If you encounter a null link or finish traversing before reaching the end of the word, the word is not present.
  3. Deletion: To delete a word, you start at the root and traverse the trie as in a search operation. After reaching the node representing the last character of the word, you remove the end-of-word flag and recursively remove nodes if they don’t contribute to other words.

Complexity:

The time complexity of inserting, searching, and deleting words in a trie is O(L), where L is the length of the word. This makes tries efficient for string-related operations. However, the space complexity can be relatively higher compared to other data structures, especially if the strings have long common prefixes.

Syntax:
struct TrieNode {
    struct TrieNode* children[26]; // One for each lowercase letter
    bool isEndOfWord;
};
//Sample C Program (Insertion and Search)

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

struct TrieNode {
    struct TrieNode* children[26];
    bool isEndOfWord;
};

struct TrieNode* createNode() {
    struct TrieNode* newNode = (struct TrieNode*)malloc(sizeof(struct TrieNode));
    for (int i = 0; i < 26; ++i) {
        newNode->children[i] = NULL;
    }
    newNode->isEndOfWord = false;
    return newNode;
}

void insert(struct TrieNode* root, const char* word) {
    struct TrieNode* current = root;
    for (int i = 0; word[i] != '\0'; ++i) {
        int index = word[i] - 'a';
        if (current->children[index] == NULL) {
            current->children[index] = createNode();
        }
        current = current->children[index];
    }
    current->isEndOfWord = true;
}

bool search(struct TrieNode* root, const char* word) {
    struct TrieNode* current = root;
    for (int i = 0; word[i] != '\0'; ++i) {
        int index = word[i] - 'a';
        if (current->children[index] == NULL) {
            return false;
        }
        current = current->children[index];
    }
    return (current != NULL && current->isEndOfWord);
}

int main() {
    struct TrieNode* root = createNode();
    insert(root, "bat");
    insert(root, "batman");
    insert(root, "box");
    insert(root, "toe");

    if (search(root, "bat")) {
        printf("Word 'bat' found in the trie.\n");
    } else {
        printf("Word 'bat' not found in the trie.\n");
    }

    if (search(root, "boxer")) {
        printf("Word 'boxer' found in the trie.\n");
    } else {
        printf("Word 'boxer' not found in the trie.\n");
    }

    return 0;
}
Output:
Word 'bat' found in the trie.
Word 'boxer' not found in the trie.

Explanation:

The sample program demonstrates the insertion and search operations in a trie structure. The struct TrieNode represents each node in the trie with an array of children pointers for each lowercase letter and a boolean flag isEndOfWord indicating the end of a valid word. The insert function inserts a word by traversing the trie and creating nodes as needed. The search function checks whether a given word exists in the trie.

Leave a Reply

Your email address will not be published. Required fields are marked *