Introduction to Structs in C: Managing Complex Data Structures

Introduction to Structs in C: Managing Complex Data Structures

The C language allows us to build custom data structures to better manage our data. While arrays are very useful and convenient, they have shortfalls that limit their usage. An array can only store a collection of identical objects; however, real-world objects and things are much more complex than that. They have attributes that are of different types.

Take for instance a primary school class with 50 students and for each student, you need to take the name, age, and sex. An array would not be suitable. For starters, we have two different data types, ie, a string and an integer. Additionally, an array has a fixed length that has to be declared at the beginning of the program. If, we admit an extra student, midway through the term, then the whole program would need to be rewritten to accommodate just an extra pupil.

One solution would be to record all the attributes separately. So, if we have a pupil Peter, Male and of age 7, we would:

const char *name = Peter;
const char *sex = Male;
int age = 7;

However, by the time we are done with 50 students, it would be so tedious, bulky, and error-prone, not to mention the need to update any associated functions that we might have. Say we have a function to display a pupil:

void display_pupil (const char name, const char sex, int age)
{
    printf(“%s is %s and is %d years\n”, name, sex, age);
}

int main()
{
    display_student(peter, male, 7);
    ...
}

The number of variables to be passed is already huge, the code would be bulky and error-prone. We need a data structure that records all the attributes of a single pupil in one cluster. The struct C function allows us to do exactly that, allowing us to make our own partitioned bucket where we can store all this data.

To make a struct, we use the struct keyword, like this:

struct pupil_details {
    const char *name;
    const char *sex;
    int age;
};

We now have a data structure that can take up all pupil attributes. We can initialize the struct pupil_details as we would an array (we will see other ways to initialize later):

struct pupil_details peter = {“Peter”, “male”, 7};

Now this is more like a real-life description of Peter. He is a 7 year old boy! The name, along with the attributes are all captured in one unit.

When we initialize a struct, we need to make sure that the individual pieces of data are in the order defined in the struct. The following initialization would be wrong:

struct pupil_details {
    const char *name;
    const char *sex;
    int age;
};
struct pupil_details peter = {“Peter”,  7, “male”};

We also need to ensure that we assign the values in the same line we declare a struct. For instance, this below would be wrong:

struct pupil_details;
pupil_details = {“Peter”, “male”, 7};

Now that we have our struct, each of the 50 pupils can have a struct representing them. The functions would also be modified to take a struct (we shall talk about the dot operator in a later section):

void display_student(struct pupil student)
{
    printf("Name: %s\n", student.name);
    printf("Sex: %s\n", student.sex);
    printf("Age: %d\n", student.age);
}

int main()
{
    struct pupil amina = {"Amina", "Female", 7};
    display_student(amina);

    return 0;
}

The output of this would be

Name: Amina
Sex: Female
Age: 7

Overall, this makes the code easier to manage and more concise. Another advantage is that we can add fields to our struct without the need to modify the functions. Let us add the height and weight of every pupil to our struct:

struct pupil_details {
    const char *name;
    const char *sex;
    int age;
    int weight;
    int height;
};

Structs can be nested, meaning, we can have a struct within a struct. We could group height and weight:

struct physical {
    int height;
    int weight;
};

struct pupil_details {
    const char *name;
    const char *sex;
    int age;
    struct physical size;
};

We initialize it as we saw earlier, only that we would have to include data for one struct in the other.

struct pupil peter = {“Peter”, “Male”, 7, {100, 25}};

Typedef

We have seen that structs help us make our code neater and less error-prone. However, thus far, we have to use the struct keyword whenever we define a struct and also when we initialize variables. This seems like double work to me. The C language programmers thought it through and navigated around it using typedef.

To simplify the usage of a struct and avoid repeating the struct keyword, we can utilize the typedef keyword in C. By using typedef, we can create an alias for our struct, allowing us to refer to the struct using the alias instead of the full struct declaration.

Here's an example demonstrating the usage of typedef:

typedef struct pupil_details {
    const char *name;
    const char *sex;
    int age;
    int weight;
    int height;
} pupil;

Now, instead of:

struct pupil_details peter = {“Peter”, “Male”, 7, 25, 100};

we simply:

pupil peter = {“Peter”, “Male”, 7, 25, 100};

Accessing elements in a struct:

To access items in a struct, we use the dot operator “.” like here:

struct pupil_details peter = {“Peter”, “male”, 7}
printf(“Name = %s\n”, peter.name);

This brings us to the other way we can initialize a struct, called the designated initializer style.

struct point {
    int x;
    int y;
};

struct point p = {.y = 1, .x = 2};

Notice in this method, we don’t need to follow the order in which the individual data items appear.

Let us now take this a little further. One of our pupils, Amina, is celebrating her birthday and is now a year older. We have to increase her age by one. So we have this function.

#include <stdio.h>

struct pupil {
    const char* name;
    const char* sex;
    int age;
};

void update_age(struct pupil p)
{
    p.age += 1;
    printf("%s is now %d\n", p.name, p.age);
}

int main() {
    struct pupil amina = {"Amina", "Female", 7};
    update_age(amina);
    printf("%s is now %d\n", peter.name, peter.age);

    return 0;
}

When we run this, we get some conflicting results.

Amina is now 8
Amina is now 7

Strange! Conflicting results! Why? This is because, in our function, we passed by value the struct amina. Essentially, when we pass by value, the function creates a copy of the data and it works on that copy, leaving the original unchanged. Therefore, the age was updated within the function, but when the function exited, everything was destroyed. To have our original data changed, we need to pass by reference our struct, using its address/ pointer. The function would thus be:

void update_age(struct pupil* p)
{
    (*p).age += 1;
    printf("%s is now %d\n", (*p).name, (*p).age);
}

When we pass a pointer to a struct, we access the element using (*p).age. The parenthesis (brackets) are very vital, as without them, the meaning of the statement changes. They are cumbersome too. Thankfully, the makers of C thought about this and simplified it. We can use the p->age expression to mean the same as (*p).age, which is neater and less error-prone. We can now update the code as:

#include <stdio.h>

struct pupil {
    const char* name;
    const char* sex;
    int age;
};

void update_age(struct pupil* p)
{
    p->age += 1;
    printf("%s is now %d\n", p->name, p->age);
}

int main()
{
    struct pupil amina = {"Amina", "Female", 7};
    update_age(&amina);
    printf("%s is now %d\n", amina.name, amina.age);

    return 0;
}

Now the output is:

Amina is now 8
Amina is now 8

That concludes this brief article on structs. As a recap, we looked at the limitations of arrays and introduced structs as a solution to store and manage complex data.

Image by kjpargeter on Freepik