C Strings Are Weird: A Practical Guide

1. What a string really is

In many languages, a string is a built-in object with a known length and lots of behavior. In C, a string is much more primitive. It is just a sequence of characters stored in memory, ending with a special byte whose value is zero.

char word[] = "cat";

That creates an array with four bytes:

'c'   'a'   't'   '\0'

The last byte, '\0', is what tells C, “the string ends here.” Without it, most string functions have no idea where to stop reading.

A C string is not “an array of chars” in the general sense. It is specifically an array of chars that contains a terminating '\0'.

2. The null terminator is the whole game

If you remember one thing from this article, remember this: C string functions depend on the null terminator. They do not carry a separate length field around.

#include <stdio.h>

int main(void) {
    char name[6] = {'A', 'l', 'i', 'c', 'e', '\0'};
    printf("%s\n", name);
    return 0;
}

This prints Alice. But if you forget the terminator:

char broken[5] = {'A', 'l', 'i', 'c', 'e'};

then printf("%s", broken) will keep reading memory past the array until it happens to find a zero byte somewhere else. That means garbage output or a crash.

C does not automatically protect you from unterminated strings. The program may compile, run, and fail in confusing ways later.

3. Arrays vs pointers: similar, but not the same

This is one of the biggest sources of confusion in C. These look similar:

char a[] = "hello";
char *p = "hello";

But they are not equivalent.

char a[] = "hello";

Creates an array containing a copy of the characters: 'h' 'e' 'l' 'l' 'o' '\0'.

The array itself is storage you own in that scope.

char *p = "hello";

Creates a pointer to a string literal.

The pointer variable is writable, but the literal data should be treated as read-only.

#include <stdio.h>

int main(void) {
    char a[] = "hello";
    char *p = "hello";

    a[0] = 'H';   // OK
    // p[0] = 'H'; // Undefined behavior: do not modify string literals

    printf("%s\n", a);
    printf("%s\n", p);
    return 0;
}

Arrays also behave differently with sizeof:

char a[] = "hello";
char *p = "hello";

printf("%zu\n", sizeof(a)); // 6
printf("%zu\n", sizeof(p)); // size of pointer, usually 8 or 4

That weirdness catches a lot of people. The array knows its storage size at compile time. The pointer only knows it points somewhere.

4. String literals are special

A string literal is the quoted text in your source code:

"hello world"

It lives in static storage, and in practice it should be treated as read-only. Even though old code sometimes uses char * for string literals, modifying them is undefined behavior.

char *msg = "hi";
// msg[0] = 'H'; // Don't do this

const char *safe = "hi";

Prefer const char * when pointing to string literals. It communicates intent and prevents accidental writes.

5. Copying, comparing, and measuring strings

C does not let you assign one character array to another after declaration. This surprises people coming from other languages.

char a[20] = "cat";
char b[20];

// b = a; // Invalid for arrays

Instead, use string functions from <string.h>.

Length

#include <string.h>

size_t len = strlen("hello"); // 5

strlen counts characters until '\0'. It does not count the terminator itself.

Copy

#include <string.h>

char src[] = "orange";
char dst[20];

strcpy(dst, src);

This works only if dst is large enough. If it is too small, you overflow the buffer.

Compare

if (strcmp("cat", "cat") == 0) {
    // equal
}

Do not use == to compare string contents.

char a[] = "cat";
char b[] = "cat";

if (a == b) {
    // false: compares addresses, not contents
}
== compares where two pointers point. strcmp compares the bytes inside the strings.
Task Function Important detail
Find length strlen Needs a valid null-terminated string
Copy strcpy Destination must have enough space
Compare strcmp Returns 0 when equal
Concatenate strcat Destination must already contain a valid string and extra room

6. Reading input safely is harder than it should be

One reason strings feel weird in C is that input functions can be tricky. The old gets function was so unsafe it was removed from the language.

A much better option is fgets:

#include <stdio.h>

int main(void) {
    char name[32];

    printf("Enter your name: ");
    if (fgets(name, sizeof(name), stdin)) {
        printf("You typed: %s", name);
    }

    return 0;
}

fgets reads at most sizeof(name) - 1 characters and always tries to null-terminate the result. That is good. The weird part is that it usually keeps the trailing newline if there is room.

#include <stdio.h>
#include <string.h>

int main(void) {
    char name[32];

    if (fgets(name, sizeof(name), stdin)) {
        name[strcspn(name, "\n")] = '\0';
        printf("Cleaned input: %s\n", name);
    }

    return 0;
}

That strcspn trick is one of the most useful string patterns in everyday C code.

7. Common string pitfalls

Off-by-one errors

If you want to store 5 visible characters, you need room for 6 bytes total.

char ok[6] = "hello";   // fits: h e l l o \0
// char bad[5] = "hello"; // too small

Using uninitialized memory as a string

char buffer[20];
printf("%s\n", buffer); // Wrong: buffer does not contain a valid string yet

Forgetting that strncpy is weird too

Many people reach for strncpy as a safe version of strcpy, but it has odd behavior: it may not null-terminate the destination if the source is too long.

char dst[5];
strncpy(dst, "abcdef", sizeof(dst));
// dst may not be null-terminated here

If you use it, terminate manually:

strncpy(dst, "abcdef", sizeof(dst) - 1);
dst[sizeof(dst) - 1] = '\0';

Printing a non-string with %s

char c = 'A';
printf("%s\n", &c); // Wrong: not a null-terminated string
The format specifier %s assumes the pointer refers to a valid null-terminated string. If it does not, the behavior is undefined.

8. Safer patterns you should actually use

Pattern 1: always size buffers intentionally

#define NAME_SIZE 32
char name[NAME_SIZE];

Pattern 2: prefer fgets for line input

if (fgets(name, sizeof(name), stdin)) {
    name[strcspn(name, "\n")] = '\0';
}

Pattern 3: track capacity, not just current contents

When you write helper functions, pass both the destination buffer and its size.

#include <stdio.h>

void greet(char *dst, size_t dst_size, const char *name) {
    snprintf(dst, dst_size, "Hello, %s!", name);
}

snprintf is often much nicer than building strings manually.

Pattern 4: use const char * for read-only text

void print_message(const char *msg) {
    printf("%s\n", msg);
}

Pattern 5: compare contents, never addresses, unless you truly mean addresses

if (strcmp(input, "quit") == 0) {
    // exit program
}
A lot of safe string handling in C comes down to three habits: know your buffer size, ensure null termination, and use the right library function for the job.

9. Mini example: normalize a username

Here is a small example that reads a username, trims the newline, checks its length, and prints a normalized version.

#include <stdio.h>
#include <string.h>
#include <ctype.h>

#define USER_SIZE 32

int main(void) {
    char user[USER_SIZE];

    printf("Enter username: ");

    if (!fgets(user, sizeof(user), stdin)) {
        return 1;
    }

    user[strcspn(user, "\n")] = '\0';

    if (strlen(user) == 0) {
        printf("Username cannot be empty.\n");
        return 1;
    }

    for (size_t i = 0; user[i] != '\0'; i++) {
        user[i] = (char)tolower((unsigned char)user[i]);
    }

    printf("Normalized username: %s\n", user);
    return 0;
}

This example shows several good habits together:

  • Use a fixed-size buffer with a named constant.
  • Use fgets instead of unsafe input functions.
  • Remove the newline explicitly.
  • Use the null terminator as the loop condition.
  • Cast to unsigned char before calling tolower.