1. What a string really is
In many languages, a string is a built-in object with a known length and lots of behavior. In C, a string is much more primitive. It is just a sequence of characters stored in memory, ending with a special byte whose value is zero.
char word[] = "cat";
That creates an array with four bytes:
'c' 'a' 't' '\0'
The last byte, '\0', is what tells C, “the string ends here.”
Without it, most string functions have no idea where to stop reading.
'\0'.
2. The null terminator is the whole game
If you remember one thing from this article, remember this: C string functions depend on the null terminator. They do not carry a separate length field around.
#include <stdio.h>
int main(void) {
char name[6] = {'A', 'l', 'i', 'c', 'e', '\0'};
printf("%s\n", name);
return 0;
}
This prints Alice. But if you forget the terminator:
char broken[5] = {'A', 'l', 'i', 'c', 'e'};
then printf("%s", broken) will keep reading memory past the array until it
happens to find a zero byte somewhere else. That means garbage output or a crash.
3. Arrays vs pointers: similar, but not the same
This is one of the biggest sources of confusion in C. These look similar:
char a[] = "hello";
char *p = "hello";
But they are not equivalent.
char a[] = "hello";
Creates an array containing a copy of the characters:
'h' 'e' 'l' 'l' 'o' '\0'.
The array itself is storage you own in that scope.
char *p = "hello";
Creates a pointer to a string literal.
The pointer variable is writable, but the literal data should be treated as read-only.
#include <stdio.h>
int main(void) {
char a[] = "hello";
char *p = "hello";
a[0] = 'H'; // OK
// p[0] = 'H'; // Undefined behavior: do not modify string literals
printf("%s\n", a);
printf("%s\n", p);
return 0;
}
Arrays also behave differently with sizeof:
char a[] = "hello";
char *p = "hello";
printf("%zu\n", sizeof(a)); // 6
printf("%zu\n", sizeof(p)); // size of pointer, usually 8 or 4
That weirdness catches a lot of people. The array knows its storage size at compile time. The pointer only knows it points somewhere.
4. String literals are special
A string literal is the quoted text in your source code:
"hello world"
It lives in static storage, and in practice it should be treated as read-only.
Even though old code sometimes uses char * for string literals, modifying them
is undefined behavior.
char *msg = "hi";
// msg[0] = 'H'; // Don't do this
const char *safe = "hi";
Prefer const char * when pointing to string literals.
It communicates intent and prevents accidental writes.
5. Copying, comparing, and measuring strings
C does not let you assign one character array to another after declaration. This surprises people coming from other languages.
char a[20] = "cat";
char b[20];
// b = a; // Invalid for arrays
Instead, use string functions from <string.h>.
Length
#include <string.h>
size_t len = strlen("hello"); // 5
strlen counts characters until '\0'. It does not count the terminator itself.
Copy
#include <string.h>
char src[] = "orange";
char dst[20];
strcpy(dst, src);
This works only if dst is large enough. If it is too small, you overflow the buffer.
Compare
if (strcmp("cat", "cat") == 0) {
// equal
}
Do not use == to compare string contents.
char a[] = "cat";
char b[] = "cat";
if (a == b) {
// false: compares addresses, not contents
}
== compares where two pointers point. strcmp compares the bytes inside the strings.
| Task | Function | Important detail |
|---|---|---|
| Find length | strlen |
Needs a valid null-terminated string |
| Copy | strcpy |
Destination must have enough space |
| Compare | strcmp |
Returns 0 when equal |
| Concatenate | strcat |
Destination must already contain a valid string and extra room |
6. Reading input safely is harder than it should be
One reason strings feel weird in C is that input functions can be tricky.
The old gets function was so unsafe it was removed from the language.
A much better option is fgets:
#include <stdio.h>
int main(void) {
char name[32];
printf("Enter your name: ");
if (fgets(name, sizeof(name), stdin)) {
printf("You typed: %s", name);
}
return 0;
}
fgets reads at most sizeof(name) - 1 characters and always tries
to null-terminate the result. That is good. The weird part is that it usually keeps the
trailing newline if there is room.
#include <stdio.h>
#include <string.h>
int main(void) {
char name[32];
if (fgets(name, sizeof(name), stdin)) {
name[strcspn(name, "\n")] = '\0';
printf("Cleaned input: %s\n", name);
}
return 0;
}
That strcspn trick is one of the most useful string patterns in everyday C code.
7. Common string pitfalls
Off-by-one errors
If you want to store 5 visible characters, you need room for 6 bytes total.
char ok[6] = "hello"; // fits: h e l l o \0
// char bad[5] = "hello"; // too small
Using uninitialized memory as a string
char buffer[20];
printf("%s\n", buffer); // Wrong: buffer does not contain a valid string yet
Forgetting that strncpy is weird too
Many people reach for strncpy as a safe version of strcpy, but it has odd behavior:
it may not null-terminate the destination if the source is too long.
char dst[5];
strncpy(dst, "abcdef", sizeof(dst));
// dst may not be null-terminated here
If you use it, terminate manually:
strncpy(dst, "abcdef", sizeof(dst) - 1);
dst[sizeof(dst) - 1] = '\0';
Printing a non-string with %s
char c = 'A';
printf("%s\n", &c); // Wrong: not a null-terminated string
%s assumes the pointer refers to a valid null-terminated string.
If it does not, the behavior is undefined.
8. Safer patterns you should actually use
Pattern 1: always size buffers intentionally
#define NAME_SIZE 32
char name[NAME_SIZE];
Pattern 2: prefer fgets for line input
if (fgets(name, sizeof(name), stdin)) {
name[strcspn(name, "\n")] = '\0';
}
Pattern 3: track capacity, not just current contents
When you write helper functions, pass both the destination buffer and its size.
#include <stdio.h>
void greet(char *dst, size_t dst_size, const char *name) {
snprintf(dst, dst_size, "Hello, %s!", name);
}
snprintf is often much nicer than building strings manually.
Pattern 4: use const char * for read-only text
void print_message(const char *msg) {
printf("%s\n", msg);
}
Pattern 5: compare contents, never addresses, unless you truly mean addresses
if (strcmp(input, "quit") == 0) {
// exit program
}
9. Mini example: normalize a username
Here is a small example that reads a username, trims the newline, checks its length, and prints a normalized version.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define USER_SIZE 32
int main(void) {
char user[USER_SIZE];
printf("Enter username: ");
if (!fgets(user, sizeof(user), stdin)) {
return 1;
}
user[strcspn(user, "\n")] = '\0';
if (strlen(user) == 0) {
printf("Username cannot be empty.\n");
return 1;
}
for (size_t i = 0; user[i] != '\0'; i++) {
user[i] = (char)tolower((unsigned char)user[i]);
}
printf("Normalized username: %s\n", user);
return 0;
}
This example shows several good habits together:
- Use a fixed-size buffer with a named constant.
- Use
fgetsinstead of unsafe input functions. - Remove the newline explicitly.
- Use the null terminator as the loop condition.
- Cast to
unsigned charbefore callingtolower.