C Language - String Arrays, Pointers, and Safe Copying
1. C Language Strings
- In the C language, a string can be defined as a continuous memory area of characters (char) terminated by a Null character ('\0').
- The Null character is represented as '\0' or 0x00.
- There is no dedicated data type for representing strings; string handling is performed through pointers and arrays.
#include <stdio.h>
#include <string.h>
int main()
{
char str[16];
strcpy(str, "012345678901234");
printf("%s,len=%zu,size=%zu\n", str, strlen(str), sizeof(str));
return 0;
}※ Local variables declared without initialization, such as char str[32], have their memory space filled with undefined values (garbage values).
※ %zu is a format specifier for accurately printing size_t type values returned by sizeof() and strlen(), used instead of %ld to enhance platform portability.
output
012345678901234,len=15,size=16The size of char str[16] is 16. However, since a Null character must be included at the end of the string, it can store strings of length 15.
"012345678901234\0"
←─────15──────→ (strlen)
←─────16───────→ (sizeof)- The
strlen()function retrieves the length of a string, and counts characters until '\0' appears. - On the other hand, the
sizeof()function determines the size that a variable occupies in memory, regardless of the string's contents.
※ This example lacks validation of input length, so it can store data exceeding the array size. This creates a risk of Buffer Overflow issues that can cause memory corruption or program malfunction.
1.1. String Initialization
- String initialization is the process of initially assigning characters to an array or pointer.
- In C language, the memory location and whether the string can be modified depend on the initialization method, so you must understand the differences between each approach.
#include <stdio.h>
#include <string.h>
int main()
{
char str1[32] = "01234567890123456789";
char str2[] = "01234567890123456789";
char *str3 = "01234567890123456789";
printf("str1: %s,len=%zu,size=%zu\n", str1, strlen(str1), sizeof(str1));
printf("str2: %s,len=%zu,size=%zu\n", str2, strlen(str2), sizeof(str2));
printf("str3: %s,len=%zu,size=%zu\n", str3, strlen(str3), sizeof(str3));
return 0;
}char str1[32] = "..."- Explicitly specifies a 32-byte array size → the remaining space is filled with '\0'
char str2[] = "..."- Omits array size → compiler automatically determines it as string length + 1 ('\0')
char *str3 = "..."- Declares string variable as pointer → string literal (read-only memory)
output
str1: 01234567890123456789,len=20,size=32
str2: 01234567890123456789,len=20,size=21
str3: 01234567890123456789,len=20,size=8strlen()- String length (length before Null character) → all three variables are 20
sizeof()str1→ 32 (entire array size)str2→ 21 (20 characters + '\0')str3→ 8 (size of pointer itself - 64-bit system)
1.2. String Literals
- String Literal is treated as
constread-only data according to the C standard. - This is an intentional protection to maintain only one copy when the same literal is used in multiple places, improving memory efficiency and preventing the string literal from being modified due to programmer error.
const char str2[]- Explicitly declares array as const → causes compilation error
char *str3- Points to literal with string pointer → compiles but causes runtime error (segmentation fault)
1.3. Safe String Copy
- String copying in C language is a frequently performed task, but without accurately validating the length of the input string, there is a risk of Buffer Overflow occurring.
- As mentioned earlier, functions like strcpy() copy strings without considering the target buffer size, so if a string longer than the copy destination array is passed, it can cause memory corruption or program malfunction.
- To prevent this, when copying a string, you must always consider the size of the destination buffer and limit the copy length while ensuring Null termination.
#include <stdio.h>
#include <string.h>
int safe_strcpy(char *dest, size_t dest_size, const char *src)
{
if (dest == NULL || src == NULL) {
return -1; // error: NULL pointer
}
if (dest_size == 0) {
return -1; // error: destination size is 0
}
size_t src_len = strlen(src);
size_t copy_len = 0;
if (src_len < dest_size) {
copy_len = src_len;
} else {
/* source too large: truncate */
copy_len = dest_size - 1;
}
strncpy(dest, src, copy_len);
dest[copy_len] = '\0';
return 0; // Success
}
int main()
{
char str[16] = {0x00, };
strncpy(str, "01234567890123456789", sizeof(str) - 1);
str[sizeof(str) - 1] = '\0';
printf("%s,len=%zu,size=%zu\n", str, strlen(str), sizeof(str));
if (0 != safe_strcpy(str, sizeof(str), "01234567890123456789")) {
printf("string copy failed\n");
}else {
printf("%s,len=%zu,size=%zu\n", str, strlen(str), sizeof(str));
}
return 0;
}※ char str[16] = {0x00, } is an initialization method that initializes the entire character array to 0x00, starting with an empty string state, and clearly ensures Null termination.
output
012345678901234,len=15,size=16
012345678901234,len=15,size=16- When using
strncpy(), care must be taken because if you do not explicitly add a Null character after copying, the string may not terminate correctly.- This is because
strncpy()only copies the specified number of characters but is not a function that always adds a null character.
- This is because
- To safely handle this at the function level, the
safe_strcpy()function implemented here performs safe string copying as follows:- First validates the validity of the pointer and buffer size.
- Limits the copy length based on the buffer size.
- After copying, explicitly adds a Null character ('\0') to ensure string termination.
※ In the process of handling strings in C language, if buffer size validation and Null termination are not ensured, it can lead to fatal errors at any time. Safe string handling is a key element that determines program stability and security.