However, if you’re doing C, C++ or assembly code, or if you implement a new external module in your favorite programming language, you will need to manage your dynamic memory allocation yourself.
What’s dynamic allocation? Why I need malloc?
Well, in all applications, when you create a new variable – it’s often called declaring a variable – you need memory to store it. As your computer is in the modern days, it can run more than one application at a time and so, each application should tell to your OS (here Linux) that it needs that amount of memory. When you write this kind of code:
#include <stdlib.h>
#define DISK_SPACE_ARRAY_LENGTH 7
void getFreeDiskSpace(int statsList[], size_t listLength) {
return;
}
int main() {
/* Contains the free disk space of the last 7 days. */
int freeDiskSpace[DISK_SPACE_ARRAY_LENGTH] = {0};
getFreeDiskSpace(freeDiskSpace, DISK_SPACE_ARRAY_LENGTH);
return EXIT_SUCCESS;
}
The freeDiskSpace array needs memory so you’ll need to ask Linux for approval to get some memory. However, as it’s obvious when reading the source code that you’ll need an array of 7 int, the compiler automatically asks Linux for it, and it’ll allocate it on the stack. This basically means that this storage gets destroyed when you return the function where the variable is declared. That’s why you can’t do that:
#include <stdlib.h>
#define DISK_SPACE_ARRAY_LENGTH 7
int* getFreeDiskSpace() {
int statsList[DISK_SPACE_ARRAY_LENGTH] = {0};
/* WHY ARE WE DOING THAT?! statsList will be DESTROYED! */
return statsList;
}
int main() {
/* Contains the free disk space of the last 7 days. */
int *freeDiskSpace = NULL;
freeDiskSpace = getFreeDiskSpace();
return EXIT_SUCCESS;
}
You see more easily the problem now? Then, you want to concatenate two strings. In Python and JavaScript, you would do:
But as you know, in C it doesn’t work like this. So to build an URL for example, you need to concatenate two strings, such as URL path and domain name. In C, we’ve strcat, right, but it only works if you’ve an array with enough room for it.
You’ll be tempted to know the length of the new string by using strlen, and you would be right. But then, how would you ask Linux to reserve this unknown amount of memory? Compiler can’t help you: the exact space you want to allocate is only known at runtime. That’s exactly where you need dynamic allocation, and malloc.
Writing my first C function using malloc
Before writing code, a little explanation: malloc allows you to allocate a specific number of bytes for your application usage. It’s really simple to use: you call malloc with the number of bytes you need, and it returns a pointer to your new area that Linux reserved for you.
You have only 3 responsibilities:
- Check if malloc returns NULL. That happens when Linux has not enough memory to provide.
- Free your variables once unused. Otherwise you’ll waste memory and it’ll slow down your application.
- Never use the memory zone after you have freed the variable.
If you follow all these rules, all will go well and dynamic allocation will solve you many problems. Because you choose when you free the memory, you can also safely return a variable allocated with malloc. Just, don’t forget to free it!
If you wonder how to free a variable, it’s with the free function. Call it with the same pointer than malloc returned you, and the memory is freed.
Let me show you with the concat example:
#include <stdlib.h>
#include <string.h>
/*
* When calling this function, don’t forget to check if the return value is NULL
* If it’s not NULL, you must call free on the returned pointer once the value
* is no longer used.
*/
char* getUrl(const char* const baseUrl, const char* const toolPath) {
size_t finalUrlLen = 0;
char* finalUrl = NULL;
/* Safety check. */
if (baseUrl == NULL || toolPath == NULL) {
return NULL;
}
finalUrlLen = strlen(baseUrl) + strlen(toolPath);
/* Don’t forget the ’’, hence the + 1. */
finalUrl = malloc(sizeof(char) * (finalUrlLen + 1));
/* Following malloc rules… */
if (finalUrl == NULL) {
return NULL;
}
strcpy(finalUrl, baseUrl);
strcat(finalUrl, toolPath);
return finalUrl;
}
int main() {
char* googleImages = NULL;
googleImages = getUrl("https://www.google.com", "/imghp");
if (googleImages == NULL) {
return EXIT_FAILURE;
}
puts("Tool URL:");
puts(googleImages);
/* It’s no longer needed, free it. */
free(googleImages);
googleImages = NULL;
return EXIT_SUCCESS;
}
So you see a practical example for using dynamic allocations. First, I avoid pitfalls such as giving getUrl return value straight to puts function. Then, I also take the time to comment and document the fact the return value should be freed properly. I also check for NULL values everywhere so anything unexpected can be safely caught instead of crashing the application.
Finally, I take the extra care of freeing the variable and then setting the pointer to NULL. That avoids to be tempted to use – even by mistake – the now freed memory zone. But as you can see, it’s easy to free a variable.
You may notice that I used sizeof in malloc. It allows to know how many bytes a char is using and clarifies the intent in the code so it’s more readable. For char, sizeof(char) is always equal to 1, but if you use an array of int instead, it works exactly the same way. For example, if you need to reserve 45 int, just do:
This way, you quickly see how much you want to allocate, that’s why I always recommend its usage.
How works malloc under-the-hood?
malloc and free are, in fact, functions included in all C programs that will talk to Linux on your behalf. It will also makes dynamic allocation easier because, at start, Linux doesn’t allow you to allocate variables of all sizes.
Linux provides two ways to get more memory in fact: sbrk and mmap. Both have limitations, and one of them is: you can allocate only relatively big amounts, such as 4,096 bytes or 8,192 bytes. You can’t request 50 bytes like I did in the example, but you also can’t request 5,894 bytes.
This has an explanation: Linux needs to keep a table where it tells which application has reserved which memory zone. And this table uses space as well, so if every byte needed a new row in this table, a big share of memory would be needed. That’s why memory is splitted in big blocks of, for example, 4,096 bytes, and much like you can’t buy 2 oranges and a half in a grocery, you can’t ask for half blocks.
So malloc will take these big blocks and gives you a little slice of these memory blocks whenever you call it. As well, if you freed few variables, but not enough to justify freeing a whole block, malloc system may keep blocks and recycle memory zones when you call malloc again. This has the benefit to make malloc faster, however memory reserved by malloc can’t be used in any other application, while the program isn’t currently using it in reality.
But malloc is smart: if you call malloc to allocate 16 MiB or a big amount, malloc will probably ask Linux for full blocks dedicated just for this big variable by using mmap. This way, when you call free, it will more likely avoid that waste of space. Don’t worry, malloc is doing a way better job at recycling than humans do with our garbage!
Conclusion
I think now you better understand how all of that works. Of course, dynamic allocation is a big topic and I think we can write a full book on the topic, but this article should make you comfortable with the concept both in general and with practical programming advices.