Mike Higginbottom

I was recently digging into getaddrinfo() and as part of that I created a little sandbox to play with the code. getaddrinfo() returns a list of struct addrinfo which represent the various potential sockets you could connect to. In a real-world program you would essentially pick one of these, connect to it, have a bit of a chit chat and then disconnect. Finally, you need to call freeaddrinfo() to get glibc to free up the memory it allocated to these structures. But for the purposes of this sandbox I just wanted to call it, look at the returned list and then clean up. Like so:

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <netdb.h>
#include <arpa/inet.h>
#include <unistd.h>

int main (int argc, char **argv) {
  char *node = "google.com";
  char *service = "80";

  struct addrinfo hints;
  memset(&hints, 0, sizeof hints);

  struct addrinfo *res = NULL;

  setvbuf(stdout, NULL, _IONBF, 0);

  int ret = getaddrinfo(node, service, &hints, &res);
  printf("getaddrinfo() returned: %d\n", ret);
  if (ret != 0) {
    fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(ret));
  }
  else
  {
    printf("Returned struct addrinfo instances for node: %s and service: %s:\n\n", node, service);
    struct addrinfo *ai;
    for (ai = res; ai != NULL; ai = ai->ai_next) {
      // printAddrInfo(ai);
    }
  }
  freeaddrinfo(res);
 
  return 0;
}

If you want to build this and have a play with it using GDB for example, you’ll need to build your own version of glibc and possibly implement some version of printAddrInfo() to dump the returned structs to the screen. I might throw all this up online in another article and a GitHub repo at some point. But… not today’s problem.

Being a nosey little sod, I wondered how these two functions behave if an error occurs while building and populating this list. More specifically, do you need to call freeaddrinfo() if the call to getaddrinfo() fails for some reason? You know, cos either: possible memory leak or possible double free. Both of which are no bueno. So I went down the rabbit hole. And part way down I ended up looking at the source for freeaddrinfo(). The signature for this is void freeaddrinfo(struct addrinfo *ai) and the definition of struct addrinfo looks like this:

struct addrinfo {
  int              ai_flags;
  int              ai_family;
  int              ai_socktype;
  int              ai_protocol;
  socklen_t        ai_addrlen;
  struct sockaddr *ai_addr;
  char            *ai_canonname;
  struct addrinfo *ai_next;
};

My expectation was that freeaddrinfo() would step through the input list, grab a copy of ai_next, free up the memory allocated to the members ai_addr, ai_canonname and the struct itself, then step on to do the same to the stored next list item. But the code looks like this:

void freeaddrinfo (struct addrinfo *ai)
{
  struct addrinfo *p;

  while (ai != NULL)
    {
      p = ai;
      ai = ai->ai_next;
      free (p->ai_canonname);
      free (p);
    }
}

Waidda minit… Where’s the call to free ai_addr? Now obviously (well probably) this is not going to be a bug. Bitter and long experience has taught me that pretty much every time I identify a cast iron definite bug, the reality is I’m just demonstrating idiocy. Again.

First things first then, let’s make sure it’s not a bug. I ran it through Valgrind which reported no memory leaks. As expected, looks like I’m being an idiot. Next things next, in order to work out if everything that’s being allocated is being freed you need to work out what’s being allocated. And a quick search in glibc/src/nss/getaddrinfo.c finds a call to malloc() that indicates the allocation is happening in generate_addrinfo() at line 1087. I’ve stripped the code down a bit here to focus on the essential idea.

struct addrinfo *ai;
ai = malloc (sizeof (struct addrinfo) + socklen);
if (ai == NULL)
    return -EAI_MEMORY;

We’re allocating memory here for ai which is a pointer to a struct addrinfo but we’re ‘over-allocating’ in the call to malloc() by an additional number of bytes equal to socklen. What’s that then? A few lines before the malloc() call we have:

if (family == AF_INET6)
    socklen = sizeof (struct sockaddr_in6);
else
    socklen = sizeof (struct sockaddr_in);

Essentially the size of ai_addr is protocol dependent - it’s bigger for IPv6 than IPv4 mainly because of the address size but other things too. So, we malloc enough space for the struct addrinfo and the struct sockadd_in6 or struct sockaddr_in in one shot.

Then we point ai_addr at the memory immediately following the struct addrinfo within this larger allocated block. Like so:

ai->ai_addr = (void *) (ai + 1);

So, if we go right back to the start and look at the code for freeaddrinfo() we can see that there’s no need to explicitly call free() on ai_addr because it’s just a pointer into a chunk of memory allocated toai itself. The memory used by ai_addr is freed by the call to free(p).

So, I’m happy once again. Apart from the niggling question “Why?” It seems a little bit opaque to do it this way. Why not just conventionally and separately allocate socklen bytes to ai_addr and then explicitly call free(p->ai_addr) in freeaddrinfo(). I think that might be tomorrow’s little puzzle. Oh yeah, and as for the original question about getaddrinfo() failing and producing a memory leak or a double free? Again, an article for another day.

Free Your addrinfo List With This One Weird Trick

Free Your `addrinfo` List With This One Weird Trick