I just can’t skip the theatricals, so I’ll open up this article with them as well. It all really started with a rather simple-stupid repair item I kept in a backlog. We have some network calls that occasionally fail and we agreed that it’d be useful to have the IP address along with the hostname at the time of failure. It’d help some of our future root cause analysis. Simple.
Amusingly enough, this is exactly one of the situations where you start realizing how much complexity is hidden by higher level languages. For example, in most languages that I randomly picked (Python, C#, PHP), there is a function called gethostbyname()
which takes a hostname (e.g. bing.com) and returns one or more IP addresses that the hostname resolves to.
Here’s a Python example:
socket.gethostbyname(hostname)
And here’s description from Python’s docs: Translate a host name to IPv4 address format. The IPv4 address is returned as a string, such as ‘100.50.200.5’. If the host name is an IPv4 address itself it is returned unchanged. See gethostbyname_ex() for a more complete interface. gethostbyname() does not support IPv6 name resolution, and getaddrinfo() should be used instead for IPv4/v6 dual stack support.
If you observe closely and if you are one of the folks who are actually reading the docs (are you? are you?!?!), you might catch an issue with this one. But I’d presume that you, just like myself, are not into business of reading the docs. You google what you need (e.g. “How to resolve Host to IP in Python”) and you just pick the first thing that pops up. You test, it works and life’s good. No second thought given. Until it bites your ass, but we’ll get to that.
Let’s talk about C++ and Win32 API now. Yes, there is a GetHostByName() defined and yes it takes one parameter – host name, and returns some hostent structure which contains the info I need (i.e. IP address):
hostent* gethostbyname(const char* name);
HOWEVER, there’s a huge deprecation warning below that states: Note The gethostbyname function has been deprecated by the introduction of the getaddrinfo function. Developers creating Windows Sockets 2 applications are urged to use the getaddrinfo function instead of gethostbyname.
As you can imagine, if there were no this warning, there’d be no this article. But “luckily”, it’s not that simple. And granted, it’s not that simple because the question is not that simple in the first place. The issue is that most languages will hide that from you. But we’ll get to that.
How hard is it to actually get an IP address that hostname resolves to?
These were my exact thoughts a month ago. And hell, looking back now I realize how clueless I was. Assuming that “how to get an IP address of a hostname” is a simple question. Haha.
Going back to Microsoft’s recommendation to use GetAddrInfo() instead of GetHostByName(), I eagerly clicked on it, expecting an interface similar to GetHostByName(). I mean, I fully expected to have a single parameter – host name, and that return value would be some voodoo-struct that’d contain IPs. Easy peasy. Except, that this is the function signature:
INT WSAAPI getaddrinfo(
[in, optional] PCSTR pNodeName,
[in, optional] PCSTR pServiceName,
[in, optional] const ADDRINFOA *pHints,
[out] PADDRINFOA *ppResult
);
Well that escalated quickly. And was about to make my life miserable, to a point I couldn’t even comprehend at the time.
How (and why?) the hell do you go from a function that takes a SINGLE parameter, to a function that has FOUR mysteriously named parameters. Do notice that there’s not a SINGLE one called “host name” (or anything similar).
Let’s read one by one:
- pNodeName — A pointer to a NULL-terminated ANSI string that contains a host (node) name or a numeric host address string. For the Internet protocol, the numeric host address string is a dotted-decimal IPv4 address or an IPv6 hex address.
Ok, this seems to be a host name. So far so good. But why the hell is it called “Node Name”? Why not “Host Name”? And why the heck does it accept IPv4 and IPv6 addresses as well? And what does it even return if you supply them? I guess that’s a question for my future self … - pServiceName — A pointer to a NULL-terminated ANSI string that contains either a service name or port number represented as a string. A service name is a string alias for a port number. For example, “http” is an alias for port 80 defined by the Internet Engineering Task Force (IETF) as the default port used by web servers for the HTTP protocol. Possible values for the pServiceName parameter when a port number is not specified are listed in the following file: %WINDIR%\system32\drivers\etc\services
Ok so this seems to be a port specifier. Why the hell do I even need to specify a port here? I just want to resolve a host to an IP address. Right? Right?!?! - pHints — A pointer to an addrinfo structure that provides hints about the type of socket the caller supports. The ai_addrlen, ai_canonname, ai_addr, and ai_next members of the addrinfo structure pointed to by the pHints parameter must be zero or NULL. Otherwise the GetAddrInfoEx function will fail with WSANO_RECOVERY. See the Remarks for more details.
This one got me. What in the world are hints? What am I supposed to hint here? Hey, I just want to get a freaking IP address! How hard is that?! Damn … - ppResult — A pointer to a linked list of one or more addrinfo structures that contains response information about the host.
Ok this one is clear. It’s where I should search for an answer to my question. It also hints to a fact that someone else is doing the allocation here, which pretty much means that I’ll have to do some freeing later on. But that’s least of my concerns now.
Naturally, this whole things leads to a logical question – why in the world is it so damn complex to “just resolve host to IP?”. And the answer is actually really simple and I’ll talk more about it later. The brief version is – because there’s no simple answer. There can be one IPv4. Or multiple. There can also be IPv6; or more of them. And there can be plethora of other things that you should likely take into consideration (i.e. IP addresses aren’t the only type of addresses out there!).
More importantly though, my next question was – what the heck does GetAddrInfo() do at all?
What the heck does GetAddrInfo() do at all
Amusingly enough, it turns out I wasn’t the only one asking this question. There’s a guy who also wrote a full-blown blog post about it and his conclusion was that it definitely does an awful lot! Way more than anyone sane would have anticipated. But he was checking on Linux and I’m curious about Windows, so I braced myself to explore this further.
Here’s a definition from Microsoft docs: The getaddrinfo function provides protocol-independent translation from an ANSI host name to an address.
This is great. It’s great because it tells you everything if you know what you are looking for, but it tells you absolutely nothing if you are a newcomer to this.
Linux’s man pages definitely have waaay better description: Given node and service, which identify an Internet host and a service, getaddrinfo() returns one or more addrinfo structures, each of which contains an Internet address that can be specified in a call to bind(2) or connect(2). The getaddrinfo() function combines the functionality provided by the gethostbyname(3) and getservbyname(3) functions into a single interface, but unlike the latter functions, getaddrinfo() is reentrant and allows programs to eliminate IPv4-versus-IPv6 dependencies.
Okay that gives a clue. Like a way better one. So GetAddrInfo() isn’t really about resolving a host to an IP, but it’s more about returning you something that you can bind() to and/or connect() to. That’s interesting. Probably useful as well.
Now comes a dramatic moment. ** drums ** The moment where my curiosity goes for an intercourse with my naivety, to produce a beautiful child made of absolute stupidity. This is when I decide to go for a, wait for it, ** drums **, a “weekend project of exploring how GetAddrInfo() works under the hood and making a blog post about it”:
Mind you, this was published almost a month ago and the more I think I’m closer to uncovering the “truth” the more I learn how many lightning years I am from actually understanding the whole picture. But we’ll get to that.
GetAddrInfo() under the hood
This is what I thought would be “under the hood”. Probably the same analogy would be if I were to tell you that I have no clue how the car works at all, but I’m gonna pull up the hood, observe what I see there and write a blog post about it. Easy. Well, everything’s easy if you are clueless enough. But I digress.
Anyway, I came up with a brilliant idea – write the simplest piece of code and observe all the calls that are made from there. For whatever reason I expected something simple to happen. Here’s the essence of the code I wrote:
int __cdecl main(int argc, char ** argv) {
PADDRINFOW result = NULL;
iResult = WSAStartup(MAKEWORD(2, 2), & wsaData);
if (iResult != 0) {
printf("WSAStartup failed: %d\n", iResult);
return 1;
}
dwRetval = GetAddrInfoW(L"bing.com", nullptr, nullptr, &result);
}
Obviously I’ve trimmed down lots of it and left only interesting parts. I’ve also passed null for pretty much anything except host name and results.
Now I’d like you to stop and think for a sec. How many calls would you expect this simple code to make? My best guess would be one or two UDP calls to DNS server and possibly one call to check in “hosts” file. That’d be about it.
0:000> wt -l 2 -nc -nw
...
...
1677 instructions were executed in 1676 events (0 from other threads)
Function Name Invocations
WS2_32!AppendAddrInfo 1
WS2_32!ConvertIp6StringToAddress 1
WS2_32!Dns_Ip6LiteralNameToAddressW 1
WS2_32!Dns_IsNameInDomainW 1
WS2_32!EventWriteWinsockGaiComplete 1
WS2_32!EventWriteWinsockGaiStart 1
WS2_32!GetAddrInfoW 1
WS2_32!GetIp4Address 1
WS2_32!IN6_IS_ADDR_V4MAPPED 1
WS2_32!IsEmailName 1
WS2_32!IsProtoRunning 2
WS2_32!LookupAddressForName 1
WS2_32!QueryDns 1
WS2_32!SortIPAddrs 1
WS2_32!_security_check_cookie 5
WS2_32!memset 2
ntdll!EtwEventEnabled 2
ntdll!RtlDebugFreeHeap 4
ntdll!RtlFreeHeap 8
ntdll!RtlIpv4StringToAddressW 1
ntdll!RtlIpv6StringToAddressExW 1
ntdll!RtlSetLastWin32Error 1
ntdll!RtlpFreeHeap 4
ntdll!RtlpFreeHeapInternal 4
ntdll!_security_check_cookie 1
ntdll!memset 2
ntdll!memset$thunk$772440563353939046 2
0 system calls were executed
That’s likely a bit more than what I expected. And mind you I had this limited to depth of 2, because going above that produces MASSIVE output.
As can be observed, GetAddrInfo() seems to do A LOT. Like, an awful lot. And I didn’t even specify hints or anything, so who knows where’d this all go if I did try to limit the output.
Throwing ProcMon on it
Inspired by that guy’s blog post on getaddrinfo() on Linux, I decided to give a shot at throwing ProcMon on top of this thing. So, what I pretty much did is set a debugger breakpoint at line where GetAddrInfo() is about to be executed, and then captured everything that happened with ProcMon:
I’ve captured ProcMon traces between above two lines only (i.e. I captured only what happens during the GetAddrInfoW()) call and here’s what I observed:
Total of 480 events, believe it or not. I mean, it’s not like I’ve been going around and debugging what bunch of Windows APIs do, but I surely didn’t expect this much things going on. But yet – here they are.
One thing that you can’t see here though, which definitely caught my eye is that there wasn’t a single network call made. Here’s a filtered list of Network things only:
I found that to be a bit strange, given that I fully expected that the process issued at least one call to DNS server. And trust me – I cleaned my cache upfront, so it wasn’t that. What I eventually learned, and what I will be writing at length in Part 2 is that all the Network calls are actually made from ANOTHER process — svchost.exe which executes Network Service. Hence the call somehow gets from my process to another one, but I haven’t figured that part out yet. But again, I will write more about this in Part 2. Hope you liked it so far!
One thought on “The Dark Hole of GetAddrInfo() – Part 1”