This content originally appeared on DEV Community and was authored by Elliot Brenya sarfo
Hello, guys today we are doing something different. I have been digging a lot about DNS and how it works on Linux. So yeah, enjoy.
When we type a server name or a website domain name into a browser, perform a ping, or launch any remote application, the operating system must convert the specified names into IP addresses. This process is called domain name resolution. At first glance, it may seem quite transparent, but there is a multi-layered mechanism behind it.
This article is the first in a series on the low-level architecture of name resolution. We’ll talk about how this process works in Linux at the kernel level, various C libraries, and system calls.
Many people know that the process of name resolution in Linux is not just a ” DNS call ” , but a chain of libraries, configuration entries and calls that depend on the implementation of a particular application, the types of libraries used and system settings.
However, engineers still have questions. For example, is it necessary to reboot the application after the DNS server address has changed? In addition, in order to diagnose errors, timeouts, and other problems with name resolution in the application and in the system, it is important to understand how this entire chain works – from getaddrinfo() to resolv.conf. In this part, we will try to analyze everything layer by layer and collect some fundamental base in a short and accessible form.
The tip of the iceberg
Almost all modern Linux applications, from curl to systemd, use the getaddrinfo() function from the standard C library (glibc or musl). It does the main work of translating a domain name into an IP address (A, AAAA records) depending on the settings and request.
In addition to DNS queries, it also handles other types of data, such as service names, for example, converting the network service name “http” to port 80 using /etc/services. This makes it a versatile tool for network applications.
The getaddrinfo() function returns a list of addrinfo structures, each of which contains an IP address, socket type, protocol, and other parameters. This allows applications to select the most appropriate address to connect to.
*An example of using getaddrinfo() in pseudocode:
*
struct addrinfo hints, *res;
zero_memory(hints);
hints.ai_family = ANY_FAMILY;
hints.ai_socktype = TCP;
err = getaddrinfo("example.com", "http", hints, &res);
if (err == 0) {
for each addr in res:
use(addr)
freeaddrinfo(res);
} else {
print(gai_strerror(err));
}
And getaddrinfo() is just the tip of the iceberg. To get an IP address, it calls a chain of internal mechanisms written into the system configuration data. One of these mechanisms is NSS (Name Service Switch).
NSS
NSS is implemented on the basis of loadable modules — dynamic libraries corresponding to the glibc API, such as libnss_dns.so, libnss_files.so, libnss_myhostname.so and others. They function as plugins and are loaded by the glibc library at runtime, responsible for specific methods of resolving IP addresses. The order and set of sources used for name resolution are specified in the configuration file /etc/nsswitch.conf.
Example of nsswitch.conf content:
# /etc/nsswitch.conf
passwd: files systemd
group: files systemd
shadow: files
gshadow: files
hosts: files dns myhostname
networks: files
protocols: db files
services: db files
ethers: db files
rpc: db files
netgroup: nis
For example, a line in modules containing hosts: files dns
says that the local /etc/hosts file is first looked for a match, and if the files module returns a result, then subsequent modules such as dns (which does a DNS lookup) will not be called.
Accordingly, if the hosts line in nsswitch.conf does not include a mention of the dns module, then the resolv.conf configuration file, which contains settings for accessing DNS sources, will be ignored, and the DNS request will not be generated.
NSS can also use the mdns (for Zeroconf/Avahi), nis (in older systems) and myhostname modules.
The myhostname module is part of systemd and is used to resolve the local hostname. It is not always present on minimalist systems such as Alpine Linux.
Libraries
The following libraries are key to the Linux ecosystem and provide applications with a specific set of functions, including domain name resolution.
Glibc is the most widely used implementation of the C standard library, implementing high-level functions such as getaddrinfo(). It interacts with the NSS (Name Service Switch) to determine name resolution sources (e.g. /etc/hosts, DNS) and uses the libresolv library to perform DNS queries.
Glibc can use system calls such as sendto and recvfrom to send and receive DNS queries over UDP or TCP. It is widely used in most Linux distributions (Ubuntu, Debian, Fedora, etc.)
Musl is an alternative C standard library designed with minimalism, performance, and POSIX compliance in mind. It is used in lightweight distributions such as Alpine Linux.
Musl implements domain name resolution directly, without using NSS, reading /etc/hosts and /etc/resolv.conf by itself and sending DNS queries without using external libraries like libresolv. However, musl has limitations in supporting some resolv.conf parameters, such as rotate or complex search.
Libresolv.so is part of glibc that implements low-level DNS handling, performing queries such as res_query() and res_send(), but can be used independently in some applications such as nslookup (which allows performing DNS queries directly, bypassing standard name resolution mechanisms).
Libresolv is used by glibc to perform DNS queries when NSS specifies that DNS should be accessed. It reads /etc/resolv.conf, constructs DNS packets, and sends them to the specified servers over UDP or TCP.
It’s worth noting that some applications, such as those written in Go, may bypass glibc/musl entirely and use their own DNS resolvers.
How resolv.conf is processed
The file /etc/resolv.conf contains the main DNS client settings, namely: list of servers, parameters, search domains. For example:
nameserver 192.168.1.1
search dev.local
options timeout:2 attempts:3
Glibc and libresolv parse it manually if necessary.
Important points and limitations:
options like rotate, ndots, timeout and attempts affect the behavior of the request;
the rotate option is used to cycle through servers from the nameserver list, but it is not supported in musl;
search is used for autocompletion, for example, if the name db01 is not an FQDN, the domains from the search directive will be substituted for it in turn.
It is important to note that the resolv.conf file can be dynamically modified by the DHCP client, NetworkManager, or the resolvconf utility, which can cause confusion when troubleshooting DNS issues. We will discuss this in a later article.
What does res_query() do?
This is a function from libresolv called internally during name resolution. It constructs a DNS packet manually and sends it to the DNS servers specified in resolv.conf. It is used by utilities like nslookup, as well as some programs that bypass getaddrinfo().
The function sends DNS queries using res_send() over UDP, and switches to TCP if necessary, for example when receiving responses larger than 512 bytes.
Important : When using res_query() you will not get information from /etc/hosts, NSS or other sources. This is a pure DNS query. Therefore, dig or nslookup may get one result, and, for example, ping or curl – a completely different one.
Res_query() is considered an obsolete function and is not recommended for use. For more convenient and secure work with DNS, it is better to give preference to getaddrinfo() or libraries such as c-ares or libdns.
c-ares is a lightweight library for asynchronous DNS queries, often used in high-load applications (e.g. curl and Node.js)
libunbound (from the Unbound project) is a more powerful library with DNSSEC support and flexible query customization.
Request implementation order and priorities
Here is a typical name resolution order on Linux using glibc and NSS:
The application calls getaddrinfo();
getaddrinfo() accesses the NSS system and follows the order specified in nsswitch.conf;
if the files module is specified first, the name is searched for in the /etc/hosts file;
if the dns module is enabled, NSS calls libnss_dns.so, which calls functions from libresolv;
libresolv forms a DNS query via res_query() and sends it via res_send() to the DNS server addresses specified in resolv.conf, then receives and returns the IP address.
Simplified scheme of name resolution in Linux via glibc. Illustrates the basic path, but it is possible to use other sources in NSS. The order of sources (files/dns) is configured in /etc/nsswitch.conf. On modern systems, the DNS cache (systemd-resolved, nscd) may also be used.
Simplified scheme of name resolution in Linux via glibc. Illustrates the basic path, but it is possible to use other sources in NSS. The order of sources (files/dns) is configured in /etc/nsswitch.conf. On modern systems, the DNS cache (systemd-resolved, nscd) may also be used.
Important : If a name is found in one of the steps, for example in hosts, subsequent sources are not used.
On minimalist systems like Alpine Linux with musl, the order may be different, since musl does not use NSS and implements DNS queries directly by reading /etc/hosts and resolv.conf itself.
Some applications and languages (e.g. Go, Java, Node.js) may use their own DNS resolvers, completely ignoring system settings.
As an example, let’s analyze the operation of the curl utility.
Team:
strace -f -e trace=network curl -s download.astralinux.ru > /dev/null
strace:
socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
socketpair(AF_UNIX, SOCK_STREAM, 0, [5, 6]) = 0
strace: Process 283163 attached
[pid 283163] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 7
[pid 283163] connect(7, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (Нет такого файла или каталога)
[pid 283163] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 7
[pid 283163] connect(7, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (Нет такого файла или каталога)
[pid 283163] socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 7
[pid 283163] connect(7, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("172.24.31.107")}, 16) = 0
[pid 283163] sendmmsg(7, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\250\207\1\0\0\1\0\0\0\0\0\0\10download\nastralinux"..., iov_len=40}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=40}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\240\215\1\0\0\1\0\0\0\0\0\0\10download\nastralinux"..., iov_len=40}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=40}], 2, MSG_NOSIGNAL) = 2
[pid 283163] recvfrom(7, "\250\207\201\200\0\1\0\1\0\0\0\0\10download\nastralinux"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("172.24.31.107")}, [28->16]) = 56
[pid 283163] recvfrom(7, "\240\215\201\200\0\1\0\0\0\1\0\0\10download\nastralinux"..., 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("172.24.31.107")}, [28->16]) = 114
[pid 283163] sendto(6, "\1", 1, MSG_NOSIGNAL, NULL, 0) = 1
[pid 283163] +++ exited with 0 +++
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 5
setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(5, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(5, SOL_TCP, TCP_KEEPIDLE, [60], 4) = 0
setsockopt(5, SOL_TCP, TCP_KEEPINTVL, [60], 4) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("130.193.50.59")}, 16) = -1 EINPROGRESS (Операция выполняется в данный момент)
getsockopt(5, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
getpeername(5, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("130.193.50.59")}, [128->16]) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(48488), sin_addr=inet_addr("172.24.31.241")}, [128->16]) = 0
sendto(5, "GET / HTTP/1.1\r\nHost: download.a"..., 86, MSG_NOSIGNAL, NULL, 0) = 86
recvfrom(5, "HTTP/1.1 200 OK\r\nServer: nginx/1"..., 102400, 0, NULL, NULL) = 1617
What do we see in this strace?
- Trying to use NSCD (Name Service Cache Daemon)
connect(…, “/var/run/nscd/socket”, …) = -1 ENOENT
This means that glibc first tries to use the name cache from NSCD if it is running. There is none on the system, so the request goes on.
- Call socket() and connect() to the DNS server
socket(AF_INET, SOCK_DGRAM|…, IPPROTO_IP) = 7
connect(7, …, sin_addr=inet_addr(“172.24.31.107”)…)
This creates a UDP socket to contact the DNS server specified in /etc/resolv.conf.
- Call sendmmsg() – Send DNS queries
sendmmsg(7, [ { “download.astralinux.ru” }, { “download.astralinux.ru” } ], …)
Name resolution requests are sent here.
- DNS response
recvfrom(...) = 56
recvfrom(...) = 114
Now the IP address is known.
56 is the size of the DNS response in bytes containing the A record (IPv4 address)
114 – size of additional data, such as CNAME, or authoritative servers in case of a recursive query.
- TCP connection over IP
connect(5, ..., sin_addr=inet_addr("130.193.50.59"))
Here curl itself establishes a TCP connection to the IP address returned to it by getaddrinfo().
So when we call curl, we don’t see the DNS queries directly – they are made by the glibc library inside the getaddrinfo() call. But strace allows us to see indirect signs:
The calls will include an attempt to connect to nscd, a connect() call to the DNS server, sending a UDP packet via sendmmsg(), and then a standard TCP over IP connection:
connect(7, {AF_INET, 172.24.31.107:53}) = 0
sendmmsg(7, [{ "download.astralinux.ru" }]) = 2
recvfrom(7, ...) = ...
connect(5, {130.193.50.59:80}) = 0
It is important to note that the behavior of getaddrinfo() may depend on the libc implementation. For example, in glibc the results may be cached, which affects performance and data freshness.
Brief summary and key points
A DNS query in Linux is not necessarily a query to a DNS server. The query chain may include hosts, NSS, glibc, and other sources.
NSS and nsswitch.conf define the order and sources of name resolution.
glibc uses NSS and can cache results ; musl implements DNS resolving directly with limited support for resolv.conf options.
Resolv.conf controls the resolver settings, but can be changed dynamically.
Getaddrinfo() is the primary interface for name resolution, handling both DNS and other sources.
Different programming languages (Go, Java, Python with dns.resolver, Node.js) may use their own DNS query mechanisms.
In the next section, we will complete the picture with a general idea of how DNS record caching works a key mechanism that directly affects the performance, reliability, and behavior of applications when IP addresses change.
This content originally appeared on DEV Community and was authored by Elliot Brenya sarfo