dnsdist

dnsdist is a highly DNS-, DoS- and abuse-aware loadbalancer. Its goal in life is to route traffic to the best server, delivering top performance to legitimate users while shunting or blocking abusive traffic.

dnsdist is dynamic, in the sense that its configuration can be changed at runtime, and that its statistics can be queried from a console-like interface.

Compiling

dnsdist depends on boost, Lua or LuaJIT and a pretty recent C++ compiler (g++ 4.8 or higher, clang 3.5 or higher). It can optionally use libsodium for encrypted communications with its client, protobuf for remote logging and re2 for regular expression matching.

Should dnsdist be run on a system with systemd, it is highly recommended to have the systemd header files (libsystemd-dev on Debian and systemd-devel on CentOS) installed to have dnsdist support systemd-notify.

To compile on CentOS 6 / RHEL6, use this script to install a working compiler:

wget -O /etc/yum.repos.d/slc6-devtoolset.repo http://linuxsoft.cern.ch/cern/devtoolset/slc6-devtoolset.repo
yum install devtoolset-2
scl enable devtoolset-2 bash
./configure
make

To build on OS X, ./configure LIBEDIT_LIBS='-L/usr/lib -ledit' LIBEDIT_CFLAGS=-I/usr/include/editline

To build on OpenBSD, ./configure CXX=eg++ CPP=ecpp LIBEDIT_LIBS='-ledit -lcurses' LIBEDIT_CFLAGS=' '

On other recent platforms, installing a Lua and the system C++ compiler should be enough.

dnsdist can drop privileges using the --uid and --gid command line switches to ensure it does not run with root privileges after binding its listening sockets. It is highly recommended to create a system user and group for dnsdist. Note that most packaged versions of dnsdist already create this user.

Packaged

We build packages for dnsdist on our repositories. In addition dnsdist has been packaged for FreeBSD.

Examples

The absolute minimum configuration:

# dnsdist 2001:4860:4860::8888 8.8.8.8

This will listen on 0.0.0.0:53 and forward queries to the two listed IP addresses, with a sensible load balancing policy.

Here is a more complete configuration:

$ cat /etc/dnsdist.conf
newServer({address="2001:4860:4860::8888", qps=1})
newServer({address="2001:4860:4860::8844", qps=1})
newServer({address="2620:0:ccc::2", qps=10})
newServer({address="2620:0:ccd::2", name="dns1", qps=10})
newServer("192.168.1.2")
setServerPolicy(firstAvailable) -- first server within its QPS limit

$ dnsdist --local=0.0.0.0:5200
Marking downstream [2001:4860:4860::8888]:53 as 'up'
Marking downstream [2001:4860:4860::8844]:53 as 'up'
Marking downstream [2620:0:ccc::2]:53 as 'up'
Marking downstream [2620:0:ccd::2]:53 as 'up'
Marking downstream 192.168.1.2:53 as 'up'
Listening on 0.0.0.0:5200
> 

We can now send queries to port 5200, and get answers:

$ dig -t aaaa powerdns.com @127.0.0.1 -p 5200 +short
2001:888:2000:1d::2

Note that dnsdist offered us a prompt above, and on it we can get some statistics:

> showServers()
#   Address                   State     Qps    Qlim Ord Wt    Queries   Drops Drate   Lat Pools
0   [2001:4860:4860::8888]:53    up     0.0       1   1  1          1       0   0.0   0.0
1   [2001:4860:4860::8844]:53    up     0.0       1   1  1          0       0   0.0   0.0
2   [2620:0:ccc::2]:53           up     0.0      10   1  1          0       0   0.0   0.0
3   [2620:0:ccd::2]:53           up     0.0      10   1  1          0       0   0.0   0.0
4   192.168.1.2:53               up     0.0       0   1  1          0       0   0.0   0.0
All                                     0.0                         1       0     

Here we also see our configuration. 5 downstream servers have been configured, of which the first 4 have a QPS limit (of 1, 1, 10, 10 and 0 -which means unlimited- queries per second, respectively). The final server has no limit, which we can easily test:

$ for a in {0..1000}; do dig powerdns.com @127.0.0.1 -p 5200 +noall > /dev/null; done
> showServers()
#   Address                   State     Qps    Qlim Ord Wt    Queries   Drops Drate   Lat Pools
0   [2001:4860:4860::8888]:53    up     1.0       1   1  1          7       0   0.0   1.6
1   [2001:4860:4860::8844]:53    up     1.0       1   1  1          6       0   0.0   0.6
2   [2620:0:ccc::2]:53           up    10.3      10   1  1         64       0   0.0   2.4
3   [2620:0:ccd::2]:53           up    10.3      10   1  1         63       0   0.0   2.4
4   192.168.1.2:53               up   125.8       0   1  1        671       0   0.0   0.4
All                                   145.0                       811       0     

Note that the first 4 servers were all limited to near their configured QPS, and that our final server was taking up most of the traffic. No queries were dropped, and all servers remain up.

To force a server down, try:

> getServer(0):setDown()
> showServers()
#   Address                   State     Qps    Qlim Ord Wt    Queries   Drops Drate   Lat Pools
0   [2001:4860:4860::8888]:53  DOWN     0.0       1   1  1          8       0   0.0   0.0 
...

The 'DOWN' in all caps means it was forced down. A lower case 'down' would have meant that dnsdist itself had concluded the server was down. Similarly, setUp() forces a server to be up, and setAuto() returns it to the default availability probing.

To change the QPS for a server:

> getServer(0):setQPS(1000)

By default, the availability of a downstream server is checked by regularly sending an A query for a.root-servers.net.. A different query type and target can be specified by passing, respectively, the checkType and checkName parameters to newServer. The default behavior is to consider any valid response with an RCODE different from ServFail as valid. If the mustResolve parameter of newServer is set to true, a response will only be considered valid if its RCODE differs from NXDomain, ServFail and Refused. The number of health check failures before a server is considered down is configurable via the maxCheckFailures parameter, defaulting to 1. The CD flag can be set on the query by setting setCD to true.

newServer({address="192.0.2.1", checkType="AAAA", checkName="a.root-servers.net.", mustResolve=true})

In order to provide the downstream server with the address of the real client, or at least the one talking to dnsdist, the useClientSubnet parameter can be used when declaring a new server. This parameter indicates whether an EDNS Client Subnet option should be added to the request. If the incoming request already contains an EDNS Client Subnet value, it will not be overridden unless setECSOverride() is set to true. The default source prefix-length is 24 for IPv4 and 56 for IPv6, meaning that for a query received from 192.0.2.42, the EDNS Client Subnet value sent to the backend will be 192.0.2.0/24. This can be changed with:

> setECSSourcePrefixV4(24)
> setECSSourcePrefixV6(56)

In addition to the global settings, rules and Lua bindings can alter this behavior per query:

In effect this means that for the EDNS Client Subnet option to be added to the request, useClientSubnet should be set to true for the backend used (default to false) and ECS should not have been disabled by calling DisableECSAction() or setting dq.useECS to false (default to true).

TCP timeouts

By default, a 2 second timeout is enforced on the TCP connection from the client, meaning that a connection will be closed if the query cannot be read in less than 2 seconds or if the answer cannot be sent in less than 2s. This can be configured with:

> setTCPRecvTimeout(5)
> setTCPSendTimeout(5)

The same kind of timeouts are enforced on the TCP connections to the downstream servers. The default value of 30 seconds can be modified by passing the tcpRecvTimeout and tcpSendTimeout parameters to newServer, with an additional tcpConnectTimeout parameter controlling the connection timeout (5s by default). If the TCP connection to a downstream server fails, dnsdist will try to establish a new one up to retries times before giving up.

newServer({address="192.0.2.1", tcpConnectTimeout=5, tcpRecvTimeout=10, tcpSendTimeout=10, retries=5})

Source address

In multi-homed setups, it can be useful to be able to select the source address or the outgoing interface used by dnsdist to contact a downstream server. This can be done by using the source parameter:

newServer({address="192.0.2.1", source="192.0.2.127"})
newServer({address="192.0.2.1", source="eth1"})
newServer({address="192.0.2.1", source="192.0.2.127@eth1"})

The supported values for source are:

Specifying the interface name is only supported on system having IP_PKTINFO.

Configuration management

At startup, configuration is read from the command line and the configuration file. The config can also be inspected and changed from the console. Sadly, our architecture does not allow us to serialize the running configuration for you. However, we do try to offer the next best thing: delta().

delta() shows all commands entered that changed the configuration. So adding a new downstream server with newServer() would show up, but showServers() or even delta() itself would not.

It is suggested to study the output of delta() carefully before appending it to your configuration file.

> setACL("192.0.2.0/24")
> showACL()
192.0.2.0/24
> delta()
-- Wed Dec 23 2015 15:15:35 CET
setACL("192.0.2.0/24")
> addACL("127.0.0.1/8")
> showACL()
192.0.2.0/24
127.0.0.1/8
> delta()
-- Wed Dec 23 2015 15:15:35 CET
setACL("192.0.2.0/24")
-- Wed Dec 23 2015 15:15:44 CET
addACL("127.0.0.1/8")
>

Webserver

To visually interact with dnsdist, try adding:

webserver("127.0.0.1:8083", "supersecretpassword", "supersecretAPIkey")

to the configuration, and point your browser at http://127.0.0.1:8083 and log in with any username, and that password. Enjoy!

By default, our web server sends some security-related headers:

You can override those headers, or add custom headers by using the last parameter to webserver(). For example, to remove the X-Frame-Options header and add a X-Custom one:

webserver("127.0.0.1:8080", "supersecret", "apikey", {["X-Frame-Options"]= "", ["X-Custom"]="custom"})

Server pools

Now for some cool stuff. Let's say we know we're getting a whole bunch of traffic for a domain used in DoS attacks, for example 'sh43354.cn'. We can do two things with this kind of traffic. Either we block it outright, like this:

> addDomainBlock("sh43354.cn.")

Or we configure a server pool dedicated to receiving the nasty stuff:

> newServer({address="192.168.1.3", pool="abuse"})
> addPoolRule({"sh43353.cn.", "ezdns.it."}, "abuse")

The wonderful thing about this last solution is that it can also be used for things where a domain might possibly be legit, but it is still causing load on the system and slowing down the Internet for everyone. With such an abuse server, 'bad traffic' still gets a chance of an answer, but without impacting the rest of the world (too much).

We can similarly add clients to the abuse server:

> addPoolRule({"192.168.12.0/24", "192.168.13.14"}, "abuse")

To define a pool that should receive only a QPS-limited amount of traffic, do:

> addQPSPoolRule("com.", 10000, "gtld-cluster")

Traffic exceeding the QPS limit will not match that rule, and subsequent rules will apply normally.

Both addDomainBlock and addPoolRule end up the list of Rules and Actions (for which see below).

Servers can be added to or removed from pools with:

> getServer(7):addPool("abuse")
> getServer(4):rmPool("abuse")

Rules

Rules can be inspected with showRules(), and can be deleted with rmRule(). Rules are evaluated in order, and this order can be changed with mvRule(from, to) (see below for exact semantics).

Rules have selectors and actions. Current selectors are:

Special rules are:

Current actions are:

Current response actions are:

Rules can be added via:

Response rules can be added via:

Cache Hit Response rules, triggered on a cache hit, can be added via:

A DNS rule can be:

Some specific actions do not stop the processing when they match, contrary to all other actions:

A convenience function makeRule() is supplied which will make a NetmaskGroupRule for you or a SuffixMatchNodeRule depending on how you call it. makeRule("0.0.0.0/0") will for example match all IPv4 traffic, makeRule({"be","nl","lu"}) will match all Benelux DNS traffic.

All the current rules can be removed at once with:

> clearRules()

It is also possible to replace the current rules by a list of new ones in a single operation with setRules():

> setRules( { newRuleAction(TCPRule(), AllowAction()), newRuleAction(AllRule(), DropAction()) } )

More power

More powerful things can be achieved by defining a function called blockFilter() in the configuration file, which can decide to drop traffic on any reason it wants. If you return 'true' from there, the query will get blocked.

A demo on how to do this and many other things can be found on https://github.com/powerdns/pdns/blob/master/pdns/dnsdistconf.lua and the exact definition of blockFilter() is at the end of this document.

ANY or whatever to TC

The blockFilter() also gets passed read/writable copy of the DNS Header, via dq.dh. If you invoke setQR(1) on that, dnsdist knows you turned the packet into a response, and will send the answer directly to the original client.

If you also called setTC(1), this will tell the remote client to move to TCP, and in this way you can implement ANY-to-TCP even for downstream servers that lack this feature.

Note that calling addAnyTCRule() achieves the same thing, without involving Lua.

Rules for traffic exceeding QPS limits

Traffic that exceeds a QPS limit, in total or per IP (subnet) can be matched by a rule.

For example:

addDelay(MaxQPSIPRule(5, 32, 48), 100)

This measures traffic per IPv4 address and per /48 of IPv6, and if traffic for such an address (range) exceeds 5 qps, it gets delayed by 100ms.

As another example:

addAction(MaxQPSIPRule(5), NoRecurseAction())

This strips the Recursion Desired (RD) bit from any traffic per IPv4 or IPv6 /64 that exceeds 5 qps. This means any those traffic bins is allowed to make a recursor do 'work' for only 5 qps.

If this is not enough, try:

addAction(MaxQPSIPRule(5), DropAction())
-- or
addAction(MaxQPSIPRule(5), TCAction())

This will respectively drop traffic exceeding that 5 QPS limit per IP or range, or return it with TC=1, forcing clients to fall back to TCP.

To turn this per IP or range limit into a global limit, use NotRule(MaxQPSRule(5000)) instead of MaxQPSIPRule.

TeeAction

This action sends off a copy of a UDP query to another server, and keeps statistics on the responses received. Sample use:

> addAction(AllRule(), TeeAction("192.168.1.54"))
> getAction(0):printStats()
refuseds    0
nxdomains   0
noerrors    0
servfails   0
recv-errors 0
tcp-drops   0
responses   0
other-rcode 0
send-errors 0
queries 0

It is also possible to share a TeeAction between several rules. Statistics will be combined in that case.

Lua actions in rules

While we can pass every packet through the blockFilter() functions, it is also possible to configure dnsdist to only hand off some packets for Lua inspection. If you think Lua is too slow for your query load, or if you are doing heavy processing in Lua, this may make sense.

To select specific packets for Lua attention, use addLuaAction(x, func), where x is either a netmask, or a domain suffix, or a table of netmasks or a table of domain suffixes. This is identical to how addPoolRule() selects.

The function should look like this:

function luarule(dq)
        if(dq.qtype==35) -- NAPTR
        then
                return DNSAction.Pool, "abuse" -- send to abuse pool
        else
                return DNSAction.None, ""      -- no action
        end
end

Valid return values for LuaAction functions are:

The same feature exists to hand off some responses for Lua inspection, using addLuaResponseAction(x, func).

DNSSEC

To provide DNSSEC service from a separate pool, try:

newServer({address="2001:888:2000:1d::2", pool="dnssec"})
newServer({address="2a01:4f8:110:4389::2", pool="dnssec"})
setDNSSECPool("dnssec")
topRule()

This routes all queries with a DNSSEC OK (DO) or CD bit set to on to the "dnssec" pool. The final topRule() command moves this rule to the top, so it gets evaluated first.

Regular Expressions

RegexRule() matches a regular expression on the query name, and it works like this:

addAction(RegexRule("[0-9]{5,}"), DelayAction(750)) -- milliseconds
addAction(RegexRule("[0-9]{4,}\\.cn$"), DropAction())

This delays any query for a domain name with 5 or more consecutive digits in it. The second rule drops anything with more than 4 consecutive digits within a .CN domain.

Note that the query name is presented without a trailing dot to the regex. The regex is applied case insensitively.

Alternatively, if compiled in, RE2Rule provides similar functionality, but against libre2.

Inspecting live traffic

This is still much in flux, but for now, try:

For example:

> grepq("127.0.0.1/24")
Time    Client                                          Server       ID    Name                      Type  Lat.   TC RD AA Rcode
-11.9   127.0.0.1:52599                                              16127 nxdomain.powerdns.com.    A               RD    Question
-11.7   127.0.0.1:52599                                 127.0.0.1:53 16127 nxdomain.powerdns.com.    A     175.6     RD    Non-Existent domain
> grepq("powerdns.com")
Time    Client                                          Server       ID    Name                      Type  Lat.   TC RD AA Rcode
-38.7   127.0.0.1:52599                                              16127 nxdomain.powerdns.com.    A               RD    Question
-38.6   127.0.0.1:52599                                 127.0.0.1:53 16127 nxdomain.powerdns.com.    A     175.6     RD    Non-Existent domain

Live histogram of latency

> showResponseLatency()
Average response latency: 78.84 msec
   msec 
   0.10 
   0.20 .
   0.40 **********************
   0.80 ***********
   1.60 .
   3.20 
   6.40 .
  12.80 *
  25.60 *
  51.20 *
 102.40 **********************************************************************
 204.80 *************************
 409.60 **
 819.20 :
1638.40 .

Where : stands for 'half a star' and . for 'less than half a star, but something was there'.

Per domain or subnet QPS limiting

If certain domains or source addresses are generating onerous amounts of traffic, you can put ceilings on the amount of traffic you are willing to forward:

> addQPSLimit("h4xorbooter.xyz.", 10)
> addQPSLimit({"130.161.0.0/16", "145.14.0.0/16"} , 20)
> addQPSLimit({"nl.", "be."}, 1)
> showRules()
#     Matches Rule                                               Action
0           0 h4xorbooter.xyz.                                   qps limit to 10
1           0 130.161.0.0/16, 145.14.0.0/16                      qps limit to 20
2           0 nl., be.                                           qps limit to 1

To delete a limit (or a rule in general):

> rmRule(1)
> showRules()
#     Matches Rule                                               Action
0           0 h4xorbooter.xyz.                                   qps limit to 10
1           0 nl., be.                                           qps limit to 1

Delaying answers

Sometimes, runaway scripts will hammer your servers with back-to-back queries. While it is possible to drop such packets, this may paradoxically lead to more traffic.

An attractive middleground is to delay answers to such back-to-back queries, causing a slowdown on the side of the source of the traffic.

To do so, use:

> addDelay("yourdomain.in.ua.", 500)
> addDelay({"65.55.37.0/24"}, 500)

This will delay responses for questions to the mentioned domain, or coming from the configured subnet, by half a second.

Like the QPSLimits and other rules, the delaying instructions can be inspected or edited using showRules(), rmRule(), topRule(), mvRule() etc.

Dynamic load balancing

The default load balancing policy is called leastOutstanding, which means we pick the server with the least queries 'in the air' (and within those, the one with the lowest order, and within those, the one with the lowest latency).

Another policy, firstAvailable, picks the first server that has not exceeded its QPS limit. If all servers are above their QPS limit, a server is selected based on the leastOutstanding policy. For now this is the only policy using the QPS limit.

A further policy, wrandom assigns queries randomly, but based on the weight parameter passed to newServer. whashed is a similar weighted policy, but assigns questions with identical hash to identical servers, allowing for better cache concentration ('sticky queries').

If you don't like the default policies you can create your own, like this for example:

counter=0
function luaroundrobin(servers, dq)
     counter=counter+1
     return servers[1+(counter % #servers)]
end

setServerPolicyLua("luaroundrobin", luaroundrobin)

Incidentally, this is similar to setting: setServerPolicy(roundrobin) which uses the C++ based roundrobin policy.

Lua server policies

If the built in rules do not suffice to pick a server pool, full flexibility is available from Lua. For example:

newServer("192.168.1.2")
newServer({address="8.8.4.4", pool="numbered"})

function splitSetup(servers, dq)
         if(string.match(dq.qname:toString(), "%d"))
         then
                print("numbered pool")
                return leastOutstanding.policy(getPoolServers("numbered"), dq)
         else
                print("standard pool")
                return leastOutstanding.policy(servers, dq)
         end
end

setServerPolicyLua("splitsetup", splitSetup)

This will forward queries containing a number to the pool of "numbered" servers, and will apply the default load balancing policy to all other queries.

Dynamic Rule Generation

To set dynamic rules, based on recent traffic, define a function called maintenance() in Lua. It will get called every second, and from this function you can set rules to block traffic based on statistics. More exactly, the thread handling the maintenance() function will sleep for one second between each invocation, so if the function takes several seconds to complete it will not be invoked exactly every second.

As an example:

function maintenance()
    addDynBlocks(exceedQRate(20, 10), "Exceeded query rate", 60)
end

This will dynamically block all hosts that exceeded 20 queries/s as measured over the past 10 seconds, and the dynamic block will last for 60 seconds.

Dynamic blocks in force are displayed with showDynBlocks() and can be cleared with clearDynBlocks(). Full set of exceed functions is listed in the table of all functions below. They return a table whose key is a ComboAddress object, representing the client's source address, and whose value is an integer representing the number of queries matching the corresponding condition (for example the qtype for exceedQTypeRate(), rcode for exceedServFails()).

Dynamic blocks drop matched queries by default, but this behavior can be changed with setDynBlocksAction(). For example, to send a REFUSED code instead of droppping the query:

setDynBlocksAction(DNSAction.Refused)

Running it for real

First run on the command line, and generate a key:

# dnsdist
> makeKey()
setKey("sepuCcHcQnSAZgNbNPCCpDWbujZ5esZJmrt/wh6ldkQ=")

Now add this setKey line to dnsdist.conf, and also add:

controlSocket("0.0.0.0") -- or add portnumber too

Then start dnsdist as a daemon, and then connect to it:

# dnsdist --daemon
# dnsdist --client
> 

Please note that, without libsodium support, 'makeKey()' will return setKey("plaintext") and the communication between the client and the server will not be encrypted.

Some versions of libedit, notably the CentOS 6 one, may require the following addition to ~/.editrc in order to support searching through the history:

bind "^R" em-inc-search-prev

ACL, who can use dnsdist

For safety reasons, by default only private networks can use dnsdist, see below how to query and change the ACL:

> showACL()
127.0.0.0/8
10.0.0.0/8
(...)
::1/128
fc00::/7
fe80::/10
> addACL("130.161.0.0/16")
> setACL({"::/0"}) -- resets the list to this array
> showACL()
::/0

Caching

dnsdist implements a simple but effective packet cache, not enabled by default. It is enabled per-pool, but the same cache can be shared between several pools. The first step is to define a cache, then to assign that cache to the chosen pool, the default one being represented by the empty string:

pc = newPacketCache(10000, 86400, 0, 60, 60)
getPool(""):setCache(pc)

The first parameter (10000) is the maximum number of entries stored in the cache, and is the only one required. All the other parameter are optional and in seconds. The second one (86400) is the maximum lifetime of an entry in the cache, the third one (0) is the minimum TTL an entry should have to be considered for insertion in the cache, the fourth one (60) is the TTL used for a Server Failure or a Refused response. The last one (60) is the TTL that will be used when a stale cache entry is returned. For performance reasons the cache will pre-allocate buckets based on the maximum number of entries, so be careful to set the first parameter to a reasonable value. Something along the lines of a dozen bytes per pre-allocated entry can be expected on 64-bit. That does not mean that the memory is completely allocated up-front, the final memory usage depending mostly on the size of cached responses and therefore varying during the cache's lifetime. Assuming an average response size of 512 bytes, a cache size of 10000000 entries on a 64-bit host with 8GB of dedicated RAM would be a safe choice.

The setStaleCacheEntriesTTL(n) directive can be used to allow dnsdist to use expired entries from the cache when no backend is available. Only entries that have expired for less than n seconds will be used, and the returned TTL can be set when creating a new cache with newPacketCache().

A reference to the cache affected to a specific pool can be retrieved with:

getPool("poolname"):getCache()

And removed with:

getPool("poolname"):unsetCache()

Cache usage stats (hits, misses, deferred inserts and lookups, collisions) can be displayed by using the printStats() method:

getPool("poolname"):getCache():printStats()

Expired cached entries can be removed from a cache using the purgeExpired(n) method, which will remove expired entries from the cache until at most n entries remain in the cache. For example, to remove all expired entries:

getPool("poolname"):getCache():purgeExpired(0)

Specific entries can also be removed using the expungeByName(DNSName [, qtype=ANY]) method.

getPool("poolname"):getCache():expungeByName(newDNSName("powerdns.com"), dnsdist.A)

Finally, the expunge(n) method will remove all entries until at most n entries remain in the cache:

getPool("poolname"):getCache():expunge(0)

Performance tuning

First, a few words about dnsdist architecture:

The maximum number of threads in the TCP pool is controlled by the setMaxTCPClientThreads() directive, and defaults to 10. This number can be increased to handle a large number of simultaneous TCP connections. If all the TCP threads are busy, new TCP connections are queued while they wait to be picked up. The maximum number of queued connections can be configured with setMaxTCPQueuedConnections() and defaults to 1000. Any value larger than 0 will cause new connections to be dropped if there are already too many queued. By default, every TCP worker thread has its own queue, and the incoming TCP connections are dispatched to TCP workers on a round-robin basis. This might cause issues if some connections are taking a very long time, since incoming ones will be waiting until the TCP worker they have been assigned to has finished handling its current query, while other TCP workers might be available. The experimental setTCPUseSinglePipe(true) directive can be used so that all the incoming TCP connections are put into a single queue and handled by the first TCP worker available.

When dispatching UDP queries to backend servers, dnsdist keeps track of at most n outstanding queries for each backend. This number n can be tuned by the setMaxUDPOutstanding() directive, defaulting to 10240, with a maximum value of 65535. Large installations are advised to increase the default value at the cost of a slightly increased memory usage.

Most of the query processing is done in C++ for maximum performance, but some operations are executed in Lua for maximum flexibility:

While Lua is fast, its use should be restricted to the strict necessary in order to achieve maximum performance, it might be worth considering using LuaJIT instead of Lua. When Lua inspection is needed, the best course of action is to restrict the queries sent to Lua inspection by using addLuaAction() instead of inspecting all queries in the blockfilter() function.

dnsdist design choices mean that the processing of UDP queries is done by only one thread per local bind. This is great to keep lock contention to a low level, but might not be optimal for setups using a lot of processing power, caused for example by a large number of complicated rules. To be able to use more CPU cores for UDP queries processing, it is possible to use the reuseport parameter of the addLocal() and setLocal() directives to be able to add several identical local binds to dnsdist:

addLocal("192.0.2.1:53", true, true)
addLocal("192.0.2.1:53", true, true)
addLocal("192.0.2.1:53", true, true)
addLocal("192.0.2.1:53", true, true)

dnsdist will then add four identical local binds as if they were different IPs or ports, start four threads to handle incoming queries and let the kernel load balance those randomly to the threads, thus using four CPU cores for rules processing. Note that this require SO_REUSEPORT support in the underlying operating system (added for example in Linux 3.9). Please also be aware that doing so will increase lock contention and might not therefore scale linearly. This is especially true for Lua-intensive setups, because Lua processing in dnsdist is serialized by an unique lock for all threads.

Another possibility is to use the reuseport option to run several dnsdist processes in parallel on the same host, thus avoiding the lock contention issue at the cost of having to deal with the fact that the different processes will not share informations, like statistics or DDoS offenders.

The UDP threads handling the responses from the backends do not use a lot of CPU, but if needed it is also possible to add the same backend several times to the dnsdist configuration to distribute the load over several responder threads.

newServer({address="192.0.2.127:53", name="Backend1"})
newServer({address="192.0.2.127:53", name="Backend2"})
newServer({address="192.0.2.127:53", name="Backend3"})
newServer({address="192.0.2.127:53", name="Backend4"})

Carbon/Graphite/Metronome

To emit metrics to Graphite, or any other software supporting the Carbon protocol, use:

carbonServer('ip-address-of-carbon-server', 'ourname', 30)

Where 'ourname' can be used to override your hostname, and '30' is the reporting interval in seconds. The last two arguments can be omitted. The latest version of PowerDNS Metronome comes with attractive graphs for dnsdist by default.

Query counters

When using carbonServer, it is also possible to send per-records statistics of the amount of queries by using setQueryCount(true). With query counting enabled, dnsdist will increase a counter for every unique record or the behaviour you define in a custom Lua function by setting setQueryCountFilter(func). This filter can decide whether to keep count on a query at all or rewrite for which query the counter will be increased. An example of a QueryCountFilter would be:

function filter(dq)
  qname = dq.qname:toString()

  -- don't count PTRs at all
  if(qname:match('in%-addr.arpa$')) then
    return false, ""
  end

  -- count these queries as if they were queried without leading www.
  if(qname:match('^www.')) then
    qname = qname:gsub('^www.', '')
  end

  -- count queries by default
  return true, qname
end

setQueryCountFilter(filter)

Valid return values for QueryCountFilter functions are:

Note that the query counters are buffered and flushed each time statistics are sent to the carbon server. The current content of the buffer can be inspected with getQueryCounters(). If you decide to enable query counting without carbonServer, make sure you implement clearing the log from maintenance() by issuing clearQueryCounters().

DNSCrypt

dnsdist, when compiled with --enable-dnscrypt, can be used as a DNSCrypt server, uncurving queries before forwarding them to downstream servers and curving responses back. To make dnsdist listen to incoming DNSCrypt queries on 127.0.0.1 port 8443, with a provider name of "2.providername", using a resolver certificate and associated key stored respectively in the resolver.cert and resolver.key files, the addDnsCryptBind() directive can be used:

addDNSCryptBind("127.0.0.1:8443", "2.providername", "/path/to/resolver.cert", "/path/to/resolver.key")

To generate the provider and resolver certificates and keys, you can simply do:

> generateDNSCryptProviderKeys("/path/to/providerPublic.key", "/path/to/providerPrivate.key")
Provider fingerprint is: E1D7:2108:9A59:BF8D:F101:16FA:ED5E:EA6A:9F6C:C78F:7F91:AF6B:027E:62F4:69C3:B1AA
> generateDNSCryptCertificate("/path/to/providerPrivate.key", "/path/to/resolver.cert", "/path/to/resolver.key", serial, validFrom, validUntil)

Ideally, the certificates and keys should be generated on an offline dedicated hardware and not on the resolver. The resolver key should be regularly rotated and should never touch persistent storage, being stored in a tmpfs with no swap configured.

You can display the currently configured DNSCrypt binds with:

> showDNSCryptBinds()
#   Address              Provider Name        Serial   Validity              P. Serial P. Validity
0   127.0.0.1:8443       2.name               14       2016-04-10 08:14:15   0         -

If you forgot to write down the provider fingerprint value after generating the provider keys, you can use printDNSCryptProviderFingerprint() to retrieve it later:

> printDNSCryptProviderFingerprint("/path/to/providerPublic.key")
Provider fingerprint is: E1D7:2108:9A59:BF8D:F101:16FA:ED5E:EA6A:9F6C:C78F:7F91:AF6B:027E:62F4:69C3:B1AA

AXFR, IXFR and NOTIFY

When dnsdist is deployed in front of a master authoritative server, it might receive AXFR or IXFR queries destined to this master. There are two issues that can arise in this kind of setup:

The first issue can be solved by routing SOA, AXFR and IXFR requests explicitly to the master:

> newServer({address="192.168.1.2", name="master", pool={"master", "otherpool"}})
> addAction(OrRule({QTypeRule(dnsdist.SOA), QTypeRule(dnsdist.AXFR), QTypeRule(dnsdist.IXFR)}), PoolAction("master"))

The second one might require allowing AXFR/IXFR from the dnsdist source address and moving the source address check on dnsdist's side:

> addAction(AndRule({OrRule({QTypeRule(dnsdist.AXFR), QTypeRule(dnsdist.IXFR)}), NotRule(makeRule("192.168.1.0/24"))}), RCodeAction(dnsdist.REFUSED))

When dnsdist is deployed in front of slaves, however, an issue might arise with NOTIFY queries, because the slave will receive a notification coming from the dnsdist address, and not the master's one. One way to fix this issue is to allow NOTIFY from the dnsdist address on the slave side (for example with PowerDNS's trusted-notification-proxy) and move the address check on dnsdist's side:

> addAction(AndRule({OpcodeRule(DNSOpcode.Notify), NotRule(makeRule("192.168.1.0/24"))}), RCodeAction(dnsdist.REFUSED))

eBPF Socket Filtering

dnsdist can use eBPF socket filtering on recent Linux kernels (4.1+) built with eBPF support (CONFIG_BPF, CONFIG_BPF_SYSCALL, ideally CONFIG_BPF_JIT). This feature might require an increase of the memory limit associated to a socket, via the sysctl setting net.core.optmem_max. When attaching an eBPF program to a socket, the size of the program is checked against this limit, and the default value might not be enough. Large map sizes might also require an increase of RLIMIT_MEMLOCK.

This feature allows dnsdist to ask the kernel to discard incoming packets in kernel-space instead of them being copied to userspace just to be dropped, thus being a lot of faster.

The BPF filter can be used to block incoming queries manually:

> bpf = newBPFFilter(1024, 1024, 1024)
> bpf:attachToAllBinds()
> bpf:block(newCA("2001:DB8::42"))
> bpf:blockQName(newDNSName("evildomain.com"), 255)
> bpf:getStats()
[2001:DB8::42]: 0
evildomain.com. 255: 0
> bpf:unblock(newCA("2001:DB8::42"))
> bpf:unblockQName(newDNSName("evildomain.com"), 255)
> bpf:getStats()
>

The blockQName() method can be used to block queries based on the exact qname supplied, in a case-insensitive way, and an optional qtype. Using the 255 (ANY) qtype will block all queries for the qname, regardless of the qtype. Contrary to source address filtering, qname filtering only works over UDP. TCP qname filtering can be done the usual way:

> addAction(AndRule({TCPRule(true), makeRule("evildomain.com")}), DropAction())

The attachToAllBinds() method attaches the filter to every existing bind at runtime, but it's also possible to define a default BPF filter at configuration time, so it's automatically attached to every bind:

bpf = newBPFFilter(1024, 1024, 1024)
setDefaultBPFFilter(bpf)

Finally, it's also possible to attach it to specific binds at runtime:

> bpf = newBPFFilter(1024, 1024, 1024)
> showBinds()
#   Address              Protocol  Queries
0   [::]:53              UDP       0
1   [::]:53              TCP       0
> bd = getBind(0)
> bd:attachFilter(bpf)

dnsdist also supports adding dynamic, expiring blocks to a BPF filter:

bpf = newBPFFilter(1024, 1024, 1024)
setDefaultBPFFilter(bpf)
dbpf = newDynBPFFilter(bpf)
function maintenance()
        addBPFFilterDynBlocks(exceedQRate(20, 10), dbpf, 60)
        dbpf:purgeExpired()
end

This will dynamically block all hosts that exceeded 20 queries/s as measured over the past 10 seconds, and the dynamic block will last for 60 seconds.

This feature has been successfully tested on Arch Linux, Arch Linux ARM, Fedora Core 23 and Ubuntu Xenial.

SNMP support

dnsdist supports exporting statistics and sending traps over SNMP when compiled with Net SNMP support, acting as an AgentX subagent. SNMP support is enabled via the snmpAgent(enableTraps [, masterSocket]) directive, where enableTraps is a boolean indicating whether traps should be sent and masterSocket is an optional string specifying how to connect to the master agent. The default for this last parameter is to use an Unix socket, but others options are available, such as TCP: tcp:localhost:705

By default, the only traps sent when enableTraps is set to true are backend status change notifications, but traps can also be sent:

Net SNMP snmpd doesn't accept subagent connections by default, so to use the SNMP features of dnsdist the following line should be added to the snmpd.conf configuration file:

master agentx

In addition to that, the permissions on the resulting socket might need to be adjusted so that the dnsdist user can write to it. This can be done with the following lines in snmpd.conf (assuming dnsdist is running as dnsdist:dnsdist):

agentxperms 0700 0700 dnsdist dnsdist

In order to allow the retrieval of statistics via SNMP, snmpd's access control has to configured. A very simple SNMPv2c setup only needs the configuration of a read-only community in snmpd.conf:

rocommunity dnsdist42

snmpd also supports more secure SNMPv3 setup, using for example the createUser and rouser directives:

createUser myuser SHA "my auth key" AES "my enc key"
rouser myuser

snmpd can be instructed to send SNMPv2 traps to a remote SNMP trap receiver by adding the following directive to the snmpd.conf configuration file:

trap2sink 192.0.2.1

The description of dnsdist's SNMP MIB is available in DNSDIST-MIB.txt.

All functions and types

Within dnsdist several core object types exist:

The existence of most of these objects can mostly be ignored, unless you plan to write your own hooks and policies, but it helps to understand an expressions like:

> getServer(0).order=12         -- set order of server 0 to 12
> getServer(0):addPool("abuse") -- add this server to the abuse pool

The '.' means 'order' is a data member, while the ':' means addPool is a member function.

Here are all functions:

All hooks

dnsdist can call Lua per packet if so configured, and will do so with the following hooks: