[MaraDNS list] Deadwood timing out on lookup

Sam Trenholme strenholme.usenet at gmail.com
Sat Aug 6 03:33:14 EDT 2011


OK, so the issue is that we're sometimes getting a server fail
upstream when using upstream_servers (instead of root_servers) and
Deadwood doesn't appear to be handling these kinds of packets well.

Very good.  This gives me enough information to play around with some
tests and start trying to fix that.  That's the good news.  The bad
news is that, as I pointed out earlier today here on the mailing list,
because I'm not getting the necessary sponsorship for my
MaraDNS/Deadwood work, the next day I'm going to look at the code is
on September 5, 2011--one month from today (I just released MaraDNS
2.0.03 today and, at my current level of sponsorship, only work on
MaraDNS/Deadwood once a month).

In the meantime, some ideas:

* If you want to just cache what comes upstream, try using Deadwood
2.3.07, the supported "tiny" non-recursive version of Deadwood:

http://maradns.org/deadwood/tiny/

In particular, this release has a parameter called "deliver_all" which
may work with your network and its issues.

* Do full recursion with Deadwood.  This is most easily done by simply
commenting out the   upstream_servers lines in the dwood3rc; Deadwood
will then contact the root nameservers and follow the referrals until
it gets a usable answer to send to the client.

While I do a lot of SQA tests with upstream_servers before releasing
Deadwood, I don't have one which recreates this particular network
condition.  I can write code to simulate this condition (or something
similar such as a DNS server which always returns "SERVER FAIL") and
see how Deadwood handles it.

Thank you for your detailed bug report and I apologize that I'm just
not getting the sponsorship needed right now to justify immediately
looking at and fixing this issue.

And, yes, I do fix issues like this once I allocate time to look at
them; for example:

http://afj2.vk.tj/

Here, I fixed an issue on June 10, 2011 which was reported on June 2,
2011 (this was back when I worked on MaraDNS/Deadwood one day every
two weeks)

- Sam

2011/8/6 Steve Fatula <compconsultant at yahoo.com>:
> Here's something really weird.... When I add the verbose_level = 1000, it worked just fine for the original dig I gave you. When I remove the verbose_level line, it failed again. Both times I did a service restart. So, I repeated the experiment. I added the line again, restarted the service, and, it worked again. So, I deleted the line from dwood3rc, restarted the service, and, it failed again. However, the third time, it failed again even with the line. So, different starts give different results.
>
> That is very strange!
>
> Here's is the verbose_level = 1000 query that failed:
>
> Aug  6 02:07:42 host2 /usr/local/sbin/Deadwood: Got DNS query for \002\070\062\003\061\063\066\003\062\063\062\003\062\060\064\004list\005dnswl\003org\000\000\020
> Aug  6 02:07:42 host2 /usr/local/sbin/Deadwood: Looking in cache for query \002\070\062\003\061\063\066\003\062\063\062\003\062\060\064\004list\005dnswl\003org\000\000\020
> Aug  6 02:07:42 host2 /usr/local/sbin/Deadwood: Nothing found for \002\070\062\003\061\063\066\003\062\063\062\003\062\060\064\004list\005dnswl\003org\000\000\020
> Aug  6 02:07:42 host2 /usr/local/sbin/Deadwood: Making connection to IP 8.8.8.8
>
> And it ends there. If I change the query to specifically go to 8.8.8.8 as follows:
>
> dig @8.8.8.8 -t txt 82.136.232.204.list.dnswl.org
>
>
> I get this:
>
> ; <<>> DiG 9.3.6-P1-RedHat-9.3.6-16.P1.el5 <<>> @8.8.8.8 -t txt 82.136.232.204.list.dnswl.org
> ; (1 server found)
> ;; global options:  printcmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 24969
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
>
> ;; QUESTION SECTION:
> ;82.136.232.204.list.dnswl.org.INTXT
>
> ;; Query time: 1308 msec
> ;; SERVER: 8.8.8.8#53(8.8.8.8)
> ;; WHEN: Sat Aug  6 02:09:46 2011
> ;; MSG SIZE  rcvd: 47
>
>
> Note the servfail.... Perhaps Deadwood is not handling that properly as a guess? 4.2.2.2 is working fine, perhaps there is some sort of problem with 8.8.8.8 right now. HOWEVER, I suspect that the servfail is causing trouble for Deadwood.
>
> Not sure where I got this, but the init file I am using is as follows:
>
> #!/bin/bash
> # MaraDNSThis shell script takes care of starting and stopping MaraDNS
> # chkconfig: - 55 45
> # description: MaraDNS is secure Domain Name Server (DNS)
> # probe: true
>
> # Copyright 2005-2006 Sam Trenholme
>
> # TERMS
>
> # Redistribution and use in source and binary forms, with or without
> # modification, are permitted provided that the following conditions
> # are met:
>
> # 1. Redistributions of source code must retain the above copyright
> #    notice, this list of conditions and the following disclaimer.
> # 2. Redistributions in binary form must reproduce the above copyright
> #    notice, this list of conditions and the following disclaimer in the
> #    documentation and/or other materials provided with the distribution.
>
> # This software is provided 'as is' with no guarantees of correctness or
> # fitness for purpose.
>
> # This is a script which stops and starts the MaraDNS process
> # The first line points to bash because I don't have a true Solaris /bin/sh
> # to test this against.
>
> # The following is a pointer to the MaraDNS program
> if [ -x "/usr/sbin/Deadwood" ] ; then
> MARADNS="/usr/sbin/Deadwood"
> elif [ -x "/usr/sbin/Deadwood.authonly" ] ; then
> MARADNS="/usr/sbin/Deadwood.authonly"
> elif [ -x "/usr/local/sbin/Deadwood" ] ; then
> MARADNS="/usr/local/sbin/Deadwood"
> elif [ -x "/usr/local/sbin/Deadwood.authonly" ] ; then
> MARADNS="/usr/local/sbin/Deadwood.authonly"
> else
> echo unable to find Deadwood
> exit 1
> fi
>
> # The following is a pointer to the duende daemonizer
> if [ -x "/usr/sbin/duende" ] ; then
> DUENDE="/usr/sbin/duende"
> elif [ -x "/usr/local/sbin/duende" ] ; then
> DUENDE="/usr/local/sbin/duende"
> elif [ -x "/usr/local/bin/duende" ] ; then
> DUENDE="/usr/local/bin/duende"
> elif [ -x "/usr/bin/duende" ] ; then
> DUENDE="/usr/bin/duende"
> else
> echo unable to find duende
> exit 1
> fi
>
> # The following is the directory we place MaraDNS log entries in
> LOGDIR="/var/log"
>
> # The following is a list of all dwood3rc files which we will load or
> # unload;
> # Simple case: Only one MaraDNS process, using the /etc/dwood3rc file
> MARARCS="/etc/dwood3rc"
> # Case two: Three MaraDNS processes, one using /etc/dwood3rc.1, the second one
> # using /etc/dwood3rc.2, and the third one using /etc/dwood3rc.3
> # (this is not as essential as it was in the 1.0 days; MaraDNS can now bind
> #  to multiple IPs)
> #MARARCS="/etc/dwood3rc.1 /etc/dwood3rc.2 /etc/dwood3rc.3"
>
> # Show usage information if this script is invoked with no arguments
> if [ $# -lt 1 ] ; then
>     echo Usage: $0 \(start\|stop\|restart\)
>     exit 1
> fi
>
> # If invoked as stop or restart, kill *all* MaraDNS processes
> if [ $1 = "stop" -o $1 = "restart" ] ; then
>     echo Sending all MaraDNS processes the TERM signal
>     ps -ef | awk '{print $2":"$8}' | grep Deadwood | grep -v $$ | \
>       cut -f1 -d: | xargs kill > /dev/null 2>&1
>     echo waiting 1 second
>     sleep 1
>     echo Sending all MaraDNS processes the KILL signal
>     ps -e | awk '{print $1":"$NF}' | grep Deadwood | grep -v $$ | \
>       cut -f1 -d: | xargs kill -9 > /dev/null 2>&1
>     echo MaraDNS should have been stopped
>     if [ $1 = "stop" ] ; then
>     exit 0
>     fi
> fi
>
> # If invoked as start or restart, start the MaraDNS processes
> if [ $1 = "start" -o $1 = "restart" ] ; then
>     echo Starting all Deadwood processes
>     for a in $MARARCS ; do
>         echo Starting Deadwood process which uses Mararc file $a
> # Duende syslogs MaraDNS' output messages and daemonizes MaraDNS
>         $DUENDE $MARADNS -f $a
>     done
>     exit 0
> fi
>
>
> Steve
>


More information about the list mailing list