a7b456
From 63b84acdf5bd7971a7da3137a6aa609c71205625 Mon Sep 17 00:00:00 2001
a7b456
From: Hans Dedecker <dedeckeh@gmail.com>
a7b456
Date: Tue, 27 Jun 2017 22:08:47 +0100
a7b456
Subject: [PATCH] Try other servers if first returns REFUSED when
a7b456
 --strict-order active.
a7b456
a7b456
If a DNS server replies REFUSED for a given DNS query in strict order mode
a7b456
no failover to the next DNS server is triggered as the failover logic only
a7b456
covers non strict mode.
a7b456
As a result the client will be returned the REFUSED reply without first
a7b456
falling back to the secondary DNS server(s).
a7b456
a7b456
Make failover support work as well for strict mode config in case REFUSED is
a7b456
replied by deleting the strict order check and rely only on forwardall being
a7b456
equal to 0 which is the case in non strict mode when a single server has been
a7b456
contacted or when strict order mode has been configured.
a7b456
a7b456
(cherry picked from commit 9396752c115b3ab733fa476b30da73237e12e7ba)
a7b456
a7b456
Stop treating SERVFAIL as a successful response from upstream servers.
a7b456
a7b456
This effectively reverts most of 51967f9807 ("SERVFAIL is an expected
a7b456
error return, don't try all servers.") and 4ace25c5d6 ("Treat REFUSED (not
a7b456
SERVFAIL) as an unsuccessful upstream response").
a7b456
a7b456
With the current behaviour, as soon as dnsmasq receives a SERVFAIL from an
a7b456
upstream server, it stops trying to resolve the query and simply returns
a7b456
SERVFAIL to the client.  With this commit, dnsmasq will instead try to
a7b456
query other upstream servers upon receiving a SERVFAIL response.
a7b456
a7b456
According to RFC 1034 and 1035, the semantic of SERVFAIL is that of a
a7b456
temporary error condition.  Recursive resolvers are expected to encounter
a7b456
network or resources issues from time to time, and will respond with
a7b456
SERVFAIL in this case.  Similarly, if a validating DNSSEC resolver [RFC
a7b456
4033] encounters issues when checking signatures (unknown signing
a7b456
algorithm, missing signatures, expired signatures because of a wrong
a7b456
system clock, etc), it will respond with SERVFAIL.
a7b456
a7b456
Note that all those behaviours are entirely different from a negative
a7b456
response, which would provide a definite indication that the requested
a7b456
name does not exist.  In our case, if an upstream server responds with
a7b456
SERVFAIL, another upstream server may well provide a positive answer for
a7b456
the same query.
a7b456
a7b456
Thus, this commit will increase robustness whenever some upstream servers
a7b456
encounter temporary issues or are misconfigured.
a7b456
a7b456
Quoting RFC 1034, Section 4.3.1. "Queries and responses":
a7b456
a7b456
    If recursive service is requested and available, the recursive response
a7b456
    to a query will be one of the following:
a7b456
a7b456
       - The answer to the query, possibly preface by one or more CNAME
a7b456
         RRs that specify aliases encountered on the way to an answer.
a7b456
a7b456
       - A name error indicating that the name does not exist.  This
a7b456
         may include CNAME RRs that indicate that the original query
a7b456
	  name was an alias for a name which does not exist.
a7b456
a7b456
       - A temporary error indication.
a7b456
a7b456
Here is Section 5.2.3. of RFC 1034, "Temporary failures":
a7b456
a7b456
    In a less than perfect world, all resolvers will occasionally be unable
a7b456
    to resolve a particular request.  This condition can be caused by a
a7b456
    resolver which becomes separated from the rest of the network due to a
a7b456
    link failure or gateway problem, or less often by coincident failure or
a7b456
    unavailability of all servers for a particular domain.
a7b456
a7b456
And finally, RFC 1035 specifies RRCODE 2 for this usage, which is now more
a7b456
widely known as SERVFAIL (RFC 1035, Section 4.1.1. "Header section format"):
a7b456
a7b456
    RCODE           Response code - this 4 bit field is set as part of
a7b456
                    responses.  The values have the following
a7b456
                    interpretation:
a7b456
                    (...)
a7b456
a7b456
                    2               Server failure - The name server was
a7b456
                                    unable to process this query due to a
a7b456
                                    problem with the name server.
a7b456
a7b456
For the DNSSEC-related usage of SERVFAIL, here is RFC 4033
a7b456
Section 5. "Scope of the DNSSEC Document Set and Last Hop Issues":
a7b456
a7b456
    A validating resolver can determine the following 4 states:
a7b456
    (...)
a7b456
a7b456
    Insecure: The validating resolver has a trust anchor, a chain of
a7b456
       trust, and, at some delegation point, signed proof of the
a7b456
       non-existence of a DS record.  This indicates that subsequent
a7b456
       branches in the tree are provably insecure.  A validating resolver
a7b456
       may have a local policy to mark parts of the domain space as
a7b456
       insecure.
a7b456
a7b456
    Bogus: The validating resolver has a trust anchor and a secure
a7b456
       delegation indicating that subsidiary data is signed, but the
a7b456
       response fails to validate for some reason: missing signatures,
a7b456
       expired signatures, signatures with unsupported algorithms, data
a7b456
       missing that the relevant NSEC RR says should be present, and so
a7b456
       forth.
a7b456
    (...)
a7b456
a7b456
    This specification only defines how security-aware name servers can
a7b456
    signal non-validating stub resolvers that data was found to be bogus
a7b456
    (using RCODE=2, "Server Failure"; see [RFC4035]).
a7b456
a7b456
Notice the difference between a definite negative answer ("Insecure"
a7b456
state), and an indefinite error condition ("Bogus" state).  The second
a7b456
type of error may be specific to a recursive resolver, for instance
a7b456
because its system clock has been incorrectly set, or because it does not
a7b456
implement newer cryptographic primitives.  Another recursive resolver may
a7b456
succeed for the same query.
a7b456
a7b456
There are other similar situations in which the specified behaviour is
a7b456
similar to the one implemented by this commit.
a7b456
a7b456
For instance, RFC 2136 specifies the behaviour of a "requestor" that wants
a7b456
to update a zone using the DNS UPDATE mechanism.  The requestor tries to
a7b456
contact all authoritative name servers for the zone, with the following
a7b456
behaviour specified in RFC 2136, Section 4:
a7b456
a7b456
    4.6. If a response is received whose RCODE is SERVFAIL or NOTIMP, or
a7b456
    if no response is received within an implementation dependent timeout
a7b456
    period, or if an ICMP error is received indicating that the server's
a7b456
    port is unreachable, then the requestor will delete the unusable
a7b456
    server from its internal name server list and try the next one,
a7b456
    repeating until the name server list is empty.  If the requestor runs
a7b456
    out of servers to try, an appropriate error will be returned to the
a7b456
    requestor's caller.
a7b456
a7b456
(cherry picked from commit 68f6312d4bae30b78daafcd6f51dc441b8685b1e)
a7b456
---
a7b456
 src/forward.c | 4 ++--
a7b456
 1 file changed, 2 insertions(+), 2 deletions(-)
a7b456
a7b456
diff --git a/src/forward.c b/src/forward.c
a7b456
index 245c448..1bbb264 100644
a7b456
--- a/src/forward.c
a7b456
+++ b/src/forward.c
a7b456
@@ -794,7 +794,6 @@ void reply_query(int fd, int family, time_t now)
a7b456
   /* Note: if we send extra options in the EDNS0 header, we can't recreate
a7b456
      the query from the reply. */
a7b456
   if (RCODE(header) == REFUSED &&
a7b456
-      !option_bool(OPT_ORDER) &&
a7b456
       forward->forwardall == 0 &&
a7b456
       !(forward->flags & FREC_HAS_EXTRADATA))
a7b456
     /* for broken servers, attempt to send to another one. */
a7b456
@@ -859,7 +858,8 @@ void reply_query(int fd, int family, time_t now)
a7b456
      we get a good reply from another server. Kill it when we've
a7b456
      had replies from all to avoid filling the forwarding table when
a7b456
      everything is broken */
a7b456
-  if (forward->forwardall == 0 || --forward->forwardall == 1 || RCODE(header) != REFUSED)
a7b456
+  if (forward->forwardall == 0 || --forward->forwardall == 1 ||
a7b456
+      (RCODE(header) != REFUSED && RCODE(header) != SERVFAIL))
a7b456
     {
a7b456
       int check_rebind = 0, no_cache_dnssec = 0, cache_secure = 0, bogusanswer = 0;
a7b456
       
a7b456
-- 
a7b456
2.21.1
a7b456