\include{parameters}
\usetheme{AFNIC}
\usepackage[english]{babel}
\usepackage[latin1]{inputenc}
\usepackage{bortzmeyer-utils}

\title{A versatile platform for DNS metrics with its application to IPv6}
\author{Stéphane Bortzmeyer\\AFNIC\\\texttt{bortzmeyer@nic.fr}}
\date{RIPE 57 - Dubai - October 2008}

%\setlength{\parskip}{1ex plus 0.5ex minus 0.2ex} 
% \setlength{\parskip}{15pt} 
\setlength{\parskip}{15pt plus 10pt minus 10pt} 

\AtBeginSection[]
{
   \begin{frame}
       \frametitle{Where are we in the talk?}
       \begin{block}{}
       {\tableofcontents[currentsection]}
       \end{block}
   \end{frame}
}

\begin{document}

\maketitle

\begin{frame}
  \titlepage
\end{frame}

\section{General presentation}

\begin{frame}
\frametitle{What is AFNIC}
AFNIC is the registry
for the TLD
\dns{.fr} (France) .

51 employees, 1.2 million domain names and a quite recent R\&D department.
\end{frame}

\begin{frame}
\frametitle{Motivation}
A DNS registry has a lot of information it does not use. 

Our marketing
team or the technical team are asking for all sort of things (``How
many of our domains are used for e-mail only?'') for which
we \emph{may} have the answer.
\end{frame}

\begin{frame}
\frametitle{More specific motivation}
\begin{block}{Getting information about the deployment of new
techniques like IPv6}{We focus on things that we can obtain from the
DNS because we are a domain name registry.}\end{block}
\only<2->{Possible surveys: IPv6, SPF, DNSSEC, EDNS0, Zonecheck\ldots
Let's build a multi-purpose platform for that!}
\end{frame}

\begin{frame}
\frametitle{Other aims}
\begin{enumerate}
\item \emph{Versatile}, able to do many different surveys (most known tools
deal only with one survey).
\item Works unattended (from cron, for instance), for periodic runs,
\item Stores raw results, not just aggregates, for long-term analysis,
\item Designed to be distributable.
\end{enumerate}
\end{frame}

\begin{frame}
\frametitle{What we can learn from the DNS (and beyond)}
\begin{itemize}
\item<1->What we send \emph{out}: active DNS queries sent to domain
name servers.
\item<2->What comes \emph{in}: DNS queries received by authoritative name
servers, passively monitored (``Who knocks at the door and what are
they asking for?'').
\end{itemize}
\only<3->{We will work on both, study the long-term evolution and
publish results.}
\end{frame}

\section{Measurements based on passive observations}

\begin{frame}
\frametitle{Passive observation of queries}
[Warning, not yet started.]

It will work by passive monitoring of the \dns{fr} name servers. We
are talking about long-term monitoring, not just the quick glance that
DSC offers.

The idea is to address the needs of the R\&D or of the marketing, not
just the needs of the NOC.

\only<2->{It will work mostly by port mirroring.}
\end{frame}

\begin{frame}
\frametitle{Expected uses of the passive measurements}
It will allow us to survey things like:
\begin{itemize}
\item<2-> Percentage of servers without SPR (Source Port Randomisation,
see \dns{.at} publications).
\item<3-> Percentage of requests done over IPv6 transport (unlike DSC, we will
be able to study long-term trends).
\item<4-> Percentage of requests with EDNS0 or DO.
\item<5-> Top N domains for which there is a NXDOMAIN reply.
\item<6-> But the list is open\ldots
\end{itemize}
\end{frame}

\section{Measurements based on active queries}

\begin{frame}
\frametitle{Active queries}
This is my main subject.

\only<2->{This is the realm of our \emph{DNSwitness} program.}

\only<3->{Announced here for the first time.}
\end{frame}

\begin{frame}
\frametitle{Related work}
\begin{itemize}
\item Patrick Maigron's measurements on IPv6 penetration \url{http://www-public.it-sudparis.eu/~maigron/}
\item JPRS, the ".jp" registry makes for a long time detailed measures on
IPv6 use (not yet published, see \url{http://v6metric.inetcore.com/en/index.html})
\item \dns{iis.se} "engine", part of their dnscheck tools, allows scanning the
entire zone to test every subdomain is properly configured \url{http://opensource.iis.se/trac/dnscheck/wiki/Engine}
\item And many others
\end{itemize}
\end{frame}

\begin{frame}
\frametitle{How it works}
DNSwitness mostly works by
asking the DNS. It loads a list of delegated zones and
queries them for various records.

\only<2->{But it can also perform other queries: HTTP and SMTP tests,
running Zonecheck\ldots}
\end{frame}


\begin{frame}[fragile]
\frametitle{The first algorithm}
Crude version of DNSwitness (everyone at a TLD registry wrote such a script at least
once). Here, to test SPF records:
\begin{info}
for domain in $(cat $DOMAINS); do
    echo $domain
    dig +short TXT $domain | grep "v=spf1"
done
\end{info}
\only<2->{Problems: does not scale, a few broken domains can slow it down
terribly, unstructured output, difficult to extend to more complex surveys.}
\end{frame}

\begin{frame}
\frametitle{The architecture}
DNSwitness is composed of a generic socle, which handles:
\begin{itemize}
\item zone file
parsing,
\item and parallel querying of the zones.
\end{itemize}
and of a module which will
perform the actual queries.
\end{frame}

\begin{frame}
\frametitle{Modules}
Thus, surveying the use of DNSSEC requires
a DNSSEC module (which will presumably ask for DNSKEY records)

\only<2->{Surveying IPv6 deployment requires an IPv6 module (which will, for
instance, ask for AAAA records for www.\$DOMAIN and stuff like that). }

\only<3->{Not all techniques are amenable to DNS active querying: for
instance, DKIM is not easy because we do not know the selectors.}
\end{frame}

\begin{frame}
\frametitle{Using it}
\begin{block}{Warning about the traffic}{DNSwitness can generate a lot
of DNS requests. May be you need to warn the name servers admins. As
of today, DNSwitness uses a caching resolver, to limit the strain on
the network.}
\end{block}
\begin{block}<2->
{UUID}
{To sort out the results in the database, every run generates a
unique identifier, a UUID and stores it.}
\end{block}

\end{frame}

\begin{frame}[fragile]
\frametitle{Options, arguments, \ldots}
Among the interesting options: run on only a random sample of the zone.

Complete usage instructions depend on the module
\begin{info}
 time dnswitness --num_threads=15000  \
        --debug=1 --module Dnssec fr.db --num_tasks=20 
\end{info}
\end{frame}

\begin{frame}[fragile]
\frametitle{Reading the results}
Querying of the database depends on the module. Here, for DNSSEC:
\begin{info}
SELECT domain,dnskey FROM Tests WHERE uuid='f72c33a6-7c3c-44e2-b743-7e67edf98f6c';

SELECT count(domain) FROM Tests WHERE uuid='f72c33a6-7c3c-44e2-b743-7e67edf98f6c' 
                                  AND nsec;
 
\end{info}
\end{frame}

\begin{frame}
\frametitle{Implementation}
\begin{itemize}
\item Written in Python,
\item The generic socle and the querying module are separated,
\item Most modules store the results in a PostgreSQL database (we
provide a helper library for that),
\item Uses the DNS library dnspython from Nominum.
\end{itemize}
Everything works fine on small zones. 

Larger zones may put a serious
strain on the machine and on some virtual resources (lack of file
descriptors, hardwired limits of \computer{select()} on Linux\ldots).
\end{frame}

\begin{frame}
\frametitle{Parallelism}
To avoid being stopped by a broken domain, DNSwitness
is \emph{parallel}.

N threads are run to perform the queries.

For \dns{.fr} (1.2 million domains), the optimal number of threads is around 15,000. The results
are obtained in a few hours.
\end{frame}

\begin{frame}
\frametitle{Developing a module}
Several modules are shipped with DNSwitness.

Should you want to develop one, you'll need mostly to write:
\begin{enumerate}
\item A class Result, with the method to store the result,
\item A class Plugin, with a method for the queries.
\end{enumerate}

A Utils package is provided to help the module authors.

\end{frame}

\begin{frame}[fragile]
\frametitle{The example module}
\begin{info}
""" DNSwitness *dummy* module to illustrate what needs to be put in a
module. This module mostly prints things, that's all.

class DummyResult(BaseResult.Result):
    
    def store(self, uuid):
        print "Dummy storage of data for %s" % self.domain

class Plugin(BasePlugin.Plugin):

    def query(self, zone, nameservers):
        result = DummyResult()
        result.universe = 42 # Here would go the DNS query
        return result

\end{info}
\end{frame}

\section{Preliminary Results}

\begin{frame}
\frametitle{Actual results}
The data presented here were retrieved from \dns{.fr} zones (17th
october 2008).

No long-term studies yet, the program is too recent.

The resolver used was Unbound, the machine was a two-Opteron PC,
running Debian/Linux.
\end{frame}

\begin{frame}
\frametitle{DNSSEC in ``\texttt{.fr}''}

Four hours for the run.

49 domains have a key.

But only 37 are actually signed (may be because of an error, such as
serving the unsigned version of the zone file).

Side note: \dns{.fr} is not signed, one domain in \dns{.fr} is in the
ISC DLV.
\end{frame}

\begin{frame}
\frametitle{SPF in .FR}
[RFC 4408]

188108 domains have SPF (15 \%). 

But there are only 4350 different records:

\begin{itemize}
\item Popular records like \computer{v=spf1 a mx ?all}
\item One big hoster added SPF for all its domains\ldots
\end{itemize}

\end{frame}

\begin{frame}
\frametitle{IPv6 in .FR}
We measure several things:

\begin{itemize}
\item Presence of AAAA records for NS and MX
\item Presence of AAAA records
for \computer{\$DOMAIN}, \computer{www.\$DOMAIN}, \ldots
\item Whether the machines reply to HTTP or SMTP connections.
\end{itemize}

\end{frame}

\begin{frame}
\frametitle{IPv6, DNS only}

When testing just the DNS, DNSwitness module runs during four hours
and gives:

51355 (4 \%) domains have at least one AAAA (Web, mail, DNS\ldots)

410 (0,03 \%) have a AAAA for all of the above three services.

Among the hosts, 435 different addresses. 24 are 6to4 and 8 are local
(a lot of \computer{::1}\ldots). % And 4 are IPv4-mapped.

\end{frame}

\begin{frame}
\frametitle{IPv6, with HTTP and SMTP tests}

78630 IP addresses, 67687 (86 \%) being HTTP. (For different
addresses, HTTP and SMTP are 50/50.)

Among the 78630 addresses, 73122 (92 \%) work (HTTP reply, even 404
or 500).

Warning: spurious addresses like \computer{::1} are not yet excluded.

For the different addresses, only 292 (on 431, 67 \%) work.
\end{frame}

\begin{frame}
\frametitle{Wildcards?}
227190 (18 \%) have wildcards for at least one type.
\end{frame}

\begin{frame}
\frametitle{Distribution}
\url{http://www.dnswitness.net/}

Distributed under the free software licence GPL.

\end{frame}

\section{Future work}

\begin{frame}
\frametitle{Future work on DNSwitness}
\begin{itemize}
\item Asking directly the authoritative name servers, instead of going
through a resolver.
\item New modules, for instance testing the domains
``email-only'' or ``web-only''. Or a module for Zonecheck ``patrols''.
\end{itemize}
\end{frame}

\begin{frame}
\frametitle{Future work on the rest of the project}
\begin{itemize}
\item<1->Gather more users. Yes, you :-)
\item<2->Come back in one year with trends.
\item<3-> Start to develop the ``DNS passive monitor''. Thanks to
the authors of dnscap, and similar programs.
\end{itemize}
\end{frame}

\end{document}

