***************************
*** DNS TUTORIAL ***
***************************




By Frank Dekervel (kervel@hotmail.com)
Belgium

CONTENTS

1 : about this document

2 : the domain name system

3 : DNS questions and answers
* DNS packets explained
* file format for DNS

ABOUT THIS DOCUMENT
-------------------
Because the DNS is quite difficult to understand, and there are no really
good tutorials available, i will try to make my own. I hope it will be easy
to understand. In later versions of this document, i'll include code for 
Borland Delphi or C++.
This tutorial is *NOT* the only thing you need to learn about the DNS proto.

If you want to master the DNS, i suggest the following things 
* download rfc 1034 for background information, and the model.
* download rfc 1035 (Domain names : specification and implementation)
* http://users.neca.com/vmis/
download all the tutorials about the tcp/ip suite , including winsock,
IP, DNS, PPP,... and many more.
* read this tutorial

This document may be distributed, but i kindly ask to report me distributions
or changes.
Since i'm not a native speaker, and since i'm not an expert, there are errors
in this file. Please report them to me at kervel@hotmail.com

Oh, yeah, this document is written in windows notepad, and hase a sucking
layout. Feel free to format this doc and send it back to me :)

THE DOMAIN NAME SYSTEM
----------------------
To continue with this tutorial you will need some background information.
A brief history of domain names can be found in rfc1034 or rfc1035, and read
rfc1035 to know what DNS, recursion, name servers and resolvers are.

DNS mainly uses UDP (port 53), but a DNS server also should accept TCP
(also port 53). The advantage of UDP is speed, but if you need to do a large
number of requests, you'ld better use TCP.


DNS : QUESTIONS AND ANSWERS
---------------------------
To demonstrate a DNS request, i ll start from this packet.
Below, you see a DNS request for space.vadium.sk via an ethernet device.
Don't mind the ethernet, IP and UDP headers, the DNS request starts from 2A
The packet is not valid, because it is broadcasted to 192.168.0.255


ORIGINAL DATA:


ےےےےےے èط5ٍ E =ل 7^ہ¨ ہ¨ ےc 5 )™‘Q)  sp
acevadiumsk 


HEX-DUMP:


OFFSET 00 01 02 03 04 05 06 07-08 09 0A 0B 0C 0D 0E 0F
------ -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
000 FF FF FF FF FF FF 00 00-E8 D8 35 F2 08 00 45 00 ےےےےےے..èط5ٍ.E.
010 00 3D E1 01 00 00 20 11-37 5E C0 A8 00 01 C0 A8 .=ل.. 7^ہ¨.ہ¨
020 00 FF 04 63 00 35 00 29-99 91 51 29 01 00 00 01 .ےc.5.)™‘Q)..
030 00 00 00 00 00 00 05 73-70 61 63 65 06 76 61 64 ......spacevad
040 69 75 6D 02 73 6B 00 00-01 00 01 iumsk...

ANALYSIS:


ETHERNET PACKET ANALYSIS

Packet Length :75 Bytes (0x0000004B)
Source Ethernet address :0.0.232.216.53.242
Destination Ethernet address :255.255.255.255.255.255
Packet Type :0x0800

IP HEADER ANALYSIS

IP source address 192.168.0.1
IP dest address 192.168.0.255
Version + header 0x45
TOS (Type of service) 0x00
IP packet length 61 Bytes (0x003D)
IP fragmentation ID 57601 (0xE101)
IP flags byte 0x00
Time to live 0x20
protocol Type 0x11
IP checksum 0x375E

UDP HEADER ANALYSIS

UDP source port 1123 (0x0463)
UDP dest. port 53 (0x0035)
UDP length 41 (0x0029)
UDP checksum 0x9991

* a few words about the string format used in DNS

Look at offset 0x36, there you see a string: 05 space 06 vadium 02 sk 00
Like in C, strings are terminated with a 00 , but each "." (which, in domain
names, separates domains from subdomains ..) is changed into a byte,
displaying the length of the next piece. So, to make a right string of
www.altavista.digital.com, u should do the following : 
- www : 3 letters
- altavista : 9 letters
- digital : 7 letters
- com : 3 letters
the string is 0x03 + "www" + 0x09 + "altavista" + 0x07 + "digital" + 0x03 
+ "com" + 0x00

* DNS compression

To reduce size of a DNS packet, a compression sheme exists. So, if we need
two times the string 'www.altavista.digital.com' in or packet, we only
write it once, and we change the second and third ... occurence into a
pointer. A (16bit) pointer looks like this : 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| 1 1| OFFSET |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Thus, if the first caracter of a string is non-ascii, it is probably a pointer.
If you make a resolver or a server, make sure your application recognizes
such pointers.

You can represent a domain name like this

* a DNS string , terminated with 0x00 (like above)
* a pointer (like here)
* a mixed form : first a DNS string, without 0x00 termination, but terminated
by a pointer.

* the DNS packet

A DNS packet consist of this : (the following things are stolen from rfc1035)
+---------------------+
| Header |
+---------------------+
| Question | the question for the name server
+---------------------+
| Answer | RRs answering the question
+---------------------+
| Authority | RRs pointing toward an authority
+---------------------+
| Additional | RRs holding additional information
+---------------------+
* the DNS header


1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| ID |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR| Opcode |AA|TC|RD|RA| Z | RCODE |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| QDCOUNT |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| ANCOUNT |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| NSCOUNT |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| ARCOUNT |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

where:

ID : a number to identify the request. The answer will have the same ID
useful for multiple simultaneous requests : so, you'll know which
answer belongs to which request.

The next 16 bits are the flags:

// QR 0 BIT 0=request,1=answer
// OPCODE 0 4BIT 0=req,1=inverse,2=status
// AA 0 BIT 0=authority,1=-~
// TC 0 BIT 0=-~,1=truncated
// RD 1 BIT 0=-~,1=recursion desired
// RA 0 BIT 0=-~,1=recursion available
// ZERO 0 3BIT ALWAYS ZERO
// RCODE 0 4BIT 0=-~,1=error occured
// for values of RCODE, view rfc1035

After that, we have the number of requests, the number of answers, the
number of authenticy records and the number of additional records.

* the DNS request section (question)

1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| |
/ QNAME /
/ /
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| QTYPE |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| QCLASS |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Qname is a DNS string (we discussed above), and Qtype is the type of the 
Query. For the type codes see a page below. Qclass is always
0x0001 for internet. Now u should be able to decode the request packet
in the beginning of this tutorial.

* RR's (Resource records)

Answer, authority and additional all use the RR format.
The RR format looks like this :
1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| |
/ /
/ NAME /
| |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| TYPE |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| CLASS |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| TTL |
| |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| RDLENGTH |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--|
/ RDATA /
/ /
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

NAME is the domain name u specified in the request.
TYPE is a type code, that specifies the type in the RDATA section

the most important type codes are :

NAME VALUE MEANING

A 1 a host address
NS 2 an authoritative name server
MD 3 a mail destination (Obsolete - use MX)
MF 4 a mail forwarder (Obsolete - use MX)
CNAME 5 the canonical name for an alias
SOA 6 marks the start of a zone of authority

for a more detailed list of Qtypes, i refer to RFC 1035

CLASS is always 1 if it's an internet packet (for all class values, see
rfc1035)

RDLENGTH is the length of the RDATA section

RDATA is the data section of the RR. there are many possible RDATA data
formats, but in most cases, RDATA is the 4 octet IP address you asked for.
Another format you'll need for more complex operations (like mail servers),
is the WKS RDATA format. I refer to RFC1035 for that. The format of RDATA
depends on the value of TYPE

* file formats.

If you make a DNS daemon, you could store your information in every possible
file format, like DBF or other database formats, but in rfc 1035 a special
text file format is developed for domain info. (the master file format).

A master file consist of lines, separated by a CR-LF sequence. Usually, 
the file starts with this:

$ORIGIN kervel.dyn.ml.org ;this line says the server is running on kervel....
$INCLUDE m2.txt mail ;this line specifies the file m2.txt has all the entries
;for mail.kervel.dyn.ml.org

So: each line can has a comment string (begins with a ;), the comment string
ends at the end of the line.
$ORIGIN specifies the domain where the name deamon is running on.
$INCLUDE loads another file into a subdomain
Parentheses are used to split a line in multiple lines, within parentheses
CR-LF's don't mean the end of the line, only the end of the comment string.
Then the hosts/mailboxes,... are specified like this:
[<domain-name>] <rr>

domain-name : optional : if not set, the domain name is the domain of the
master file, if set, the server uses the domain name specified, without
looking for the domain name of the master file.

-> special characters, like a ".", are noted like this: '\.', to avoid
conflicts. So altavista.com is altavista\.com
(stolen from rfc 1035)

@ A free standing @ is used to denote the current origin.
\X where X is any character other than a digit (0-9), is
used to quote that character so that its special meaning
does not apply. For example, "\." can be used to place
a dot character in a label.
\DDD where each D is a digit is the octet corresponding to
the decimal number described by DDD. The resulting
octet is assumed to be text and is not checked for
special meaning.
( ) Parentheses are used to group data that crosses a line
boundary. In effect, line terminations are not
recognized within parentheses.
; Semicolon is used to start a comment; the remainder of
the line is ignored.

rr : [<ttl>] [<class>] <type> <rdata>

ttl and class are used in the sent RR's, and are optional.
type and rdata are used in the RR to send.
A sample master file: (stolen from rfc1035)

@ IN SOA VENERA Action\.domains (
20 ; SERIAL
7200 ; REFRESH
600 ; RETRY
3600000; EXPIRE
60) ; MINIMUM
NS A.ISI.EDU.
NS VENERA
NS VAXA
MX 10 VENERA
MX 20 VAXA
A A 26.3.0.103
VENERA A 10.1.0.52
A 128.9.0.32
VAXA A 10.2.0.27
A 128.9.0.33
$INCLUDE <SUBSYS>ISI-MAILBOXES.TXT

Where the file <SUBSYS>ISI-MAILBOXES.TXT is:
MOE MB A.ISI.EDU.
LARRY MB A.ISI.EDU.

CURLEY MB A.ISI.EDU.
STOOGES MG MOE
MG LARRY
MG CURLEY

* to be continued *