<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
<!-- <!ENTITY I-D.draft-bortzmeyer-dnsop-dns-privacy SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-bortzmeyer-dnsop-dns-privacy"> TO ENABLE when bibxml3 is back to normal -->
<t>This document describes a common output format of Passive DNS Servers which clients can query. The output format description includes also in addition a common semantic for each Passive DNS system. By having multiple Passive DNS Systems adhere to the same output format for queries, users of multiple Passive DNS servers will be able to combine result sets easily.</t>
<t>Passive DNS is a technique described by Florian Weimer in 2005 in <xreftarget="WEINERPDNS">Passive DNS replication, F Weimer - 17th Annual FIRST Conference on Computer Security</xref>. Since then multiple Passive DNS implementations evolved over time. Users of these Passive DNS servers may query a server (often via <xreftarget="RFC3912">WHOIS</xref> or HTTP <xreftarget="REST">REST</xref>), parse the results and process them in other applications.</t>
There are multiple implementations of Passive DNS software. Users of passive DNS query each implementation and aggregate the results for their search. This document describes the output format of four Passive DNS Systems (<xreftarget="DNSDB"/>, <xreftarget="PDNSCERTAT"/>, <xreftarget="PDNSCIRCL"/> and <xreftarget="PDNSCOF"/>) which are in use today and which already share a nearly identical output format.
As the format and the meaning of output fields from each Passive DNS need to be consistent, we propose in this document a solution to commonly name each field along with their corresponding interpretation. The format follows a simple key-value structure in <xreftarget="RFC4627">JSON</xref> format.
The benefit of having a consistent Passive DNS output format is that multiple client implementations can query different servers without having to have a separate parser for each
The document does not describe the protocol (e.g. <xreftarget="RFC3912">WHOIS</xref>, HTTP <xreftarget="REST">REST</xref>) nor the query format used to query the Passive DNS. Neither does this document describe "pre-recursor" Passive DNS Systems. Both of these are separate topics and deserve their own RFC document.
<t> As a Passive DNS can include protection mechanisms for their operation, results might be different due to those protection measures. These mechanisms filter out DNS answers if they fail some criteria. The <xreftarget="BAILIWICK">bailiwick algorithm</xref> protects the Passive DNS Database from <xreftarget="CACHEPOISONING">cache poisoning attacks</xref>.
Another limitiation that clients querying the database need to be aware of is that each query simply gets an snapshot-answer of the time of querying. Clients MUST NOT rely on consistent answers. Nor must they assume that answers must be identical across multiple Passive DNS Servers.
<t>The formatting of the answer follows the <xreftarget="RFC4627">JSON</xref> format. The order of the fields is not significant for the same resource type. That means, the same name tuple plus timing information identifies a unique answer per server.</t>
<t>The intent of this output format is to be easily parsable by scripts. Each JSON object is expressed on a single line to be processed by the client line-by-line. Every implementation MUST support the JSON output format.</t><!-- note: it is "parsable" if you want to be really nit-picking. See: https://en.wiktionary.org/wiki/parsable -->
<t>Note that value is defined in <xreftarget="RFC4627">JSON</xref> and has the exact same specification as there. The same goes for the definition of string.</t>
<t>Implementation MUST support all the mandatory fields.</t>
<t>Uniqueness property: the tuple (rrname,rrtype,rdata) will always be unique within one answer per server. While rrname and rrtype are always individual JSON primitive types (strings, numbers, booleans or null), rdata MAY be an array as defined in <xreftarget="RFC4627">JSON</xref>. Implementors of this draft MUST be able to deal with rdata being returned as JSON array or alternatively as a JSON string. <!-- MOTE: this is not good --></t>
<t>This field returns the resource record type as seen by the passive DNS. The key is rrtype and the value is in the interpreted record type. If the value cannot be interpreted the
decimal value is returned following the principle of transparency as described in <xreftarget="RFC3597">RFC 3597</xref>.
The resource record type can be any values as described by IANA in the DNS parameters document in the section 'DNS Label types' (http://www.iana.org/assignments/dns-parameters).
A client MUST be able to understand these textual rtype values. In addition, a client MUST be able to handle a decimal value (as mentioned above) as answer.
<t>This field returns the data of the queried resource. In general, this is to be interpreted as string. Depending on the rtype, this can be an IPv4 or IPv6 address, a domain name (as in the case of CNAMEs), an SPF record, etc. A client MUST be able to interpret any value which is legal as the right hand side in a DNS zone file <xreftarget="RFC1035">RFC 1035</xref> and <xreftarget="RFC1034">RFC 1034</xref>. If the rdata came from an unknown DNS resource records, the server must follow the transparency principle as described in <xreftarget="RFC3597">RFC 3597</xref>.</t>
<t>This field returns the first time that the record / unique tuple (rrname, rrtype, rdata) has been seen by the passive DNS. The date is expressed in seconds (decimal ASCII) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC.</t>
<t>This field returns the last time that the unique tuple (rrname, rrtype, rdata) record has been seen by the passive DNS. The date is expressed in seconds (decimal ASCII) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC.</t>
<t>Specifies how many authoritative DNS answers were received at the Passive DNS Server's collectors with the set of answers (i.e. same data). The number of requests is expressed as a decimal value.</t>
<t>Specifies the number of times this particular event denoted by the other type fields has been seen in the given time interval (between time_last and time_first). Decimal number.</t>
<t>The bailiwick is the best estimate of the apex of the zone where this data is authoritative. String.</t>
</section>
</section>
<sectiontitle="Additional Fields">
<t>Implementations MAY support the following fields:</t>
<sectiontitle="sensor_id">
<t>This field returns the sensor information where the record was seen. The sensor_id is an opaque byte string as defined by <xreftarget="RFC5001"> RFC 5001 in section 2.3</xref>.</t>
<t>This field returns the first time that the unique tuple (rrname, rrtype, rdata) record has been seen via zone file import. The date is expressed in seconds (decimal ASCII) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC.</t>
<t>This field returns the last time that the unique tuple (rrname, rrtype, rdata) record has been seen via zone file import. The date is expressed in seconds (decimal ASCII) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC.</t>
<t>In accordance with <xreftarget="RFC6648"/>, designers of new passive DNS applications that would need additional fields can request and register new field name at https://github.com/adulau/pdns-qof/wiki/Additional-Fields.</t>
<t>Passive DNS Servers collect DNS answers from multiple collecting points ("sensors") which are located on the Internet-facing side of DNS recursors. In this process, they intentionally omit the source IP, source port, destination IP and destination port. Furthermore, since multiple sensors feed into a passive DNS server, the resulting data gets mixed together, reducing the likelihood that Passive DNS Servers are able to find out much about the actual person querying the DNS records nor who actually sent the query. In this sense, passive DNS Servers are similar to keeping an archive of all previous phone books - if public DNS records can be compared to phone numbers - as they often are.
Nevertheless, the authors encourage Passive DNS implementors to take special care of privacy issues. [draft-bortzmeyer-dnsop-dns-privacy] is an excellent starting point for this.
Finally, the overall recommendations in <xreftarget="RFC6973">RFC6973</xref> should be taken into consideration when designing any application which uses Passive DNS data.</t>
<t>In some cases, Passive DNS output might contain confidential information and its access might be restricted. When a user is querying multiple Passive DNS and aggregating the data, the sensitivity of the data must be considered.</t>