Merge branch 'adulau:master' into patch-1

This commit is contained in:
Joakim von Brandis 2021-05-06 23:15:45 +02:00
commit 99e0aee951
11 changed files with 1497 additions and 2 deletions

2
.gitignore vendored Normal file
View file

@ -0,0 +1,2 @@
*.swp

View file

@ -1,5 +1,12 @@
# Version 0.8
## Content changes
* added time_first_ms, time_last_ms
* clarified that time_{first,last} OR zone_time_{first,last} can be specified.
* Added MIME type SHOULD be "application/x-ndjson". As discussed in #9.
## Other changes
* Added JSON schema
* Started tracking in a CHANGELOG file
* Change Aaron's Address
* Address and find a compromise for issue #17

View file

@ -0,0 +1,11 @@
# Example python parser
This little package can parse the Passive DNS Common Output Format (COF) and validate it.
It is given as example code.
* cofparser.py ... a manually written parser for the COF format
* cofparser_jsonschema.py .... one which uses the JSON schema to validate.

View file

View file

@ -0,0 +1,66 @@
"""
Example passive DNS Common Output Format]() parser.
It will parse the JSON file and validate it.
Author: Aaron Kaplan <aaron@lo-res.org>
Copyright 2021, all rights reserved.
License: AGPL v3. See https://www.gnu.org/licenses/agpl-3.0.en.html
"""
import sys
import json # maybe use an ndjson library...
def is_valid(d: dict) -> bool:
# Check MANDATORY fields according to COF
if "rrname" not in d:
print("Missing MANDATORY field 'rrname'", file=sys.stderr)
return False
if not isinstance(d['rrname'], str):
print("Type error: 'rrname' is not a JSON string", file=sys.stderr)
return False
if "rrtype" not in d:
print("Missing MANDATORY field 'rrtype'", file=sys.stderr)
return False
if not isinstance(d['rrtype'], str):
print("Type error: 'rrtype' is not a JSON string", file=sys.stderr)
return False
if "rdata" not in d:
print("Missing MANDATORY field 'rdata'", file=sys.stderr)
return False
if "rdata" not in d:
print("Missing MANDATORY field 'rdata'", file=sys.stderr)
return False
if not isinstance(d['rdata'], str) and not isinstance(d['rdata'], list):
print("'rdata' is not a list and not a string.", file=sys.stderr)
return False
if not ("time_first" in d and "time_last" in d) or ("zone_time_first" in d and "zone_time_last" in d):
print("We are missing EITHER ('first_seen' and 'last_seen') OR ('zone_time_first' and zone_time_last') fields")
return False
# currently we don't check the OPTIONAL fields. Sorry... to be done later.
return True
def parse_line(input: str) -> dict:
d = None
try:
d = json.loads(input)
if not is_valid(d):
print("Warning: line %s does not conform to the COF standard." % input)
except Exception as ex:
print("error. Could not parse input '%s'. Reason: '%s'" %(input, str(ex)), file=sys.stderr)
return d
def parse_lines(multilines: str):
for line in multilines.split('\n'):
yield parse_line(line)
if __name__ == "__main__":
mock_input = """{"count":1909,"rdata":["cpa.circl.lu"],"rrname":"www.circl.lu","rrtype":"CNAME","time_first":"1315586409","time_last":"1449566799"}
{"count":2560,"rdata":["cpab.circl.lu"],"rrname":"www.circl.lu","rrtype":"CNAME","time_first":"1449584660","time_last":"1617676151"}"""
for result in parse_lines(mock_input):
print("result: %r" % result)

View file

@ -0,0 +1,37 @@
import sys
import json
import jsonschema
from jsonschema import validate
def get_schema(filename):
"""This function loads the given schema available"""
with open(filename, 'r') as file:
schema = json.load(file)
return schema
def validate_json(json_data, schema=None):
"""REF: https://json-schema.org/ """
try:
validate(instance=json_data, schema=schema)
except jsonschema.exceptions.ValidationError as err:
print(err)
err = "Given JSON data is InValid"
return False, err
message = "Given JSON data is Valid"
return True, message
if __name__ == "__main__":
schema = get_schema("schema/schema.json")
# Convert json to python object.
with open(sys.argv[1], 'r') as ndjson_file:
for line in ndjson_file:
jsonData = json.loads(line)
# validate it
is_valid, msg = validate_json(jsonData, schema=schema)
print(msg)

2
example_code/testdata/data.json vendored Normal file
View file

@ -0,0 +1,2 @@
{"count":1909,"rdata":["cpa.circl.lu"],"rrname":"www.circl.lu","rrtype":"CNAME","time_first":1315586409,"time_last":1449566799}
{"count":2560,"rdata":["cpab.circl.lu"],"rrname":"www.circl.lu","rrtype":"CNAME","time_first":1449584660,"time_last":1617676151}

728
i-d/pdns-qof.txt Normal file
View file

@ -0,0 +1,728 @@
Domain Name System Operations A. Dulaunoy
Internet-Draft CIRCL
Intended status: Informational A. Kaplan
Expires: December 3, 2020
P. Vixie
H. Stern
Farsight Security, Inc.
June 1, 2020
Passive DNS - Common Output Format
draft-dulaunoy-dnsop-passive-dns-cof-08
Abstract
This document describes a common output format of Passive DNS Servers
which clients can query. The output format description includes also
in addition a common semantic for each Passive DNS system. By having
multiple Passive DNS Systems adhere to the same output format for
queries, users of multiple Passive DNS servers will be able to
combine result sets easily.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 3, 2020.
Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Dulaunoy, et al. Expires December 3, 2020 [Page 1]
Internet-Draft Passive DNS - Common Output Format June 2020
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3
2. Limitation . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Common Output Format . . . . . . . . . . . . . . . . . . . . 4
3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2. ABNF grammar . . . . . . . . . . . . . . . . . . . . . . 4
3.3. Mandatory Fields . . . . . . . . . . . . . . . . . . . . 5
3.3.1. rrname . . . . . . . . . . . . . . . . . . . . . . . 5
3.3.2. rrtype . . . . . . . . . . . . . . . . . . . . . . . 5
3.3.3. rdata . . . . . . . . . . . . . . . . . . . . . . . . 5
3.3.4. time_first . . . . . . . . . . . . . . . . . . . . . 6
3.3.5. time_last . . . . . . . . . . . . . . . . . . . . . . 6
3.4. Optional Fields . . . . . . . . . . . . . . . . . . . . . 6
3.4.1. count . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4.2. bailiwick . . . . . . . . . . . . . . . . . . . . . . 6
3.5. Additional Fields . . . . . . . . . . . . . . . . . . . . 6
3.5.1. sensor_id . . . . . . . . . . . . . . . . . . . . . . 6
3.5.2. zone_time_first . . . . . . . . . . . . . . . . . . . 7
3.5.3. zone_time_last . . . . . . . . . . . . . . . . . . . 7
3.5.4. origin . . . . . . . . . . . . . . . . . . . . . . . 7
3.5.5. time_first_ms . . . . . . . . . . . . . . . . . . . . 7
3.5.6. time_last_ms . . . . . . . . . . . . . . . . . . . . 7
3.6. Additional Fields Registry . . . . . . . . . . . . . . . 7
3.7. Additional notes . . . . . . . . . . . . . . . . . . . . 8
3.8. Suggested MIME Types . . . . . . . . . . . . . . . . . . 8
4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
6. Privacy Considerations . . . . . . . . . . . . . . . . . . . 8
7. Security Considerations . . . . . . . . . . . . . . . . . . . 9
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9
8.1. Normative References . . . . . . . . . . . . . . . . . . 9
8.2. References . . . . . . . . . . . . . . . . . . . . . . . 10
8.3. Informative References . . . . . . . . . . . . . . . . . 11
Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12
Dulaunoy, et al. Expires December 3, 2020 [Page 2]
Internet-Draft Passive DNS - Common Output Format June 2020
1. Introduction
Passive DNS is a technique described by Florian Weimer in 2005 in
Passive DNS replication, F Weimer - 17th Annual FIRST Conference on
Computer Security [WEIMERPDNS]. Since then multiple Passive DNS
implementations were created and evolved over time. Users of these
Passive DNS servers may query a server (often via WHOIS [RFC3912] or
HTTP REST [REST]), parse the results and process them in other
applications.
There are multiple implementations of Passive DNS software. Users of
passive DNS query each implementation and aggregate the results for
their search. This document describes the output format of four
Passive DNS Systems ([DNSDB], [DNSDBQ], [PDNSCERTAT], [PDNSCIRCL] and
[PDNSCOF]) which are in use today and which already share a nearly
identical output format. As the format and the meaning of output
fields from each Passive DNS need to be consistent, we propose in
this document a solution to commonly name each field along with their
corresponding interpretation. The format follows a simple key-value
structure in JSON [RFC4627] format. The benefit of having a
consistent Passive DNS output format is that multiple client
implementations can query different servers without having to have a
separate parser for each individual server. passivedns-client
[PDNSCLIENT] currently implements multiple parsers due to a lack of
standardization. The document does not describe the protocol (e.g.
WHOIS [RFC3912], HTTP REST [REST]) nor the query format used to query
the Passive DNS. Neither does this document describe "pre-recursor"
Passive DNS Systems. Both of these are separate topics and deserve
their own RFC document. The document describes the current best
practices implemented in various Passive DNS server implementations.
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2. Limitation
As a Passive DNS servers can include protection mechanisms for their
operation, results might be different due to those protection
measures. These mechanisms filter out DNS answers if they fail some
criteria. The bailiwick algorithm [BAILIWICK] protects the Passive
DNS Database from cache poisoning attacks [CACHEPOISONING]. Another
limitation that clients querying the database need to be aware of is
that each query simply gets a snapshot-answer of the time of
querying. Clients MUST NOT rely on consistent answers. Nor must
Dulaunoy, et al. Expires December 3, 2020 [Page 3]
Internet-Draft Passive DNS - Common Output Format June 2020
they assume that answers must be identical across multiple Passive
DNS Servers.
3. Common Output Format
3.1. Overview
The formatting of the answer follows the JSON [RFC4627] format. In
fact, it is a subset of the full JSON language. Notable differences
are the modified definition of whitespace ("ws"). The order of the
fields is not significant for the same resource type.
The intent of this output format is to be easily parsable by scripts.
Each JSON object is expressed on a single line to be processed by the
client line-by-line. Every implementation MUST support the JSON
output format.
Examples of JSON (Appendix A) output are in the appendix.
3.2. ABNF grammar
Formal grammar as defined in ABNF [RFC2234]
answer = entries
entries = * ( entry CR)
entry = "{" keyvallist "}"
keyvallist = [ member *( value-separator member ) ]
member = qm field qm name-separator value
name-separator = ws %x3A ws ; a ":" colon
value = value ; as defined in the JSON RFC
value-separator = ws %x2C ws ; , comma. As defined in JSON
field = "rrname" | "rrtype" | "rdata" | "time_first" |
"time_last" | "count" | "bailiwick" | "sensor_id" |
"zone_time_first" | "zone_time_last" | "origin" |
futureField
futureField = string
CR = %x0D
qm = %x22 ; " a quotation mark
ws = *(
%x20 | ; Space
%x09 ; Horizontal tab
)
Note that value is defined in JSON [RFC4627] and has the exact same
specification as there. The same goes for the definition of string.
Dulaunoy, et al. Expires December 3, 2020 [Page 4]
Internet-Draft Passive DNS - Common Output Format June 2020
3.3. Mandatory Fields
Implementation MUST support all the mandatory fields.
Uniqueness property: the tuple (rrname,rrtype,rdata) will always be
unique within one answer per server. While rrname and rrtype are
always individual JSON primitive types (strings, numbers, booleans or
null), rdata MAY return multiple resource records or a single record.
When multiple resource records are returned, rdata MUST be a JSON
array. In the case of a single resource record is returned, rdata
MUST be a JSON string or a JSON array containing one JSON string.
Senders SHOULD send an array for rdata, but receivers MUST be able to
accept a single-string result for rdata.
3.3.1. rrname
This field returns the name of the queried resource. JSON [RFC4627]
string.
3.3.2. rrtype
This field returns the resource record type as seen by the passive
DNS. The key is rrtype and the value is in the interpreted record
type represented as a JSON [RFC4627] string. If the value cannot be
interpreted, the decimal value is returned following the principle of
transparency as described in RFC 3597 [RFC3597]. Then the decimal
value is represented as a JSON [RFC4627] number. The resource record
type can be any values as described by IANA in the DNS parameters
document in the section 'Resource Record (RR) TYPEs'
(http://www.iana.org/assignments/dns-parameters). Supported textual
descriptions of rrtypes include: A, AAAA, CNAME, etc. A client MUST
be able to understand these textual rrtype values represented as a
JSON [RFC4627] string. In addition, a client MUST be able to handle
a decimal value (as mentioned above) answer represented as a JSON
[RFC4627] number.
3.3.3. rdata
This field returns the resource records of the queried resource.
When multiple resource records are returned, rdata MUST be a JSON
array containing JSON strings. In the case of a single resource
record is returned, rdata MUST be a JSON string or a JSON array
containing one JSON string. Each resource record is represented as a
JSON [RFC4627] string. Each resource record MUST be escaped as
defined in section 2.6 of RFC4627 [RFC4627]. Depending on the
rrtype, this can be an IPv4 or IPv6 address, a domain name (as in the
case of CNAMEs), an SPF record, etc. A client MUST be able to
interpret any value which is legal as the right hand side in a DNS
Dulaunoy, et al. Expires December 3, 2020 [Page 5]
Internet-Draft Passive DNS - Common Output Format June 2020
master file RFC 1035 [RFC1035] and RFC 1034 [RFC1034]. If the rdata
came from an unknown DNS resource records, the server must follow the
transparency principle as described in RFC 3597 [RFC3597].
3.3.4. time_first
This field returns the first time that the record / unique tuple
(rrname, rrtype, rdata) has been seen by the passive DNS. The date
is expressed in seconds (decimal) since 1st of January 1970 (Unix
timestamp). The time zone MUST be UTC. This field is represented as
a JSON [RFC4627] number.
3.3.5. time_last
This field returns the last time that the unique tuple (rrname,
rrtype, rdata) record has been seen by the passive DNS. The date is
expressed in seconds (decimal) since 1st of January 1970 (Unix
timestamp). The time zone MUST be UTC. This field is represented as
a JSON [RFC4627] number.
3.4. Optional Fields
Implementations SHOULD support one or more fields.
3.4.1. count
Specifies how many authoritative DNS answers were received at the
Passive DNS Server's collectors with exactly the given set of values
as answers (i.e. same data in the answer set - compare with the
uniqueness property in "Mandatory Fields"). The number of requests
is expressed as a decimal value. This field is represented as a JSON
[RFC4627] number.
3.4.2. bailiwick
The bailiwick is the best estimate of the apex of the zone where this
data is authoritative.
3.5. Additional Fields
Implementations MAY support the following fields:
3.5.1. sensor_id
This field returns the sensor information where the record was seen.
It is represented as a JSON [RFC4627] string.
Dulaunoy, et al. Expires December 3, 2020 [Page 6]
Internet-Draft Passive DNS - Common Output Format June 2020
If the data originate from sensors or probes which are part of a
publicly-known gathering or measurement system (e.g. RIPE Atlas), a
JSON [RFC4627] string SHOULD be prefixed.
3.5.2. zone_time_first
This field returns the first time that the unique tuple (rrname,
rrtype, rdata) record has been seen via master file import. The date
is expressed in seconds (decimal) since 1st of January 1970 (Unix
timestamp). The time zone MUST be UTC. This field is represented as
a JSON [RFC4627] number.
3.5.3. zone_time_last
This field returns the last time that the unique tuple (rrname,
rrtype, rdata) record has been seen via master file import. The date
is expressed in seconds (decimal) since 1st of January 1970 (Unix
timestamp). The time zone MUST be UTC. This field is represented as
a JSON [RFC4627] number.
3.5.4. origin
Specifies the resource origin of the Passive DNS response. This
field is represented as a Uniform Resource Identifier [RFC3986]
(URI).
3.5.5. time_first_ms
Same meaning as the field "time_first", with the only difference,
that the resolution is in milliseconds since 1st of January 1970
(UTC).
3.5.6. time_last_ms
Same meaning as the field "time_last", with the only difference, that
the resolution is in milliseconds since 1st of January 1970 (UTC).
3.6. Additional Fields Registry
In accordance with [RFC6648], designers of new passive DNS
applications that would need additional fields can request and
register new field name at https://github.com/adulau/pdns-qof/wiki/
Additional-Fields.
Dulaunoy, et al. Expires December 3, 2020 [Page 7]
Internet-Draft Passive DNS - Common Output Format June 2020
3.7. Additional notes
An implementer of a passive DNS Server MAY chose to either return
time_first and time_last OR return zone_time_first and
zone_time_last. In pseudocode: (time_first AND time_last) OR
(zone_time_first AND zone_time_last). In this case,
zone_time_{first,last} replace the time_{first,last} fields.
However, this is not encouraged since it might be confusing for
parsers who will expect the mandatory fields time_{first,last}. See:
[github_issue_17]
3.8. Suggested MIME Types
An implementer of a passive DNS Server SHOULD server a document in
this Common Output Format with a MIME header of "application/
x-ndjson".
4. Acknowledgements
Thanks to the Passive DNS developers who contributed to the document.
5. IANA Considerations
This memo includes no request to IANA.
6. Privacy Considerations
Passive DNS Servers capture DNS answers from multiple collecting
points ("sensors") which are located on the Internet-facing side of
DNS recursors ("post-recursor passive DNS"). In this process, they
intentionally omit the source IP, source port, destination IP and
destination port from the captured packets. Since the data is
captured "post-recursor", the timing information (who queries what)
is lost, since the recursor will cache the results. Furthermore,
since multiple sensors feed into a passive DNS server, the resulting
data gets mixed together, reducing the likelihood that Passive DNS
Servers are able to find out much about the actual person querying
the DNS records nor who actually sent the query. In this sense,
passive DNS Servers are similar to keeping an archive of all previous
phone books - if public DNS records can be compared to phone numbers
- as they often are. Nevertheless, the authors strongly encourage
Passive DNS implementors to take special care of privacy issues.
bortzmeyer-dnsop-dns-privacy is an excellent starting point for this.
Finally, the overall recommendations in RFC6973 [RFC6973] should be
taken into consideration when designing any application which uses
Passive DNS data.
Dulaunoy, et al. Expires December 3, 2020 [Page 8]
Internet-Draft Passive DNS - Common Output Format June 2020
In the scope of the General Data Protection Regulation (GDPR -
Directive 95/46/EC), operators of Passive DNS Server needs to ensure
the legal ground and lawfulness of its operation.
7. Security Considerations
In some cases, Passive DNS output might contain confidential
information and its access might be restricted. When a user is
querying multiple Passive DNS and aggregating the data, the
sensitivity of the data must be considered.
8. References
8.1. Normative References
[RFC1034] Mockapetris, P., "Domain names - concepts and facilities",
STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987,
<https://www.rfc-editor.org/info/rfc1034>.
[RFC1035] Mockapetris, P., "Domain names - implementation and
specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
November 1987, <https://www.rfc-editor.org/info/rfc1035>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC2234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", RFC 2234, DOI 10.17487/RFC2234,
November 1997, <https://www.rfc-editor.org/info/rfc2234>.
[RFC3597] Gustafsson, A., "Handling of Unknown DNS Resource Record
(RR) Types", RFC 3597, DOI 10.17487/RFC3597, September
2003, <https://www.rfc-editor.org/info/rfc3597>.
[RFC3912] Daigle, L., "WHOIS Protocol Specification", RFC 3912,
DOI 10.17487/RFC3912, September 2004,
<https://www.rfc-editor.org/info/rfc3912>.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66,
RFC 3986, DOI 10.17487/RFC3986, January 2005,
<https://www.rfc-editor.org/info/rfc3986>.
Dulaunoy, et al. Expires December 3, 2020 [Page 9]
Internet-Draft Passive DNS - Common Output Format June 2020
[RFC4627] Crockford, D., "The application/json Media Type for
JavaScript Object Notation (JSON)", RFC 4627,
DOI 10.17487/RFC4627, July 2006,
<https://www.rfc-editor.org/info/rfc4627>.
[RFC5001] Austein, R., "DNS Name Server Identifier (NSID) Option",
RFC 5001, DOI 10.17487/RFC5001, August 2007,
<https://www.rfc-editor.org/info/rfc5001>.
[RFC6648] Saint-Andre, P., Crocker, D., and M. Nottingham,
"Deprecating the "X-" Prefix and Similar Constructs in
Application Protocols", BCP 178, RFC 6648,
DOI 10.17487/RFC6648, June 2012,
<https://www.rfc-editor.org/info/rfc6648>.
[RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
Morris, J., Hansen, M., and R. Smith, "Privacy
Considerations for Internet Protocols", RFC 6973,
DOI 10.17487/RFC6973, July 2013,
<https://www.rfc-editor.org/info/rfc6973>.
8.2. References
[BAILIWICK]
"Passive DNS Hardening", 2010,
<https://archive.farsightsecurity.com/Passive_DNS/
passive_dns_hardening_handout.pdf>.
[CACHEPOISONING]
"Black ops 2008: It's the end of the cache as we know
it.", 2008, <http://kurser.lobner.dk/dDist/DMK_BO2K8.pdf>.
[DNSDB] "DNSDB API", 2013, <https://api.dnsdb.info/>.
[DNSDBQ] "DNSDB API Client, C Version", 2018,
<https://github.com/dnsdb/dnsdbq>.
[github_issue_17]
"Discussion on the existing implementations of returning
either zone_time{first,last} OR time_{first,last}", 2020,
<https://github.com/adulau/pdns-qof/issues/17>.
[PDNSCERTAT]
"pDNS presentation at 4th Centr R&D workshop Frankfurt Jun
5th 2012", 2012,
<http://www.centr.org/system/files/agenda/attachment/
rd4-papst-passive_dns.pdf>.
Dulaunoy, et al. Expires December 3, 2020 [Page 10]
Internet-Draft Passive DNS - Common Output Format June 2020
[PDNSCIRCL]
"CIRCL Passive DNS", 2012,
<https://www.circl.lu/services/passive-dns/>.
[PDNSCLIENT]
"Queries 5 major Passive DNS databases: BFK, CERTEE,
DNSParse, ISC, and VirusTotal.", 2013,
<https://github.com/chrislee35/passivedns-client>.
[PDNSCOF] "Passive DNS server interface using the common output
format", 2013,
<https://github.com/D4-project/analyzer-d4-passivedns/>.
[REST] "Representational State Transfer (REST)", 2000,
<http://www.ics.uci.edu/~fielding/pubs/dissertation/
rest_arch_style.htm>.
[WEIMERPDNS]
"Passive DNS Replication", 2005,
<http://www.enyo.de/fw/software/dnslogger/
first2005-paper.pdf>.
8.3. Informative References
[I-D.narten-iana-considerations-rfc2434bis]
Narten, T. and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", draft-narten-iana-
considerations-rfc2434bis-09 (work in progress), March
2008.
[RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC
Text on Security Considerations", BCP 72, RFC 3552,
DOI 10.17487/RFC3552, July 2003,
<https://www.rfc-editor.org/info/rfc3552>.
Appendix A. Examples
The JSON output are represented on multiple lines for readability but
each JSON object should be on a single line.
If you query a passive DNS for the rrname www.ietf.org, the passive
dns common output format can be:
Dulaunoy, et al. Expires December 3, 2020 [Page 11]
Internet-Draft Passive DNS - Common Output Format June 2020
{"count": 102, "time_first": 1298412391, "rrtype": "AAAA",
"rrname": "www.ietf.org", "rdata": "2001:1890:1112:1::20",
"time_last": 1302506851}
{"count": 59, "time_first": 1384865833, "rrtype": "A",
"rrname": "www.ietf.org", "rdata": "4.31.198.44",
"time_last": 1389022219}
If you query a passive DNS for the rrname ietf.org, the passive dns
common output format can be:
{"count": 109877, "time_first": 1298398002, "rrtype": "NS",
"rrname": "ietf.org", "rdata": "ns1.yyz1.afilias-nst.info",
"time_last": 1389095375}
{"count": 4, "time_first": 1298495035, "rrtype": "A",
"rrname": "ietf.org", "rdata": "64.170.98.32",
"time_last": 1298495035}
{"count": 9, "time_first": 1317037550, "rrtype": "AAAA",
"rrname": "ietf.org", "rdata": "2001:1890:123a::1:1e",
"time_last": 1330209752}
Please note that the examples imply that a single query returns a
single set of JSON objects. For example, two queries were made; one
query returned a set of two JSON objects and the other query returned
a set of three JSON objects. This specification requires each JSON
object individually MUST conform to the common output format, but
this specification does not require that a query will return a set of
JSON objects.
Please note that in the examples above, any backslashes "\" can be
ignored and are an artifact of the tools which produced this
document.
Authors' Addresses
Alexandre Dulaunoy
CIRCL
16, bd d'Avranches
Luxembourg L-1160
Luxembourg
Phone: (+352) 247 88444
Email: alexandre.dulaunoy@circl.lu
URI: http://www.circl.lu/
Dulaunoy, et al. Expires December 3, 2020 [Page 12]
Internet-Draft Passive DNS - Common Output Format June 2020
L. Aaron Kaplan
Vienna A-1170
Austria
Email: aaron@lo-res.org
Paul Vixie
Farsight Security, Inc.
11400 La Honda Road
Woodside, California 94062
U.S.A.
Email: paul@redbarn.org
URI: https://www.farsightsecurity.com/
Henry Stern
Farsight Security, Inc.
11400 La Honda Road
Woodside, California 94062
U.S.A.
Phone: +1 650 542-7836
Email: henry@stern.ca
URI: https://www.farsightsecurity.com/
Dulaunoy, et al. Expires December 3, 2020 [Page 13]

View file

@ -263,6 +263,9 @@ ws = *(
</section>
<section title="Additional notes">
<t>An implementer of a passive DNS Server MAY chose to either return time_first and time_last OR return zone_time_first and zone_time_last. In pseudocode: (time_first AND time_last) OR (zone_time_first AND zone_time_last). In this case, zone_time_{first,last} replace the time_{first,last} fields. However, this is not encouraged since it might be confusing for parsers who will expect the mandatory fields time_{first,last}. See: <xref target="github_issue_17"/></t>
</section>
<section title="Suggested MIME Types">
<t>An implementer of a passive DNS Server SHOULD server a document in this Common Output Format with a MIME header of "application/x-ndjson".</t>
</section>
</section>
@ -370,7 +373,7 @@ ws = *(
<date year="2013"/>
</front>
</reference>
<reference anchor="PDNSCERTAT" target="http://www.centr.org/system/files/agenda/attachment/rd4-papst-passive_dns.pdf">
<reference anchor="PDNSCERTAT" target="http://www.centr.org/system/files/agenda/attachment/d4-papst-passive_dns.pdf">
<front>
<title>pDNS presentation at 4th Centr R&amp;D workshop Frankfurt Jun 5th 2012</title>
<author fullname="CERT.at"/>
@ -388,7 +391,7 @@ ws = *(
<front>
<title>Passive DNS server interface using the common output format</title>
<author fullname="D4 Project, Alexandre Dulaunoy"/>
<date year="2013"/>
<date year="2019"/>
</front>
</reference>
<reference anchor="DNSDBQ" target="https://github.com/dnsdb/dnsdbq">

182
schema/schema.json Normal file
View file

@ -0,0 +1,182 @@
{
"$id": "https://github.com/adulau/pdns-qof/blob/master/schema/schema.json",
"$schema": "http://json-schema.org/draft-07/schema",
"default": {},
"description": "This schema helps to validate JSON documents against the passive DNS Common Output Format. See https://www.first.org/global/sigs/passive-dns/",
"examples": [
{
"count": 962,
"time_first": 1522998917,
"time_last": 1619613241,
"rrname": "westernunion.com.ph.",
"rrtype": "A",
"bailiwick": "westernunion.com.ph.",
"rdata": [
"66.218.161.27",
"66.218.170.77"
]
}
],
"required": [
"time_first",
"time_last",
"rrname",
"rrtype",
"rdata"
],
"title": "The passive DNS Common Output Format (COF) schema",
"type": "object",
"properties": {
"count": {
"$id": "#/properties/count",
"default": 0,
"description": "Specifies how many authoritative DNS answers were received at the Passive DNS Server's collectors with exactly the given set of values as answers (i.e. same data in the answer set). The number of requests is expressed as a decimal value. This field is represented as a JSON [RFC4627] number.",
"examples": [
962
],
"title": "count of authoritative answers",
"type": "number"
},
"time_first": {
"$id": "#/properties/time_first",
"default": 0,
"description": "This field returns the first time that the record / unique tuple (rrname, rrtype, rdata) has been seen by the passive DNS. The date is expressed in seconds (decimal) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC. This field is represented as a JSON [RFC4627] number.",
"examples": [
1522998917
],
"title": "first seen",
"type": "number"
},
"time_last": {
"$id": "#/properties/time_last",
"default": 0,
"description": "This field returns the last time that the unique tuple (rrname, rrtype, rdata) record has been seen by the passive DNS. The date is expressed in seconds (decimal) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC. This field is represented as a JSON [RFC4627] number.",
"examples": [
1619613241
],
"title": "last seen",
"type": "number"
},
"rrname": {
"$id": "#/properties/rrname",
"default": "",
"description": "This field returns the name of the queried resource.",
"examples": [
"westernunion.com.ph."
],
"title": "rrname",
"type": "string"
},
"rrtype": {
"$id": "#/properties/rrtype",
"default": "",
"description": "This field returns the resource record type as seen by the passive DNS. The key is rrtype and the value is in the interpreted record type represented as a JSON [RFC4627] string. If the value cannot be interpreted the decimal value is returned following the principle of transparency as described in RFC 3597 [RFC3597]. Then the decimal value is represented as a JSON [RFC4627] number. Currently known and supported textual descriptions of rrtypes are: A, AAAA, CNAME, PTR, SOA, TXT, DNAME, NS, SRV, RP, NAPTR, HINFO, A6.",
"examples": [
"A"
],
"title": "rrtype",
"type": "string"
},
"bailiwick": {
"$id": "#/properties/bailiwick",
"default": "",
"description": "The bailiwick is the best estimate of the apex of the zone where this data is authoritative.",
"examples": [
"google.com."
],
"title": "bailiwick",
"type": "string"
},
"rdata": {
"$id": "#/properties/rdata",
"default": [],
"description": "This field returns the resource records of the queried resource. When multiple resource records are returned, rdata MUST be a JSON array containing JSON strings. In the case of a single resource record is returned, rdata MUST be a JSON string or a JSON array containing one JSON string. Each resource record is represented as a JSON string",
"examples": [
[
"8.8.8.8",
"9.9.9.9"
]
],
"title": "The rdata schema",
"type": "array",
"additionalItems": true,
"items": {
"$id": "#/properties/rdata/items",
"anyOf": [
{
"$id": "#/properties/rdata/items/anyOf/0",
"type": "string",
"title": "The first anyOf schema",
"description": "An explanation about the purpose of this instance.",
"default": "",
"examples": [
"66.218.161.27",
"66.218.170.77"
]
}
]
}
},
"sensor_id": {
"$id": "#/properties/sensor_id",
"default": 0,
"description": "This field returns the sensor information where the record was seen. It is represented as a JSON [RFC4627] string.",
"examples": [
"42"
],
"title": "sensor_id",
"type": "string"
},
"zone_time_first": {
"$id": "#/properties/zone_time_first",
"default": 0,
"description": "This field returns the first time that the unique tuple (rrname, rrtype, rdata) record has been seen via master file import. The date is expressed in seconds (decimal) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC. This field is represented as a JSON [RFC4627] number.",
"examples": [
1522998917
],
"title": "zone time first seen",
"type": "number"
},
"zone_time_last": {
"$id": "#/properties/zone_time_last",
"default": 0,
"description": "This field returns the last time that the unique tuple (rrname, rrtype, rdata) record has been seen via master file import. The date is expressed in seconds (decimal) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC. This field is represented as a JSON [RFC4627] number.",
"examples": [
1619613241
],
"title": "zone time last seen",
"type": "number"
},
"time_first_ms": {
"$id": "#/properties/time_first_ms",
"default": 0,
"description": "Same meaning as the field 'time_first', with the only difference, that the resolution is in milliseconds since 1st of January 1970 (UTC).",
"examples": [
1619690051001
],
"title": "time first seen (millisec)",
"type": "number"
},
"time_last_ms": {
"$id": "#/properties/time_last_ms",
"default": 0,
"description": "Same meaning as the field 'time_last', with the only difference, that the resolution is in milliseconds since 1st of January 1970 (UTC).",
"examples": [
1619690211002
],
"title": "time last seen (millisec)",
"type": "number"
},
"origin": {
"$id": "#/properties/origin",
"default": "",
"description": "Specifies the resource origin of the Passive DNS response. This field is represented as a Uniform Resource Identifier [RFC3986] (URI).",
"examples": [
"http://circl.lu"
],
"title": "sensor_id",
"type": "string"
}
},
"additionalProperties": true
}

View file

@ -0,0 +1,457 @@
<?xml version="1.0" encoding="US-ASCII"?>
<!-- This template is for creating an Internet Draft using xml2rfc,
which is available here: http://xml.resource.org. -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!-- One method to get references from the online citation libraries.
There has to be one entity for each item to be referenced.
An alternate method (rfc include) is described in the references. -->
<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY RFC2629 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml">
<!ENTITY RFC3552 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3552.xml">
<!ENTITY RFC1035 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1035.xml">
<!ENTITY RFC1034 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1034.xml">
<!ENTITY RFC4627 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4627.xml">
<!ENTITY RFC5001 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5001.xml">
<!ENTITY RFC3597 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3597.xml">
<!ENTITY RFC3912 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3912.xml">
<!ENTITY RFC6648 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6648.xml">
<!ENTITY RFC2234 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2234.xml">
<!ENTITY RFC6973 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6973.xml">
<!ENTITY RFC3986 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3986.xml">
<!ENTITY I-D.narten-iana-considerations-rfc2434bis SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.narten-iana-considerations-rfc2434bis.xml">
<!ENTITY I-D.draft-bortzmeyer-dnsop-dns-privacy SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.draft-bortzmeyer-dnsop-dns-privacy">
]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<!-- used by XSLT processors -->
<!-- For a complete list and description of processing instructions (PIs),
please see http://xml.resource.org/authoring/README.html. -->
<!-- Below are generally applicable Processing Instructions (PIs) that most I-Ds might want to use.
(Here they are set differently than their defaults in xml2rfc v1.32) -->
<?rfc strict="yes" ?>
<!-- give errors regarding ID-nits and DTD validation -->
<!-- control the table of contents (ToC) -->
<?rfc toc="yes"?>
<!-- generate a ToC -->
<?rfc tocdepth="4"?>
<!-- the number of levels of subsections in ToC. default: 3 -->
<!-- control references -->
<?rfc symrefs="yes"?>
<!-- use symbolic references tags, i.e, [RFC2119] instead of [1] -->
<?rfc sortrefs="yes" ?>
<!-- sort the reference entries alphabetically -->
<!-- control vertical white space
(using these PIs as follows is recommended by the RFC Editor) -->
<?rfc compact="yes" ?>
<!-- do not start each main section on a new page -->
<?rfc subcompact="no" ?>
<!-- keep one blank line between list items -->
<!-- end of list of popular I-D processing instructions -->
<rfc category="info" docName="draft-dulaunoy-dnsop-passive-dns-cof-08" ipr="trust200902">
<!-- category values: std, bcp, info, exp, and historic
ipr values: full3667, noModification3667, noDerivatives3667
you can add the attributes updates="NNNN" and obsoletes="NNNN"
they will automatically be output with "(if approved)" -->
<!-- ***** FRONT MATTER ***** -->
<front>
<title abbrev="Passive DNS - Common Output Format">Passive DNS - Common Output Format</title>
<author fullname="Alexandre Dulaunoy" initials="A."
surname="Dulaunoy">
<organization>CIRCL</organization>
<address>
<postal>
<street>16, bd d'Avranches</street>
<city>Luxembourg</city>
<region></region>
<code>L-1160</code>
<country>Luxembourg</country>
</postal>
<phone>(+352) 247 88444</phone>
<email>alexandre.dulaunoy@circl.lu</email>
<uri>http://www.circl.lu/</uri>
<!-- uri and facsimile elements may also be added -->
</address>
</author>
<author fullname="L. Aaron Kaplan" initials="A."
surname="Kaplan">
<organization></organization>
<address>
<postal>
<street>
</street>
<city>Vienna</city>
<region></region>
<code>A-1170</code>
<country>Austria</country>
</postal>
<phone></phone>
<email>aaron@lo-res.org</email>
<uri></uri>
</address>
</author>
<author fullname="Paul Vixie" initials="P."
surname="Vixie">
<organization>Farsight Security, Inc.</organization>
<address>
<postal>
<street>11400 La Honda Road</street>
<city>Woodside</city>
<region>California</region>
<code>94062</code>
<country>U.S.A.</country>
</postal>
<phone></phone>
<email>paul@redbarn.org</email>
<uri>https://www.farsightsecurity.com/</uri>
</address>
</author>
<author fullname="Henry Stern" initials="H." surname="Stern">
<organization>Farsight Security, Inc.</organization>
<address>
<postal>
<street>11400 La Honda Road</street>
<city>Woodside</city>
<region>California</region>
<code>94062</code>
<country>U.S.A.</country>
</postal>
<phone>+1 650 542-7836</phone>
<email>henry@stern.ca</email>
<uri>https://www.farsightsecurity.com/</uri>
</address>
</author>
<date month="May" year="2021" />
<area>General</area>
<workgroup>Domain Name System Operations</workgroup>
<keyword>dns</keyword>
<abstract>
<t>This document describes a common output format of Passive DNS Servers which clients can query. The output format description includes also in addition a common semantic for each Passive DNS system. By having multiple Passive DNS Systems adhere to the same output format for queries, users of multiple Passive DNS servers will be able to combine result sets easily.</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>Passive DNS is a technique described by Florian Weimer in 2005 in <xref target="WEIMERPDNS">Passive DNS replication, F Weimer - 17th Annual FIRST Conference on Computer Security</xref>. Since then multiple Passive DNS implementations were created and evolved over time. Users of these Passive DNS servers may query a server (often via <xref target="RFC3912">WHOIS</xref> or HTTP <xref target="REST">REST</xref>), parse the results and process them in other applications.</t>
<t>
There are multiple implementations of Passive DNS software. Users of passive DNS query each implementation and aggregate the results for their search. This document describes the output format of four Passive DNS Systems (<xref target="DNSDB"/>, <xref target="DNSDBQ"/>, <xref target="PDNSCERTAT"/>, <xref target="PDNSCIRCL"/> and <xref target="PDNSCOF"/>) which are in use today and which already share a nearly identical output format.
As the format and the meaning of output fields from each Passive DNS need to be consistent, we propose in this document a solution to commonly name each field along with their corresponding interpretation. The format follows a simple key-value structure in <xref target="RFC4627">JSON</xref> format.
The benefit of having a consistent Passive DNS output format is that multiple client implementations can query different servers without having to have a separate parser for each
individual server. <xref target="PDNSCLIENT">passivedns-client</xref> currently implements multiple parsers due to a lack of standardization.
The document does not describe the protocol (e.g. <xref target="RFC3912">WHOIS</xref>, HTTP <xref target="REST">REST</xref>) nor the query format used to query the Passive DNS. Neither does this document describe "pre-recursor" Passive DNS Systems. Both of these are separate topics and deserve their own RFC document. The document describes the current best practices implemented in various Passive DNS server implementations.
</t>
<section title="Requirements Language">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in <xref
target="RFC2119">RFC 2119</xref>.</t>
</section>
</section>
<section title="Limitation">
<t> As a Passive DNS servers can include protection mechanisms for their operation, results might be different due to those protection measures. These mechanisms filter out DNS answers if they fail some criteria. The <xref target="BAILIWICK">bailiwick algorithm</xref> protects the Passive DNS Database from <xref target="CACHEPOISONING">cache poisoning attacks</xref>.
Another limitation that clients querying the database need to be aware of is that each query simply gets a snapshot-answer of the time of querying. Clients MUST NOT rely on consistent answers. Nor must they assume that answers must be identical across multiple Passive DNS Servers.
</t>
</section>
<section title="Common Output Format">
<section title="Overview">
<t>The formatting of the answer follows the <xref target="RFC4627">JSON</xref> format. In fact, it is a subset of the full JSON language. Notable differences are the modified definition of whitespace ("ws"). The order of the fields is not significant for the same resource type. </t>
<t>The intent of this output format is to be easily parsable by scripts. Each JSON object is expressed on a single line to be processed by the client line-by-line. Every implementation MUST support the JSON output format.</t> <!-- note: it is "parsable" if you want to be really nit-picking. See: https://en.wiktionary.org/wiki/parsable -->
<t><xref target="app-additional">Examples of JSON</xref> output are in the appendix.</t>
</section>
<section title="ABNF grammar">
<figure><preamble>Formal grammar as defined in <xref target="RFC2234">ABNF</xref></preamble><artwork><![CDATA[
answer = entries
entries = * ( entry CR)
entry = "{" keyvallist "}"
keyvallist = [ member *( value-separator member ) ]
member = qm field qm name-separator value
name-separator = ws %x3A ws ; a ":" colon
value = value ; as defined in the JSON RFC
value-separator = ws %x2C ws ; , comma. As defined in JSON
field = "rrname" | "rrtype" | "rdata" | "time_first" |
"time_last" | "count" | "bailiwick" | "sensor_id" |
"zone_time_first" | "zone_time_last" | "origin" |
futureField
futureField = string
CR = %x0D
qm = %x22 ; " a quotation mark
ws = *(
%x20 | ; Space
%x09 ; Horizontal tab
)
]]></artwork></figure>
<t>Note that value is defined in <xref target="RFC4627">JSON</xref> and has the exact same specification as there. The same goes for the definition of string.</t>
</section>
<section title="Mandatory Fields">
<t>Implementation MUST support all the mandatory fields.</t>
<t>Uniqueness property: the tuple (rrname,rrtype,rdata) will always be unique within one answer per server. While rrname and rrtype are always individual JSON primitive types (strings, numbers, booleans or null), rdata MAY return multiple resource records or a single record. When multiple resource records are returned, rdata MUST be a JSON array. In the case of a single resource record is returned, rdata MUST be a JSON string or a JSON array containing one JSON string. Senders SHOULD send an array for rdata, but receivers MUST be able to accept a single-string result for rdata.</t>
<section title="rrname">
<t>This field returns the name of the queried resource. <xref target="RFC4627">JSON</xref> string.</t>
</section>
<section title="rrtype">
<t>This field returns the resource record type as seen by the passive DNS. The key is rrtype and the value is in the interpreted record type represented as a <xref target="RFC4627">JSON</xref> string. If the value cannot be interpreted, the decimal value is returned following the principle of transparency as described in <xref target="RFC3597">RFC 3597</xref>. Then the decimal value is represented as a <xref target="RFC4627">JSON</xref> number.
The resource record type can be any values as described by IANA in the DNS parameters document in the section 'Resource Record (RR) TYPEs' (http://www.iana.org/assignments/dns-parameters).
Supported textual descriptions of rrtypes include: A, AAAA, CNAME, etc.
A client MUST be able to understand these textual rrtype values represented as a <xref target="RFC4627">JSON</xref> string. In addition, a client MUST be able to handle a decimal value (as mentioned above) answer represented as a <xref target="RFC4627">JSON</xref> number.
</t>
</section>
<section title="rdata">
<t>This field returns the resource records of the queried resource. When multiple resource records are returned, rdata MUST be a JSON array containing JSON strings. In the case of a single resource record is returned, rdata MUST be a JSON string or a JSON array containing one JSON string. Each resource record is represented as a <xref target="RFC4627">JSON</xref> string. Each resource record MUST be escaped as defined in section 2.6 of <xref target="RFC4627">RFC4627</xref>. Depending on the rrtype, this can be an IPv4 or IPv6 address, a domain name (as in the case of CNAMEs), an SPF record, etc. A client MUST be able to interpret any value which is legal as the right hand side in a DNS master file <xref target="RFC1035">RFC 1035</xref> and <xref target="RFC1034">RFC 1034</xref>. If the rdata came from an unknown DNS resource records, the server must follow the transparency principle as described in <xref target="RFC3597">RFC 3597</xref>.</t>
</section>
<section title="time_first">
<t>This field returns the first time that the record / unique tuple (rrname, rrtype, rdata) has been seen by the passive DNS. The date is expressed in seconds (decimal) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC. This field is represented as a <xref target="RFC4627">JSON</xref> number.</t>
</section>
<section title="time_last">
<t>This field returns the last time that the unique tuple (rrname, rrtype, rdata) record has been seen by the passive DNS. The date is expressed in seconds (decimal) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC. This field is represented as a <xref target="RFC4627">JSON</xref> number.</t>
</section>
</section>
<section title="Optional Fields">
<t>Implementations SHOULD support one or more fields.</t>
<section title="count">
<t>Specifies how many authoritative DNS answers were received at the Passive DNS Server's collectors with exactly the given set of values as answers (i.e. same data in the answer set - compare with the uniqueness property in "Mandatory Fields"). The number of requests is expressed as a decimal value. This field is represented as a <xref target="RFC4627">JSON</xref> number.</t>
</section>
<section title="bailiwick">
<t>The bailiwick is the best estimate of the apex of the zone where this data is authoritative.</t>
</section>
</section>
<section title="Additional Fields">
<t>Implementations MAY support the following fields:</t>
<section title="sensor_id">
<t>This field returns the sensor information where the record was seen. It is represented as a <xref target="RFC4627">JSON</xref> string.</t>
<t>If the data originate from sensors or probes which are part of a publicly-known gathering or measurement system (e.g. RIPE Atlas), a <xref target="RFC4627">JSON</xref> string SHOULD be prefixed.</t>
</section>
<section title="zone_time_first">
<t>This field returns the first time that the unique tuple (rrname, rrtype, rdata) record has been seen via master file import. The date is expressed in seconds (decimal) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC. This field is represented as a <xref target="RFC4627">JSON</xref> number.</t>
</section>
<section title="zone_time_last">
<t>This field returns the last time that the unique tuple (rrname, rrtype, rdata) record has been seen via master file import. The date is expressed in seconds (decimal) since 1st of January 1970 (Unix timestamp). The time zone MUST be UTC. This field is represented as a <xref target="RFC4627">JSON</xref> number.</t>
</section>
<section title="origin">
<t>Specifies the resource origin of the Passive DNS response. This field is represented as a <xref target="RFC3986">Uniform Resource Identifier</xref> (URI).
</t>
</section>
<section title="time_first_ms">
<t>Same meaning as the field "time_first", with the only difference, that the resolution is in milliseconds since 1st of January 1970 (UTC).
</t>
</section>
<section title="time_last_ms">
<t>Same meaning as the field "time_last", with the only difference, that the resolution is in milliseconds since 1st of January 1970 (UTC).
</t>
</section>
</section>
<section title="Additional Fields Registry">
<t>In accordance with <xref target="RFC6648"/>, designers of new passive DNS applications that would need additional fields can request and register new field name at https://github.com/adulau/pdns-qof/wiki/Additional-Fields.</t>
</section>
<section title="Additional notes">
<t>An implementer of a passive DNS Server MAY chose to either return time_first and time_last OR return zone_time_first and zone_time_last. In pseudocode: (time_first AND time_last) OR (zone_time_first AND zone_time_last). In this case, zone_time_{first,last} replace the time_{first,last} fields. However, this is not encouraged since it might be confusing for parsers who will expect the mandatory fields time_{first,last}. See: <xref target="github_issue_17"/></t>
</section>
<section title="Suggested MIME Types">
<t>An implementer of a passive DNS Server SHOULD server a document in this Common Output Format with a MIME header of "application/x-ndjson".</t>
</section>
</section>
<!-- This PI places the pagebreak correctly (before the section title) in the text output. -->
<?rfc needLines="8" ?>
<section anchor="Acknowledgements" title="Acknowledgements">
<t>Thanks to the Passive DNS developers who contributed to the document.</t>
</section>
<!-- Possibly a 'Contributors' section ... -->
<section anchor="IANA" title="IANA Considerations">
<t>This memo includes no request to IANA.</t>
</section>
<section anchor="Privacy" title="Privacy Considerations">
<t>Passive DNS Servers capture DNS answers from multiple collecting points ("sensors") which are located on the Internet-facing side of DNS recursors ("post-recursor passive DNS"). In this process, they intentionally omit the source IP, source port, destination IP and destination port from the captured packets. Since the data is captured "post-recursor", the timing information (who queries what) is lost, since the recursor will cache the results. Furthermore, since multiple sensors feed into a passive DNS server, the resulting data gets mixed together, reducing the likelihood that Passive DNS Servers are able to find out much about the actual person querying the DNS records nor who actually sent the query. In this sense, passive DNS Servers are similar to keeping an archive of all previous phone books - if public DNS records can be compared to phone numbers - as they often are.
Nevertheless, the authors strongly encourage Passive DNS implementors to take special care of privacy issues. bortzmeyer-dnsop-dns-privacy is an excellent starting point for this.
Finally, the overall recommendations in <xref target="RFC6973">RFC6973</xref> should be taken into consideration when designing any application which uses Passive DNS data.</t>
<t>In the scope of the General Data Protection Regulation (GDPR - Directive 95/46/EC), operators of Passive DNS Server needs to ensure the legal ground and lawfulness of its operation.</t>
</section>
<section anchor="Security" title="Security Considerations">
<t>In some cases, Passive DNS output might contain confidential information and its access might be restricted. When a user is querying multiple Passive DNS and aggregating the data, the sensitivity of the data must be considered.</t>
</section>
</middle>
<!-- *****BACK MATTER ***** -->
<back>
<!-- References split into informative and normative -->
<!-- There are 2 ways to insert reference entries from the citation libraries:
1. define an ENTITY at the top, and use "ampersand character"RFC2629; here (as shown)
2. simply use a PI "less than character"?rfc include="reference.RFC.2119.xml"?> here
(for I-Ds: include="reference.I-D.narten-iana-considerations-rfc2434bis.xml")
Both are cited textually in the same manner: by using xref elements.
If you use the PI option, xml2rfc will, by default, try to find included files in the same
directory as the including file. You can also define the XML_LIBRARY environment variable
with a value containing a set of directories to search. These can be either in the local
filing system or remote ones accessed by http (http://domain/dir/... ).-->
<references title="Normative References">
<!--?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?-->
&RFC2119;
&RFC1035;
&RFC1034;
&RFC3912;
&RFC4627;
&RFC5001;
&RFC3597;
&RFC6648;
&RFC2234;
&RFC6973;
&RFC3986;
</references>
<references>
<reference anchor="WEIMERPDNS" target="http://www.enyo.de/fw/software/dnslogger/first2005-paper.pdf">
<front>
<title>Passive DNS Replication</title>
<author fullname="Florian Weimer"/>
<date year="2005"/>
</front>
</reference>
<reference anchor="CACHEPOISONING" target="http://kurser.lobner.dk/dDist/DMK_BO2K8.pdf">
<front>
<title>Black ops 2008: It's the end of the cache as we know it.</title>
<author fullname="Dan Kaminsky"/>
<date year="2008"/>
</front>
</reference>
<reference anchor="BAILIWICK" target="https://archive.farsightsecurity.com/Passive_DNS/passive_dns_hardening_handout.pdf">
<front>
<title>Passive DNS Hardening</title>
<author fullname="Robert Edmonds"/>
<date year="2010"/>
</front>
</reference>
<reference anchor="PDNSCLIENT" target="https://github.com/chrislee35/passivedns-client">
<front>
<title>Queries 5 major Passive DNS databases: BFK, CERTEE, DNSParse, ISC, and VirusTotal.</title>
<author fullname="Chris Lee"/>
<date year="2013"/>
</front>
</reference>
<reference anchor="REST" target="http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm">
<front>
<title>Representational State Transfer (REST)</title>
<author fullname="Roy Thomas Fielding"/>
<date year="2000"/>
</front>
</reference>
<reference anchor="DNSDB" target="https://api.dnsdb.info/">
<front>
<title>DNSDB API</title>
<author fullname="Farsight Security"/>
<date year="2013"/>
</front>
</reference>
<reference anchor="PDNSCERTAT" target="http://www.centr.org/system/files/agenda/attachment/rd4-papst-passive_dns.pdf">
<front>
<title>pDNS presentation at 4th Centr R&amp;D workshop Frankfurt Jun 5th 2012</title>
<author fullname="CERT.at"/>
<date year="2012"/>
</front>
</reference>
<reference anchor="PDNSCIRCL" target="https://www.circl.lu/services/passive-dns/">
<front>
<title>CIRCL Passive DNS</title>
<author fullname="CIRCL -Computer Incident Response Center Luxembourg"/>
<date year="2012"/>
</front>
</reference>
<reference anchor="PDNSCOF" target="https://github.com/D4-project/analyzer-d4-passivedns/">
<front>
<title>Passive DNS server interface using the common output format</title>
<author fullname="D4 Project, Alexandre Dulaunoy"/>
<date year="2019"/>
</front>
</reference>
<reference anchor="DNSDBQ" target="https://github.com/dnsdb/dnsdbq">
<front>
<title>DNSDB API Client, C Version</title>
<author fullname="Paul Vixie"/>
<date year="2018"/>
</front>
</reference>
<reference anchor="github_issue_17" target="https://github.com/adulau/pdns-qof/issues/17">
<front>
<title>Discussion on the existing implementations of returning either zone_time{first,last} OR time_{first,last}</title>
<author fullname="Paul Vixie, Weizman, April, Kaplan, et.al"/>
<date year="2020"/>
</front>
</reference>
</references>
<references title="Informative References">
<!-- Here we use entities that we defined at the beginning. -->
&RFC3552;
&I-D.narten-iana-considerations-rfc2434bis;
<!-- &I-D.draft-bortzmeyer-dnsop-dns-privacy; -->
</references>
<section anchor="app-additional" title="Examples">
<t>The JSON output are represented on multiple lines for readability but each JSON object should be on a single line.</t>
<t>If you query a passive DNS for the rrname www.ietf.org, the passive dns common output format can be:</t>
<figure><artwork>
<![CDATA[
{"count": 102, "time_first": 1298412391, "rrtype": "AAAA",
"rrname": "www.ietf.org", "rdata": "2001:1890:1112:1::20",
"time_last": 1302506851}
{"count": 59, "time_first": 1384865833, "rrtype": "A",
"rrname": "www.ietf.org", "rdata": "4.31.198.44",
"time_last": 1389022219}
]]>
</artwork></figure>
<t>If you query a passive DNS for the rrname ietf.org, the passive dns common output format can be:</t>
<figure><artwork>
<![CDATA[
{"count": 109877, "time_first": 1298398002, "rrtype": "NS",
"rrname": "ietf.org", "rdata": "ns1.yyz1.afilias-nst.info",
"time_last": 1389095375}
{"count": 4, "time_first": 1298495035, "rrtype": "A",
"rrname": "ietf.org", "rdata": "64.170.98.32",
"time_last": 1298495035}
{"count": 9, "time_first": 1317037550, "rrtype": "AAAA",
"rrname": "ietf.org", "rdata": "2001:1890:123a::1:1e",
"time_last": 1330209752}
]]>
</artwork></figure>
<t>Please note that the examples imply that a single query returns a single set of JSON objects. For example, two queries were made; one query returned a set of two JSON objects and the other query returned a set of three JSON objects. This specification requires each JSON object individually MUST conform to the common output format, but this specification does not require that a query will return a set of JSON objects.</t>
<t>Please note that in the examples above, any backslashes "\" can be ignored and are an artifact of the tools which produced this document.</t>
</section>
</back>
</rfc>