KNO Protocol Specification, version 0

This document defines the network protocol for communicating with the KNO. This is version 0 ("v0") of the protocol.

In this specification, the term "number" refers to an ordered sequence of bytes (octets). When a number is transmitted over TCP, the bytes are sent in order, with no framing, and when a number is hashed, the bytes are fed to the hash function in order. Since the protocol is defined in terms of byte sequences, the concept of byte order does not apply at the protocol level (except for the timestamp); interpreting the sequences of bytes as encodings of integers or bit streams is a higher level concept discussed separately in the section titled Number Encoding.

Let "client" refer to the entity that wishes to determine whether a number is known or not, and let "U" refer to that number.

Let "server" refer to the Known Number Oracle (KNO).

Let the H(X) denote the SHA-256 hash of X.

Let the vertical bar "|" denote concatenation.

Let 0 denote the number consisting of 256/8 = 32 bytes of all zero bits.

To initiate the protocol, the client establishes a TCP connection to v0.kno.araneus.fi port 6349.

The client sends the message M0 = H(U).

The server receives M0 and generates a new 256-bit random number (nonce) T. If the server is able to identify a known number K such that H(K) = M0, it responds with the message M1 = H(T | K); this is the commitment. If it is unable to identify such a number, it responds with M1 = H(T | 0). Note that the latter M1 is somewhat arbitrary; its purpose is to serve as a placeholder that the client cannot distinguish from a commitment to a valid proof of knowledge of U until T is revealed, so any random-looking string of the appropriate length would work. The specified response could be interpreted as the commitment to a proof that the number 0 is known, which seems a reasonably truthful statement to make in the absence of more useful knowledge.

The client receives the server's response message M1.

The client sends the message M2 = U.

The server receives M2 and verifies that H(M2) is equal to the M0 received earlier. A mismatch is a protocol error.

The server sends M3 = T | TS; this is the reveal. TS is a timestamp indicating when the number U became known to the KNO. If the number was not previously known, this is the current time, and subsequent responses identifying the number as known should contain the same value. The timestamp is transmitted as an unsigned 64-bit integer count of seconds since the POSIX epoch, most significant byte first. The timestamp is informational only; there is no way for the client to verify that the KNO is providing truthful timestamps or that they have not been modified in transit.

The client receives M3, splits it into the nonce T and timestamp TS, and checks whether H(T | U) is equal to M1. If equal, the client determines that U must already have been known to the server at the time when it sent message M1, before U itself was transmitted in M3, and therefore U must be considered a "known number". If unequal, no such determination is made.

Multiple queries, each consisting of the four messages M0 through M3, may be conducted sequentially over a single TCP connection. There is no provision for multiple concurrent or interleaved queries over the same connection. Once the client has no further queries to perform, it should immediately close the TCP connection; connections should not be kept idle. To preserve resources, the server may close idle connections after a timeout of a few seconds.

In case of a protocol error, the TCP connection is immediately closed.

Number Encoding

If a number that you wish to submit to the KNO is not already a sequence of 16 bytes, you have to decide how to encode it. When the same number is submitted by two different users, the KNO will only be able to detect a match if they are encoded identically, so it is useful to agree on a common standard for encoding. This section defines such a set of encoding rules; this is version 1 ("v1") of the rules.

If the number is a bit stream, such as the output from a random bit generator, the first bit is sent as the most significant bit of the first byte, the next bit as the next-to-most significant bit of the first byte, etc.

If the number is an integer, it is first converted to binary form. The binary form is then converted to a 128-bit number by padding with zero bits at the most significant end (if the number is less than 128 bits) or by omitting the most significant bits (if the number is more than 128 bits). The number is then transmitted in big-endian order (the eight most significant bits in the first octet, etc).