The
Diffie–Hellman key exchange by itself does not provide protection against a man-in-the-middle attack. To ensure that the attacker is indeed not present in the first session (when no shared secrets exist), the
Short Authentication String (SAS) method is used: the communicating parties verbally cross-check a shared value displayed at both endpoints. If the values do not match, a man-in-the-middle attack is indicated. A specific attack theorized against the ZRTP protocol involves creating a synthetic voice of both parties to read a bogus SAS which is known as a "
Rich Little attack", but this class of attack is not believed to be a serious risk to the protocol's security. The SAS is used to authenticate the key exchange, which is essentially a
cryptographic hash of the two Diffie–Hellman values. The SAS value is rendered to both ZRTP endpoints. To carry out authentication, this SAS value is read aloud to the communication partner over the voice connection. If the values on both ends do not match, a man-in-middle attack is indicated; if they do match, a man-in-the-middle attack is highly unlikely. The use of hash commitment in the DH exchange constrains the attacker to only one guess to generate the correct SAS in the attack, which means the SAS may be quite short. A 16-bit SAS, for example, provides the attacker only one chance out of 65536 of not being detected.
Key continuity ZRTP provides a second layer of authentication against a MitM attack, based on a form of key continuity. It does this by caching some hashed key information for use in the next call, to be mixed in with the next call's DH shared secret, giving it key continuity properties analogous to
SSH. If the MitM is not present in the first call, he is locked out of subsequent calls. Thus, even if the SAS is never used, most MitM attacks are stopped because the MitM was not present in the first call. ==Operating environment==