According to an NSA slide presentation about XKeyscore from 2013, it is a "
DNI Exploitation System/Analytic Framework". DNI stands for Digital Network Intelligence, which means intelligence derived from internet traffic. Edward Snowden said about XKeyscore: "It's a front end search engine" in an interview with the German
Norddeutscher Rundfunk. XKeyscore is considered a "passive" program, in that it listens, but does not transmit anything on the networks that it targets. But it can trigger other systems, which perform "active" attacks through
Tailored Access Operations which are "tipping", for example, the QUANTUM family of programs, including QUANTUMINSERT, QUANTUMHAND, QUANTUMTHEORY, QUANTUMBOT and QUANTUMCOPPER and
Turbulence. These run at so-called "defensive sites" including the
Ramstein Air Force base in Germany,
Yokota Air Base in Japan, and numerous military and non-military locations within the US. Trafficthief, a core program of Turbulence, can alert NSA analysts when their targets communicate, and trigger other software programs, so select data is "promoted" from the local XKeyscore data store to the NSA's "corporate repositories" for long-term storage. Among the facilities involved in the program are four bases in
Australia and one in
New Zealand. •
F6 (Special Collection Service) joint operation of the CIA and NSA that carries out clandestine operations including espionage on foreign diplomats and leaders •
FORNSAT which stands for "foreign satellite collection", and refers to intercepts from satellites •
SSO (Special Source Operations) a division of the NSA that cooperates with telecommunication providers In a single, undated slide published by Swedish media in December 2013, the following additional data sources for XKeyscore are mentioned: •
Overhead intelligence derived from American spy planes, drones and satellites •
Tailored Access Operations a division of the NSA that deals with hacking and
cyberwarfare •
FISA all types of surveillance approved by the
Foreign Intelligence Surveillance Court •
Third party foreign partners of the NSA such as the (signals) intelligence agencies of Belgium, Denmark, France, Germany, Italy, Japan, the Netherlands, Norway, Sweden, etc. However the Netherlands is out of any cooperation concerning intelligence gathering and sharing for illegal spying. From these sources, XKeyscore stores "full-take data", which is scanned by plug-ins that extract certain types of metadata (like phone numbers, e-mail addresses, log-ins, and user activity) and indexes them in metadata tables, which can be queried by analysts. XKeyscore has been integrated with
MARINA, which is NSA's database for internet metadata. A detailed commentary on an NSA presentation published in
The Guardian in July 2013 cites a document published in 2008 declaring that "At some sites, the amount of data we receive per day (20+ terabytes) can only be stored for as little as 24 hours."
Types of XKeyscore According to a document from an internal GCHQ website which was disclosed by the German magazine
Der Spiegel in June 2014, there are three different types of the XKeyscore system: •
Traditional: The initial version of XKeyscore is fed with data from low-rate data signals, after being processed by the WEALTHYCLUSTER system. This traditional version is not only used by NSA but also at many intercept sites of GCHQ. •
Stage 2: This version of XKeyscore is used for higher data rates. The data is first processed by the TURMOIL system, which sends 5% of the internet data packets to XKeyscore. GCHQ only uses this version for collection under the
MUSCULAR program. •
Deep Dive: This latest version can process internet traffic at data rates of 10 gigabits per second. Data that could be useful for intelligence purposes is then selected and forwarded by using the "GENESIS selection language". GCHQ also operates a number of Deep Dive versions of XKeyscore at three locations under the codename
TEMPORA.
Capabilities For analysts, XKeyscore provides a "series of viewers for common data types", which allows them to query terabytes of raw data gathered at the aforementioned collection sites. This enables them to find targets that cannot be found by searching only the metadata, and also to do this against data sets that otherwise would have been dropped by the front-end data processing systems. According to a slide from an XKeyscore presentation, NSA collection sites select and forward less than 5% of the internet traffic to the
PINWALE database for internet content. • Look for the usage of
Google Maps and terms entered into a search engine by known targets looking for suspicious things or places. • Look for "anomalies" without any specific person attached, like detecting the nationality of foreigners by analyzing the language used within intercepted emails. An example would be a German speaker in Pakistan. The Brazilian paper
O Globo claims that this has been applied to Latin America and specifically to Colombia, Ecuador, Mexico and Venezuela. • Detect people who use encryption by doing searches like "all
PGP usage in Iran". The caveat given is that very broad queries can result in too much data to transmit back to the analyst. • Showing the usage of
virtual private networks (VPNs) and machines that can potentially be
hacked via
TAO. • Track the source and authorship of a document that has passed through many hands. • On July 3, 2014
ARD revealed that XKeyscore is used to closely monitor users of the
Tor anonymity network,
The Guardian revealed in 2013 that most of these things cannot be detected by other NSA tools, because they operate with strong selectors (like e-mail and IP addresses and phone numbers) and the raw data volumes are too high to be forwarded to other NSA databases. In 2008, NSA planned to add a number of new capabilities in the future including access to
VoIP and other, unspecified network protocols and additional forms of metadata such as
Exif tags, which often include
geolocation (
GPS) data. == Contribution to U.S. security ==