Guardian publishes NSA training slides showing XKeyscore processes 1-2% of global internet traffic through approximately 700 servers at 150 sites worldwide

Glenn Greenwald — The Guardian, July 31, 2013

Edward Snowden states in Guardian interview that XKeyscore allows analysts to search 'nearly everything a typical user does on the internet' and calls it 'the most terrifying' NSA program

Ewen MacAskill and Glenn Greenwald — The Guardian, July 31, 2013

NSA training materials claim more than 300 terrorists captured using intelligence from XKeyscore database queries

Top Secret NSA Training Materials — Disclosed July 2013

NSA documents indicate XKeyscore retains full content data for 3-5 days and metadata for up to 45 days in rolling databases, with selected data moved to permanent storage

Glenn Greenwald and Spencer Ackerman — The Guardian, July 31, 2013

Der Spiegel reports German intelligence service BND received XKeyscore from NSA in 2007 and operated installations at Bad Aibling and headquarters in Pullach

Laura Poitras, Marcel Rosenbach, and Holger Stark — Der Spiegel, September 20, 2013

Digital Veil · Part 17 of 17 · Case #9917

XKeyscore Is the NSA's Tool for Searching Its Vast Collection of Internet Data — Emails, Chats, Browser Histories, and Social Media — In Near Real Time. Edward Snowden Called It 'the Most Terrifying' Program He Encountered.

XKeyscore is the NSA's front-end search tool that allows analysts to query massive databases of intercepted internet communications without prior authorization. Disclosed by Edward Snowden in July 2013, the system processes roughly 1 to 2 percent of all global internet traffic and retains content for three to five days, metadata for up to 45 days. Training materials revealed that analysts could search by email address, IP address, phone number, or even specific search terms a target has used. Unlike PRISM, which required court oversight for targeting U.S. persons, XKeyscore granted analysts direct access to raw intelligence data with minimal oversight.

700+XKeyscore servers worldwide as of 2013

1-2%Of global internet traffic processed

45 daysMaximum metadata retention period

150Deployment sites across Five Eyes nations

Financial

Harm

Structural

Research

Government

The Architecture of Total Surveillance

On July 31, 2013, The Guardian published a detailed investigation of a National Security Agency system that Edward Snowden had described in stark terms: XKeyscore was "the most terrifying" surveillance program he had encountered during his tenure with NSA contractors. Unlike PRISM, which required Foreign Intelligence Surveillance Court approval for targeting U.S. persons, XKeyscore gave NSA analysts direct access to search vast databases of intercepted communications without prior judicial authorization. Training materials disclosed by Snowden revealed that the system processed approximately 1 to 2 percent of all global internet traffic—billions of individual records daily—through a network of roughly 700 servers deployed at 150 sites worldwide.

XKeyscore is not itself a collection program. It functions as the search interface, the front end that allows thousands of NSA analysts to query data gathered through other surveillance operations. The system integrates with multiple NSA databases including MARINA, which stores internet metadata; PINWALE, which stores email and instant message content; and data streams from upstream collection (direct tapping of fiber optic cables) and downstream programs like PRISM. What made XKeyscore particularly powerful—and particularly concerning to privacy advocates—was its scope and accessibility. An analyst with XKeyscore access could conduct sophisticated searches across multiple data sources simultaneously using selectors as specific as an email address or as broad as a country code, retrieving results within seconds.

700+

Servers worldwide. NSA training materials from 2008 indicated XKeyscore operated through approximately 700 servers deployed at roughly 150 sites globally, processing between 1 and 2 percent of all internet traffic passing through monitored communications infrastructure.

The July 2013 disclosure included dozens of pages of NSA training slides that detailed XKeyscore's capabilities with remarkable specificity. Analysts could search by email address, IP address, phone number, or even by specific search terms a target had previously used. The system included features for real-time monitoring, allowing an analyst to watch a target's internet activity as it occurred. One training slide stated plainly: "My target uses Google Maps to scope target locations—can I use this information to determine his email address? What about the web searches—do any stand out? …XKeyscore." Another slide showed how an analyst could identify targets based on "anomalous events" including the use of encryption or anonymization services.

Data Retention and Database Architecture

XKeyscore's power derived partly from its integration with NSA's massive data storage infrastructure. According to documents, the system retained full content data—the actual text of emails, chat messages, and browsing activity—for 3 to 5 days due to storage capacity limitations. Metadata, which includes information about communications without the content itself (sender, recipient, timestamp, IP addresses, file sizes), could be retained for up to 45 days. Communications meeting specific intelligence criteria could be moved from rolling databases to permanent storage, where they remained searchable indefinitely.

This rolling retention model meant that XKeyscore functioned as both a real-time surveillance tool and a retrospective investigation system. If an analyst identified a person of interest through current intelligence, XKeyscore allowed queries extending days or weeks into the past across both content and metadata databases. The MARINA database, specifically designed for metadata storage, extended this retrospective capability further with retention periods up to one year. A March 2013 NSA document indicated MARINA processed over 20 billion internet communications events daily, creating a comprehensive map of digital communications patterns.

"I, sitting at my desk, could wiretap anyone, from you or your accountant, to a federal judge or even the president, if I had a personal email."

Edward Snowden — The Guardian Interview, July 31, 2013

The database architecture supporting XKeyscore reflected NSA's collection priorities and technical constraints. PINWALE, which stored email and instant message content, operated with severe storage limitations given the volume of global communications. The 3 to 5 day retention window for most content represented a practical compromise between intelligence value and infrastructure costs. By contrast, MARINA's metadata could be stored efficiently in vast quantities, enabling long-term pattern analysis of communications networks without requiring the storage space that full content would demand.

The Plugin Ecosystem

XKeyscore operated through a modular plugin architecture that made the system highly adaptable. Documents published by The Guardian and Der Spiegel revealed hundreds of specialized plugins designed to process different types of intercepted data and target specific communications platforms. Plugins had names that directly indicated their function: FACEBOOKSEARCH for querying Facebook data, MAIL_HEADER for analyzing email metadata, DNI_CHAT for instant messages, and XKEYSCORE_PHONE_HOME for tracking when target devices connected to the internet.

This plugin system allowed NSA to expand XKeyscore's capabilities continuously without rebuilding core infrastructure. As new communications platforms emerged—new social media services, new chat applications, new encryption methods—NSA could develop new plugins to process and search that data. Some plugins focused on technical analysis, extracting metadata from network traffic. Others targeted content, parsing messages for keywords or analyzing social network connections. Still others addressed operational challenges like de-anonymization, attempting to identify users behind VPN services, Tor nodes, or proxy servers.

45 days

Metadata retention. While content data was typically retained for 3-5 days due to storage constraints, XKeyscore systems retained internet metadata for up to 45 days in rolling databases, with selected data moved to permanent storage.

Training materials showed that analysts could combine multiple plugins in single queries. An example from disclosed documents demonstrated searching for all Facebook messages containing specific keywords sent from a particular country to recipients in another country within a defined time window. Another example showed tracking email addresses associated with a specific IP address, then identifying all other email accounts that communicated with those addresses, effectively mapping a target's communications network across multiple degrees of separation.

Legal Authorities and Oversight Mechanisms

The legal framework governing XKeyscore usage was complex and contested. The system queried data collected under multiple authorities including Section 702 of the Foreign Intelligence Surveillance Act (FISA), Executive Order 12333, and third-party collection sharing agreements with foreign intelligence services. Each authority came with different legal requirements and oversight mechanisms, creating what critics described as a patchwork of rules that analysts could navigate to access desired information.

Section 702 collection, which included PRISM data from technology companies, required Foreign Intelligence Surveillance Court certification of annual targeting procedures and minimization procedures designed to protect U.S. persons' privacy. However, XKeyscore also queried data collected under Executive Order 12333—the authority governing intelligence collection outside the United States—which did not require FISA Court approval. Analysts were instructed through training that targeting U.S. persons required a warrant or FISA Court order, but the system itself did not prevent searches that might violate those requirements. Oversight relied substantially on analysts' adherence to training and internal compliance reviews conducted after the fact.

Data Type

Retention Period

Primary Database

Email/Chat Content

3-5 days

PINWALE

Internet Metadata

45 days

XKEYSCORE

Communications Metadata

Up to 1 year

MARINA

Selected Intelligence

Indefinite

Permanent storage

NSA Director Keith Alexander defended XKeyscore in congressional testimony on July 31, 2013, the same day The Guardian published detailed revelations about the system. Alexander stated that queries required "reasonable articulable suspicion" linking a selector to a foreign intelligence target, and that violations were rare and unintentional. However, documents suggested a more permissive operational reality. One training slide stated that analysts could begin surveillance of anyone by typing an email address or other selector into the system. Another indicated that while certain queries required approval from shift supervisors, many searches could be conducted based solely on the analyst's judgment that foreign intelligence justification existed.

International Deployment and Five Eyes Integration

XKeyscore's global deployment extended beyond NSA facilities. The system was integrated into the intelligence infrastructure of the Five Eyes alliance—the intelligence sharing partnership among the United States, United Kingdom, Canada, Australia, and New Zealand established during World War II and formalized in subsequent decades. Documents indicated that Government Communications Headquarters (GCHQ) in the UK operated XKeyscore installations that processed data from GCHQ's own collection operations, including Tempora, which tapped directly into fiber optic cables carrying internet traffic.

A September 2013 article in The Guardian revealed that GCHQ processed approximately 600 million telephone events daily through XKeyscore servers. The intelligence sharing arrangement meant that NSA analysts could potentially query data collected by GCHQ, and GCHQ analysts could query data collected by NSA, creating what privacy advocates described as a means of circumventing domestic legal restrictions. If U.S. law prevented NSA from collecting certain categories of Americans' communications, but GCHQ collected that data through its own operations, the data could still potentially be searchable by U.S. analysts through shared XKeyscore infrastructure.

150

Global deployment sites. NSA training materials indicated XKeyscore operated at approximately 150 sites worldwide, including NSA facilities, Five Eyes partner locations, and installations at cooperating intelligence services in allied nations.

Germany's Bundesnachrichtendienst (BND) represented another significant XKeyscore deployment. Documents published by Der Spiegel in 2013 revealed that NSA had provided BND with XKeyscore software in 2007 and offered system upgrades through 2013. German parliamentary inquiries following the Snowden disclosures examined BND's use of the system. Intelligence officials confirmed XKeyscore operations at facilities including BND headquarters in Pullach and Bad Aibling Station, a former NSA site transferred to German control. The revelations sparked intense controversy in Germany, particularly following disclosures that NSA had monitored Chancellor Angela Merkel's phone, raising questions about the terms of intelligence cooperation between allied nations.

Technical Capabilities and Operational Usage

Training materials provided granular detail about XKeyscore's search capabilities. Analysts could construct queries using Boolean operators (AND, OR, NOT) and wildcard characters, similar to commercial search engines but applied to vast databases of intercepted communications. The system supported searching by sender or recipient email addresses, IP addresses, phone numbers, language, geographic location codes, and keywords or phrases appearing in message content. One training slide showed how to identify all encrypted Microsoft Word documents sent to or from Pakistan, then extract metadata about senders and recipients for further investigation.

XKeyscore included real-time monitoring capabilities allowing analysts to watch targets' internet activity as it occurred. A training slide titled "Show Me All the VPN Startups in Country X" demonstrated how analysts could identify users of virtual private networks—services often used to encrypt communications and mask IP addresses. The system could flag when targets began using encryption or anonymization tools, capabilities that raised concerns among privacy and security advocates who argued that privacy-protecting technologies were being treated as indicators of suspicious activity.

"The NSA's claim that it is protecting Americans' privacy is 'bullshit.' They're deliberately misleading us and lying about it."

William Binney, Former NSA Technical Director — Statement to The Guardian, 2013

Documents indicated XKeyscore was used not only for counterterrorism but for broader intelligence gathering. Training materials referenced searches related to diplomatic negotiations, economic intelligence, and monitoring of foreign government officials. A 2008 NSA document claimed that more than 300 terrorists had been captured using intelligence derived from XKeyscore queries, but provided no details about how intelligence from the system contributed to those captures or what proportion of XKeyscore usage focused on counterterrorism versus other intelligence objectives.

The Disclosure and Its Aftermath

Edward Snowden's decision to disclose XKeyscore stemmed from what he described as a fundamental disconnect between public understanding of NSA surveillance and operational reality. In his July 31, 2013 interview with The Guardian, Snowden explained that he could read anyone's email, watch their real-time internet activity, or see photos they uploaded simply by having their email address. The claim was contested by NSA officials who emphasized that such activities would violate policy and be detected through audits, but the technical capability Snowden described was consistent with the training materials he disclosed.

The Guardian's publication of XKeyscore documents followed earlier revelations about PRISM, NSA's bulk telephone metadata collection, and other surveillance programs. The XKeyscore disclosure was significant because it showed the practical mechanics of how NSA analysts accessed and searched intercepted communications. While earlier revelations had documented what NSA collected, XKeyscore showed how that collected data became operationally useful—how an analyst sitting at a desk could query billions of records and retrieve specific emails, chats, or browsing histories within seconds.

20+ billion

Daily metadata events. NSA's MARINA database, queryable through XKeyscore, processed over 20 billion internet communications metadata events daily as of March 2013, according to internal NSA documentation.

Congressional responses to the XKeyscore disclosures divided along predictable lines. Intelligence Committee chairs defended the program as legal, necessary, and subject to appropriate oversight. Critics, led by Senator Ron Wyden and Representative Justin Amash, argued that XKeyscore demonstrated that surveillance had extended far beyond what Congress intended to authorize and what the public had been told. Civil liberties organizations including the American Civil Liberties Union cited XKeyscore in lawsuits challenging NSA surveillance as unconstitutional under the Fourth Amendment's prohibition of unreasonable searches.

Competing Interpretations and Continuing Debate

The legal and constitutional questions raised by XKeyscore remain contested. Supporters argue that the system operates within legal authorities granted by Congress and approved by the FISA Court, that it focuses on legitimate foreign intelligence targets, and that oversight mechanisms prevent abuse. They note that Snowden's claim that he could unilaterally wiretap anyone including the president was technically true only in the sense that the system's interface would allow such a query, not that conducting such a search would be legal or undetected. Compliance systems, they argue, would flag the unauthorized query and result in investigation and prosecution.

Critics counter that a system requiring post-hoc compliance review rather than prior authorization represents precisely the kind of general warrant that the Fourth Amendment was designed to prohibit. They argue that the technical capability to conduct warrantless searches of Americans' communications creates unacceptable risks regardless of oversight mechanisms, because oversight depends on the willingness of the executive branch to enforce restrictions on itself. Furthermore, critics note that even if analysts generally follow rules against targeting U.S. persons, XKeyscore's search of foreign targets inevitably captures Americans' communications with those targets, and the system's metadata queries can reveal sensitive information about Americans' associations, movements, and activities without ever accessing content.

A May 2015 ruling by the U.S. Court of Appeals for the Second Circuit in ACLU v. Clapper provided partial vindication for critics. While the case focused specifically on bulk telephone metadata collection under Section 215 of the PATRIOT Act rather than XKeyscore directly, the court ruled that the metadata program exceeded statutory authority. The court's opinion noted that the government's interpretation of its surveillance authorities would allow collection that Congress had not clearly authorized. Congress subsequently passed the USA FREEDOM Act, which prohibited bulk collection of telephone records under Section 215, though it left intact other authorities under which XKeyscore-accessible data is collected.

Technical Evolution and Current Status

Information about XKeyscore's current capabilities and deployment remains classified. The system disclosed in 2013 represented NSA infrastructure and policies as they existed prior to Snowden's departure in May 2013. NSA has undoubtedly modified both technical systems and internal policies in response to the disclosures. The extent of those modifications is unknown to the public. Congressional oversight committees receive classified briefings about NSA surveillance systems, but those briefings and any modifications to XKeyscore remain classified.

What is clear is that the fundamental surveillance architecture disclosed in 2013 continues to operate. The legal authorities under which data flows into XKeyscore-queryable databases—Section 702 of FISA, Executive Order 12333, and intelligence sharing agreements—remain in effect. Congress reauthorized Section 702 in 2018 with modest modifications but without fundamental reforms that would prevent the kind of searching capabilities XKeyscore enables. The FISA Court continues to approve surveillance applications at rates exceeding 99 percent. And NSA's budget, while classified, remains substantial with congressional appropriations supporting continued signals intelligence operations.

300+

Terrorist captures claimed. A 2008 NSA document asserted that more than 300 terrorists had been captured using intelligence derived from XKeyscore database queries, though no details were provided about how the system contributed to those captures.

Technology companies have responded to the Snowden disclosures by implementing stronger encryption for data in transit and at rest, making communications more difficult to intercept and read even if captured. Apple, Google, Facebook, and Microsoft all expanded encryption of user data following 2013. However, these measures protect against some forms of surveillance more effectively than others. Encryption protects data as it travels across the internet but may not protect against collection at endpoints or against companies' own access to user data. And metadata—information about who communicates with whom, when, and from where—often remains accessible even when content is encrypted.

The Architecture of Accountability

The XKeyscore disclosure illuminated fundamental tensions in democratic governance of intelligence agencies. The system was developed and deployed under legal authorities approved by Congress and the FISA Court. It operated for years before public disclosure, during which time congressional oversight committees received classified briefings about NSA surveillance capabilities. Yet when details became public, many members of Congress expressed surprise at the scope and capabilities of surveillance systems their committees had been briefed on in classified sessions. This gap between classified briefings to oversight committees and public accountability raises questions about whether oversight mechanisms designed in the 1970s remain adequate for 21st-century surveillance technologies.

The FISA Court's role in this architecture remains particularly contested. Between 1979 and 2012, the court approved 33,942 surveillance warrants while rejecting only 11 outright—an approval rate exceeding 99.9 percent. Defenders note that this statistic reflects the fact that applications are carefully prepared by government attorneys to meet legal standards before submission, and that the court frequently requires modifications to applications before approval. Critics argue that an ex parte court where only the government presents arguments, with an approval rate approaching 100 percent, cannot provide meaningful oversight of executive branch surveillance activities.

The Snowden disclosures, including the details about XKeyscore, sparked reforms but left fundamental surveillance authorities intact. The USA FREEDOM Act of 2015 ended NSA's bulk collection of telephone metadata under Section 215, requiring the agency to obtain specific court orders for records held by telecommunications companies. But the Act did not address upstream internet collection, Section 702 surveillance, or Executive Order 12333 collection—the authorities under which most data accessible through XKeyscore is gathered. Congressional debates about reauthorizing Section 702 in 2017 and 2018 included proposals for warrant requirements before searching Section 702 databases for Americans' communications, but those proposals failed.

A Search Engine for Everything

XKeyscore represents a category of technology that raises questions extending beyond specific legal authorities or oversight mechanisms. The system demonstrates that given sufficient resources and legal authorities, an intelligence agency can create infrastructure allowing analysts to search vast swaths of human communication and internet activity nearly in real time. The technical capability exists. The legal authorities, whether adequate or inadequate, have been granted. The oversight mechanisms, whether effective or ineffective, are in place. The question is not whether such a system can be built—it has been built—but whether democratic societies should permit such systems to exist.

Edward Snowden's characterization of XKeyscore as "the most terrifying" NSA program reflected his assessment that the system's combination of scope, accessibility, and minimal prior authorization created risks fundamentally different from those of earlier surveillance technologies. A wiretap requires a court order identifying a specific target. XKeyscore allows analysts to search billions of records using criteria that might identify targets not previously known or suspected, then work backward to build cases for further investigation. This inversion—surveillance infrastructure first, individualized suspicion later—represents precisely what critics argue the Fourth Amendment was designed to prevent.

More than a decade after XKeyscore's disclosure, the fundamental issues remain unresolved. The system, in some form, continues to operate. The legal authorities supporting it remain in effect. The oversight mechanisms have been modified but not fundamentally reformed. The technical capabilities have almost certainly expanded as storage becomes cheaper and analysis tools more sophisticated. What changed in 2013 was public awareness. The architecture of mass surveillance, long suspected by privacy advocates and documented in fragments by earlier disclosures, became undeniable. Whether that awareness will lead to meaningful constraints on surveillance capabilities remains an open question.

Primary Sources

[1]

Glenn Greenwald — XKeyscore: NSA tool collects 'nearly everything a user does on the internet,' The Guardian, July 31, 2013

[2]

Ewen MacAskill and Glenn Greenwald — NSA Prism program taps in to user data of Apple, Google and others, The Guardian, June 6, 2013

[3]

Laura Poitras, Marcel Rosenbach, and Holger Stark — 'Partner and Target': NSA Snooped on European Union Offices, Der Spiegel, September 20, 2013

[4]

James Ball — NSA's Prism surveillance program: how it works and what it can do, The Guardian, June 8, 2013

[5]

Barton Gellman and Laura Poitras — U.S., British intelligence mining data from nine U.S. Internet companies in broad secret program, The Washington Post, June 7, 2013

[6]

Spencer Ackerman — NSA collected US email records in bulk for more than two years under Obama, The Guardian, June 27, 2013

[7]

Keith Alexander — Testimony before House Permanent Select Committee on Intelligence, C-SPAN, July 31, 2013

[8]

ACLU v. Clapper — 785 F.3d 787 (2d Cir. 2015)

[9]

Privacy and Civil Liberties Oversight Board — Report on the Surveillance Program Operated Pursuant to Section 702 of FISA, July 2, 2014

[10]

Charlie Savage — Power Wars: The Relentless Rise of Presidential Authority and Secrecy, Little, Brown and Company, 2015

[11]

Glenn Greenwald — No Place to Hide: Edward Snowden, the NSA, and the U.S. Surveillance State, Metropolitan Books, 2014

[12]

Luke Harding — The Snowden Files: The Inside Story of the World's Most Wanted Man, Vintage Books, 2014

[13]

Barton Gellman — Dark Mirror: Edward Snowden and the American Surveillance State, Penguin Press, 2020

[14]

Julia Angwin — Dragnet Nation: A Quest for Privacy, Security, and Freedom in a World of Relentless Surveillance, Times Books, 2014

[15]

Bruce Schneier — Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World, W. W. Norton & Company, 2015

Evidence File