Cover coming soon

Hacks, Leaks, and Revelations

by Micah Lee
April 2023, 352 pp.
Use coupon code PREORDER to get 25% off!

In the age of hacking and whistleblowing, the internet contains massive troves of leaked information that contain goldmines of newsworthy revelations in the public interest—if you know how to unravel them. Whether you’re an investigative journalist or an amateur researcher, this book gives you the technical expertise to find and interrogate complex datasets, transforming unintelligible files into groundbreaking reports.

Through hands-on assignments and examples that highlight real-world cases, information security expert and well-known investigative journalist Micah Lee guides you through the process of analyzing leaked datasets from governments, companies, and political groups. You’ll dig into hacked files from the BlueLeaks dataset of law enforcement records, analyze social media traffic from those behind the 2021 insurrection at the US Capitol, hear the exclusive story of privately leaked data from the anti-vaccine group America’s Frontline Doctors, and much more.


  • Technical skills and Python programming basics needed for data science investigations
  • Security concepts, like disk encryption
  • How to work with data in JSON, CSV, and SQL formats
  • Tricks for using the command-line interface to explore datasets packed with secrets
Author Bio 

Micah Lee is the Director of Information Security at First Look Media, parent company of The Intercept, and is known for helping secure Edward Snowden's communications while he leaked secret NSA documents. Micah used to work for the Electronic Frontier Foundation, and is currently an advisor to the transparency collective Distributed Denial of Secrets. He is also co-founder of the Freedom of the Press Foundation, a Tor Project core contributor, and he develops open source security and privacy tools like OnionShare and Dangerzone. In his spare time he likes competing in CTF contests and playing Dungeons & Dragons.

Table of contents 

Part 1: Sources and Datasets
Chapter 1: TBD
Chapter 2: Protecting Sources and Yourself
Chapter 3: Acquiring Datasets

Part 2: Tools of the Trade
Chapter 4: The Command Line Interface
Chapter 5: Explore Datasets in the Terminal
Chapter 6: Docker, Aleph, and Making Datasets Searchable
Chapter 7: Reading Other People's Email
Part 3: Programming and Structured Data
Chapter 8: A Brief Introduction to Python Programming
Chapter 9: The Many CSV Files of BlueLeaks
Chapter 10: Parler, the Insurrection of January 6, and the JSON File Format
Chapter 11: BlueLeaks Explorer
Chapter 12: SQL Database and Epik Fail
Part 4: Case Studies
Chapter 13: Pandemic Profiteers Making Millions From COVID-19 Disinformation
Chapter 14: Neo-Nazis and Their Chat Rooms
Appendix A: Using the Windows Subsystem for Linux
Appendix B: Texas GOP's Website Backup

The chapters in red are included in this Early Access PDF.