Data Engineering for Cybersecurity

Data Engineering for Cybersecurity

Build Secure Data Pipelines with Free and Open-Source Tools
by James Bonifield
July 2025, 344 pp
ISBN-13: 
9781718504028
Use coupon code PREORDER to get 25% off!

Download Chapter 4: Endpoint and Network Data

Look Inside!

Data Engineering for Cybersecurity back cover

Data Engineering for Cybersecurity pages 6-7Data Engineering for Cybersecurity pages 96-97Data Engineering for Cybersecurity pages 146-147

Security teams rely on telemetry—the continuous stream of logs, events, metrics, and signals that reveal what’s happening across systems, endpoints, and cloud services. But that data doesn’t organize itself. It has to be collected, normalized, enriched, and secured before it becomes useful. That’s where data engineering comes in.

In this hands-on guide, cybersecurity engineer James Bonifield teaches you how to design and build scalable, secure data pipelines using free, open source tools such as Filebeat, Logstash, Redis, Kafka, and Elasticsearch and more. You’ll learn how to collect telemetry from Windows including Sysmon and PowerShell events, Linux files and syslog, and streaming data from network and security appliances. You’ll then transform it into structured formats, secure it in transit, and automate your deployments using Ansible.

You’ll also learn how to:

  • Encrypt and secure data in transit using TLS and SSH
  • Centrally manage code and configuration files using Git
  • Transform messy logs into structured events
  • Enrich data with threat intelligence using Redis and Memcached
  • Stream and centralize data at scale with Kafka
  • Automate with Ansible for repeatable deployments

Whether you’re building a pipeline on a tight budget or deploying an enterprise-scale system, this book shows you how to centralize your security data, support real-time detection, and lay the groundwork for incident response and long-term forensics.

Author Bio 

James Bonifield has over a decade of experience analyzing malicious activity, implementing data pipelines, and training others in the security industry. He has built enterprise-scale log solutions, automated detection workflows, and led analyst teams investigating major cyber threat actors. Bonifield holds numerous certifications and enjoys spending time with his family, traveling, and tinkering with all things security and Python related.

Table of contents 

Acknowledgments
Introduction

Part I: Foundations of Secure Data Engineering
Chapter 1: Data Engineering Basics
Chapter 2: Network Encryption
Chapter 3: Source and Configuration Management

Part II: Log Extraction and Management
Chapter 4: Endpoint and Network Data
Chapter 5: Windows Logs
Chapter 6: Integrating and Storing Data
Chapter 7: Working with Syslog Data

Part III: Data Transformation and Standardization
Chapter 8: Data Manipulation Pipelines
Chapter 9: Transformation Filters

Part IV: Data Centralization, Automation, and Enrichment
Chapter 10: Centralizing Security Data
Chapter 11: Automating Tool Configurations
Chapter 12: Ansible Tasks and Playbooks
Chapter 13: Caching Threat Intelligence Data

Index