Cover Coming Soon

The Shape of Data

Network Science, Geometry-Based Machine Learning, and Topological Data Analysis in R
by Colleen M. Farrelly and Yaé Ulrich Gaba
July 2023, 272 pp.
Use coupon code PREORDER to get 25% off!

Download Chapter 4: BEYOND NETWORKS

Look Inside!

Shape of Data pages 4-5 Shape of Data pages 36-37Shape of Data pages 96-97Shape of Data pages 132-133

The Shape of Data shows how to use geometry- and topology-based algorithms for machine learning. Focused on practical applications rather than dense mathematical concepts, the book progresses through coding examples using social network data, text data, medical data, and education data. Readers will come away with an entirely new toolkit to use in their own machine-learning work, as well as with a solid understanding of some of the most exciting algorithms being used in the field today.

Author Bio 

Colleen M. Farrelly is a senior data scientist whose academic and industry research has focused on topological data analysis, quantum machine learning, geometry-based machine learning, network science, hierarchical modeling, and natural language processing. Since graduating from University of Miami with an MS in Biostatistics, Colleen has worked as a data scientist in a variety of industries, including health care, consumer packaged goods, biotech, nuclear engineering, marketing, and education. Colleen often speaks at tech conferences, including PyData, SAS Global, WiDS, Data Science Africa, and DataScience SALON. When not working, Colleen can be found writing haibun/haiga or doing any sort of water sport.

Yaé Ulrich Gaba completed his doctoral studies at the University of Cape Town (UCT, South Africa) with specialization in Topology and is presently a research associate at Quantum Leap Africa (QLA, Rwanda). His research interests are computational geometry, applied algebraic topology (topological data analysis), and geometric machine learning (graph and point-cloud representation learning). His current focus lies in geometric methods in data analysis, and his work seeks to develop effective and theoretically justified algorithms for data/shape analysis using geometric and topological ideas and methods.

Table of contents 

Chapter 1. Why Geometry?
Chapter 2. Introduction to Network Data
Chapter 3. Network Analysis
Chapter 4. Beyond Networks
Chapter 5. Geometry in Data Science
Chapter 6. Other Applications of Geometry in Machine Learning
Chapter 7. Topological Data Analysis
Chapter 8. Algorithms Related to Homotopy
Chapter 9. Working with Language
Chapter 10. Computational Solutions for TDA Algorithms

The chapters in red are included in this Early Access PDF.


"The title says it all. Data is bound by many complex relationships not easily shown in our two-dimensional, spreadsheet filled world. The Shape of Data walks you through this richer view and illustrates how to put it into practice."
—Stephanie Thompson, Data Scientist and Speaker

The Shape of Data is a novel perspective and phenomenal achievement in the application of geometry to the field of machine learning. It is expansive in scope and contains loads of concrete examples and coding tips for practical implementations, as well as extremely lucid, concise writing to unpack the concepts. Even as a more veteran data scientist who has been in the industry for years now, having read this book I've come away with a deeper connection to and new understanding of my field."
—Kurt Schuepfer, Ph.D., McDonalds Corporation

“A great source for the application of topology and geometry in data science. Topology and geometry advance the field of machine learning on unstructured data, and The Shape of Data does a great job introducing new readers to the subject.”
—Uchenna “Ike” Chukwu, Senior Quantum Developer

Extra Stuff 

Click here to download the Python code files.