This sixth book in the MIT Lincoln Laboratory Series describes how mathematics can simplify data to help analysts solve problems that involve large, diverse datasets.
Mathematics of Big Data: Spreadsheets, Databases, Matrices, and Graphs

The rapid explosion of digitally available data in banking, commerce, science and engineering, transportation, healthcare, and social sciences has amplified the problems of storing and analyzing data. To organize huge datasets into constructs that they can manipulate with some ease, people have developed spreadsheets, databases, matrices, and graphs. Yet, the datasets are still often unmanageable, and analysis of the data consumes much time and computation. The new book "Mathematics of Big Data: Spreadsheets, Databases, Matrices, and Graphs" presents a mathematical approach to taming big data: using associative arrays to streamline data processing.

The book, the latest in the MIT Lincoln Laboratory Series published by MIT Press, is written by Jeremy Kepner, a Lincoln Laboratory Fellow and the founder of the Lincoln Laboratory Supercomputing Center, and Hayden Jananthan, a mathematics educator and a researcher at Lincoln Laboratory.

"Data provides insight into solutions for practical problems. The goal of this book is to provide readers with the concepts and techniques that will allow them to adapt to increasing data volume, velocity, and variety," Kepner said. "Nothing handles big like mathematics."

The book is divided into three sections. The first section, Applications and Practice, introduces the concept of the associative array and the application of associative arrays to graph analysis and machine learning systems. In the second part, Mathematical Foundations, the authors provide a mathematically rigorous definition of associative arrays. The third section, Linear Systems, discusses the extension of the mathematical concepts of linearity to associative arrays. Each chapter ends with a series of exercises that allow readers to test their understanding of the material.

In the foreword to the book, Charles Leiserson, professor of computer science and engineering at MIT and head of the Supertech Research Group in MIT's Computer Science and Artificial Intelligence Lab, said, "Jeremy Kepner and Hayden Jananthan will lead you on a voyage to relearn everything you know about matrices, graphs, databases, and spreadsheets, viewing their tight interrelationships through the lens of associative arrays. Or if you're new to these topics, they will lead you on a journey of discovery, teaching them to you without the arbitrary constraints of traditional educational narratives."

"Mathematics of Big Data" is available for purchase in print or e-book at the MIT Press website.

Purchase Here