Benchmarking the Graphulo processing framework
September 13, 2016
Graph algorithms have wide applicability to a variety of domains and are often used on massive datasets. Recent standardization efforts such as the GraphBLAS are designed to specify a set of key computational kernels that hardware and software developers can adhere to. Graphulo is a processing framework that enables GraphBLAS kernels in the Apache Accumulo database. In our previous work, we have demonstrated a core Graphulo operation that performs large scale multiplication operations of database tables called TableMult. In this article, we present results of scaling the Graphulo engine to larger problems and scalablity when using greater number of resources. Specifically, we present the results of two experiments that demonstrate Graphulo scaling performance as linear with the number of available resources. The first experiment demonstrates cluster processing rates through Graphulo's TableMult operator on two large graphs, scaled between 2^17 and 2^19 vertices. The second experiment uses TableMult to extract a random set of rows from a large graph (2^19 nodes) to simulate a cued graph analytic. These benchmarking results are of relevance to Graphulo users who wish to apply Graphulo to their graph problems.