# Approximate triangle counting algorithms on Multi-cores

## Abstract:

Counting triangles in a large network is an important research task because of its usages in analyzing large networks. However, this task becomes expensive when runs on large networks with millions of nodes and millions of edges. For efficient triangle counting on such networks, researchers in recent years have adopted approximate counting or have proposed parallel or distributed solutions. In this work, we propose an approximate triangle counting algorithm, that runs on multi-core computers through a multi-threaded implementation. We show that for a given speedup factor, our method has a better approximation accuracy; further, the multi-threaded implementation that we propose is much superior to the Hadoop-based distributed methods that earlier algorithms propose.

## Challenge and Contribution:

In terms of time complexity, the fastest triangle counting methods are based on fast matrix multiplication. The best-known algorithm has complexity O(m^{1.41}). However, the methods that are based on matrix multiplication require large amount of memory, and hence they are not suitable for counting triangles in very large graphs. For today’s large network with millions of vertices and edges, all the exact methods for triangle counting can be deemed as expensive; so, the majority of the recent efforts of triangle counting either adopt a method for approximate counting or design a parallel or distributed framework for solving the counting task effectively. For approximate counting, DOULION uses a probabilistic sparsification technique to obtain a sparser graph; then, it computes the exact triangle count on the sparse graph, from which it extrapolates an approximate triangle count of the original graph.

we propose a variant of EDGEITERATOR method for approximate triangle counting. Our method has a surprisingly high accuracy, with a generous speedup. On large real-life graphs with millions of nodes and edges, the single processor version of our algorithm consistently achieves a 30-fold speedup (compared to the best exact method) with an accuracy that is around 99%. The most attractive feature of our method is that both the speedup, and the accuracy of our method improve as the input graph becomes larger, so it is particularly suitable for very large graphs.

The simplicity of our algorithm also allows a simple multi-threaded implementation of our method for executing on today’s multi-core architecture—this improves the speedup even further without harming the counting accuracy. We indeed found that for all the real-life graphs that we encountered, the multi-core version that we propose is a better choice by a wide margin than all the Hadoop-based solutions. For a specific example, on a Wikipedia graph with 1.63 million vertices, and 18.5 millions of edges; using 32- threads out method obtains a whopping 837-fold speedup with an accuracy of 98.2%. None of the Hadoop-based solution reports a speedup that is as high as this work.

## Method:

### Exact triangle count by EXACTEI:

An edge iterator algorithm iterates over each edge *e(v _{i} ,* v

*for counting the total number of triangles for which*

_{j}) ∈ E*e*is a participating edge. Let’s call the number of triangles incident to the edge

*e*, the partial triangle count with respect to

*e*, and represent it by

*count*. Since a triangle is composed of 3 edges, a triangle will appear in exactly 3 of these partial counts. EXACTEI can obtain the total count of triangles in G by simply adding the partial counts of all the edges followed by a division by 3. The triple counting of a triangle in the above method can be avoided by imposing a restriction that the third vertex of the triangle (say,

_{e}*v*) has an id which is larger than the id’s of both the vertices

_{k}*v*and

_{i}*v*of the edge

_{j}*e*; this yields a more efficient version of EXACTEI; a pseudo-code for which is shown in Algorithm 1.

### Approximate triangle count by APPROXEI:

We design an approximate triangle counting algorithm, called APPROXEI, by computing the partial count, *count _{e}*, only for a fraction of edges. For this, APPROXEI takes an additional parameter, a sample factor

*p ∈ [0, 1]*, which defines the fraction of edges in

*E*for which we compute the partial count. A pseudo-code is shown in Algorithm 2.

### Parallel algorithm, PARAPPROXEI:

Since, the APPROXEI algorithm only performs read operations on the graph data structures, multiple threads can access these data structures without requiring any exclusive access. So, PARAPPROXEI splits the *p ∗ |E|* edges, and assign each part to several threads so that each thread can compute the partial count of the edges in its part independently. Each thread *th* maintains it’s own counter (say *count _{th}*) which contains the sum of all the

*count*processed by it. After every thread has done its share of computation, PARAPPROXEI sums the partial triangle counts from each thread (

_{e}*count*) to get the

_{th}*count*. Then, it divides the

_{partial}*count*by appropriate normalization factor to get the approximate triangle count. The process is illustrated in Figure 1.

_{partial}

## Dataset:

We perform several experiments to observe the performance of approximate triangle counting. For this, we use a collection of real-life graphs available from http://www. cise.ufl.edu/research/sparse/matrices/. The statistics of the graphs are shown in Table II.

## Result:

### APPROXEI vs DOULION:

We compare the performance of APPROXEI with that of DOULION. For that, we repeat the APPROXEI with p = 0.01 and DOULION with p = 0.1 for all the graphs from Table II. The speed up is with respect to EXACTEI. The result is shown in Table VI. Clearly, APPROXEI (single-threaded) is better than DOULION both in terms of speedup and accuracy for most of the graphs.

### PARAPPROXEI vs *GraphPartition*:

We compare the performance of PARAPPROXEI with that of *GraphPartition*(GP). Since, GP is an exact counting method, We conduct the exact triangle counting using PARAPPROXEI with p = 1 (100% sampling), and *tc* = 32, and compare the execution time. The results are shown in Table VII. In this table, times under GP column are taken from the corresponding paper, which is the time of running GP on a 1636-node Hadoop cluster. In all three cases for which we were able to collect graph from SNAP (http://snap.stanford.edu/), our method significantly wins over GP using only 32 threads!

## Code and Data:

https://github.com/DMGroup-IUPUI/TriangleCountingMultiCore-Source

## Contact:

1. Mohammad Al Hasan (alhasan@cs.iupui.edu)

2. Mahmudur Rahman (prime059@gmail.com)

## Grant: NSF CAREER Award (IIS-1149851)

## Paper:

- Mahmudur Rahman, Mohammad Al Hasan. Approximate triangle counting algorithms on Multi-cores. IEEE Big Data ’13. Workshop on Scalable Machine Learning.
- http://ieeexplore.ieee.org/document/6691744/

## Citation:

@INPROCEEDINGS{RahmanHasan2013, author={M. Rahman and M. Al. Hasan}, booktitle={Big Data, 2013 IEEE International Conference on}, title={Approximate triangle counting algorithms on multi-cores}, year={2013}, pages={127-133}, doi={10.1109/BigData.2013.6691744}, month={Oct},}