Research Funding

  • NSF Career Award (Award# IIS-1149851) 

Our Projects

1. Interactive Pattern Mining on Hidden Data

Mining frequent patterns from a hidden dataset is an important task with various real-life applications. In this research, we propose a solution to this problem that is based on Markov Chain Monte Carlo (MCMC) sampling of frequent patterns.

2. A Generic Framework for Interactive Personalized Interesting Pattern Discovery

In this work, we propose an interactive pattern discovery framework named PRIIME which identifies a set of interesting patterns for a specific user without requiring any prior input on the interestingness measure of patterns from the user. The proposed framework is generic to support discovery of the interesting set, sequence and graph type patterns.

3. Smart Home Exploration Through Interactive Pattern Discovery

In this paper, we introduce a new home discovery tool called RAVEN. It uses interactive feedback over a collection of home feature-sets to learn a buyer's interestingness profile.  Then it recommends a small list of homes that match with the buyer's interest.

4. An Iterative MapReduce based Frequent Subgraph Mining Algorithm

In this work, we propose a frequent subgraph mining algorithm called FSM-H which uses an iterative MapReduce-based framework. FSM-H is complete as it returns all the frequent subgraphs for a given user-defined support, and it is efficient as it applies all the optimizations that the latest FSM algorithms adopt.

5. Representing Graphs as Bag of Vertices and Partitions for Graph Classification

In this work, we propose a novel approach for solving graph classification using two alternative graph representations, which are the bag of vertices and the bag of partitions. For the first representation, we use deep learning based node features and for the second, we use traditional metric based features.

6. Waiting to be Sold: Prediction of Time-Dependent House Selling Probability

In this work, we propose a supervised regression (Cox regression) model inspired by survival analysis to predict the sale probability of a house given historical home sale information within an observation time window.

7. DyLink2Vec: Effective Feature Representation for Link Prediction in Dynamic Networks 

A novel method for metric embedding of node-pair instances for a dynamic network. DyLink2Vec models the metric embedding task as an optimal coding problem where the objective is to minimize the reconstruction error, and it solves this optimization task using a gradient descent method

8. GraTFEL: Link Prediction in Dynamic Networks using Graphlet

A novel method for graphlet transitions based feature representation of the node-pair instances. GraTFEL uses unsupervised feature learning methodologies on graphlet transition based features to give a low-dimensional feature representation of the node-pair instances. 

9. GRAFT: an Approximate Graphlet Counting Algorithm for Large Graph Analysis

A simple, yet powerful algorithm that obtains the approximate graphlet frequency for all graphlets that have upto 5 vertices.

10. GUISE: A Uniform Sampler for Constructing Frequency Histogram of Graphlets 

A Uniform Sampler for Constructing Frequency Histogram of Graphlets. GUISE uses Markov Chain Monte Carlo (MCMC) sampling method for constructing the approximate GFD of a large network.

11. Approximate triangle counting algorithms on Multi-cores

An approximate triangle counting algorithm, that runs on multi-core computers through a multi-threaded implementation.

12. Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams

A Bayesian non-exhaustive classification framework for solving online name disambiguation task in digital library domain.

13. Sampling triples from restricted networks using MCMC strategy

Two Indirect triple sampling methods based on Markov Chain Monte Carlo (MCMC) sampling strategy. Triple-MCMC samples triple by performing MCMC walk on an imaginary triple sample space. Vertex-MCMC samples triple by performing MCMC walk on the original network to sample a node and then samples a triple centered by the selected node.