Past Projects


Low Activity

Energy Efficient Datacenters
Scalable EDA-GP



Website   Publication   Publication



A scalable machine learning framework that allows researchers to build predictive models through physiological waveform mining and analysis. One of it's strengths is it's flexibility, allowing for users to define features to track, conditions to detect, and filters to apply with short user-defined scripts. BeatDB is integrated within Amazon Web Services (AWS), allowing for users to run computations in parallel in the cloud. Because of these features, BeatDB allows researchers and scientists to cut down on time needed for prediction studies and data processing without sacrificing any of the parameterization and specificity to the data possible with custom (and often single-use) scripts.

"Patients Like Me" for Precision Medicine (PM2)

In data-driven precision medicine, fast yet accurate prediction of acute and critical events based on sensors' time series is of crucial importance especially in intensive care units. In such setting, promptness is demanded, so if a task can be completed dramatically faster, it is often acceptable to tolerate a slight decrease in accuracy.  To address these challenges, we are developing techniques for scalable patient record retrieval and event prediction based on locality-sensitive hashing (LSH). It has a significantly faster querying time, while maintaining the accuracy in a competitive range in comparison to the linear, exhaustive k-nearest neighbor search. The prediction based on LSH is essentially a two-step process of first quickly retrieving "patients like me", the approximate nearest neighbors of our query of interest by LSH, and second, extrapolating the information of nearest neighbors for prediction.


An integrative machine learning and econometric framework that allows researchers to build prediction and causation models drawing on observational data. OSaaS seeks to aid decision making in contexts where the ground truth is not well known and uncertainty exists around the effectiveness of one's interventions. By leveraging big data in different domains (health, transportation, commercial), OSaaS seeks to make explicit and transparent the modeling choices made by researchers that ultimately inform key decisions (what patient receives what treatment and when, transportation infrastructure and policy decisions, insurance pricing).


The Human Data Interaction Project

The goal of the Human Data Interaction Project is to develop methods that are the intersection of data science, machine learning, and large scale interactive systems, to answer a rather very simple question:Why does it take a long time to process, analyze and derive insights from the data? Answer to this question lies in developing technology and a new cadre of methodologies that are based on intricately understanding the complexities in how humans (scientists, researchers, analysts, sales folks, marketing folks and everyone of us) interact with data to analyze, interpret, and derive insights from it. As we have ventured into multiple domains, and multiple applications, we observe that the processes involved in generating insights from data can be organized into five steps: organize, pre process, understand, learn models, generate insights, disseminate.


Wind Energy

We are developing a variety of machine learning approaches to provide powerful scalable statistical tools to the wind energy community. These include a variety of modeling methods like Bayesian networks, copula based dependence modeling, Gaussian processes, and a variety of optimization approaches based on sampling, and generative approaches. We closely work and collaborate with AWS Truepower. We have identified three different areas where machine learning and information technology can help improve wind systems performance. The first area helps in building a wind farm via. resource assessment. The second helps optimize the layout for a farm given turbine models, farm constraints and wind resource. We also developed an optimal power routing algorithm for large farms that improve the efficiency of the farms (current implementation of the algorithm is commercially used in OpenWind). Finally, we are interested in developing techniques to improve forecasting accuracy helping seamless integration of wind into our energy portfolio.



Systems and machine learning projects: Autotuning in PetaBricks, Zetabricks, D-TEC , Data-Mining Virtual Machines for Resource Optimization, Resource Allocation in Virtual Machines for Energy Efficient Data Centers and Clouds, Application Counter Intelligence, Meta-Optimization: Improving Compilation with Genetic Programming.


Hierarchical Genetic Algorithms for Parallelization of Sparse Matrix Algebra

High performance computing on a multicore processor demands efficient parallelization. While dense matrices can be efficiently distributed among cores without concern for inter-chip transport costs, sparse matrix algebra requires consideration of data distribution and transport costs. In collaboration with Lincoln Labs, we have teamed a hierarchical GA with a fine grained computation model. The GAs (inner and outer) adaptively determine an efficient processor mapping for sparse matrix multiplication with respect to data processing and transport costs.(Learn More)



We are developing scalable algorithms for a variety of NP hard problems in networks. These problems emerge in ad-hoc wireless networks, sensor networks. We have designed a distributed algorithms for network coding.


Meta-Optimization: Improving Compilation with Genetic Programming

We used genetic programming to automatically generate application specific and general compiler priority functions. These functions are known as the "Achilles Heel" because typically compiler designers develop them by hand and test them on problem instances that rapidly drift out of date. Our priority functions worked in the context of hyperblock scheduling and register allocation. A powerpoint from a PLDI presentation is available as a pdf.


Support Vector Machines: Performance Analysis

Support Vector Machines are an example of a recently developed machine learning algorithm that has rapidly been adopted by a wide range of application programmers as a means of classifying and performing data regression.


Multi-Objective Optimization Algorithm Design

We are investigating how design knowledge can easily be elicited from an expert designer to be exploited by an algorithm that returns to the designer a suite of pareto-optimal (i.e. non-dominated) designs. These designs present different tradeoffs with respect to multiple objectives and allow the designer or control algorithm to choose between them. The choice can be updated according to the current critical performance specifications. The technical challenge is to efficiently explore the space of possible solutions with scalable techniques that accomodate high dimensionality and multiple objectives.


Hybrid Machine-Learning and Optimization

Convex optimization techniques such as geometric programming and semi-definite programming are powerful techniques for design and optimization. However, they require the design problem to be modeled with a specific formulation such as a posynomial/monomial objective, constraint or sum-of-squares objective. This is often not straight forward to accomplish accurately.


Analog Reconfigurable Systems

Model-free methods such as evolutionary algorithms allow reconfigurable systems to adapt or self-tune based solely on performance feedback. Analog reconfigurable systems have potential payoffs in two areas.


Adaptive Resource Allocation

Computer architecture and application complexity is rapidly increasing. With the adoption of multi-core processors for desktop computing, workloads are less predictable because applications are more complex in terms of thread parallelism and diverse computation demands. Decentralized adaptive strategies within the operating system or runtime system potentially are a scalable solution to handling this complexity. We investigate computational economic mechanisms that allow individual software components to introspect on performance and adapt their run time resource requests like they would in a market place of sellers and consumers.


Evolvable Hardware

We have developed an evolvable hardware testbench named GRACE. Grace's software component includes an evolutionary algorithm that generates sized analog circuit topologies. Evolved circuit designs are directly tested in silicon. They are each dynamically configured on an Field Programmable Analog Array then exercised with input signal while their output behaviour is captured and evaluated. GRACE is extensible. We plan to pursue using a highly complex reconfigurable circuit environment to evolve complex circuits such as an ADC.