Joining ALFA

Contact alfa-apply (@) csail.mit.edu to apply
Include:  project of interest by name, relevant courses and grades, relevant experience, expected year of graduation, and CV.

 

 

Contents

MIT Students

Join from ouside MIT

 

 


MIT STUDENTS


 

Summer 2019 Research Opportunity

We have openings for a summer research project in one of our ongoing projects. This opportunity will be best suited for students in their senior year/who're graduating, and are considering a M.Eng.

 

ML for Program Analysis

There's an unprecedented amount of software being written. This also means there's an unprecedented number of buggy pieces of code out there. We want to introduce a fresh perspective to analyzing bugs and security flaws in programs. We think we can model codes (like text) statistically, and predict where issues will occur in them. In this project, we will want to collect large corpora of codes, and design novel neural architectures to perform analysis and reasoning tasks on them. This project will introduce you to hands-on program analysis, designing and implementing novel deep learning models, and rigorous data analysis.
Preferred pre-reqs. Any/all of 6.033, 6.035, 6.036, 6.864, 6.867
 
To get in touch. Write to alfa-apply (@) csail.mit.edu with subject line "Summer research - ML for program analysis"
Include: project of interest by name, relevant courses and grades, relevant experience, expected year of graduation, and CV.
 
 

SuperUROP & UROP Projects

 

Contact alfa-apply (@) csail.mit.edu to apply
Include:  project of interest by name, relevant courses and grades, relevant experience, expected year of graduation, and CV.
 
 

SuperUROP Fall 2019

Cloud based semi-supervised GANs training

Training machine learning methods with a small amount of labeled data may produce poor results. However, semi-supervised learning using GANs, which trains the systems with small sets of labeled data and unlabeled data together, can produce considerable improvement in learning accuracy. Cloud based distributed deep learning systems that apply spatial coevolution improve GANs training scalability, getting more reliable methods. We want to improve GANs semi-supervised learning by applying these distributed deep learning training methods. We believe that this approach will significantly improve semi-supervised learning accuracy. This project will mainly introduce you to hands-on deep learning by designing and implementing new learning models, distributed systems design, etc.

 

How do K-12 students learn programming?

How do K-12 students learn how to program? Knowing how to read and write code is becoming more and more prominent. However a book about C is not every K-12 students cup of tea. Instead, the K-12 students use MIT AppInventor online, which is designed for learning programming. So what does learning to program with AppInventor look like. For example can we compare it with how students are learning with Python?

 

AI and Regulatory Technology

Public and private organisations often require rules and regulations to govern their subjects, e.g IRS and the Federal Reserve. A lack of clear regulation can lead to chaos and other unintended consequences. We are interested in how AI and software engineering methodolgies can aid the creation of regulatory technology help improve crafting and enforcing regulations? E.g. detect tax regulation loopholes, enforce regulatory compliance and guide to crafting better regulations.

 

Statistical Models of Computer Programs

There's an unprecedented amount of software being written. This also means there's an unprecedented number of buggy pieces of code out there. We want to introduce a fresh perspective to analyzing bugs and security flaws in programs. We think we can model codes (like text) statistically, and predict where issues will occur in them. In this project, we will want to collect large corpora of codes, and design novel neural architectures to perform analysis and reasoning tasks on them. This project will introduce you to hands-on program analysis, designing and implementing novel deep learning models, and rigorous data analysis.
When you apply, please share your CV, grades in the above/related coursework, and a link to your github account (if you have one).
Preferred pre-reqs: Any/all of 6.033, 6.035, 6.036

 

UROP Spring 2019

Data Science for Online Education

More and more material is available for learning online, e.g. on Massive Open Online Course platforms.  While online students learn their activities can be tracked and later used for research or even cycled through machine learning models to help guide instruction to improve their learning.  ALFA is conducting Data Science and Machine Learning on such activity data.  Join us in developing new technology to help learners learn better and instructors teach better. We are working on deep learning modeling, analyzing student program attempts, predictive modeling and transfer learning. We are examining edX's MITx courses named "6.00.1x Introduction to Programming and Python" and "6.00.2x Introduction to Computational Thinking and Data Science".

 

Analysis of bugs and flaws in large computer program corpora

There's an unprecedented amount of software being written. This also means there's an unprecedented number of buggy pieces of code out there. We want to introduce a fresh perspective to analyzing bugs and security flaws in programs. But in order to do so, we want to deeply understand specific, extant tools and techniques used to analyze programs. In this project, we want to collect, clean, and run large code corpora of computer programs on various industrial-strength program analysis tools. Will introduce UROPs to systems engineering, learning and using open-source tools, and rigorous data analysis.
Preferred pre-reqs: Any/all of 6.033, 6.035, 6.036

 

Statistical models of computer programs

There's an unprecedented amount of software being written. This also means there's an unprecedented number of buggy pieces of code out there. We want to introduce a fresh perspective to analyzing bugs and security flaws in programs. We think we can model codes (like text) statistically, and predict where issues will occur in them. In this project, we will want to collect, clean large corpora of codes and build various statistical models and study how they predict bugs. Will introduce UROPs to systems engineering, learning and using open-source tools, and rigorous data analysis.
Preferred pre-reqs: Any/all of 6.033, 6.035, 6.036

 

Data-pipeline and APIs for crunching large data

There's an unprecedented amount of data being generated today.  In order to analyze and make sense of GB/TB sized-datasets, we generally need the support of strong systems-infrastructure. In this project, we want to explore where the limits are for Pandas and Numpy in Python, and understand how we can smartly circumvent these limitations while being frugal on using additional computational resources. Will introduce UROPs to systems engineering, learning and using open-source tools, and rigorous data analysis.
Preferred pre-reqs: Any/all of 6.033, 6.035, 6.036

 

Analyzing how humans commit programming errors across languages

Understanding how we write computer programs, a complex cognitive task, remains an open question. One way to gain an insight into this process is to understand how we make mistakes; how we start with an idea and oftentimes fail to appreciate and follow the semantics demanded by a programming language, resulting in bugs which we inadvertently introduce. In this work, we want to explore mistakes of a similar nature made across different programming languages. For instance, we want to understand how programmers inadvertently introduce race conditions in different programming languages like Python or Java. We want to perform a systematic study of what some of these errors are and how they manifest themselves in public code repositories. Such a study will help us build interpretable statistical models to recreate, predict, and possibly prevent programmers from committing them, irrespective of the programming language they write in. Such interpretable models will eventually help us understand gaps that exist in programmers' understanding of concepts, which lead to such errors. This work will introduce UROPs to doing exploratory science, engineer systems, learn and use open-source tools, and rigorously analyze data.
Pre-requisites: Any/all of 6.033, 6.035, 6.036

 

MENG/UAP Projects
Fall 2018

 

Contact alfa-apply (@) csail.mit.edu to apply
Include:  project of interest by name, relevant courses and grades, relevant experience, expected year of graduation, and CV.
 

 

Machine Learning and Cyber Security

Denial of Service (DoS) Cyber attacks continue to increase and cause numerous disruptions in both industry and politics. With more and more critical information moving through networks, it is important to keep these networks available. The project will involve applying machine learning to investigate how to secure networks against autonomous and adaptive adversaries. It is ideal for students planning on taking or who have taken Machine Learning and who have a networks course.

 

Obfuscation Detection for Evasive Malware

Malware is constantly adapting in order to avoid detection. For example, content obfuscation presents a challenge to static signature-based malware detectors. These modest changes to detectable malware that allows it to evade detection can be described as adversarial perturbations. The goal of this project is to build detectors that are robust against obfuscation via adversarial learning. Join us, if you'd like to work on a real-world dataset of nearly a 1/2 million source files and contribute to our battle against Malware authors. At the end of this project, you will gain experience in scripting, data wrangling, and adversarial learning.  Experience with deep learning libraries (e.g., Pytorch) is preferred, with a basic understanding of machine learning concepts and data science.

 

Coevolutionary Dynamics for Generative Adversarial Networks

Generative Adversarial Networks (GANs) have become one of the dominant methods for deep generative modeling. Despite their demonstrated success on multiple vision tasks, GANs are difficult to train and much research has been dedicated towards understanding and improving their gradient-based learning dynamics. Most of the pathological behaviors encountered with gradient-based GAN training have been identified and studied by the evolutionary computing community decades ago—e.g., focusing, relativism, and loss of gradients. In this project, we aim to investigate extensively the coevolutionary dynamics of GANs on problems ranging from toy examples to a distributed dataset over a cluster. Join us if you'd like to contribute to the understanding of GAN dynamics and stabilizing their training.  Prerequisite: experience with deep learning libraries (e.g., Pytorch), evolutionary computing, and cloud computing.

 

Graph-based neural networks to model computer programs

There's an unprecedented amount of software being written. This also means there's an unprecedented number of buggy pieces of code out there. Can statistical models of codes (like text) capture aspects which conventional control and data dependency graphs of programs convey? Can such models then be used to predict pertinent properties of programs, like whether they're buggy or have security gaps in it? These are some questions we want to explore. Specifically, we want to explore how well graph-based neural networks can represent programs and help discover security issues in them. Will introduce candidates to rigorous data analysis, understanding literature, building robust ML models, and working with messy, real-world data.
Preferred pre-reqs: 6.033, 6.035, 6.036, 6.864, 6.820, 6.858

 


Opportunities to Join from Outside MIT


 

Long Term

 

PhD Student Research Assistantship

  • Funded PhD student research assistantship
  • In machine learning for cybersecurity
  • Location: Massachusetts Institute of Technology
  • Deadline for PhD program application: Mid December
  • Start Date: Fall

See: Grad admissions website for more specific information

We have an opening for a top class PhD student  interested in understanding the adversarial behavior that drives the security arms race between cyber attacks and defenses. This research is focused on Artificial Intelligence (including Machine/Deep Learning) and Mod/Sim  (modeling and simulation)  techniques.  One of our projects, named RIVALS , is focused on extreme DDOS attacks and DDOS-resilient peer to peer networks. Others consider the effectiveness of network enclaves, deception and malware detection. We are supported by the USA Defense Advanced Research Project Agency and the MIT CSAIL Cybersecurity Initiative.
 
Applicants must apply through MIT’s graduate program admissions process for the Department of Electrical Engineering and Computer Science. The admission application deadline is in mid-December. Mention interest in these topics and studying under Dr. O'Reilly in your application (e.g. statement of purpose,)  See  https://www.eecs.mit.edu/academics-admissions/graduate-program/admissions for study starting in September.

For informal inquiries, email alfa-apply (@) csail.mit.edu

 

Post-doctoral Researcher

Potential post-docs should contact Una-May O'Reilly through alfa-apply (@) csail.mit.edu It is very important when you contact us to clearly and concisely identify areas of mutual interest and provide information about your background. Having a face to face relationship with a member of ALFA in the group is very helpful.

The group has rotating funding for a modest number of post doctoral associate positions. Others are welcome if they can earn funding from other sources such as research foundations in their own country. Examples are NSERCs in Canada, NSFs from the USA, etc. In these situations a letter of support may be required during your application process. That can be provided should the project match well with ALFA. In yet other situations, post-docs apply after they have a scholarship. Be aware that post-doc fellowships that are funded outside MIT are usually (but not always) subject to extra CSAIL specific fees to cover visa processing and/or resources usage which implies either Dr. O'Reilly must cover these costs or your scholarship should. As well, having an award in hand does not guarantee an invitation to join. We have limited resources and require mutual interests.

 

Short Term Visits

 

Students

On very rare occasions, outside thesis work can be conducted over a short visit. Note that for these visits, the visitor must cover costs including MIT visiting student fees, CSAIL specific fees to cover visa processing and resources usage, travel to MIT, local accommodations and travel. Most students are funded by their home institution or scholarship for the visit. The cost of a visit depends on the length of stay and time of year, and does not include accommodations and local travel. International students must meet the financial requirements for a visa.

Generally, you must be willing to work on a project of mutual interest, with our software libraries and infrastructure.

Contact alfa-apply (@) csail.mit.edu to apply and for more information

 

Faculty Visitors

We welcome visits from our colleagues in academia. We are most interested in visitors whose research agenda closely aligns with our own or whom we have personally met through conferences, workshop and related travel. Note that visitor fees are not usually imposed upon short term faculty visits.

On very selective occasions, outside thesis work can be conducted over a short visit. Note that for these visits, the visitor must cover costs including CSAIL specific fees to cover visa processing and resources usage, travel to MIT, local accommodations and travel. Generally you must be willing to work on a project of mutual interest, with our software libraries and infrastructure.

Contact alfa-apply (@) csail.mit.edu to apply and for more information

 

 

Unfortunately we don't always have the time to reply to every unsolicited request.
Please don't be offended, we just don't have enough time.

Image

ALFA Group welcomes graduate and undergraduate students, visitors, and guests regardless of
race, religion, gender identification, sexual orientation, age, nationality, or disability status