Joining ALFA

 

 

Diversity Drives Innovation and Learning

At the Anyscale Learning For All (ALFA) Group, we mean Anyscale Learning for ALL. ALFA is dedicated to cultivating an inclusive culture that supports, promotes, and empowers diverse voices in Computer Science & AI. Our focus is to improve and build machine learning, AI, and data analytics technology that works for everyone.

Representation matters.

We want our research to be representative of everyone who benefits and learns from it. We value people with different experiences, perspectives, and backgrounds - it’s the cornerstone of our approach to learning and research. We celebrate diversity along many axes: race, religion, ethnicity, age, sex, national origin, sexual orientation, gender identity, gender expression, genetic disposition, neurodiversity, disability, veteran status and any other aspect which makes you unique.

 

Contents

MIT Students

Join from ouside MIT

 

 


MIT STUDENTS


 
 

SuperUROPs & UROPs

Contact alfa-apply (@) csail.mit.edu to apply
Include:  project of interest by name, relevant courses and grades, relevant experience, expected year of graduation, and CV.
 

 

 

SuperUROPs Fall 2020

CHASE Cyber-hunting with public knowledge and AI

Do you wait for your adversary, or do you go and find him? Today, cyber defenders need to be active when protecting their cyber assets. Actively hunting for adversaries is one way of preventing attacks. But how do you find the adversary? There are currently public resources describing cyber tactics, techniques and procedures and by utilizing this public threat data with Artificial Intelligence you will research how to improve cyber defenses.

 

IDAS Synthesizing programs with evolution and formal methods

How can better software be automatically engineered? Is it possible to tell the computer what to implement a task instead of how to implement it? To proceed in this direction, we will combine an AI-based approach based on stochastic search heuristics with formal program synthesis methods.  Our goal is to generate correct software based on a "what is needed" specification.

Requires a software engineering course.

 

MOOCs Reflecting in online learning

Does learning online provide you with the same opportunity for reflection as in a normal classroom? Online learning has many benefits, but there is still room for improvement. We research the role of active reflection on the learning for online learners and how that impacts learning outcomes. This requires data science and natural language processing to handle the large volume of free text answers that students provide.

Desired NLP course

 

COVID Competing (mis)information narratives and human behavior in pandemics

We propose the use of agent-based modeling to determine the effects of COVID related health information to the public. The current global health recommendation for mitigating the COVID spread is social distancing. The impact of this method is based on how citizens interpret, act and comply on health information. There is a disconnect in both the messaging and obedience to public directives, which can have massive implications. We investigate a data-driven, agent-based modeling method based on human moral cognition for effective public health communication. We will construct of models from regions that are ahead in the COVID spread to provide realistic decisions for other regions. This can cover the different stages of a viral pandemic, from the outbreak, to apex, and the return to pre-pandemic levels of society without exhausting existing resources.

 

Statistical Models of Computer Programs

We aim to understand and analyze programs from a data-driven perspective. 
Imagine solving hard program analysis tasks like finding bugs and vulnerabilities in them. We view this as a problem in training machine learning models to capture structured information (loops, data and control dependencies, parse trees of programs) to predict properties like bugs and vulnerabilities. 
This is a rich and nascent problem space, and we are actively working on several aspects, including representation learning, evaluating different architecture designs, and on even synthesizing programs from generative models.
 
Prerequisites: A mix of systems and ML inclination. Preferable if the applicant has taken one or multiple of these -- 
Systems - 6.009, 6.031, 6.033, 6.035, 6.820, 6.858
ML/NLP - 6.008, 6.034, 6.036, 6.806/6.864, 6.867
 

 

 

UROPs Summer 2020

Machine Learning to Understand What Programs Mean (Filled)

The ALFA group has under development a number of Machine Learning code bases targeted to understand program semantics. We then use the semantic models to explore classifying software and synthesizing programs. We do this to automate bug detection, malware detection, and to understand programs and programming languages as human-invented artifacts and tools. New students on this project would gain experience in one or more of the codebases and applications. They would be expected to evaluate the ML methods' on different data sets and, to themselves, learn how to extend the methods.

Prerequisites: Would be nice: 6.036 Introduction to Machine Learning

 

Automated Vulnerability and Malware Detection in Source Code (Filled)

This is a part of the on-going MIT-IBM collaborative project on ‘Instruction, Command Line, or Script Malware Detection using Machine Learning’. This project investigates combining machine learning models with traditional programming analysis to detect and classify malicious code. The goal of this summer UROP is to establish competitive baselines to test the representations and techniques introduced in this project.

Prerequisites: 6.036 Introduction to Machine Learning
Would be nice: 6.862 Applied Machine Learning

 

Data Science and AI while Left of Boom: Sniffing Out Cyber Attacks (Filled)

In the increasingly adversarial setting of cyberspace, cyber defenders need to be active when protecting their cyber assets. Actively hunting for adversaries is one way of preventing attacks. Our question is whether it is possible to figure out what an adversary may be planning in terms of attacking and what kill-chain events may already have occurred? Our approach will draw upon AI and machine learning. There are currently public resources describing cyber tactics, techniques, and procedures. We offer the opportunity to work with a team that is using data science techniques and machine learning to exploit this public threat data to anticipate and hunt for ongoing cyber attacks.

Prerequisites: Rising Junior or Senior

 

Synthesizing Software with AI and Formal Program Synthesis Methods (Filled)

As software proliferates and the need for it explodes, what methods contribute to improving our ability to write and repurpose code? To what extent can software be automatically engineered? Is it possible to describe a task to a software development tool instead of how to implement it? What if an existing code base was the starting point? To what extent is it possible to repurpose it? What tools, formalisms, training data, method hybridizations, and inventions are necessary or helpful? You will learn how we are combining an AI-based approach based on stochastic search heuristics with formal program synthesis methods in the context of these questions. You will contribute to a project with a goal to generate correct software based on a "what is needed" specification.

Prerequisites: 6.009 Fundamentals of Programming, 6.031 Elements of Software Construction, 6.033 Computer Systems Engineering

 

Reflection and its Impact on Student Online Learning (Filled)

Our immediate experience of online classes has driven home the obvious -- we can teach online better than we are now. But what makes the difference? There are many factors and in this project, we consider that learning online may not provide the same opportunity for learning reflection as a normal classroom does. Without reflection, knowledge acquisition may lag. You will join us in evaluating the role of active reflection on the quality of learning by online learners. We have set up a Massive Online Learning Course on programming with opportunities for reflection. You will gain experience with data science techniques that aim to connect the impact of reflection with learner’s academic outcomes. You will be mentored in the application of natural language processing algorithms that handle a large volume of free text student responses.

Recommended Preparatory Courses: 6.806 Natural Language Processing, One or both of 6.031 Elements of Software Construction, 6.033 Computer Systems Engineering

 

 

 

M.Eng Projects/UAPs

 

Contact alfa-apply (@) csail.mit.edu to apply
Include:  project of interest by name, relevant courses and grades, relevant experience, expected year of graduation, and CV.
 

 

New projects for 2020 COMING SOON

 

Fall 2019 Projects

Machine Learning and Cyber Security

Denial of Service (DoS) Cyber attacks continue to increase and cause numerous disruptions in both industry and politics. With more and more critical information moving through networks, it is important to keep these networks available. The project will involve applying machine learning to investigate how to secure networks against autonomous and adaptive adversaries. It is ideal for students planning on taking or who have taken Machine Learning and who have a networks course.

 

Obfuscation Detection for Evasive Malware

Malware is constantly adapting in order to avoid detection. For example, content obfuscation presents a challenge to static signature-based malware detectors. These modest changes to detectable malware that allows it to evade detection can be described as adversarial perturbations. The goal of this project is to build detectors that are robust against obfuscation via adversarial learning. Join us, if you'd like to work on a real-world dataset of nearly a 1/2 million source files and contribute to our battle against Malware authors. At the end of this project, you will gain experience in scripting, data wrangling, and adversarial learning.  Experience with deep learning libraries (e.g., Pytorch) is preferred, with a basic understanding of machine learning concepts and data science.

 

Coevolutionary Dynamics for Generative Adversarial Networks

Generative Adversarial Networks (GANs) have become one of the dominant methods for deep generative modeling. Despite their demonstrated success on multiple vision tasks, GANs are difficult to train and much research has been dedicated towards understanding and improving their gradient-based learning dynamics. Most of the pathological behaviors encountered with gradient-based GAN training have been identified and studied by the evolutionary computing community decades ago—e.g., focusing, relativism, and loss of gradients. In this project, we aim to investigate extensively the coevolutionary dynamics of GANs on problems ranging from toy examples to a distributed dataset over a cluster. Join us if you'd like to contribute to the understanding of GAN dynamics and stabilizing their training.  Prerequisite: experience with deep learning libraries (e.g., Pytorch), evolutionary computing, and cloud computing.

 

Graph-based neural networks to model computer programs

There's an unprecedented amount of software being written. This also means there's an unprecedented number of buggy pieces of code out there. Can statistical models of codes (like text) capture aspects which conventional control and data dependency graphs of programs convey? Can such models then be used to predict pertinent properties of programs, like whether they're buggy or have security gaps in it? These are some questions we want to explore. Specifically, we want to explore how well graph-based neural networks can represent programs and help discover security issues in them. Will introduce candidates to rigorous data analysis, understanding literature, building robust ML models, and working with messy, real-world data.
Preferred pre-reqs: 6.033, 6.035, 6.036, 6.864, 6.820, 6.858

 


Opportunities to Join from Outside MIT


 

Long Term

 

PhD Student Research Assistantship

  • Funded PhD student research assistantship
  • In machine learning for cybersecurity
  • Location: Massachusetts Institute of Technology
  • Deadline for PhD program application: Mid December
  • Start Date: Fall

See: Grad admissions website for more specific information

We have an opening for a top class PhD student  interested in understanding the adversarial behavior that drives the security arms race between cyber attacks and defenses. This research is focused on Artificial Intelligence (including Machine/Deep Learning) and Mod/Sim  (modeling and simulation)  techniques.  One of our projects, named RIVALS , is focused on extreme DDOS attacks and DDOS-resilient peer to peer networks. Others consider the effectiveness of network enclaves, deception and malware detection. We are supported by the USA Defense Advanced Research Project Agency and the MIT CSAIL Cybersecurity Initiative.
 
Applicants must apply through MIT’s graduate program admissions process for the Department of Electrical Engineering and Computer Science. The admission application deadline is in mid-December. Mention interest in these topics and studying under Dr. O'Reilly in your application (e.g. statement of purpose,)  See  https://www.eecs.mit.edu/academics-admissions/graduate-program/admissions for study starting in September.

For informal inquiries, email alfa-apply (@) csail.mit.edu

 

Postdoctoral Researchers

Potential postdocs should contact Una-May O'Reilly through alfa-apply (@) csail.mit.edu It is very important when you contact us to clearly and concisely identify areas of mutual interest and provide information about your background. Having a face to face relationship with a member of ALFA in the group is very helpful.

The group has rotating funding for a modest number of post doctoral associate positions. Others are welcome if they can earn funding from other sources such as research foundations in their own country. Examples are NSERCs in Canada, NSFs from the USA, etc. In these situations, a letter of support may be required during your application process. That can be provided should the project match well with ALFA. In other situations, postdocs apply after they have a scholarship. Be aware that postdoc fellowships that are funded outside MIT are usually (but not always) subject to extra CSAIL specific fees to cover visa processing and/or resources usage which implies either Dr. O'Reilly must cover these costs or your scholarship should. Having an award in hand does not guarantee an invitation to join. We have limited resources and require mutual interests.

 

Short Term Visits

Please note, due to the ongoing situation with COVID-19, MIT has suspended all short term visit appointments until further notice.
This includes remote research.

 

Students

On very rare occasions, outside thesis work can be conducted over a short visit. Note that for these visits, the visitor must cover costs including MIT visiting student fees, CSAIL specific fees to cover visa processing and resources usage, travel to MIT, local accommodations and travel. Most students are funded by their home institution or scholarship for the visit. The cost of a visit depends on the length of stay and time of year, and does not include accommodations and local travel. International students must meet the financial requirements for a visa.

Generally, you must be willing to work on a project of mutual interest, with our software libraries and infrastructure.

Contact alfa-apply (@) csail.mit.edu to apply and for more information

 

Faculty Visitors

We welcome visits from our colleagues in academia. We are most interested in visitors whose research agenda closely aligns with our own or whom we have personally met through conferences, workshop and related travel. Note that visitor fees are not usually imposed upon short term faculty visits.

On very selective occasions, outside thesis work can be conducted over a short visit. Note that for these visits, the visitor must cover costs including CSAIL specific fees to cover visa processing and resources usage, travel to MIT, local accommodations and travel. Generally you must be willing to work on a project of mutual interest, with our software libraries and infrastructure.

Contact alfa-apply (@) csail.mit.edu to apply and for more information

 

 

Unfortunately we don't always have the time to reply to every unsolicited request.
Please don't be offended, we just don't have enough time.

Image