Hi! I’m Alana Marzoev.
I’m a computer science PhD student in the Language and Intelligence group at MIT. My research focuses on using AI to accelerate scientific discovery across the natural and social sciences. I’m advised by Jacob Andreas and Mike Cafarella.
In 2020, I went on leave from the PhD program at MIT to found Readyset, a data infrastructure startup that is helping developers build performant data-driven applications effortlessly. I served as the CEO of Readyset from 2020-2024, during which time we raised $30M from top investors like Index Ventures, Amplify Partners, and Sequoia Capital. In 2024, I transitioned out of my role as CEO to finish my PhD.
I actively angel invest in early stage AI and infra startups. Some of my past investments include Hydra, Tinfoil, and Snow Leopard AI.
Outside of my research, my interests range from the physical (running, yoga, pilates), to the academic (philosophy, economics, mathematics), to the artistic (fashion, design).
I split my time between NYC and Cambridge. If you’d like to get in touch, don’t hesitate to drop me a line at [lastname]@mit.edu.
Background
Before starting Readyset, I was a computer science PhD student at MIT, where I was supported by a Jacobs Presidential Fellowship. During this time, I thought about the future of data systems, including exploring novel architectures as well as how to democratize data access through machine learning.
Before MIT, I explored the implications of resource disaggregation in the cloud, investigated the effects of low precision arithmetic and variance reduction on the performance and hardware efficiency of stochastic gradient descent (SGD), and led a team of 50+ engineers in designing and building a fully-functional prototype of a Hyperloop pod.
My other past experiences include working on:
Ray, a distributed execution engine for machine learning at UC Berkeley’s RISELab
Project Sirius and Project Silica (Microsoft Research Cloud Infrastructure group).
Improving data sketch based containment estimates for data catalogs (Microsoft Research Data Systems group).
Publications
Some of my publications prior to starting Readyset include:
Unnatural Language Processing: Bridging the Gap Between Synthetic and Natural Language Data
Alana Marzoev, Samuel Madden, Frans Kaashoek, Michael Cafarella, Jacob Andreas
Preprint
Towards Multiverse Databases
Alana Marzoev, Lara Timbó Araújo, Malte Schwarzkopf, Samyukta Yagati, Eddie Kohler, Robert Morris, M. Frans Kaashoek, Sam Madden
HotOS 2019
High Accuracy SGD Using Low-Precision Arithmetic and Variance Reduction (for Linear Models)
Alana Marzoev, Christopher De Sa
SysML 2018.