eugenewu.net Open in urlscan Pro
128.59.11.206  Public Scan

URL: http://eugenewu.net/
Submission: On October 29 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

EUGENE WU



BIO

Eugene Wu is an associate professor of computer science at Columbia University.
He is broadly interested in technologies that help users play with their data.
His goal is for users at all technical levels to effectively and quickly make
sense of their information. He is interested in solutions that ultimately
improve the interface between users and data, and uses techniques borrowed from
fields such as data management, systems, crowd sourcing, visualization, and HCI.
Eugene Wu received his Ph.D. from MIT, B.S. from Cal, and was a postdoc in the
AMPLab. A profile, an obit.

Eugene Wu has received the VLDB 2018 10-year test of time award,
best-of-conference citations at ICDE and VLDB, the SIGMOD 2016 best demo award,
the NSF CAREER, and the Google, Adobe, and Amazon faculty awards.

The WuLab Blog

Research Overview (circa 2022)

Joining The Lab
I'm looking for a student to work at the intersection of LLM agents and data
systems, or database and HCI. Applicants should have taken courses in both areas
and/or have experience in both areas.
PhDs: read the lab's work and provide evidence you can conduct research in the
lab. Include "bananas" in the subject line. See youtube videos of how to
approach the application.
Postdocs: share thoughts on how you can best make use of my expertise. Include
"satsuma" in the subject line.
Interns + UGrad + Masters read the list of projects looking for help.

INFO

ewu@cs.columbia.edu
421 Mudd, 500 W 120th St
Twitter: @sirrice
Github: sirrice, cudbg
OH: Thurs 12-1PM EST 421 Mudd
CV

Co-Chair: Data, Media & Society
Advisor: Journalism + CS Dual Degree
Member: Columbia DB, Columbia CS, DSI


Support: NSF 1527765, 1564049, 1845638 (CAREER), 1740305, 2008295, 2106197,
2103794, 2312991, Amazon, Google, Adobe, CAIT, Columbia SIRS




RECENT NEWS

 * Nov-2023: Excited to present our vision of Task-based Dataset Search at CIDR
   2024 in January and our vision about using documentation to tackle data
   ambiguity at the Table Representation Learning Workshop at NeurIPS 2023.
 * Aug-2023: We will be presenting 4 papers at VLDB! Saibot brings differential
   privacy to task-based dataset search platforms, Joinboost trains decision
   trees, radom forests, and gradient boosted models within DBMSes faster than
   ML libraries like LightGBM and XGBoost, Pollock introduces a benchmark and
   grammar-based formalization for studying CSV loaders, and ConnectorX is the
   fastest way for DataFrame systems to load data from DBMSes. Excited to see
   folks in Vancouver!
 * Jul-2023: Delighted to recieve a grant from Columbia’s Center of AI
   Technology to study differentially private dataset search with Rachel
   Cummings.
 * May-2023: SIGMOD 2023 is next month! We have exciting work on a Data
   Interface Grammar, how to detect and avoid logical errors in semantic layers,
   CPU-GPU acceleration for In-DBMS machine learning, a suite of XTutor
   visualization libraries for data education, and the OM3 progressively encoded
   data structure for fast and massive-scale time series visualizations!
 * Dec-2022: Zachary recieved the Columbia Data Science Institute’s Avanessian
   PhD fellowship!
 * Sep-2022: IEEE VIS 2022 is next month! Excited to present our work on
   generating fully interactive interface from natural language questions at the
   nlviz workshop, our recommendations for how to write captions for data
   visualizations at the viscomm workshop, and our new formalism and interaction
   design for ad hoc comparisons at the main conference!
 * Aug-2022: A nice Columbia Engineering article about our research to help
   insure African farmers from droughts and floods.

ALL PUBLICATIONS (SHOW SELECTED)

 1.   Suna: Scalable Causal Confounder Discovery over Relational Data
      Jiaxiang Liu, Eugene Wu
      In Review 2025
 2.   Jade: Design Independence Via Physical Visualization Design
      Yiru Chen, Xuyeng Li, Jeff Tao, Lana Ramjit, Ravi Netravali, Subrata
      Mitra, Aditya Parameswaran, Javad Ghaderi, Dan Rubenstein, Eugene Wu
      In Review 2025
 3.   DocETL: Agentic Query Rewriting and Evaluation for Complex Document
      Processing
      Shreya Shankar, Aditya Parameswaran, Eugene Wu
      arXiv 2024
 4.   Where Does Database Research Go From Here?
      Eugene Wu
      SIGMOD Blog 2024
 5.   Kitana: A Data-as-a-Service Platform
      Zachary Huang, Pranav Subramaniam, Raul Fernandez, Eugene Wu
      In Review 2023
 6.   Database Theory + X: Database Visualization
      Eugene Wu
      In Review 2025
 7.   Design-Specific Transformations in Visualization
      Eugene Wu, Remco Chang
      ArXiV 2024
 8.   Data-Centric Text-to-SQL with Large Language Models
      Zezhou Huang, Shuo Zhang, Kechen Liu, Eugene Wu
      Table Representation Learning at NeurIPS 2024
 9.   DynoClass: A Dynamic Table-Class Detection System Without the Need for
      Predefined Ontologies
      Haonan Wang, Kechen Liu, Jiaxiang Liu, Eugene Wu
      Table Representation Learning at NeurIPS 2024
 10.  Transform Table to Database Using Large Language Models
      Zezhou Huang, Jia Guo, Eugene Wu
      Tabular Data Analysis (TaDA) Workshop at VLDB2024
 11.  spade: Synthesizing Data Quality Assertions for Large Language Model
      Pipelines
      Shreya Shankar, Haotian Li, Parth Asawa, Madelon Hulsebos, Yiming Lin,
      J.D. Zamfirescu-Pereira, Harrison Chase, Will Fu-Hinthorn, Aditya G.
      Parameswaran, Eugene Wu
      VLDB 2025
 12.  Cocoon: Semantic Table Profiling Using Large Language Models
      Zachary Huang, Eugene Wu
      HILDA Workshop at SIGMOD 2024
 13.  SET: Searching Effective Supervised Learning Augmentations in Large
      Tabular Data Repositories
      Jerry Liu, Zachary Huang, Eugene Wu
      GUIDEAI Workshop at SIGMOD 2024
 14.  Lightweight Materialization for Fast Dashboards Over Joins
      Zachary Huang, Eugene Wu
      SIGMOD 2024
 15.  Data Ambiguity Strikes Back: How Documentation Improves GPT's Text-to-SQL
      Zezhou Huang, Pavan Kalyan, Eugene Wu
      Table Representation Learning Workshop at NeurIPS 2023
 16.  The Fast and the Private: Task-based Dataset Search
      Zezhou Huang, Jiaxiang Liu, Haonan Wang, Eugene Wu
      CIDR 2024 Slides
 17.  JoinBoost: Grow Trees Over Normalized Data Using Only SQL
      Zezhou Huang, Rathijit Sen, Jiaxiang Liu, Eugene Wu
      VLDB 2023
 18.  Saibot: A Differentially Private Data Search Platform
      Zezhou Huang, Jiaxiang Liu, Daniel Alabi, Raul Castro Fernandez, Eugene Wu
      VLDB 2023
 19.  Pollock: A Data Loading Benchmark
      Gerardo Vitagliano, Mazhar Hameed, Lan Jiang, Lucas Reisener, Eugene Wu,
      Felix Naumann
      VLDB 2023
 20.  ConnectorX: Accelerating Data Loading From Databases to Dataframes
      Xiaoying Wang, Weiyuan Wu, Jinze Wu, Yizhou Chen, Nick Zrymiak, Changbo
      Qu, Lampros Flokas, George Chow, Jiannan Wang, Tianzheng Wang, Eugene Wu,
      Qingqing Zhou
      VLDB 2023
 21.  SmokedDuck Demonstration: SQLStepper
      Haneen Mohammed, Charlie Summers, Sughosh Kaushik, Eugene Wu
      SIGMOD Demo 2023
 22.  Teaching Data Science by Visualizing Data Table Transformations: Pandas
      Tutor for Python, Tidy Data Tutor for R, and SQL Tutor
      Sam Lau, Sean Kross, Eugene Wu, Philip Guo
      DataEd at SIGMOD 2023
 23.  Analysis Errors Over Semantic Layers and How To Avoid Them
      Zezhou Huang, Pavan Kalyan, Eugene Wu
      HILDA at SIGMOD 2023
 24.  DIG: The Data Interface Grammar
      Yiru Chen, Jeffrey Tao, Eugene Wu
      HILDA at SIGMOD 2023
 25.  Random Forests over Normalized Data in CPU-GPU DBMSes
      Zezhou Huang, Pavan Kalyan, Rathijit Sen, Eugene Wu
      DaMoN at SIGMOD 2023
 26.  OM3: An Ordered Multi-level Min-Max Representation for Interactive
      Progressive Visualization of Time Series
      Yunhai Wang, Yuchun Wang, Xin Chen, Yue Zhao, Fan Zhang, Eugene Wu,
      Chi-Wing Fu, Xiaohui Yu
      SIGMOD 2023
 27.  NL2INTERFACE: Interactive Visualization Interface Generation from Natural
      Language Queries
      Yiru Chen, Ryan Li, Austin Mac, Tianbao Xie, Tao Yu, Eugene Wu
      VIS nlviz workshop 2022
 28.  How Do Captions Affect Visualization Reading?
      Shelly Cheng, Hazel Zhu, Eugene Wu
      VIS Viscomm 2022
 29.  Extending the View Composition Algebra to Hierarchical Data
      Eugene Wu
      arXiV 2022
 30.  A Grammar for Hypothesis-Driven Visual Analysis
      Ashley Suh, Yilan Jiang, Ab Mosca, Eugene Wu, Remco Chang
      ArXiV 2022
 31.  A Sensorless Drone-based System for Mapping Indoor 3D Airflow Gradients
      Yanchen Liu, Minghui Zhao, Stephen Xia, Eugene Wu, Xiaofan Jiang
      MobiSys 2022 Demo
 32.  How I Stopped Worrying About Training Data Bugs and Started Complaining
      Lampros Flokas, Weiuan Wu, Jiannan Wang, Nakul Verma, Eugene Wu
      DEEM Workshop 2022
 33.  Interactive Interface Generation in Notebooks
      Jeffrey Tao, Yiru Chen, Eugene Wu
      SIGMOD 2022 demo
 34.  PI2: Generating Visual Analysis Interfaces From Queries
      Yiru Chen, Eugene Wu
      SIGMOD 2022
 35.  View Composition Algebra for Ad Hoc Comparisons
      Eugene Wu
      TVCG 2022
 36.  Reptile: Aggregation-level Explanations for Hierarchical Data
      Zachary Huang, Eugene Wu
      SIGMOD 2022
 37.  A Neural Network Solves and Generates Mathematics Problems by Program
      Synthesis: Calculus, Differential Equations, Linear Algebra, and More
      Iddo Drori, Sunny Tran, Roman Wang, Newman Cheng, Kevin Liu, Leonard Tang,
      Elizabeth Ke, Nikhil Singh, Taylor L. Patti, Jayson Lynch, Avi Shporer,
      Nakul Verma, Eugene Wu, Gilbert Strang
      PNAS 2022
 38.  Complaint-Driven Training Data Debugging at Interactive Speeds
      Lampros Flokas, Young Wu, Jiannan Wang, Nakul Verma, Eugene Wu
      SIGMOD 2022
 39.  Dynamic Breakpoints for Y-axis Scales
      Jacob Fisher, Remco Chang, Eugene Wu
      InfoVIS 2021 (short paper)
 40.  Enabling SQL-based training data debugging for federated learning
      Young Wu, Yejia Liu, Lampros Flokas, Jiannan Wang, Eugene Wu
      VLDB 2022
 41.  Explaining SQL-ML Queries with Bayesian Optimization
      Brandon Lockhard, Jiannan Wang, Eugene Wu
      VLDB 2021
 42.  DIEL: Interactive Visualization Beyond the Here and Now
      Yifan Wu, Remco Chang, Joseph Hellerstein, Arvind Satyanarayan, Eugene Wu
      VIS 2021
 43.  PopFactor: Live-Streamer Behavior and Popularity
      Robert Netzorg, Lauren Arnett, Augustin Chaintreau, Eugene Wu
      ICWSM 2021
 44.  Impact of Cognitive Biases on Progressive Visualization
      Marianne Procopio, Ab Mosca, Carlos Scheidegger, Eugene Wu, Remco Chang
      TVCG 2021
 45.  From Cleaning Before ML to Cleaning For ML
      Felix Neutatz, Binger Chen, Ziawasch Abedjan, Eugene Wu
      Invited, IEEE Data Engineering Bulletin 2021
 46.  Facilitating Exploration with Interaction Snapshots under High Latency
      Yifan Wu, Remco Chang, Joe Hellerstein, Eugene Wu
      InfoVIS (short paper) 2020
 47.  ActiveDeeper: A Model-based Active Data Enrichment system
      Liang Zhao, Qingcan Li, Pei Wang, Jiannan Wang, Eugene Wu
      VLDB 2020 demo
 48.  Continuous Prefetch for Interactive Data Applications
      Haneen Mohammed, Ziyun Wei, Ravi Netravali, Eugene Wu
      VLDB 2020 Talk Video Blogpost
 49.  Complaint-driven Training Data Debugging for Query 2.0
      Young Wu, Lampros Flokas, Jiannan Wang, Eugene Wu
      SIGMOD 2020 Talk Video Blogpost
 50.  Physical Visualization Design
      Lana Ramjit, Zhaoning Kong, Ravi Netravali, Eugene Wu
      SIGMOD (demo) 2020
 51.  Towards Complaint-driven ML Workflow Debugging
      Lampros Flokas, Young Wu, Jiannan Wang, Eugene Wu
      MLOps 2020
 52.  Monte Carlo Tree Search for Generating Interactive Data Analysis
      Interfaces
      Yiru Chen, Eugene Wu
      Intelligent Process Automation (IPA) 2020
 53.  Acorn: Aggressive Result Caching in Spark SQL
      Lana Ramjit, Matteo Interlandi, Eugene Wu, Ravi Netravali
      SOCC 2019
 54.  AlphaClean: Automatic Generation of Data Cleaning Pipelines
      Sanjay Krishnan, Eugene Wu
      ArXiv 2019
 55.  Towards Democratizing Relational Data Visualization
      Nan Tang, Eugene Wu, Guoliang Li
      SIGMOD 2019 Tutorial
 56.  Precision Interfaces
      Qianrui Zhang, Haoci Zhang, Viraj Rai, Thibault Sellam, Eugene Wu
      SIGMOD 2019
 57.  Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment
      Pei Wang, Jiannan Wang, Ryan Shea, Eugene Wu
      SIGMOD 2019
 58.  Cross-platform Interactions and Popularity in the Live-streaming Community
      Lauren Arnett, Robert Netzorg, Augustin Chaintreau, Eugene Wu
      CHI Latebreaking 2019
 59.  DeepBase: Deep Inspection of Neural Networks
      Thibault Sellam, Kevin Lin, Ian Yiran Huang, Michelle Yang, Carl Vondrick,
      Eugene Wu
      SIGMOD 2019
 60.  Deep Neural Inspection Using DeepBase
      Yiru Chen, Yiliang Shi, Boyuan Chen, Thibault Sellam, Carl Vondrick,
      Eugene Wu
      LearnSys 2018 Workshop at NIPS
 61.  CIDR2: Crazier Innovations in Databases JOIN Reinforcement-learning
      Research
      Eugene Wu
      CIDR 2019 Abstract
 62.  Ten Years of Web Tables
      Michael Cafarella, Alon Halevy, Daisy Zhe Wang, Hongrae Lee, Jayant
      Madhavan, Cong Yu, Eugene Wu
      PVLDB 2018 Invited Paper,
 63.  At a Glance: Approximate Entropy as a Measure of Line Chart Visualization
      Complexity
      Gabriel Ryan, Abigail Mosca, Remco Chang, Eugene Wu
      InfoVIS 2018 Code
 64.  Provenance in Interactive Visualizations
      Fotis Psallidas, Eugene Wu
      HILDA 2018
 65.  Leveraging Quality Prediction Models for Automatic Writing Feedback
      Hamed Nilforoshan, Eugene Wu
      ICWSM 2018
 66.  Precision Interfaces for Different Modalities
      Haoci Zhang, Viraj Rai, Thibault Sellam, Eugene Wu
      SIGMOD (demo) 2018
 67.  Demonstration of Smoke: A Deep Breath of Data-Intensive Lineage
      Applications
      Fotis Psallidas, Eugene Wu
      SIGMOD (demo) 2018
 68.  Deeper: A Data Enrichment System Powered by Deep Web.
      Pei Wang, Yongjun He, Ryan Shea, Jiannan Wang, Eugene Wu
      SIGMOD (demo) 2018
 69.  "I Like the Way You Think!" Inspecting the Internal Logic of Recurrent
      Neural Networks
      Thibault Sellam, Kevin Lin, Ian Yiran Huang, Carl Vondrick, Eugene Wu
      SysML 2018
 70.  A "Probabilistic" Model of Research
      Eugene Wu
      Blog Post 2018
 71.  Smoke: Fine-grained Lineage at Interactive Speeds
      Fotis Psallidas, Eugene Wu
      VLDB 2018
 72.  Mining Precision Interfaces From Query Logs
      Haoci Zhang, Thibault Sellam, Eugene Wu
      Tech Report 2017
 73.  BoostClean: Automated Error Detection and Repair for Machine Learning
      Sanjay Krishnan, Michael J. Franklin, Ken Goldberg, Eugene Wu
      Tech Report 2017
 74.  Load-n-Go: Fast Approximate Join Visualizations That Improve Over Time
      Marianne Procopio, Carlos Scheidegger, Eugene Wu, Remco Chang
      DSIA 2017
 75.  Approximate Entropy as a Measure of Line Chart Complexity
      Gabriel Ryan, Abigail Mosca, Eugene Wu, Remco Chang
      InfoVIS Poster 2017
 76.  Towards a Bayesian Model of Data Visualization Cognition
      Yifan Wu, Larry Xu, Remco Chang, Eugene Wu
      DECISIVE 2017
 77.  PreCog: Improving Crowdsourced Data Quality Before Acquisition
      Hamed Nilforoshan, Jiannan Wang, Eugene Wu
      Arxiv 2017
 78.  Precision Interfaces
      Haoci Zhang, Thibault Sellam, Eugene Wu
      HILDA 2017
 79.  PALM: Machine Learning Explanations For Iterative Debugging
      Sanjay Krishnan, Eugene Wu
      HILDA 2017
 80.  Segment-Predict-Explain for Automatic Writing Feedback
      Hamed Nilforoshan, James Sands, Kevin Lin, Rahul Khanna, Eugene Wu
      Collective Intelligence 2017
 81.  Dialectic: Enhancing Text Input Fields with Automatic Feedback to Improve
      Social Content Writing Quality
      Hamed Nilforoshan, James Sands, Kevin Lin, Rahul Khanna, Eugene Wu
      ArXiv 2017
 82.  Skipping-oriented Partitioning for Columnar Layouts
      Liwen Sun, Michael J. Franklin, Jiannan Wang, Eugene Wu
      VLDB 2017
 83.  Combining Design and Performance in a Data Visualization Management System
      Eugene Wu, Fotis Psallidas, Zhengjie Miao, Haoci Zhang, Laura Rettig,
      Yifan Wu, Thibault Sellam
      CIDR 2017
 84.  CIDR: Chat-oriented Innovations in Database Research
      Eugene Wu
      CIDR 2017 Abstract
 85.  QFix: Diagnosing errors through query histories
      Xiaolan Wang, Alexandra Meliou, Eugene Wu
      SIGMOD 2017
 86.  A DeVIL-ish Approach to Inconsistency in Interactive Visualizations
      Yifan Wu, Joe Hellerstein, Eugene Wu
      HILDA 2016
 87.  PFunk-H: Approximate Query Processing using Perceptual Models
      Daniel Alabi, Eugene Wu
      HILDA 2016
 88.  Towards Reliable Interactive Data Cleaning: A User Survey and
      Recommendations
      Sanjay Krishnan, Daniel Haas, Michael J. Franklin, Eugene Wu
      HILDA 2016
 89.  TrendQuery: A System for Interactive Exploration of Trends
      Niranjan Kamat, Eugene Wu, Arnab Nandi
      HILDA 2016
 90.  ActiveClean: An Interactive Data Cleaning Framework For Modern Machine
      Learning
      Sanjay Krishnan, Michael Franklin, Ken Goldberg, Jiannan Wang, Eugene Wu
      SIGMOD 2016 Demo (Demo Award Winner!)
 91.  Graphical Perception in Animated Bar Charts
      Eugene Wu, Lilong Jiang, Larry Xu, Arnab Nandi
      Arxiv 2016
 92.  QFix: Demonstrating error diagnosis in query histories
      Xiaolan Wang, Alexandra Meliou, Eugene Wu
      SIGMOD 2016 Demo
 93.  QFix: Diagnosing errors through query histories
      Xiaolan Wang, Alexandra Meliou, Eugene Wu
      Arxiv 2016
 94.  ActiveClean: Interactive Data Cleaning While Learning Convex Loss Models
      Sanjay Krishnan, Jiannan Wang, Eugene Wu, Michael J. Franklin, Ken
      Goldberg
      Arxiv 2016
 95.  Towards Perception-aware Interactive Data Visualization Systems
      Eugene Wu, Arnab Nandi
      DSIA 2015 Slides
 96.  SampleClean: Fast and Reliable Analytics on Dirty Data (overview paper)
      Sanjay Krishnan, Jiannan Wang, Michael J Franklin, Ken Goldberg, Tim
      Kraska, Tova Milo, Eugene Wu
      IEEE Data Eng. Bulletin 2015
 97.  CLAMShell: Speeding up Crowds for Low-latency Data Labeling
      Daniel Haas, Jiannan Wang, Eugene Wu, Michael J. Franklin
      VLDB 2016
 98.  Automated Metadata Construction to Support Portable Building Applications
      Arka A. Bhattacharya, Dezhi Hong, David Culler, Jorge Ortiz, Kamin
      Whitehouse, Eugene Wu
      BuildSys 2015
 99.  Wisteria: Nurturing Scalable Data Cleaning Infrastructure
      Daniel Haas, Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Eugene Wu
      VLDB 2015 demo
 100. Collaborative Data Analytics with Datahub
      Anant Bhardwaj, Amol Deshpande, Aaron Elmore, David Karger, Sam Madden,
      Aditya Parameswaran, Harihar Subramanyam, Eugene Wu, Rebecca Zhang
      VLDB 2015 demo
 101. Indexing Cost Sensitive Prediction
      Leilani Battle, Edward Benson, Aditya Parameswaran, Eugene Wu
      Technical Report 2016
 102. Explaining Data in Visual Analytic Systems
      Eugene Wu
      Doctoral Thesis 2015
 103. The Case for Data Visualization Management Systems
      Eugene Wu, Leilani Battle, Samuel Madden
      VLDB 2014
 104. Vertexica: Your Relational Friend for Graph Analytics!
      Alekh Jindal, Praynaa Rawlani, Eugene Wu, Samuel Madden, Amol Deshpande,
      Mike Stonebraker
      SIGMOD 2014 demo
 105. Data In Context: Aiding News Consumers while Taming Dataspaces
      Eugene Wu, Adam Marcus, Sam Madden
      DBCrowd 2013
 106. Mobile applications need Targeted Micro-updates
      Alvin Cheung, Lenin Ravindranath, Eugene Wu, Samuel Madden, Hari
      Balakrishnan
      APSYS 2013
 107. Scorpion: Explaining Away Outliers in Aggregate Queries
      Eugene Wu, Samuel Madden
      VLDB 2013 (Best-of) Slides
 108. SubZero: a Fine-Grained Lineage System for Scientific Databases
      Eugene Wu, Samuel Madden, Michael Stonebraker
      ICDE 2013 (Best-of)
 109. A Demonstration of DBWipes: Clean as You Query
      Eugene Wu, Samuel Madden, Michael Stonebraker
      VLDB 2012
 110. Human-powered Sorts and Joins
      Adam Marcus, Eugene Wu, David Karger, Samuel Madden, Robert Miller
      VLDB 2012
 111. Partitioning Techniques for Fine-Grained Indexing
      Eugene Wu, Sam Madden
      ICDE 2011
 112. Demonstration of Qurk: A Query Processor for Human Operators
      Adam Marcus, Eugene Wu, David Karger, Samuel Madden, Robert Miller
      SIGMOD 2011
 113. No Bits Left Behind
      Eugene Wu, Carlo Curino, Sam Madden
      CIDR 2011
 114. Crowdsourced Databases: Query Processing with People
      Adam Marcus, Eugene Wu, Sam Madden, Robert Miller
      CIDR 2011
 115. Relational Cloud: A Database-as-a-Service for the Cloud
      Carlo Curino, Evan Jones, Raluca Popa, Nirmesh Malviya, Eugene Wu, Sam
      Madden, Hari Balakrishnan, Nickolai Zeldovich
      CIDR 2011
 116. Relational Cloud: The Case for a Database Service
      Carlo Curino, Evan Jones, Yang Zhang, Eugene Wu, Sam Madden
      MIT Tech Report 2010
 117. TrajStore: An Adaptive Storage System for Very Large Trajectory Data Sets
      Philippe Cudre-Mauroux, Eugene Wu, Sam Madden
      ICDE 2010
 118. Webtables, exploring the power of tables on the web
      Michael Cafarella, Alon Halevy, Daisy Wang, Eugene Wu, Yang Zhang
      VLDB 2008
 119. SASE: Complex Event Processing over Streams (Demo)
      Daniel Gyllstrom, Eugene Wu, Hee-Jin Chae, Yanlei Diao, Patrick Stahlberg,
      Gordon Anderson
      CIDR 2007
 120. High-performance complex event processing over streams
      Eugene Wu, Yanlei Diao, Shariq Rizvi
      SIGMOD 2006
 121. SASE: Complex Event Processing over Streams
      Daniel Gyllstrom, Eugene Wu, Hee-Jin Chae, Yanlei Diao, Patrick Stahlberg,
      Gordon Anderson
      CoRR 2006
 122. Probabilistic Data Management for Pervasive Computing: The Data Furnace
      Project
      Minos N. Garofalakis, Kurt P. Brown, Michael J. Franklin, Joseph M.
      Hellerstein, Daisy Zhe Wang, Eirinaios Michelakis, Liviu Tancau, Eugene
      Wu, Shawn R. Jeffery, Ryan Aipperspach
      IEEE Data Eng. Bulletin 2006
 123. Design Considerations for High Fan-In Systems: The HiFi Approach
      Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick
      Reiss, Shariq Rizvi, Eugene Wu, Owen Cooper, Anil Edakkunni, Wei Hong
      CIDR 2005
 124. HiFi: A Unified Architecture for High Fan-in Systems
      Owen Cooper, Anil Edakkunni, Michael J. Franklin, Wei Hong, Shawn R.
      Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu
      VLDB 2004 Demo

PHDS AND POSTDOCS

Haneen Mohammed
Yiru Chen
Zach Huang
Charlie Summers
Jiaxiang “Jerry” Liu
Haonan “Peter” Wang

Lampros Flokas (PhD, Meta)
Fotis Psallidas (PhD, Microsoft)
Thibault Sellam (postdoc, Google)

MASTERS QUESTIONS

Current Student? See if Professor Verma’s FAQ page answers your questions.

Questions about the CS Journalism dual degree? See the FAQ

TEACHING

 * Intro to DB S19, F18, F16, F15
 * Systems for Human Data Interaction F21 S20 S17
 * Database Research Topics F20 S19
 * Big Data Systems S18, S17, S16
 * Database Topics in Research & Practice S18
 * From Ascii to Answers@MIT
 * Data IAP@MIT

SERVICE

Organizer/Co-Chair ICDE Workshops (2023), SIGMOD Student Research Competition
(2019, 2020), NYDBDay (2018), HILDA (2018), SIGMOD NRS (2017), SIGMOD Travel
Award (2015, 2016), North East DB Day (2016)

Area Chair ICDE (2017), VLDB (2021, 2022), SIGMOD (2025)

Program Committee WWW (2017), SIGMOD (2017, 2019, 2023), VLDB (2017, 2020), VLDB
PhD Workshop (2020), SOCC (2021, 2022), HILDA (2016, 2017, 2020, 2022), SSDBM
(2017), HCOMP (2017), CLOUDDM (2016) DATA4U (2014)

Reviewer SIGMOD, VLDB, CIDR, SOCC, WWW, ICDE, VIS, UIST, CHI, TKDE

2014 JOB MATERIALS

Old CV, Research, Teaching, Diversity

SIDE PROJECTS

Latex Snapshots goes through your latex repo’s history and renders thumbnails of
your pdf

VLDB conference trends renders a history of databases through keyword trends in
VLDB publication titles

pygg is a python wrapper around ggplot2 that provides nearly the same syntax,
but in python.

bibcleaner is a web UI to normalize and clean up bibtex entries, and help
shorten the references section

dbtruck imports your data into postgres you want, as painlessly as possible.

researchsetup is a set of recommendations to bootstrap your research.

I sometimes doodle

Mortal Kombat Papers

SOME INTERESTING LINKS

 * I made tea
 * layer ping pong
 * pornp
 * Best shows ever?
 * artist heros