Michael R. Lucas
I am a Ph.D. candidate in the Northwestern University
EECS Department
, working with Professor
Doug Downey
since 2011 as a member of the
WebSAIL
research lab.
Previously, I researched Computational Complexity and Mechanism Design
(algorithm-focused auction design) as a member of the
Economics and
Theory Group from 2009-2010 with Professor
Lance Fortnow.
My research interests are focused on
designing and scaling Machine Learning methods for use on enormous
datasets, specifically in the fields of Semi-Supervised
Learning, Natural Language Processing, and Bayesian Models.
Background
I completed my undergraduate degree in Computer Science at
The University of Dayton in 2008. In 2007, I interned at GE Aviation
in Evendale, Ohio, where I automated systems for the detection
of production delays in the Supply Chain Management department.
Non-graduate research experience includes a summer research internship at
UIC working in Distributed Machine Learning with
Dr. Robert Grossman and
Dr. Yunhong Gu.
I designed and implemented large-scale clustering
algorithms to be run on their
UDT Protocol.
And least recently of all, I was a research assistant for Dr. Kathryn Fischbach at
UTHSCSA's Alzheimer's research lab.
Publications
[PDF]
Michael R. Lucas and Doug Downey.
"Semi-supervised Naive Bayes with Feature Marginals,"
The 51st Annual Meeting
of the Association for Computational Linguistics,
Sofia, Bulgaria, August 4-9, 2013.
Summary:
When labeled data is scarce or expensive to collect,
Semi-supervised learning (SSL) methods that utilize
unlabeled datasets can outperform standard machine
learning algorithms. In this paper, we propose
a scalable SSL improvement to the classic Naive Bayes
Classifier.
Modern SSL techniques typically require multiple
passes over the unlabeled data, which is often
impossible on the web-scale corpora being produced
today. In this paper, we show that improving baseline
estimates of word frequencies using unlabeled data
can improve Naive Bayes Classifiers while
scaling to modern massive data sets.
In experiments with text topic classification and
sentiment analysis, we show that our method is both
more scalable and more accurate than SSL techniques
from previous work.
Experience
Conference Experience
- Program Committee: IJCAI 2016.
- Program Committee: ACL 2014.
- Program Committee: NAACL-HLT 2013.
- Program Committee: EMNLP-CoNLL 2012.
Teaching Assistantships
- EECS 101 - An Intro to Computer Science for Everyone. Fall 2015.
- EECS 348 - Artificial Intelligence. Spring 2014.
- EECS 310 - Data Structures. Fall 2010.
Contact
Northwestern University
Evanston, IL 60208, USA
mlucas@u.northwestern.edu