Zheng Yuan

Ph.D. Candidate in Computer Science

Resume

Advisor: Doug Downey
EECS Department
Northwestern University
Office: 3.324 Ford, 2133 Sheridan Rd, Evanston, IL 60208
Email: zys133@eecs.northwestern.edu


I am a computer science Ph.D. candidate at the Northwestern University. Welcome to my personal website.

Interests

Named entity typing, Deep Learning, Entity relationship extraction, Natural language processing, Hyper-parameter optimization, Parallel computing, Operating system

Recent Projects

Named entity typing using deep learning

Classical NET(Named Entity Typing) targeted a few coarse-grained types, but has expanded to sets of hundreds of types in recent years. However, existing work in NET assumes that the target types are specified in advance and that hand-labeled examples of each type are available. What if the set of target types is not known in advance? Currently, I am working on a neural network architecture for ONET to tag entities with types not seen in training.

Hyper-parameter optimization for machine learning algorithms

Every machine learning algorithm has hyper-parameters, which great impact the performance. Unfortunately, there usually does not exist a closed-form expression, which could describe the relationship between hyper-parameter and performance of machine learning algorithm. People cost a lot of time to tune hyper-parameters nowadays especially for neuron networks. Thus, how to fine out a good hyper-parameter, especially for large data sets, is a challenging and beneficial topic. Can we actually predict good hyper-parameters when dataset is large, using good hyper-parameters for small dataset? I tried this. However, it doesn't work well...

Parallel data compression

Spatial-temporal data stores information at the same spatial location but different time stamps. Spatial-temporal datasets are commonly used nowadays especially in scientific simulations. By clustering similarities between different time stamps, we can compress spatial-temporal data a lot. I designed and implemented parallel-NUMARCK, a parallel data compression algorithm for spatial-temporal data. Using 1600 cores, parallel- NUMARCK compressed 472GB turbulence stirring data within 2.7 sec.

Publications

More about me

I am a Christian.
I like to play instruments in most of my spare time. Instruments I played: violin(major these days), flute, piano, ocarina, xiao, erhu, cucurbit flute, bawu, pianica.
Snowboard skiing is my favorite in winter.
Here is my bungee video in Beijing. Remember to wear a snow goggles so that you could open your eyes after you jump.

Last Updated: Nov 8, 2017