Zheng Yuan

Ph.D. Candidate in Computer Science

Resume

Advisor: Doug Downey
EECS Department
Northwestern University
Office: 3.324 Ford, 2133 Sheridan Rd, Evanston, IL 60208
Email: zys133@eecs.northwestern.edu


I am a computer science Ph.D. candidate at the Northwestern University. Welcome to my personal website.

Interests

Named entity typing, Deep Learning, Entity relationship extraction, Natural language processing, Hyper-parameter optimization, Parallel computing, Operating system

Selected Projects

Open Named entity typing using deep learning

Classical NET(Named Entity Typing) targeted a few coarse-grained types, but has expanded to sets of hundreds of types in recent years. However, existing work in NET assumes that the target types are specified in advance and that hand-labeled examples of each type are available. What if the set of target types is not known in advance? Currently, I am working on a neural network architecture for ONET to tag entities with types not seen in training.

Hyper-parameter optimization for machine learning algorithms

Every machine learning algorithm has hyper-parameters, which great impact the performance. Unfortunately, there usually does not exist a closed-form expression, which could describe the relationship between hyper-parameter and performance of machine learning algorithm. People cost a lot of time to tune hyper-parameters nowadays especially for neuron networks. Thus, how to fine out a good hyper-parameter, especially for large data sets, is a challenging and beneficial topic. Can we actually predict good hyper-parameters when dataset is large, using good hyper-parameters for small dataset? I tried this. However, it doesn't work well...

Parallel data compression

Spatial-temporal data stores information at the same spatial location but different time stamps. Spatial-temporal datasets are commonly used nowadays especially in scientific simulations. By clustering similarities between different time stamps, we can compress spatial-temporal data a lot. I designed and implemented parallel-NUMARCK, a parallel data compression algorithm for spatial-temporal data. Using 1600 cores, parallel- NUMARCK compressed 472GB turbulence stirring data within 2.7 sec.

Publications


For God so loved the world that he gave his one and only Son, that whoever believes in him shall not perish but have eternal life. (John 3:16)
Last Updated: Feb 2, 2018