About Me
Hi! I’m Guohao, a Ph.D. candidate studying Maching Learning at the School of Information, Rochester Institute of Technology (RIT), advised by Zhiqiang Tao. My research focuses on data-centric approaches to vision–language modeling, with an emphasis on perception, reasoning, and reliable generalization in multimodal systems. I am particularly interested in improving vision–language models through preference optimization, latent reasoning, and hallucination mitigation, grounded in probabilistic and information-theoretic principles.
My research
I am drawn to simplicity, value principled understanding, and enjoy building practical systems. My current research focuses on multimodal foundation models such as GPT-4o and CLIP, which can be adapted to a wide range of downstream tasks. I aim to study how these models reason and fail, and to enhance their robustness and reliability through pretraining and fine-tuning techniques, including self-questioning, structured reasoning, variational attention, and information bottleneck–based modeling.
My background and history
I received my B.S. from the College of Engineering at Michigan State University in 2018, and M.S. from the Department of Computer Science and Engineering at Santa Clara University in 2021. Between my undergraduate and graduate studies, I was a SDE at Robotrak, focusing on traditional and Machine Learning algorithm design and implementation.
