Selected Publications

You can also find my articles on my Google Scholar profile.

IF-Prune: Information-Flow Guided Token Pruning for Efficient Vision-Language Models

Published in CVPR, 2026

This work has introduced a statistic way to provide an importance map for token pruning in VLM.

Recommended citation:
Download Paper

Latent Chain-of-Thought for Visual Reasoning

Published in NeurIPS, 2025

This work has introduced a new deep-RL method that enhances VLM reasoning ability.

Recommended citation: Sun, G., Hua, H., Wang, J., Luo, J., Dianat, S., Rabbani, M., & Tao, Z. (2025). Latent chain-of-thought for visual reasoning. NeurIPS
Download Paper

STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical

Published in Empirical Methods in Natural Language Processing (EMNLP), 2024

This work have introduced a novel self-training approach to enhance the data efficiency of training LVLMs for medical tasks

Recommended citation: Sun, G., Qin, C., Fu, H., Wang, L., & Tao, Z. (2024). STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical.
Download Paper

SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant

Published in Proceedings of the 18th European Conference on Computer Vision (ECCV), 2024

This work has introduced a new training method that enhances general-purpose vision-language understanding and image-oriented question answering through visual self-questioning.

Recommended citation: Sun, G., Qin, C., Wang, J., Chen, Z., Xu, R., & Tao, Z. (2024). SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant. ECCV
Download Paper

Guohao Sun

Selected Publications

IF-Prune: Information-Flow Guided Token Pruning for Efficient Vision-Language Models

Latent Chain-of-Thought for Visual Reasoning

STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical

SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant