IF-Prune: Information-Flow Guided Token Pruning for Efficient Vision-Language Models
Published in CVPR, 2026
This work has introduced a statistic way to provide an importance map for token pruning in VLM.
Recommended citation:
Download Paper
Published in CVPR, 2026
This work has introduced a statistic way to provide an importance map for token pruning in VLM.
Recommended citation:
Download Paper
Published in NeurIPS, 2025
This work has introduced a new deep-RL method that enhances VLM reasoning ability.
Recommended citation: Sun, G., Hua, H., Wang, J., Luo, J., Dianat, S., Rabbani, M., & Tao, Z. (2025). Latent chain-of-thought for visual reasoning. NeurIPS
Download Paper
Published in Empirical Methods in Natural Language Processing (EMNLP), 2024
This work have introduced a novel self-training approach to enhance the data efficiency of training LVLMs for medical tasks
Recommended citation: Sun, G., Qin, C., Fu, H., Wang, L., & Tao, Z. (2024). STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical.
Download Paper
Published in Proceedings of the 18th European Conference on Computer Vision (ECCV), 2024
This work has introduced a new training method that enhances general-purpose vision-language understanding and image-oriented question answering through visual self-questioning.
Recommended citation: Sun, G., Qin, C., Wang, J., Chen, Z., Xu, R., & Tao, Z. (2024). SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant. ECCV
Download Paper