About Me [CV]
I am a second-year Ph.D. student in the College of Engineering at Northeastern University, advised by Prof. Yun Raymond Fu in the SMILE Lab.
I received my B.S. and M.S. degrees from Xidian University, advised by Prof. Xuefeng Liang. During my master's studies, I visited Kyoto University, working with Prof. Takatsune Kumada.
Research interests Multimodal LLMs | Efficiency | Reliability | Hallucination Detection & Mitigation | Video Understanding | Layout Understanding
Actively seeking internship opportunities.
News
TOP Paper list of Video LLM hallucination. Welcome to Star and Contribute!
Jan. 2026 One paper SHIELD accepted by ICLR 2026
Dec. 2025 Passed the Ph.D. Qualifying Exam, thanks to my advisor and committee members.
Aug. 2025 One paper D-CoDe accepted by EMNLP 2025 May 2025 Started Research Internship at Adobe Research. Sep. 2024 Started my journey at Northeastern University. Experience
SMILE Lab, Northeastern University, Boston
Ph.D. Student, Sep. 2024 – Present
Supervisor: Prof. Yun Raymond Fu

Adobe Research, San Jose
Research Intern, May 2025 – Nov. 2025

Xidian University, Xi'an
Master Student, Sep. 2021 – Jun. 2024
Undergraduate Student, Sep. 2017 – Jun. 2021
Supervisor: Prof. Xuefeng Liang

Publications [Google Scholar]
Submitted to ARR
Submitted to CVPR
MASON: Compositional Design Layout Understanding in VLMs through Multimodal Alignment and Structural Perception
TL;DR: Diagnosed failure modes in layered designs (semantic drift, structural ambiguity) and built MASON, a plug-and-play framework with metadata-aware alignment and structural cue injection.
Submitted to CVPR
Rethinking Fine-Tuning: Unlocking Hidden Capabilities in Vision-Language Models
TL;DR: Applied MFT to VLMs: learnable gating reorganizes subnetworks without weight updates; outperforms LoRA and full fine-tuning.
ICLR 2026
EMNLP 2025
ICASSP 2025
LipReading for Low-resource Languages by Language Dynamic LoRA
TL;DR: Developed dynamic LoRA for meta lip shapes and multilingual instruction tuning to improve cross-lingual lipreading in low-resource settings.
ACMMM 2021
CALLip: Lipreading using Contrastive and Attribute Learning
TL;DR: Proposed CALLip, leveraging attribute learning to normalize cross-speaker variation and audio-visual contrastive learning to mitigate viseme confusion.
Academic Service
Conference Reviewer FG, ARR
Journal Reviewer ACM TKDD
Honors & Awards
2022 Outstanding Student, Xidian University
2021 National Scholarship, China
2021 Undergraduate Computer Design Competition (1st Prize), China
2019 RoboMaster National Robotics Competition (2nd Prize), China
2019 ICRA AI Challenge (3rd Prize)
Teaching Experience
Fall 2025 TA — DS 5110 Essentials of Data Science
Spr. 2026 TA — DS 5020 Fundamentals of Linear Algebra and Probability
Contact
WeChat hukcc369
