About Me [CV]

I am a Ph.D. candidate in the SMILE Lab of the Department of ECE, Northeastern University, under the supervision of Prof. Yun Raymond Fu (Member of the Academy of Europe, European Academy of Sciences and Arts; Fellow of ACM, IEEE, AAAI, AAAS, US National Academy of Inventors). I received my B.S. and M.S. degrees from Xidian University, advised by Prof. Xuefeng Liang. I have interned at Adobe Research and visited Kyoto University.

Actively seeking internship opportunities for Fall 2026 and Spring/Summer 2027. Feel free to reach out for research collaborations or any inquiries!

News

TOP Paper list of Video LLM hallucination. Welcome to Star and Contribute!

Apr. 2026 One Video-LLM Hallucination survey paper accepted by ACL 2026 Apr. 2026 One co-authored paper IDEA accepted by IEEE FG 2026 Jan. 2026 One paper SHIELD accepted by ICLR 2026

Dec. 2025 Passed the Ph.D. Qualifying Exam, thanks to my advisor and committee members.

Aug. 2025 One paper D-CoDe accepted by EMNLP 2025 May 2025 Started Research Internship at Adobe Research. Sep. 2024 Started my journey at Northeastern University.

Experience

SMILE Lab, Northeastern University, Boston

Ph.D. Student, Sep. 2024 – Present

Supervisor: Prof. Yun Raymond Fu

Adobe Research, San Jose

Research Intern, May 2025 – Nov. 2025

Mentor: Zhaowen Wang; Simon Jenni; Jing Shi

Kyoto University, Kyoto

Research Student, Sep. 2023 – Mar. 2024

Supervisor: Prof. Takatsune Kumada

Xidian University, Xi'an

Master Student, Sep. 2021 – Jun. 2024

Undergraduate Student, Sep. 2017 – Jun. 2021

Supervisor: Prof. Xuefeng Liang

Publications [Google Scholar]

Role Status Topic

Submitted to ECCV

Thumbnail: MASON layout understanding paper

MASON: Compositional Design Layout Understanding in VLMs through Multimodal Alignment and Structural Perception

Yiyang Huang, Zhaowen Wang, Simon Jenni, Jing Shi, Yun Fu

TL;DR: Diagnosed failure modes in layered designs (semantic drift, structural ambiguity) and built MASON, a plug-and-play framework with metadata-aware alignment and structural cue injection.

Submitted to ECCV

Thumbnail: Rethinking Fine-Tuning for VLMs paper

Rethinking Fine-Tuning: Unlocking Hidden Capabilities in Vision-Language Models

Mingyuan Zhang, Yue Bai, Yifan Wang, Yiyang Huang, Yun Fu

TL;DR: Applied MFT to VLMs: learnable gating reorganizes subnetworks without weight updates; outperforms LoRA and full fine-tuning.

ACL 2026

Thumbnail: Video LLM Hallucination Survey paper

Distorted or Fabricated? A Survey on Hallucination in Video LLMs

Yiyang Huang, Yitian Zhang, Yizhou Wang, Mingyuan Zhang, Liang Shi, Huimin Zeng, Yun Fu

TL;DR: Authored a survey on Video-LLM hallucinations (taxonomy, benchmarks, mitigations) and maintain a curated repo.

IEEE FG 2026

Thumbnail: IDEA facial expression generation paper

IDEA: Capturing Individual Differences of Facial Expression for Authentic Expression Generation

Liang Shi, Yiyang Huang, Yun Fu

TL;DR: Trained a CLIP-style contrastive model with identity and expression modalities to guide text-to-image generation toward more authentic, individual-specific expressions.

ICLR 2026

Thumbnail: SHIELD hallucination mitigation paper

SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense

Yiyang Huang, Liang Shi, Yitian Zhang, Yi Xu, Yun Fu

TL;DR: Identified encoder-side causes of hallucinations and developed SHIELD, a training-free token-editing module (re-weighting + adversarial decoding) for captioning and VQA.

EMNLP 2025

Thumbnail: D-CoDe video understanding paper

D-CoDe: Scaling Image-Pretrained VLMs to Video via Dynamic Compression and Question Decomposition

Yiyang Huang, Yizhou Wang, Yun Fu

TL;DR: Developed D-CoDe, a plug-and-play pipeline with dynamic compression and question decomposition for long-video QA under tight context.

ICASSP 2025

Thumbnail: LipReading low-resource languages paper

LipReading for Low-resource Languages by Language Dynamic LoRA

Shuai Zou, Xuefeng Liang, Yiyang Huang

TL;DR: Developed dynamic LoRA for meta lip shapes and multilingual instruction tuning to improve cross-lingual lipreading in low-resource settings.

ACMMM 2021

CALLip: Lipreading using Contrastive and Attribute Learning

Yiyang Huang, Xuefeng Liang, Chaowei Fang

TL;DR: Proposed CALLip, leveraging attribute learning to normalize cross-speaker variation and audio-visual contrastive learning to mitigate viseme confusion.

Academic Service

Conference Reviewer FG, ARR

Journal Reviewer ACM TKDD

Honors & Awards

2022 Outstanding Student, Xidian University

2021 National Scholarship, China

2021 Undergraduate Computer Design Competition (1st Prize), China

2019 RoboMaster National Robotics Competition (2nd Prize), China

2019 ICRA AI Challenge (3rd Prize)

Teaching Experience

Fall 2025 TA — DS 5110 Essentials of Data Science

Spr. 2026 TA — DS 5020 Fundamentals of Linear Algebra and Probability

Contact

Email huang.yiyan@northeastern.edu / yiyang.huang.hukcc@gmail.com

WeChat hukcc369

Yiyang Huang 黄奕洋