Research

My research focuses on advancing artificial intelligence and machine learning, with particular emphasis on multimodal understanding, generation, and robustness. As a PhD student in ECE at Duke University, I work on the following key research directions:
📚
Research Areas
🔍
Multimodal Understanding
Long-term Memory
Hippocampal-inspired Memory for Video Understanding [HippoMM]
Efficient Processing
Keyframe-oriented Token Pruning for Vision [KVTP]
Context-aware Pruning for Speech [SpeechPrune]
Signal Processing
Speech Envelope Decoding from EEG Signals [EEG-Decoding]
🎨
Multimodal Generation
Voice Synthesis
Bilingual Singing Voice Synthesis [BiSinger]
Singing Voice Data Scaling-up [ACE-Opencpop]
Cross-modal Generation
Stable Diffusion-Enhanced Voice Generation [Face2VSDEVGoice]
Speech Enhancement
Character-Based TV & Movie Speech Dataset [TMCSPEECH]
Zero-Shot Dysarthric Speech Reconstruction [TSVC]
🛡️
Robustness in AI Systems
Adversarial Examples
Natural Adversarial Examples with Stable Diffusion [SD-NAE]
Out-of-Distribution Detection
Enhanced Benchmark for OOD Detection [OpenOOD]
Vision-Language Robustness
Generalized OOD Detection Survey [OOD-Survey]
🔧
Tools & Applications
Research Visualization
Research Trend Visualization Toolkit [RTVis]
AI Frameworks
Singing Voice Synthesis Toolkit [Muskits]
Material Science
Tracking Nanoparticle Diffusion in Polymers [NanoTrack]

Recognition & Achievements

  • Oral presentation at Interspeech 2024
  • Best Paper Award at AAAI Spring Symposium 2025
  • Honorable Mention Demo Award at ACM Multimedia 2024

Industry Experience

Adobe Logo

Adobe Research

Research Intern · San Jose, CA · Summer 2025

Use this template Last edited on May 30, 2025