PhD in Information Engineering
The Chinese University of Hong Kong, 2022
AnimatedDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Conference paper
Cinematic Behavior Transfer via NeRF-based Differential Filming
Conference paper
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
Conference paper
ScriptViz: A Visualization Tool to Aid Scriptwriting based on a Large Movie Database
Conference paper
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
Conference paper
A Coarse-to-Fine Framework for Automatic Video Unscreen
Article
Adding Conditional Control to Text-to-Image Diffusion Models
Conference paper
Automatic Conversion of Music Videos into Lyrics Videos
Conference paper
Dynamic Storyboard Generation in an Engine-based Virtual Environment for Video Production
Conference paper
HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and Regime-Switch VAE
Conference paper
Self-Supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences
Conference paper
Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization
Conference paper
Jointly Learning the Attributes and Composition of Shots for Boundary Detection in Videos
Article
AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation
Conference paper
BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering
Conference paper
Shoot360: Normal View Video Creation from City Panorama Footage
Conference paper
Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows
Conference paper
BlockPlanner: City Block Generation with Vectorized Graph Representation
Conference paper
A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation
Conference paper
A Unified Framework for Shot Type Classification Based on Subject Centric Lens
Conference paper
MovieNet: A Holistic Dataset for Movie Understanding
Conference paper
Online Multi-modal Person Search in Videos
Conference paper
HotFlip: White-Box Adversarial Examples for NLP
Conference paper
AnimatedDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Cinematic Behavior Transfer via NeRF-based Differential Filming
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
ScriptViz: A Visualization Tool to Aid Scriptwriting based on a Large Movie Database
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
Adding Conditional Control to Text-to-Image Diffusion Models
Dynamic Storyboard Generation in an Engine-based Virtual Environment for Video Production
HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and Regime-Switch VAE
Self-Supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences
Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and Maximization
AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation
BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering
Shoot360: Normal View Video Creation from City Panorama Footage
Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows
BlockPlanner: City Block Generation with Vectorized Graph Representation
HotFlip: White-Box Adversarial Examples for NLP
Conference paper
No Publications |
EMIA6500H | AI for Visual Content Creation |
No Teaching Assignments |
No Teaching Assignments |
No Teaching Assignments |
No Teaching Assignments |
No Teaching Assignments |
Update your browser to view this website correctly. Update your browser now