Publications

Ego3DT: Tracking Every 3D Object in Ego-centric Videos
Shengyu Hao, Wenhao Chai, Zhonghan Zhao, Meiqi Sun, Wendi Hu, Jieyang Zhou, Yixian Zhao, Qi Li, Yizhou Wang, Xi Li, Gaoang Wang
ACM Multimedia (MM), 2024
[Paper] [Video]
We propose a novel zero-shot approach for the 3D reconstruction and tracking of all objects from the ego-centric video.

See and Think: Embodied Agent in Virtual Environment
Zhonghan Zhao, Wenhao Chai, Xuan Wang, Boyi Li, Shengyu Hao, Shidong Cao, Tian Ye, Gaoang Wang
European Conference on Computer Vision (ECCV), 2024
[Paper]
This paper proposes STEVE, a comprehensive and visionary embodied agent in the Minecraft virtual environment.

DIVOTrack: A Novel Dataset and Baseline Method for Cross-View Multi-Object Tracking in DIVerse Open Scenes
Shengyu Hao, Peiyuan Liu, Yibing Zhan, Kaixun Jin, Zuozhu Liu, Mingli Song, Jenq-Neng Hwang, Gaoang Wang
International Journal of Computer Vision (IJCV), 2024
[Paper] [Dataset] [Code]
A new cross-view multi-object tracking dataset for DIVerse Open scenes with dense tracking pedestrians.

DiffFashion: Reference-based Fashion Design with Structure-aware Transfer by Diffusion Models
Shidong Cao, Wenhao Chai, Shengyu Hao, Yanting Zhang, Hangyue Chen, Gaoang Wang
IEEE Transactions on Multimedia (TMM), 2023
[Paper] [Code]
We transfer a reference appearance image onto a clothing image while preserving the structure of the clothing image.

Weakly supervised instance segmentation using multi-prior fusion
Shengyu Hao, Gaoang Wang, Renshu Gu
Computer Vision and Image Understanding (CVIU), 2021
[Paper]
We propose a novel method that combines multiple priors for mask generation.