About Me

Sheng Jin is currently a Phd candidate (2020-present) at the University of Hong Kong (HKU), advised by Prof. Ping Luo and co-supervised by Prof. Wenping Wang and Prof. Xiaoou Tang.

In 2020, he received his master's degree in the Department of Automation at Tsinghua University, advised by Prof. Changshui Zhang. In 2017, he received the B.Eng. degree with highest honor (Outstanding Graduate Scholarships) from Tsinghua University.

His research focus is on teaching machines/robots to see and understand human behaviors such as human body poses, actions, and human-machine interactions.

News


  • [2024-07] Four papers accepted to ECCV'2024.

  • [2024-01] Two papers accepted to ICLR'2024 (1 Spotlight and 1 Poster).

  • [2023-12] One paper accepted to AAAI'2024.

  • [2023-03] One paper accepted to CVPR'2023.

  • [2022-08] One paper accepted to TPAMI'2022.

  • [2022-07] Three papers accepted to ECCV'2022 (1 Oral and 2 Posters).

  • [2022-04] One paper accepted to CVPR'2022 (Oral).

  • [2022-01] One paper accepted to ICLR'2022.

Education

The University of Hong Kong

PhD in Computer Science (HKPFS awardee), 2020~now
Tsinghua University

MS in Control Science and Engineering, 2017~2020
Tsinghua University

BSc in Automation (ranking 1/145), 2013~2017


Honors and Awards

  • YS and Christabel Lung Postgraduate Scholarship, 2020-2021.

  • HKU Presidential PhD Scholarship (HKU-PS) 2020.

  • Hong Kong PhD Fellowships (HKPF), 2020.

  • Outstanding Graduate Scholarship, Tsinghua University (top 1% in Tsinghua), 2017.

  • The Baosteel Excellent Student Scholarship, 2016.

  • Zheng Weimin Scholarship (2nd class) for Comprehensive Excellence, 2016.

  • Tsinghua-JJWorld (Beijing) Nework Technology Fellowships, Tsinghua University, 2015.

  • Tsinghua-Evergrande Fellowships for Academic Excellence, Tsinghua University, 2014.


Selected Publications

* means equal contributions.
UniFS: Universal Few-shot Instance Perception with Point Representations

Sheng Jin*, Ruijie Yao*, Lumin Xu, Wentao Liu, Chen Qian, Ji Wu, Ping Luo

European Conference on Computer Vision (ECCV), 2024.

[Paper]  [Code & Data]
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception

Sheng Jin*, Shuhuai Li*, Tong Li, Wentao Liu, Chen Qian, Ping Luo

European Conference on Computer Vision (ECCV), 2024.

[Paper]  [Code & Data]
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset

Yi Zhang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu

European Conference on Computer Vision (ECCV), 2024.

[Paper]  [Code & Data]
GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition

Ruijie Yao, Sheng Jin, Lumin Xu, Wang Zeng, Wentao Liu, Chen Qian, Ping Luo, Ji Wu

European Conference on Computer Vision (ECCV), 2024.

[Paper]  [Code]
TCFormer: Visual Recognition via Token Clustering Transformer

Wang Zeng, Sheng Jin, Lumin Xu, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024.

[Paper]  [Code]
Pose for Everything: Towards Category-Agnostic Pose Estimation

Lumin Xu*, Sheng Jin*, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang

European Conference on Computer Vision (ECCV), 2022, Oral.

[Paper]  [Code & Data]  [Blog(商汤学术)]   [Talk(OpenMMLab社区)]
Whole-Body Human Pose Estimation in the Wild

Sheng Jin, Lumin Xu, Jin Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, and Ping Luo

European Conference on Computer Vision (ECCV), 2020.

[Paper] [Dataset]
Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation

Sheng Jin, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, and Ping Luo

European Conference on Computer Vision (ECCV), 2020.

[Paper] [Blog(知乎)]
Multi-person Articulated Tracking with Spatial and Temporal Embeddings

Sheng Jin, Wentao Liu, Wanli Ouyang, Chen Qian

Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[Paper] [Demo]


Other papers

Click to expand or collapse
F-LMM: Grounding Frozen Large Multimodal Models

Size Wu, Sheng Jin, Wenwei Zhang, Lumin Xu, Wentao Liu, Wei Li, Chen Change Loy

arXiv, 2024.

[Paper]
AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks

Zekang Yang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu

arXiv, 2024.

[Paper]
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Xiangtai Li, Wentao Liu, Chen Change Loy

International Conference on Learning Representations (ICLR), 2024, Spotlight.

[Paper]  [Code]  [Blog(商汤学术)]
PROGRAM: PROtotype GRAph Model based Pseudo-Label Learning for Test-Time Adaptation

Haopeng Sun, Lumin Xu, Sheng Jin, Ping Luo, Chen Qian, Wentao Liu

International Conference on Learning Representations (ICLR), 2024.

CLIM: Contrastive Language-Image Mosaic for Region Representation

Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Wentao Liu, Chen Change Loy

AAAI Conference on Artificial Intelligence (AAAI), 2024.

[Paper]  [Code]  [Blog(商汤学术)]
Aligning Bag of Regions for Open-Vocabulary Object Detection

Size Wu, Wenwei Zhang, Sheng Jin, Wentao Liu, Chen Change Loy

Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

[Paper]  [Code]   [Project]  [Blog(商汤学术)]
ZoomNAS: Searching for Whole-body Human Pose Estimation in the Wild

Lumin Xu, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022.

[Paper]  [Data]   [Talk(OpenMMLab社区)]
PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation

Wentao Jiang, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Si Liu

European Conference on Computer Vision (ECCV), 2022.

[Paper]  [Code]  [Blog(商汤学术)]   [Talk(OpenMMLab社区)]
3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal

Hao Meng*, Sheng Jin*, Wentao Liu, Chen Qian, Mengxiang Lin, Wanli Ouyang, Ping Luo

European Conference on Computer Vision (ECCV), 2022.

[Paper]  [Code & Data]   [Project]  [Blog(商汤学术)]   [Talk(OpenMMLab社区)]
Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer

Wang Zeng, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Ouyang Wanli, Xiaogang Wang

Conference on Computer Vision and Pattern Recognition (CVPR), 2022, Oral.

[Paper]  [Code]  [Blog(商汤学术)]   [News(机器之心)]   [Talk(OpenMMLab社区)]
Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint Localization

Can Wang, Sheng Jin, Yingda Guan, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang

International Conference on Learning Representations (ICLR), 2022.

[Paper
Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images

Size Wu, Sheng Jin, Wentao Liu, Lei Bai, Chen Qian, Dong Liu, Wanli Ouyang

IEEE International Conference on Computer Vision (ICCV), 2021.

[Paper]  [Code]  [Blog(商汤学术)]  [Demo]
ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

Lumin Xu, Yingda Guan, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang

Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[Paper]  [Code]   [Talk(OpenMMLab社区)]
When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks

Jiahang Wang, Sheng Jin, Wentao Liu, Weizhong Liu, Chen Qian, Ping Luo

Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[Paper]  [Code]
When Counterpoint Meets Chinese Folk Melodies

Nan Jiang, Sheng Jin, Zhiyao Duan, Changshui Zhang

Conference on Neural Information Processing Systems (NeurIPS), 2020.

[Paper] [Supplementary] [Poster] [Code] [Project Page]
TRB: A Novel Triplet Representation for Understanding 2D Human Body

Haodong Duan, Kwan-Yee Lin, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang

IEEE International Conference on Computer Vision (ICCV), 2019, Oral.

[Paper] [Dataset]
Robust Few-Shot Learning for User-Provided Data

Jiang Lu, Sheng Jin, Jian Liang, and Changshui Zhang

IEEE Transactions on Neural Networks and Learning Systems (TNNLS).

[Paper
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning

Nan Jiang, Sheng Jin, Zhiyao Duan, Changshui Zhang

AAAI Conference on Artificial Intelligence (AAAI), 2020, Oral.

[Paper] [Demo]
Hierarchical Automatic Curriculum Learning: Converting a Sparse Reward Navigation Task into Dense Reward

Nan Jiang, Sheng Jin, Changshui Zhang

Neurocomputing, 2019.

[Paper
Connectionist Temporal Classification with Maximum Entropy Regularization

Hu Liu, Sheng Jin, Changshui Zhang

Conference on Neural Information Processing Systems (NeurIPS), 2018, Spotlight.

[Paper] [Poster] [Code]
Towards Multi-Person Pose Tracking: Bottom-up and Top-down Methods

Sheng Jin, Xujie Ma, Zhipeng Han, Yue Wu, Wei Yang, Wentao Liu, Chen Qian, Wanli Ouyang

International Conference on Computer Vision (ICCV) PoseTrack Workshop, 2017.

[Paper] [Leaderboard](BUTDS and BUTD2) [Demo]


Projects

MMPose Toolbox

MMPose is an open-source toolbox for pose estimation based on PyTorch, which is a part of the OpenMMLab project.

[Project
ACM MM'2020 HiEve Challenge

Our team (SimpleTrack) won the 3rd place in Track-3 "Crowd Pose Tracking in Complex Events" of ACM MM'2020 HiEve Challenge.

[Leaderboard] [Technical Report]
CVPR'2018 Look Into Person (LIP) Challenge

Our team (MJDG) won the 2nd place in Track-4 "Multi-Human Pose Estimation Challenge" of CVPR'2018 LIP Challenge.

[Leaderboard] [Oral Presentation]  
ICCV'2017 PoseTrack Challenge

Our team (BUTDS | BUTD2) won the 2nd places in both Track-1 "Single-Frame Person Pose Estimation" and Track-3 "Multi-Person Pose Tracking" of ICCV'2017 PoseTrack Challenge.

[Leaderboard] [Technical Report]  [Oral Presentation]  [Demo]


Patents

Click to expand or collapse
Key point detection method, device, electronic equipment and storage medium

Sheng Jin, Wentao Liu, Chen Qian

Chinese Invention Patent.

Publication Number: CN111898642A. Publication Date: 2020-11-06.
Key point detection method, device, electronic equipment and storage medium

Sheng Jin, Wentao Liu, Chen Qian

Chinese Invention Patent.

Publication Number: CN111783882A. Publication Date: 2020-10-16.
Image processing method and device, detection device and storage medium

Tong Li, Sheng Jin, Wentao Liu, Chen Qian

Chinese Invention Patent.

Publication Number: CN111539992A. Publication Date: 2020-08-14.
Key point detection method, device, electronic equipment and storage medium

Sheng Jin, Wentao Liu, Chen Qian

Chinese Invention Patent.

Publication Number: CN111444928A. Publication Date: 2020-07-24.
Image processing method and device, detection device and storage medium

Sheng Jin, Wentao Liu, Chen Qian

Chinese Invention Patent.

Publication Number: CN109948526A. Publication Date: 2019-06-28.
Image processing method and device, detection device and storage medium

Sheng Jin, Wentao Liu, Chen Qian

Chinese Invention Patent.

Publication Number: CN109934183A. Publication Date: 2019-06-25.
Deep learning model training method and device, training equipment and storage medium

Sheng Jin, Wentao Liu, Chen Qian

Chinese Invention Patent.

Publication Number: CN109919245A. Publication Date: 2019-06-21.


Teaching


  • TA, Deep Learning (COMP7606), HKU, [autumn, 2021]
  • TA, From Human Vision to Machine Vision (CCST9049), HKU, [spring, 2020]
  • TA, Introduction to Artificial Intelligence (40250182-0), THU, [spring, 2019]


  • Activities


  • Conference Reviewer/PC Member
    • NeurIPS'19-24, AAAI'19-24, ICML'20-24, CVPR'20-24, ICCV'21-23, ECCV'22-24, ICLR'21-24, WACV'21-24,
  • Journal Reviewer
    • Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Transactions on Artificial Intelligence (TAI), Transactions on Image Processing (TIP), International Journal of Computer Vision (IJCV), IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), IEEE Transactions on Visualization and Computer Graphics (TVCG)
  • Website Chairs


  • Contacts

    js20 [at] connect.hku.hk | jinsheng13 [at] foxmail.com