Yang Sui | 隋阳
Howdy! I'm a Postdoctoral Research Associate in Department of Computer Science, Rice University, working with Prof. Xia "Ben" Hu and Prof. Hanjie Chen, to explore Efficient and Trustworthy AI, especially for LLMs, Diffusion Models, and Multimodal Generative Models.
Prior to joining Rice, I completed my Ph.D. at Department of Electrical and Computer Engineering at Rutgers University, advised by Prof. Bo Yuan.
In 2024 Spring, I was a Research Intern in Creative Vision team, Snap Research, proposing the 1.99 bits quantization on text-to-image generative model, BitsFusion. In 2022, I was a Research Intern at Media Lab, Tencent America, exploring efficiency and robustness of Learned Image Compression and Transformer models. In 2019, I was a full-time Algorithm Engineer at JD, working on the face verification and recognition. In 2018, I also spent a wonderful time as a Research and Development Intern and a member of PaddlePaddle (20.4k stars now), initializing the deep learning inference framework Paddle-Lite (6.6k stars now) at Baidu, reported in NeurIPS Expo, Baidu Create, Wave Summit+.
In addition to my academic work, I am passionate about Basketball, DOTA/DOTA2, World of Warcraft. I love Tracy McGrady, Stephen Curry, Lionel Messi, PIS (YaphetS).
I'm looking forward to the collaboration related to efficient and trustworthy AI, especially for LLMs and Diffusion Models. If you're interested in working with me, please don't hesitate to contact me.
Email  / 
Google Scholar  / 
Github   
|
|
Research Interests
My research is primarily focused on Efficient AI and Trustworthy AI. In the domain of Efficient AI, my investigations revolve around developing techniques to achieve resource-efficient deep learning models without compromising their accuracy or performance. I aim to design and innovate compression methods, such as pruning, quantization, and low-rank approximation, to reduce the size and complexity of deep learning models. By doing so, I intend to facilitate the deployment of these models on resource-constrained devices like mobile phones and embedded systems.
In my research on Trustworthy AI, my interests lie in investigating the vulnerability and robustness through adversarial attacks and backdoor attacks. I am dedicated to understanding the vulnerabilities of AI models in the face of these malicious threats and develop defense mechanisms that enhance the robustness AI models against these attacks.
Specifically, my research areas include:
Technologies:
Efficient AI: Model Compression & Data Compression.
Trustworthy AI: Model Vulnerability & Robustness.
Algorithm-hardware Co-design for AI Model Acceleration.
Tasks:
Generative AI: Text-to-Image Diffusion Models, Large Language Models, Multimodal Models.
Foundamental Deep Neural Networks: CNN, Transformer.
Image Processing: Learned Image Compression.
Digital Signal Processing: Error Correction Coding, Radio Frequency Neural Network.
|
Previous "Efficient Deep Learning Reading Group" Sessions:
|
News
- 10/2024: Invited to deliver a guest lecture, "Model Compression: Pruning, Quantization, and Recent Advances." at Texas A&M University, CSCE 689 Special Topics: Generative AI.
- 10/2024: I’m glad to join in the Department of Computer Science at Rice University as a Postdoctoral Associate.
- 09/2024: One paper is accepted by The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
- 09/2024: One paper is accepted as Findings by The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024).
- 09/2024: One paper is accepted by 30th Asia and South Pacific Design Automation Conference (ASP-DAC 2025).
- 07/2024: One paper is accepted by The 35th British Machine Vision Conference (BMVC 2024).
- 07/2024: One paper is accepted by European Conference on Computer Vision (ECCV 2024).
- 05/2024: One paper is accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
- 04/2024: I’m glad to receive the Paul Panayotatos Scholarship at Rutgers University.
- 03/2024: One paper is accepted by IEEE/ACM Design Automation Conference (DAC 2024).
- 02/2024: I’m glad to join the Creative Vision team, Snap Research as a Research Intern. I love the beautiful beach in Santa Monica and vibrant life in Los Angeles.
- 12/2023: Two papers are accepted as poster by the Data Compression Conference (DCC 2024).
- 10/2023: One paper is accepted by The International Symposium on High-Performance Computer Architecture (HPCA 2024).
- 09/2023: One paper is accepted by IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2023).
- 07/2023: Invited to deliver a talk, "Efficient Diffusion Models and Large Language Models: Quantization, Pruning, and LoRA." (Video)
- 07/2023: One paper is accepted with Spotlight presentation at ICML'23 NCW Workshop.
- 06/2023: One paper is accepted by IEEE/RSJ International Conference on Intelligent Robots (IROS 2023).
- 03/2023: One paper is accepted by The 50th International Symposium on Computer Architecture (ISCA 2023).
- 02/2023: One paper is accepted by IEEE/ACM Design Automation Conference (DAC 2023).
- 02/2023: One paper receives the Best Paper Runner-Up Award with Oral presentation at AAAI’23 DCAA Workshop.
- 11/2022: Two papers are accepted with Oral presentation by AAAI Conference on Artificial Intelligence (AAAI 2023).
- 10/2022: Present a poster in IBM IEEE CAS/EDS – 5th AI Compute Symposium at IBM Thomas J Watson Research Center, Yorktown Heights, NY.
- 05/2022: I’m glad to join the Media Lab, Tencent America as a Research Intern remotely.
- 03/2022: One paper is accepted by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022).
- 09/2021: One paper is accepted by Neural Information Processing Systems Conference (NeurIPS 2021).
- 09/2021: One paper is accepted by IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2021).
- 03/2021: One paper is accepted by ACM International Symposium on Computer Architecture (ISCA 2021).
- 02/2021: One paper was accepted by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021).
|
|
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Yang Sui, Yanyu Li, Anil Kag, Yerlan Idelbayev, Junli Cao, Ju Hu, Dhritiman Sagar, Bo Yuan, Sergey Tulyakov, Jian Ren
[NeurIPS 2024] The Thirty-eighth Annual Conference on Neural Information Processing Systems
Daily Paper in Hugging Face
arXiv
Project Page
Video
|
|
MoE-I²: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Cheng Yang*, Yang Sui*, Jinqi Xiao, Lingyi Huang, Yu Gong, Yuanlin Duan, Wenqi Jia, Miao Yin, Yu Cheng, Bo Yuan
[EMNLP 2024 Findings] The 2024 Conference on Empirical Methods in Natural Language Processing
|
|
Transferable Learned Image Compression-Resistant Adversarial Perturbations
Yang Sui, Zhuohang Li, Ding Ding, Xiang Pan, Xiaozhong Xu, Shan Liu, Zhenzhong Chen
[BMVC 2024] The 35th British Machine Vision Conference, 2024
arXiv
|
|
Clean & Compact: Efficient Data-Free Backdoor Defense with Model Compactness
Huy Phan, Jinqi Xiao, Yang Sui, Tianfang Zhang, Zijie Tang, Cong Shi, Yan Wang, Yingying Chen, Bo Yuan
[ECCV 2024] The 18th European Conference on Computer Vision ECCV 2024
PDF
|
|
Co-Exploring Sparsification and Low-Rank Decomposition for Compact DNNs
Yang Sui, Miao Yin, Yu Gong, Bo Yuan
[TNNLS] IEEE Transactions on Neural Networks and Learning Systems
PDF
|
|
DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models
Yang Sui, Huy Phan, Jinqi Xiao, Tianfang Zhang, Zijie Tang, Cong Shi, Yan Wang, Yingying Chen, Bo Yuan
[arXiv] In submission
arXiv
|
|
MOPED: Efficient Motion Planning Engine with Flexible Dimension Support
Lingyi Huang, Yu Gong, Yang Sui, Xiao Zang, Bo Yuan
[HPCA 2024] In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
PDF
|
|
In-Sensor Radio Frequency Computing for Energy-Efficient Intelligent Radar
Yang Sui, Minning Zhu, Lingyi Huang, Chung-Tse Michael Wu, Bo Yuan
[ICCAD 2023] In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2023
PDF
|
|
Corner-to-Center Long-range Context Model for Efficient Learned Image Compression
Yang Sui, Ding Ding, Xiang Pan, Xiaozhong Xu, Shan Liu, Bo Yuan, Zhenzhong Chen
[JVCI] Journal of Visual Communication and Image Representation
PDF
|
|
Reconstruction Distortion of Learned Image Compression with Imperceptible Perturbations
Yang Sui, Zhuohang Li, Ding Ding, Xiang Pan, Xiaozhong Xu, Shan Liu, Zhenzhong Chen
[DCC 2024] In Proceedings of the Data Compression Conference, 2024
[NCW@ICML 2023] Neural Compression: From Information Theory to Applications
Spotlight presentation
PDF
Website
|
|
DynGMP: Graph Neural Network-based Motion Planning in Unpredictable Dynamic Environments
Wenjin Zhang, Xiao Zang, Lingyi Huang, Yang Sui, Jingjin Yu, Yingying Chen, Bo Yuan
[IROS 2023] In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2023
PDF
|
|
ETTE: Efficient Tensor-Train-based Computing Engine for Deep Neural Networks
Yu Gong, Miao Yin, Lingyi Huang, Jinqi Xiao, Yang Sui, Chunhua Deng, Bo Yuan
[ISCA 2023] In Proceedings of the 50th International Symposium on Computer Architecture, 2023
PDF
|
|
DSPIMM: Digital Sparse In-Memory Matrix Vector Multplier for Communication Applications
Amitesh Sridharan, Fan Zhang, Yang Sui, Bo Yuan, Deliang Fan
[DAC 2023] In Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
PDF
|
|
Towards Sparse and Low-rank Neural Networks with Hybrid Compression
Yang Sui, Wanzhao Yang, Miao Yin, Yu Gong, Bo Yuan
[DCAA@AAAI 2023] DCAA, The First Workshop on DL-Hardware Co-Design for AI Acceleration
Website
Award
Best Paper Runner-Up Award
|
|
CSTAR: Towards Compact and STructured Deep Neural Networks with Adversarial Robustness
Huy Phan, Miao Yin, Yang Sui, Bo Yuan, Saman Zonouz
[AAAI 2023] In Proceedings of the AAAI Conference on Artificial Intelligence, 2023
PDF
Oral
|
|
HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Jinqi Xiao, Chengming Zhang, Yu Gong, Miao Yin, Yang Sui, Lizhi Xiang, Dingwen Tao, Bo Yuan
[AAAI 2023] In Proceedings of the AAAI Conference on Artificial Intelligence, 2023
PDF
Oral
|
|
CEPD: Co-Exploring Pruning and Decomposition for Compact DNN Models
Yang Sui, Wanzhao Yang, Miao Yin, Yu Gong, Bo Yuan
[Preprint]
PDF
|
|
ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks
Yang Sui, Miao Yin, Yu Gong, Jinqi Xiao, Huy Phan, Bo Yuan
[arXiv]
PDF
|
|
Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition
Yu Gong, Miao Yin, Lingyi Huang, Chunhua Deng, Yang Sui, Bo Yuan
[arXiv]
PDF
|
|
HODEC: Towards Efficient High-Order DEcomposed Convolutional Neural Networks
Miao Yin, Yang Sui, Wanzhao Yang, Xiao Zang, Yu Gong, Bo Yuan
[CVPR 2022] In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
PDF
|
|
CHIP: CHannel Independence-based Pruning for Compact Neural Networks
Yang Sui, Miao Yin, Yi Xie, Huy Phan, Saman Zonouz, Bo Yuan
[NeurIPS 2021] Advances in Neural Information Processing Systems 34, 2021
PDF
|
|
Algorithm and Hardware Co-design for Deep Learning-powered Channel Decoder: A Case Study
Boyang Zhang*, Yang Sui*, Lingyi Huang, Siyu Liao, Chunhua Deng, Bo Yuan
[ICCAD 2021] In Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021
PDF
|
|
GoSPA: An Energy-efficient High-performance Globally Optimized SParse Convolutional Neural Network Accelerator
Chunhua Deng, Yang Sui, Siyu Liao, Xuehai Qian, Bo Yuan
[ISCA 2021] In Proceedings of the ACM/IEEE 48th Annual International Symposium on Computer Architecture, 2021
PDF
|
|
Towards Efficient Tensor Decomposition-Based DNN Model Compression With Optimization Framework
Miao Yin, Yang Sui, Siyu Liao, Bo Yuan
[CVPR 2021] In Proceedings of The IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021
PDF
|
|
Paddle-Lite (Paddle-Mobile)
Initial contributor: Yang Sui, Ruilong Liu, Jiaying Zhao, Wang Liu, Yonghui Li.
[Baidu] The authors contributed almost equally to this work.
Paddle-Lite Github (6.4k stars)
Paddle-Lite is an updated version of Paddle-Mobile, an open-open source deep learning framework designed to make it easy to perform inference on mobile, embeded, and IoT devices. It is compatible with PaddlePaddle and pre-trained models from other sources, reported in NeurIPS Expo, Baidu Create, Wave Summit+.
|
Professional Services
-
Program Committee Member and Reviewer:
-
NeurIPS'22, 23, 24
-
ICLR'24
-
ICML'22, 23, 24
-
CVPR'22, 23, 24
-
ICCV'23
-
ECCV'22, 24
-
KDD'23
-
AAAI'22, 23, 24, 25
-
IROS'23
-
TNNLS
|
Teaching Experiences
-
Teaching Assistant at Rutgers University
14:332:351 - Programming Methodology II, Fall 2020
Instructor: Prof. Saman Zonouz   
14:332:351 - Programming Methodology II, Fall 2023
Instructor: Prof. Yao Liu   
|
Supervised Students
Wenjin Zhang, Ph.D. at Rutgers University
Topic: Quantization in Model Compression.
Justin Ding, Master student at Rutgers University
Topic: Pruning in Model Compression of "Graduate Special Problem"
Linqi Xiao, Master student at Rutgers University
Topic: Error Correct Coding of "Graduate Special Problem"
Srinihar Bondalapati, Master student at Rutgers University
Topic: Quantization in Model Compression
Yue Wang, Master student at Rutgers University
Topic: Dataset Distillation, Pruning in Model Compression of "Graduate Special Problem"
Veena Vrushi, Undergraduate student at Rutgers University
Topic: Deep Learning of "Project SUPER Research Program".
Vijay Maddila, Graduate student at Rutgers University
Topic: Large Language Models.
Ayan Patel, High school student at High Technology High School
Topic: Deep Learning.
Zhiyu Chen, Master student at Rutgers University
Topic: Large Language Models of "Graduate Special Problem".
|
Talks
Efficient Diffusion Models and Large Language Models: Quantization, Pruning, and LoRA. (Video)
July, 2023.
Model Compression: Pruning, Quantization, and Recent Advances.
Texas A&M University, CSCE 689 Special Topics: Generative AI, October, 2024.
|
Honors & Awards
-
Doctor of Philosophy (Ph.D.)
-
Master of Science (M.E.)
-
Graduate Student Academic Scholarship, 2017, 2018
-
Bachelor of Engineering (B.E.)
-
Postgraduate Recommendation (10% in EE at Jilin University), 2016
-
Outstanding Graduates, 2016
-
Second Prize Scholarship, 2014, 2015, 2016
|
Hobbies & Interests
In addition to my academic work, I am a fan of Basketball, Soccer, Formula 1, Snooker. I love Tracy McGrady, Stephen Curry, Lionel Messi, PIS (YaphetS).
I'm an experienced player of DOTA/DOTA2, World of Warcraft, Warcraft III, manipulating Druid (Balance Druid), DH (Havoc Demon Hunter) in WoW, and NE (Night Elf) in Warcraft III. I would like record some impressive moments:
- World of Warcraft
- World top 10 DPS of Balance Druid with H4 (Flamebender Ka'graz) in Blackrock Foundry reported on WCL in 2015.
-
Raid Leader for defeating the Heroic Highmaul Raid in the Warlords of Draenor in 2014. I gathered my fourteen friends and together we conquered the Heroic Highmaul raid. It was an incredibly impressive experience that we will never forget.
- DOTA
- Member of DOTA school team (1/5) in Jilin City No.1 High School, DOTA, 2011
- Hearthstone Legend
- Ladder Rank: 147, in 2016
- Diablo III
- Ladder Rank: 698, Witch Doctor, Season 9, 2017
I derive great pleasure from listening to the music that owns wonderful rhythm, especially R&B, and classical music from Chopin, Bach, Paganini.
Kim Tae-yeon was my idol during high school, providing me with a strong example and encouragement during my most difficult and depressive times.
Xiaolan (name is inspired from "Detective Conan"), my adorable grey and white cat with sparkling eyes, which appears in my NeurIPS'21 paper "CHIP", is playful, affectionate, and loves to cuddle. Say hi, Xiaolan!
|
|