Zeming Wei

Cited by

	All	Since 2019
Citations	305	305
h-index	9	9
i10-index	9	9

260

130

195

2023202447 255

Public access

View all

4 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Yifei WangPostdoc, MIT CSAILVerified email at mit.edu
Yihao ZhangPeking UniversityVerified email at stu.pku.edu.cn
Meng SunProfessor, School of Mathematical Science, Peking UniversityVerified email at math.pku.edu.cn
Yichuan MoPh.D. Candidate, Peking UniversityVerified email at stu.pku.edu.cn
Xiyue ZhangUniversity of BristolVerified email at bristol.ac.uk
Jingyu ZhuVerified email at stu.pku.edu.cn
Chawin SitawarinPostdoctoral Researcher @ MetaVerified email at meta.com
David WagnerProfessor of Computer Science, UC BerkeleyVerified email at cs.berkeley.edu
Julien PietUC BerkeleyVerified email at berkeley.edu
Sizhe ChenUC Berkeley, FAIR at MetaVerified email at berkeley.edu
Huanran ChenUndergraduate, Beijing Institute of TechnologyVerified email at bit.edu.cn
Hangzhou HePeking UniversityVerified email at stu.pku.edu.cn
Stefanie JegelkaTUM and MITVerified email at mit.edu
Sun JunProfessor of SCIS, SMUVerified email at smu.edu.sg
Yinpeng DongTsinghua UniversityVerified email at tsinghua.edu.cn

Zeming Wei

Undergraduate, Peking University

Verified email at stu.pku.edu.cn - Homepage

Trustworthy AI Adversarial Robustness Explainability


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations Z Wei, Y Wang, A Li, Y Mo, Y Wang arXiv preprint arXiv:2310.06387, 2023	134	2023
CFA: Class-wise Calibrated Fair Adversarial Training Z Wei, Y Wang, Y Guo, Y Wang CVPR 2023, 2023	48	2023
Jatmo: Prompt injection defense by task-specific finetuning J Piet, M Alrashed, C Sitawarin, S Chen, Z Wei, E Sun, B Alomair, ... ESORICS 2024, 2024	32	2024
Sharpness-Aware Minimization Alone can Improve Adversarial Robustness Z Wei✉️, J Zhu, Y Zhang ICML 2023 Workshop on New Frontiers in Adversarial Machine Learning, 2023	17*	2023
Fight back against jailbreaking via prompt adversarial tuning Y Mo, Y Wang, Z Wei, Y Wang NeurIPS 2024, 2024	11*	2024
Boosting Jailbreak Attack with Momentum Y Zhang, Z Wei✉️ ICLR 2024 Workshop on Reliable and Responsible Foundation Models, 2024	10	2024
Architecture Matters: Uncovering Implicit Mechanisms in Graph Contrastive Learning X Guo, Y Wang, Z Wei, Y Wang NeurIPS 2023, 2023	10	2023
Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks Z Wei, X Zhang, Y Zhang, M Sun Journal of Logical and Algebraic Methods in Programming 136, 100907, 2023	10	2023
Extracting Weighted Finite Automata from Recurrent Neural Networks for Natural Languages Z Wei, X Zhang, M Sun ICFEM 2022, 2022	10	2022
Using Z3 for Formal Modeling and Verification of FNN Global Robustness Y Zhang, Z Wei, X Zhang, M Sun arXiv preprint arXiv:2304.10558, 2023	7	2023
On the Duality Between Sharpness-Aware Minimization and Adversarial Training Y Zhang, H He, J Zhu, H Chen, Y Wang, Z Wei✉️ ICML 2024, 2024	6	2024
Exploring the Robustness of In-Context Learning with Noisy Labels C Cheng, X Yu, H Wen, J Sun, G Yue, Y Zhang, Z Wei✉️ ICLR 2024 Workshop on Reliable and Responsible Foundation Models, 2024	5	2024
A Theoretical Understanding of Self-Correction through In-context Alignment Y Wang, Y Wu, Z Wei, S Jegelka, Y Wang NeurIPS 2024, 2024	2	2024
Automata Extraction from Transformers Y Zhang, Z Wei, M Sun arXiv preprint arXiv:2406.05564, 2024	1	2024
Towards General Conceptual Model Editing via Adversarial Representation Engineering Y Zhang, Z Wei, J Sun, M Sun NeurIPS 2024, 2024	1	2024
Characterizing Robust Overfitting in Adversarial Training via Cross-Class Features Z Wei, Y Guo, Y Wang OpenReview preprint, 2023	1	2023
MILE: A Mutation Testing Framework of In-Context Learning Systems Z Wei, Y Zhang, M Sun SETTA 2024, 2024		2024
DiffTextPure: Defending Large Language Models with Diffusion Purifiers H Chen, Z Wang, Y Yang, S Zhang, Z Wei, F Jin, Y Dong NeurIPS 2024 Workshop, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–18

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors