ConfGuard: A Simple and Effective Backdoor Detection for Large Language Models
Published in arXiv preprint arXiv:2508.01365, 2025
Use Google Scholar for full citation
Recommended citation: Zihan Wang, Rui Zhang, Hongwei Li, Wenshu Fan, Wenbo Jiang, Qingchuan Zhao, Guowen Xu, "ConfGuard: A Simple and Effective Backdoor Detection for Large Language Models." arXiv preprint arXiv:2508.01365, 2025.
