+

thu-coai

All

102 repositories

SocialEval
Public
[ACL'25] SocialEval: Evaluating Social Intelligence of Large Language Models
Python
•
MIT License
•0•5•0•0•Updated Oct 13, 2025Oct 13, 2025
CogFlow
Public
Think Socially via Cognitive Reasoning
Python
•0•2•0•0•Updated Oct 2, 2025Oct 2, 2025
CharacterGLM-6B
Public
[EMNLP'24] CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models
Python
•
Apache License 2.0
•36•481•4•0•Updated Oct 2, 2025Oct 2, 2025
Crisp
Public
[EMNLP'25] Crisp: Cognitive Restructuring of Negative Thoughts through Multi-turn Supportive Dialogues
Python
•0•9•0•0•Updated Sep 2, 2025Sep 2, 2025
AISafetyLab
Public
AISafetyLab: A comprehensive framework covering safety attack, defense, evaluation and paper list.
Python
•
MIT License
•13•203•0•0•Updated Aug 29, 2025Aug 29, 2025
JPS
Public
[MM'25] JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering
Python
•1•6•2•0•Updated Aug 25, 2025Aug 25, 2025
LRM-Safety-Study
Public
Python
•
MIT License
•0•4•0•0•Updated Aug 15, 2025Aug 15, 2025
Agent-SafetyBench
Public
Python
•
MIT License
•2•64•1•0•Updated Aug 11, 2025Aug 11, 2025
CharacterBench
Public
[AAAI'25] CharacterBench: Benchmarking Character Customization of Large Language Models
Python
•0•16•0•0•Updated Aug 1, 2025Aug 1, 2025
ShieldVLM
Public
Python
•0•6•0•0•Updated Jul 31, 2025Jul 31, 2025
SafetyBench
Public
Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]
Python
•
MIT License
•12•253•0•0•Updated Jul 28, 2025Jul 28, 2025
VPO
Public
Python
•
Apache License 2.0
•1•19•1•0•Updated Jul 20, 2025Jul 20, 2025
LongSafety
Public
[ACL 2025] LongSafety: Evaluating Long-Context Safety of Large Language Models
Python
•
MIT License
•0•15•0•0•Updated Jun 18, 2025Jun 18, 2025
SPaR
Public
Python
•
Apache License 2.0
•3•46•0•0•Updated Jun 11, 2025Jun 11, 2025
TransferAttack
Public
[ACL 2025] Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
Python
•1•11•0•0•Updated May 23, 2025May 23, 2025
HPSS
Public
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators (ACL 2025 Findings)
Python
•0•3•0•0•Updated May 23, 2025May 23, 2025
Backdoor-Data-Extraction
Public
Python
•
MIT License
•6•29•1•0•Updated May 22, 2025May 22, 2025
BARREL
Public
Python
•
MIT License
•1•16•0•0•Updated May 21, 2025May 21, 2025
MAPS
Public
Official Implementation of ICLR25 paper "MAPS: Advancing Multi-modal Reasoning in Expert-level Physical Science"
Python
•1•5•0•0•Updated Mar 12, 2025Mar 12, 2025
ComplexBench
Public
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
Python
•
MIT License
•11•95•5•0•Updated Feb 20, 2025Feb 20, 2025
MiniPLM
Public
[ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models
Python
•
MIT License
•9•60•4•0•Updated Nov 23, 2024Nov 23, 2024
MoralStory
Public
Python
•0•17•1•0•Updated Nov 7, 2024Nov 7, 2024
OpenMEVA
Public
Benchmark for evaluating open-ended generation
benchmark evaluation-metrics language-generation
Python
•7•50•3•1•Updated Nov 6, 2024Nov 6, 2024
CodePlan
Public
2•16•1•0•Updated Oct 16, 2024Oct 16, 2024
ShieldLM
Public
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]
Python
•
MIT License
•9•212•1•0•Updated Sep 29, 2024Sep 29, 2024
PICL
Public
Code for ACL2023 paper: Pre-Training to Learn in Context
Python
•
MIT License
•4•107•1•1•Updated Jul 26, 2024Jul 26, 2024
PsyQA
Public
一个中文心理健康支持问答数据集，提供了丰富的援助策略标注。可用于生成富有援助策略的长咨询文本。
17•228•0•0•Updated Jul 21, 2024Jul 21, 2024
JailbreakDefense_GoalPriority
Public
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
Python
•1•29•0•0•Updated Jul 9, 2024Jul 9, 2024
SafeUnlearning
Public
Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
Python
•1•31•3•0•Updated Jul 9, 2024Jul 9, 2024
CritiqueLLM
Public
Python
•3•147•6•0•Updated Jul 1, 2024Jul 1, 2024

点击这是indexloc提供的php浏览器服务，不要输入任何密码和下载