EvasionBench: Detecting Evasive Answers in Financial Q&A via Multi-Model Consensus and LLM-as-Judge Paper • 2601.09142 • Published 4 days ago • 8 • 3
DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation Paper • 2512.19012 • Published 27 days ago • 16 • 4
DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation Paper • 2512.19012 • Published 27 days ago • 16 • 4