Submitted by Runpeng Dai 6 StatEval: A Comprehensive Benchmark for Large Language Models in Statistics Shanghai University of Finance and Economics 4