Benchmark: DeepSeek V3 vs GPT-4o vs Claude for coding tasks

#117

by xujfcn - opened 4 days ago

I ran a comparison of DeepSeek V3 against GPT-4o and Claude Sonnet on 50 coding tasks (LeetCode medium/hard). Here are my findings:

Model	Pass Rate	Avg Time	Cost/1K requests
DeepSeek V3	82%	3.2s	$0.21
GPT-4o	85%	2.8s	$6.25
Claude Sonnet	87%	3.5s	$9.00

DeepSeek V3 is remarkably competitive at a fraction of the cost.

Test setup: I used Crazyrouter to run all three models through the same OpenAI-compatible API.

from openai import OpenAI
client = OpenAI(base_url="https://crazyrouter.com/v1", api_key="your-key")
for model in ["deepseek-chat", "gpt-4o", "claude-sonnet-4-20250514"]:
    response = client.chat.completions.create(model=model, messages=messages)

Full comparison: Model Comparison Guide

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment