DeepSeek Coder V2, an open-source mixture of experts (MoE) code language model, has shown to outperform state-of-the-art closed-source models, including GPT-4 Turbo, Claude 3 Opus, and Gemini 1.5 Pro. It supports more than 300 programming languages and has achieved top-ranking performance with only a small number of activated parameters2.
When tested on benchmarks such as MBPP+, HumanEval, and Aider, designed to evaluate code generation, editing, and problem-solving capabilities of Large Language Models (LLMs), DeepSeek Coder V2 scored 76.2, 90.2, and 73.7, respectively. This places it ahead of most closed and open-source models, including GPT-4 Turbo, Claude 3 Opus, Gemini 1.5 Pro, Codestral, and Llama-3 70B.
Additionally, DeepSeek Coder V2 has shown strong performance in general reasoning and language understanding tasks. In the MMLU benchmark, which evaluates language understanding across multiple tasks, it scored 79.2. This is better than other code-specific models and similar to the score of Llama-3 70B.
Overall, DeepSeek Coder V2 has demonstrated exceptional proficiency in code generation and understanding, as well as in mathematical tasks, significantly narrowing the performance gap between open-source models and leading AI models like GPT-4 Turbo, Claude 3 Opus, and Gemini 1.5 Pro.
The DeepSeek Coder V2 model introduces several key features that differentiate it from its predecessor:
DeepSeek has significantly expanded its model's capabilities in terms of programming language support and context window size with the release of DeepSeek Coder V2. The original DeepSeek Coder supported 86 programming languages and had a context window of 16K. With the new V2 offering, DeepSeek has increased language support to 338 programming languages and expanded the context window to 128K. These enhancements enable the model to handle more complex and extensive coding tasks.