Add LongBench v2 support for DeepSeek by yiliu30 · Pull Request #2410 · intel/neural-compressor

yiliu30 · 2026-02-14T00:48:09Z

This PR adds LongBench v2 support for DeepSeek, similar to the Qwen implementation in PR #2406.

Changes

DeepSeek LongBench v2 support

Type of Change

Feature enhancement - adds long-context evaluation support for DeepSeek models

Description

Adapts the LongBench v2 evaluation framework for DeepSeek models:

Dynamic configuration: Automatically adjusts max_length to 40960 for longbench tasks (vs 8192 for standard tasks)
Server-based evaluation: Uses vLLM server with API-based evaluation for better stability with long contexts
Modular functions: Refactored code into reusable functions for server management and evaluation
Parallel execution: Supports 512 threads for efficient longbench evaluation

Implementation Details

Added long-bench-eval dependency to requirements.txt
Refactored run_evaluation.sh with:
- Task-based routing (detects 'longbench' in task name)
- vLLM server lifecycle management (start, health check, cleanup)
- Proper error handling and logging
Maintains backward compatibility with standard lm-eval tasks

Expected Behavior

When task name contains 'longbench', the script will:

Start a vLLM server with 40K context length support
Wait for server to be ready (with health checks)
Run LongBench evaluation via API
Clean up server on completion or error

For standard tasks, uses the original direct lm-eval execution.

How has this PR been tested?

Mirrors the Qwen implementation from PR #2406

Dependency Change

Added: long-bench-eval @ git+https://github.com/yiliu30/long-bench-eval

- Add long-bench-eval dependency to requirements.txt - Refactor run_evaluation.sh to support both standard and LongBench v2 tasks - Add dynamic max_length configuration (40960 for longbench tasks) - Implement vLLM server-based evaluation for LongBench - Add helper functions for server lifecycle management - Support up to 40K context length evaluation with 512 threads Signed-off-by: yiliu30 <yi4.liu@intel.com>

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 added 2 commits February 14, 2026 00:46

update max len

f2c5553

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 merged commit cb1b4a3 into longbench Feb 14, 2026
7 checks passed

yiliu30 deleted the longbench-deepseek branch February 14, 2026 01:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LongBench v2 support for DeepSeek#2410

Add LongBench v2 support for DeepSeek#2410
yiliu30 merged 2 commits intolongbenchfrom
longbench-deepseek

yiliu30 commented Feb 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yiliu30 commented Feb 14, 2026

Changes

Type of Change

Description

Implementation Details

Expected Behavior

How has this PR been tested?

Dependency Change

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant