Model architectures ['DeepseekOCR2ForCausalLM'] are not supported with latest vllm

#3
by avishekjana - opened

Getting the following error:

validation error for ModelConfig
vllm-server-1 | (APIServer pid=1) Value error, Model architectures ['DeepseekOCR2ForCausalLM'] are not supported.

GPU: 2x 16GB NVIDIA RTX 5070Ti

My compose.yml:

  vllm-server:
    image: vllm/vllm-openai:latest
    command: >
      --model deepseek-ai/DeepSeek-OCR-2
      --trust-remote-code
      --enable-prefix-caching
      --gpu-memory-utilization 0.9
      --port 8005
      --max-num-seqs 2
      --tensor-parallel-size 2
    environment:
      - VLLM_ATTENTION_BACKEND=FLASH_ATTN
    volumes:
      - ~/.cache:/root/.cache

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    network_mode: host
    ipc: host

Same error when deploying to vLLM server using Kubernested. Maybe the vLLM don't support?

The current vLLM doesnt support deepseek OCR 2 arch, which mean we need regis new model arch with exact typo (or you guy just need to clone the original github repo, and install vLLM whl)

Sign up or log in to comment