Model architectures ['DeepseekOCR2ForCausalLM'] are not supported with latest vllm

by avishekjana - opened 18 days ago

18 days ago

Getting the following error:

validation error for ModelConfig
vllm-server-1 | (APIServer pid=1) Value error, Model architectures ['DeepseekOCR2ForCausalLM'] are not supported.

GPU: 2x 16GB NVIDIA RTX 5070Ti

My compose.yml:

  vllm-server:
    image: vllm/vllm-openai:latest
    command: >
      --model deepseek-ai/DeepSeek-OCR-2
      --trust-remote-code
      --enable-prefix-caching
      --gpu-memory-utilization 0.9
      --port 8005
      --max-num-seqs 2
      --tensor-parallel-size 2
    environment:
      - VLLM_ATTENTION_BACKEND=FLASH_ATTN
    volumes:
      - ~/.cache:/root/.cache

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    network_mode: host
    ipc: host

sinhduc01

18 days ago

Same error when deploying to vLLM server using Kubernested. Maybe the vLLM don't support?

DuyTa

17 days ago

The current vLLM doesnt support deepseek OCR 2 arch, which mean we need regis new model arch with exact typo (or you guy just need to clone the original github repo, and install vLLM whl)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment