vLLM load error
TypeError: Invalid type of HuggingFace processor. Expected type: <class 'transformers.processing_utils.ProcessorMixin'>, but found type: <class 'transformers.tokenization_utils_fast.PreTrainedTokenizerFast'>
same cant get this to load on either vllm or sglang latest sglang gives different error ( return super().getattribute(key)
AttributeError: 'Glm4vMoeConfig' object has no attribute 'rope_scaling'
)
You probably need to update your transformers library -- I had the same as well and after updating to 5.0.0rc0 I am good to run it.
(glm46v) mbelleau@aibeast:/mnt/vault/llm/glm46v$ uv pip freeze | grep -e torch -e vllm -e transformers
torch==2.9.1
torchaudio==2.9.0+cu130
torchvision==0.24.1
transformers==5.0.0rc0
vllm==0.12.0
Thanks, that was indeed the issue. even though i had it in my requirements it didnt install for some reason. After installing transformers v5 i was able to run but had a lot of memory issues.
on my 8xL4 (24gb each) this worked
!vllm serve zai-org/GLM-4.6V-FP8 --served-model-name glm-4.6v --host 0.0.0.0 --port 1234 --max-model-len 32000 --tensor-parallel-size 8 --distributed-executor-backend mp --max-num-seqs 4 --enable-expert-parallel --tool-call-parser glm45 --reasoning-parser glm45 --enable-auto-tool-choice --enforce-eager --mm-encoder-tp-mode data --mm-processor-cache-type shm --gpu_memory_utilization 0.8 --kv-cache-dtype fp8_e4m3 --mm-processor-cache-gb 1 --limit-mm-per-prompt '{"image":2, "video":0}'
Because vllm==0.12.0 depends on transformers>=4.56.0,<5 and your project depends on transformers==5.0.0rc0, we can conclude that vllm==0.12.0 and your project are incompatible.
And because your project depends on vllm==0.12.0, we can conclude that your project's requirements are unsatisfiable.