vLLM cannot run modelopt quantized weights. After following the examples of FP8 quantization in examples/llm_ptq, it succeeded with generating FP8 weights, but when I ...
Python 2 is deprecated and no longer maintained. It is recommended to transition to Python 3 for ongoing support and security updates.