vllm/model_executor at 74d5543ec589daaa4ac042d65d52dccf26ee3f2c - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-10 12:49:08 +08:00

History

Peter Salas 74d5543ec5

[VLM][Core] Fix exceptions on ragged NestedTensors (#7974 )

2024-08-29 03:24:31 +00:00

..

guided_decoding

[misc][core] lazy import outlines (#7831 )

2024-08-24 00:51:38 -07:00

[Kernel/Model] Migrate mamba_ssm and causal_conv1d kernels to vLLM (#7651 )

2024-08-28 15:06:52 -07:00

[Kernel] Expand MoE weight loading + Add Fused Marlin MoE Kernel (#7766 )

2024-08-27 15:07:09 -07:00

[VLM][Core] Fix exceptions on ragged NestedTensors (#7974 )

2024-08-29 03:24:31 +00:00

__init__.py

[Performance] Optimize e2e overheads: Reduce python allocations (#7162 )

2024-08-08 21:34:28 -07:00

custom_op.py

[XPU] fallback to native implementation for xpu custom op (#7670 )

2024-08-20 00:26:09 -07:00

parameter.py

[Misc] update fp8 to use vLLMParameter (#7437 )

2024-08-22 08:36:18 -04:00

pooling_metadata.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

sampling_metadata.py

[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )

2024-08-18 17:57:20 -07:00

utils.py

[Hardware][Neuron] Refactor neuron support (#3471 )

2024-03-22 01:22:17 +00:00