vllm/layers at e9899fb7a4d9e032198d26ef84f1dd2cfd9621aa - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-24 13:57:03 +08:00

History

Cody Yu e9899fb7a4

[Model] Enable FP8 QKV in MoE and refine kernel tuning script (#5039 )

2024-05-31 14:29:19 -07:00

..

[Model] Enable FP8 QKV in MoE and refine kernel tuning script (#5039 )

2024-05-31 14:29:19 -07:00

[Mypy] Part 3 fix typing for nested directories for most of directory (#4161 )

2024-04-22 21:32:44 -07:00

[Bugfix] Avoid Warnings in SparseML Activation Quantization (#5120 )

2024-05-30 17:04:37 -07:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

activation.py

[Misc]Add customized information for models (#4132 )

2024-04-30 21:18:14 -07:00

layernorm.py

[Misc]Add customized information for models (#4132 )

2024-04-30 21:18:14 -07:00

linear.py

[Kernel] Initial Activation Quantization Support (#4525 )

2024-05-23 21:29:18 +00:00

logits_processor.py

[Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985 )

2024-05-23 22:04:24 +00:00

pooler.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

rejection_sampler.py

[Core][2/N] Model runner refactoring part 2. Combine prepare prefill / decode to a single API (#4681 )

2024-05-15 14:00:10 +09:00

rotary_embedding.py

[Lora] Support long context lora (#4787 )

2024-05-18 16:05:23 +09:00

sampler.py

[CORE] Improvement in ranks code (#4718 )

2024-05-12 17:47:47 -07:00

vocab_parallel_embedding.py

[Misc]Add customized information for models (#4132 )

2024-04-30 21:18:14 -07:00