Janson Lau
|
b2253d1807
|
Update model.py
|
2025-07-27 23:42:47 +08:00 |
|
Janson Lau
|
c21638c56c
|
Update model.py
|
2025-07-27 23:36:35 +08:00 |
|
Janson Lau
|
292b8a34d8
|
Create model.py
|
2025-07-27 23:34:37 +08:00 |
|
Janson Lau
|
b265f3795c
|
Delete inference/model.py
|
2025-07-27 21:54:05 +08:00 |
|
Janson Lau
|
9fabdf8ae6
|
Create model.py @greptile
|
2025-07-27 21:32:16 +08:00 |
|
Janson Lau
|
e1daf07be1
|
Delete inference/model.py
|
2025-07-27 21:31:51 +08:00 |
|
Janson Lau
|
55f36bafc7
|
Create model.py
|
2025-07-27 21:30:22 +08:00 |
|
Janson Lau
|
e5f8de034b
|
Delete inference/model.py
|
2025-07-27 21:30:04 +08:00 |
|
huxuedan
|
d29a967601
|
modify the explanation of MLA
|
2025-02-26 17:07:39 +08:00 |
|
Xingkai Yu
|
1398800ebf
|
fix scores mask
|
2025-02-14 20:26:45 +08:00 |
|
Xingkai Yu
|
5ee97a83f0
|
fix comment
|
2025-02-07 16:42:55 +08:00 |
|
Xingkai Yu
|
87a01053e4
|
Merge pull request #556 from XxAlonexX/main
Fix Linear Layer Bias Initialization
|
2025-02-05 16:23:02 +08:00 |
|
XxAlonexX
|
6a30b43249
|
Fix Linear Layer Bias Initialization
|
2025-02-04 10:38:45 +05:30 |
|
Roman Fitzjalen
|
2756e130c2
|
clarify assertion error
|
2025-01-28 13:16:54 +01:00 |
|
enoch kan
|
bc77f22afc
|
Updated model.py docstrings
|
2025-01-05 18:24:31 +00:00 |
|
GeeeekExplorer
|
fd011c11aa
|
torch rmsnorm
|
2025-01-05 14:33:48 +08:00 |
|
stack-heap-overflow
|
4c2fdb8f55
|
Release DeepSeek-V3
|
2024-12-26 19:01:57 +08:00 |
|