You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Fix expand_kv in components/attention.rs to use repeat()
- Fix forward_prefill to use k_t/v_t for expand_kv
- Fix forward_decode with proper transposes and contiguous()
- Add GQA shape tests
- Add tokenizer loading from model directory in server
WIP: Still has reshape errors in decode phase
0 commit comments