Accelerating a Triton Fused Kernel for W4A16 Quantized Inference with
  SplitK work decomposition
v1v2 (latest)

Accelerating a Triton Fused Kernel for W4A16 Quantized Inference with SplitK work decomposition

Papers citing "Accelerating a Triton Fused Kernel for W4A16 Quantized Inference with SplitK work decomposition"