The paper titled “8-bit Transformer Inference and Fine-tuning for Edge Accelerators,” authored by Jeffrey Yu, Kartik Prabhu, Yonatan Urman, Robert M. Radway, Eric Han, and Priyanka Raina will be presented at ASPLOS 2024. The work has fully open-sourced its code and received all three artifact evaluation badges from ASPLOS: “Artifacts Available,” “Artifacts Evaluated,” and “Results Reproduced.”
Priyanka Raina’s group’s research on quantized transformer inference and training gets accepted at ASPLOS 2024
