AQUA Attention via QUery mAgnitudes for Memory and Compute Efficient Inference in LLMs

AI Review

Keywords

Click the button to extract keywords

Insights

Click the button to extract insights