Towards Automated Kernel Generation in the Era of LLMs
Yang Yu, Peiyu Zang, Chi Hsu Tsai +11 more
The performance of modern AI systems is fundamentally constrained by the quality of their underlying kernels, which translate high-level algorithmic semantics into low-level hardware operations. Achieving near-optimal kernels requires expert-level understanding of hardware architectures and programm...