I am broadly interested in understanding machine learning systems through mathematics. For example, I think a lot about approximation theory, statistical learning theory, geometry, and dynamical systems. I am also tangentially interested in interactions between ML and more abstract fields (category theory, homotopy theory, algebraic topology).
Preprints
- Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods. Zhaiming Shen, Alex Hsu, Rongjie Lai, Wenjing Liao. Under review. (2025) [arXiv]
We explore a connection between attention and kernel methods, which we extend to derive generalization error bounds for in-context kernel regression on manifolds. Along the way, we construct a transformer which realizes kernel regression in the ICL setting.
- The layer number of grids. Gergely Ambrus, Alexander Hsu, Bo Peng, Shiyu Yan. (2020) [arXiv]
We study the number of convex layers for integer grids in higher dimensions. Undergrad summer research (unfortunately over Zoom). Never published; at the time there were sharper bounds via different techniques, and eventually our results were subsumed by newer work.
Publications
...or rather, the lack thereof.