Email: ys646 [at] stanford.edu
My research focuses on an algorithmic framework called test-time training. Its core idea is that each test instance defines its own learning problem, with its own target of generalization. This is usually realized by training a different model on-the-fly for each test instance using self-supervision.
I am a postdoc at Stanford University, hosted by Carlos Guestrin, Tatsu Hashimoto, and Sanmi Koyejo. I completed my PhD in EECS at UC Berkeley, advised by Alyosha Efros and Moritz Hardt. During my undergrad at Cornell University, I worked with Kilian Weinberger.
For a complete list of papers, please see my Google Scholar.
* indicates equal contribution.
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Yu Sun*, Xinhao Li*, Karan Dalal*, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois,
Xinlei Chen†, Xiaolong Wang†, Sanmi Koyejo†, Tatsunori Hashimoto†, Carlos Guestrin†
[paper]
[JAX code]
[PyTorch code]
Learning to (Learn at Test Time)
Yu Sun*, Xinhao Li*, Karan Dalal, Chloe Hsu, Sanmi Koyejo, Carlos Guestrin,
Xiaolong Wang, Tatsunori Hashimoto†, Xinlei Chen†
[paper]
[code]
Test-Time Training on Nearest Neighbors for Large Language Models
Moritz Hardt, Yu Sun
ICLR 2024
[paper]
[code]
Test-Time Training on Video Streams
Renhao Wang*, Yu Sun*, Yossi Gandelsman, Xinlei Chen, Alexei A. Efros, Xiaolong Wang
JMLR
[paper]
[website]
Test-Time Training with Masked Autoencoders
Yossi Gandelsman*, Yu Sun*, Xinlei Chen, Alexei A. Efros
NeurIPS 2022
[paper]
[website]
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, Moritz Hardt
ICML 2020
[paper]
[website]
[talk]
* indicates alphabetical order.
On Calibration of Modern Neural Networks
Chuan Guo*, Geoff Pleiss*, Yu Sun*, Kilian Q. Weinberger
ICML 2017
[paper]
[code]
Deep Networks with Stochastic Depth
Gao Huang*, Yu Sun*, Zhuang Liu, Daniel Sedra, Kilian Q. Weinberger
ECCV 2016
[paper]
[code]
[talk]