Photo

Yu Sun

Email: ys646 [at] stanford.edu

My research focuses on an algorithmic framework called test-time training. Its core idea is that each test instance defines its own learning problem. This is usually realized by training a different model on-the-fly for each test instance using self-supervision.

I'm a postdoc at Stanford University, hosted by Carlos Guestrin, Tatsu Hashimoto, and Sanmi Koyejo. I'm also a researcher at NVIDIA. I completed my PhD in 2023 at UC Berkeley, advised by Alyosha Efros and Moritz Hardt. My PhD thesis is Test-Time Training. During my undergrad at Cornell University, I worked with Kilian Weinberger.

For a complete list of papers, please see my Google Scholar.

Selected Papers

* indicates equal contribution.

Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Yu Sun*, Xinhao Li*, Karan Dalal*, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen†, Xiaolong Wang†, Sanmi Koyejo†, Tatsunori Hashimoto†, Carlos Guestrin†
[paper] [JAX code] [PyTorch code]

Test-Time Training on Nearest Neighbors for Large Language Models
Moritz Hardt, Yu Sun
ICLR 2024
[paper] [code]

Test-Time Training on Video Streams
Renhao Wang*, Yu Sun*, Arnuv Tandon, Yossi Gandelsman, Xinlei Chen, Alexei A. Efros, Xiaolong Wang
JMLR
[paper] [website]

Test-Time Training with Masked Autoencoders
Yossi Gandelsman*, Yu Sun*, Xinlei Chen, Alexei A. Efros
NeurIPS 2022
[paper] [website]

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, Moritz Hardt
ICML 2020
[paper] [website] [talk]

Older Papers

* indicates alphabetical order.

On Calibration of Modern Neural Networks
Chuan Guo*, Geoff Pleiss*, Yu Sun*, Kilian Q. Weinberger
ICML 2017
[paper] [code]

Deep Networks with Stochastic Depth
Gao Huang*, Yu Sun*, Zhuang Liu, Daniel Sedra, Kilian Q. Weinberger
ECCV 2016
[paper] [code] [talk]

From Word Embeddings To Document Distances
Matt Kusner, Yu Sun, Nicholas Kolkin, Kilian Q. Weinberger
ICML 2015
[paper]