Abstract:
Many recent theoretical works on meta-learning aim to achieve guarantees in leveraging similar representational structures from related tasks towards simplifying a target task. The main aim of theoretical guarantees on the subject is to establish the extent to which convergence rates---in learning a common representation---may scale with the number N of tasks (as well as the number of samples per task). First steps in this setting demonstrate this property when both the shared representation amongst tasks, and task-specific regression functions, are linear. This linear setting readily reveals the benefits of aggregating tasks, e.g., via averaging arguments. In practice, however, the representation is often highly nonlinear, introducing nontrivial biases in each task that cannot easily be averaged out as in the linear case.
In the present work, we derive theoretical guarantees for meta-learning with nonlinear representations. In particular, assuming the shared nonlinearity maps to an infinite-dimensional reproducing kernel Hilbert spaces, we show that additional biases can be mitigated with careful regularization that leverages the smoothness of task-specific regression functions, yielding improved rates that scale with the number of tasks as desired.
https://arxiv.org/abs/2307.10870
Bio:
Dr Zhu Li currently is a Senior Research Fellow at Gatsby Computational Neuroscience Unit, University College London. Previously he obtained his PhD in statistical machine learning at Department of Statistics, University of Oxford. He will be joining Department of Mathematics at Imperial College later this year. His research interest mainly lies in establishing rigorous theoretical foundations for multi-stage learning including meta-learning, fine-tuning and causal inference with kernel methods. His work has been awarded ICML Best paper Honourable Mention