Tree Learning: Optimal Algorithms and Sample Complexity

35:09

0 views

Published March 3, 2023

About this talk

A Google TechTalk, presented by Dmitrii Avdyukhin, 2023-02-21 ABSTRACT: We study the problem of learning a hierarchical tree representation of data from labeled samples, taken from an arbitrary (and possibly adversarial) distribution. Consider a collection of data tuples labeled according to their hierarchical structure. The smallest number of such tuples required in order to be able to accurately label subsequent tuples is of interest for data collection in machine learning. We present optimal sample complexity bounds for this problem in several learning settings, including (agnostic) PAC learning and online learning. Our results are based on tight bounds of the Natarajan and Littlestone dimensions of the associated problem. The corresponding tree classifiers can be constructed efficiently in near-linear time. Bio: Dmitrii is a last-year PhD student at Indiana University, advised by Prof. Grigory Yaroslavtsev. His main research area is continuous optimization, in particular in Federated Learning settings. He is also broadly interested in the theoretical foundations of machine learning and approximation algorithms. Previously, he interned at Meta, working on a gradient descent-based algorithm for balanced graph partitioning, and Amazon, working on graph convolutional networks, federated learning, and few-shot learning. A Google Research Algorithms Seminar