Recent Publications

More Publications

When forecasting time series with a hierarchical structure, the existing state of the art is to forecast each time series …

MISO, also known as Finito, was one of the first stochastic variance reduced methods discovered, yet its popularity is fairly low. Its …

In this paper, we consider distributed algorithms for solving the empirical risk minimization problem under the master/worker …

We consider the problem of minimizing the sum of three convex functions: i) a smooth function $f$ in the form of an expectation or a …

We consider a new extension of the extragradient method that is motivated by approximating implicit updates. Since in a recent work …

We consider distributed optimization where the objective function is spread among different devices, each sending incremental model …

Many popular distributed optimization methods for training machine learning models fit the following template: a local gradient …

Training very large machine learning models requires a distributed computing approach, with communication of the model updates often …

The last decade witnessed a rise in the importance of supervised learning applications involving big data and big models. Big data …

We propose a randomized first order optimization method–SEGA (SkEtched GrAdient method)– which progressively throughout its …

Recent Posts

More Posts

From 15 to 18 July I’m attending the Frontiers of Deep Learning workshop at Simons Insitute.

From 17 to 28 June I visited Matthias Ehrhardt.

Our work on time series was accepted as a poster to the Time Series Workshop at ICML and I presented it together with Federico Vaggi.

After a successful round of reviews for ICML I was invited to serve on committee for two more important ML conferences.

I will be at EPFL, visiting the Machine Learning and Optimization Laboratory led by Martin Jaggi.

Contact