title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling

Motivated by applications in large-scale and multi-agent reinforcement learning, we study the non-asymptotic performance of stochastic approximation (SA) schemes with delayed updates under Markovian sampling. While the effect of delays has been extensively studied for optimization, the manner in which they interact with the underlying Markov process to shape the finite-time performance of SA remains poorly understood. In this context, our first main contribution is to show that under time-varying bounded delays, the delayed SA update rule guarantees exponentially fast convergence of the \emph{last iterate} to a ball around the SA operator’s fixed point. Notably, our bound is \emph{tight} in its dependence on both the maximum delay $\tau_{max}$, and the mixing time $\tau_{mix}$. To achieve this tight bound, we develop a novel inductive proof technique that, unlike various existing delayed-optimization analyses, relies on establishing uniform boundedness of the iterates. As such, our proof may be of independent interest. Next, to mitigate the impact of the maximum delay on the convergence rate, we provide the first finite-time analysis of a delay-adaptive SA scheme under Markovian sampling. In particular, we show that the exponent of convergence of this scheme gets scaled down by $\tau_{avg}$, as opposed to $\tau_{max}$ for the vanilla delayed SA rule; here, $\tau_{avg}$ denotes the average delay across all iterations. Moreover, the adaptive scheme requires no prior knowledge of the delay sequence for step-size tuning. Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms, including TD learning, Q-learning, and stochastic gradient descent under Markovian sampling.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

adibi24a

0

Stochastic Approximation with Delayed Updates: Finite-Time Rates under {M}arkovian Sampling

2746

2754

2746-2754

2746

false

Adibi, Arman and {D}al Fabbro, Nicol\`{o} and Schenato, Luca and Kulkarni, Sanjeev and Vincent Poor, H. and J. Pappas, George and Hassani, Hamed and Mitra, Aritra

given	family
Arman	Adibi

given	family	prefix
Nicolò	Fabbro	Dal

given	family
Luca	Schenato

given	family
Sanjeev	Kulkarni

given	family
H.	Vincent Poor

given	family
George	J. Pappas

given	family
Hamed	Hassani

given	family
Aritra	Mitra

2024-04-18

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics

238

inproceedings

date-parts

2024

4

18

https://proceedings.mlr.press/v238/adibi24a/adibi24a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024-04-18-adibi24a.md

2024-04-18-adibi24a.md

Files

2024-04-18-adibi24a.md

Latest commit

History

2024-04-18-adibi24a.md

File metadata and controls