Instance dependent lower bounds
In the last post we showed that under mild assumptions ($n = \Omega(K)$ and Gaussian noise), the regret in the worst case is at least $\Omega(\sqrt{Kn})$. More precisely, we showed that for every policy $\pi$ and $n\ge K-1$ there exists Continue Reading