Comments for Bandit Algorithms
http://banditalgs.com
Mon, 21 Aug 2017 22:52:42 +0000hourly1https://wordpress.org/?v=4.6.6Comment on Adversarial bandits by Csaba Szepesvari
http://banditalgs.com/2016/10/01/adversarial-bandits/#comment-112
Mon, 21 Aug 2017 22:52:42 +0000http://banditalgs.com/?p=618#comment-112Thanks, good catch. We have corrected the page.
]]>Comment on The Upper Confidence Bound Algorithm by Csaba Szepesvari
http://banditalgs.com/2016/09/18/the-upper-confidence-bound-algorithm/#comment-111
Mon, 21 Aug 2017 22:50:26 +0000http://banditalgs.com/?p=348#comment-111Where is this? The trick we use is that we bound the *expected* number of pulls of suboptimal arms. Hence, each suboptimal arm is compared to the optimal arm, one by one, separately, avoiding the need to argue about multiple suboptimal arms at the same time. I hope this clarifies things.
]]>Comment on The Upper Confidence Bound Algorithm by Csaba Szepesvari
http://banditalgs.com/2016/09/18/the-upper-confidence-bound-algorithm/#comment-110
Mon, 21 Aug 2017 22:48:25 +0000http://banditalgs.com/?p=348#comment-110Hi Xiang! Sorry for the slow response. Where is the bug? The universal constant just relies on bounding constant + $\log \log n$ by $C \log n$, it seems to me.
]]>Comment on Ellipsoidal Confidence Sets for Least-Squares Estimators by Csaba Szepesvari
http://banditalgs.com/2016/11/13/ellipsoidal-confidence-bounds-for-least-squares-estimators/#comment-109
Mon, 21 Aug 2017 22:43:52 +0000http://banditalgs.com/?p=1131#comment-109Hi!
Thanks for the comment. You are right: $n$ should have been $t$ here. Oh, and I edited the page to reflect this.
– Csaba
]]>Comment on Ellipsoidal Confidence Sets for Least-Squares Estimators by Hairi
http://banditalgs.com/2016/11/13/ellipsoidal-confidence-bounds-for-least-squares-estimators/#comment-108
Mon, 21 Aug 2017 19:46:35 +0000http://banditalgs.com/?p=1131#comment-108Thank you Dr. Lattimore. I have a question about the derivation of the inequality after (6), where in the bracket, you said first let s=n, and then s=n-1,etc. But what is n? is it s=t instead?
]]>Comment on The Upper Confidence Bound Algorithm by Hairi
http://banditalgs.com/2016/09/18/the-upper-confidence-bound-algorithm/#comment-102
Mon, 12 Jun 2017 05:09:59 +0000http://banditalgs.com/?p=348#comment-102Hi, professor, when reasoning arm 1 is optimal and not the arm j (j \neq 1), we say that we have a 1-\delta level of confidence. But, should we also say arm 1 is optimal compared to all the other K-1 arms, so the confidence level would be (1-\delta)^{K-1}?
]]>Comment on More information theory and minimax lower bounds by Csaba Szepesvari
http://banditalgs.com/2016/09/28/more-information-theory-and-minimax-lower-bounds/#comment-101
Sat, 10 Jun 2017 17:29:42 +0000http://banditalgs.com/?p=468#comment-101If you take some $\alpha$, you can solve for the optimizing $\Delta$ in the lower bound proof. Since $\Delta<1$ is expected, $\Delta^\alpha$ decreases as $\alpha$ increases. This means, you have less information when $\alpha$ is bigger (KL is smaller). Hence, the adversary can afford to choose a larger gap, leading to a larger lower bound for larger $\alpha$. In simple terms, $1/\Delta^\alpha$ samples are necessary to detect a gap of size $\Delta$. While this many samples are used, the regret is $\Delta/\Delta^\alpha = \Delta^{1-\alpha}$, showing the same as above.
]]>Comment on More information theory and minimax lower bounds by Jie Zhong
http://banditalgs.com/2016/09/28/more-information-theory-and-minimax-lower-bounds/#comment-100
Fri, 09 Jun 2017 18:15:06 +0000http://banditalgs.com/?p=468#comment-100Interesting. Though I don’t have any examples to have the gap of order $\alpha$, I am very curious that how the order $\alpha$ of the gap between means affects the bound. The smaller, the better, or the other way around?
]]>Comment on More information theory and minimax lower bounds by Csaba Szepesvari
http://banditalgs.com/2016/09/28/more-information-theory-and-minimax-lower-bounds/#comment-99
Thu, 08 Jun 2017 06:27:12 +0000http://banditalgs.com/?p=468#comment-99Sure, it would. Do you have some interesting example on your mind?
]]>Comment on More information theory and minimax lower bounds by Jie Zhong
http://banditalgs.com/2016/09/28/more-information-theory-and-minimax-lower-bounds/#comment-98
Thu, 08 Jun 2017 04:01:00 +0000http://banditalgs.com/?p=468#comment-98Hi Csaba,
In Note 1, if the KL divergence for some distribution is of order $\Delta$, instead of $\Delta^2$ like the Gaussian case, or more general, of order $\Delta^\alpha$, will the lower bound change?
]]>