Should this say “$/mathcal{G}//2^{[N]}$-measurable map”?

]]>But c*(v)’s leading term is 2 and (10)’s leading term of lower bound is 1/4 so they are different about 8 times! How can we say UCB is almost optimal? 8 times difference is trivial? ]]>

c*(n) -> c*(v)

Since UCB is consistent for E′K⊃E′K(N),

c∗(ν)=infπ∈Πcons(E′K(N))lim infn→∞Rn(π,ν)log(n)≤lim supn→∞Rn(UCB,ν)log(n)≤c∗(n)

(anyway, I have wondered how can I write some equations like you)

In addition, you said p>=1 but you wrote ” Since p>0 was arbitrary, it follows that this also holds for p=0. ” why p=0??????????????

]]>However, I have on question.

you wrote like this:

“in fact, it follows from our previous analysis that for sufficiently large n, UCB will achieve a regret of ∑i:Δi(ν)>0Clog(n)Δi(ν)≈CKlog(n), where Δi(ν) is the suboptimality gap of action i in ν”

Before saying this, you wrote on v the immediate regret(Δ) is close to 1. But in upper statement, you took Δ ≈1/K. I wonder Δ is not close to 1. What do you say?

]]>