KataGo does use 50% gating, due to the heuristic demo showing that 55% AGZ gating under an ideal model is to conservative if the distribution of new net strengths is not too bad... and due to AZ coming along after that and actually showing that in fact the distribution of new net strengths is so not-bad in Go that 0 gating is actually okay! @gjm11's point 3 is definitely on my mind as one of the possible factors here. However, I have NOT ever done a controlled test between the two, so I have no actual evidence.
Currently my GPUs are entirely consumed by re-running ablation runs for a new paper version, but I'd be up for doing a controlled test in a month or two within KataGo.
Regarding strength, KataGo has just now finished a new run. The new run surpasses the peak strength of the old 1-week-long run in only 3.5 days with (20-28)xV100. After 18 to 19 days now, the final 20 block 256 channel network has finished around LZ-ELFv2 strength. Given that ELF used more than 50 times more compute than this (according to Facebook's paper), I'm happy with this.
Tests also indicate roughly similar strength to LZ190 or LZ195 with equal visits, consistent with being around ELFv2. At equal time instead of equal visits, it may be stronger than LZ190, since LZ's network is 40 blocks, but also I think LZ's GPU implementation might be more efficient than mine for playing single games rather than hundreds in parallel, so if that's true it may compensate for that. I've never said anything about or compared with LZ229 though.
KataGo确实使用50%的门控,因为启发式演示表明,如果新净强度的分布不是太差,理想模型下的55%AGZ门控是保守的......并且由于AZ之后出现并实际显示事实上,新的净优势的分布在Go中是如此的糟糕,0门控实际上是可以的! @ gjm11的第3点绝对是我心目中的可能因素之一。但是,我没有在两者之间做过控制测试,所以我没有实际的证据。
目前我的GPU完全用于重新运行新纸张版本的消融运行,但我想在KataGo中的一两个月内进行受控测试。
关于实力,KataGo刚刚完成新的运行。新的运行仅用了3.5天(20-28)xV100,超过了旧的1周长期运行的峰值强度。现在经过18到19天,最终的20块256通道网络已经完成了LZ-ELFv2的强度。鉴于ELF使用的计算量超过此计算的50倍(根据Facebook的论文),我对此感到满意。 |