标题: KataGo作者说他用少很多的时间训练出和 elfv2差不多的权重 [打印本页] 作者: lu01 时间: 2019-6-12 04:38 标题: KataGo作者说他用少很多的时间训练出和 elfv2差不多的权重 KataGo does use 50% gating, due to the heuristic demo showing that 55% AGZ gating under an ideal model is to conservative if the distribution of new net strengths is not too bad... and due to AZ coming along after that and actually showing that in fact the distribution of new net strengths is so not-bad in Go that 0 gating is actually okay! @gjm11's point 3 is definitely on my mind as one of the possible factors here. However, I have NOT ever done a controlled test between the two, so I have no actual evidence.
Currently my GPUs are entirely consumed by re-running ablation runs for a new paper version, but I'd be up for doing a controlled test in a month or two within KataGo.
Regarding strength, KataGo has just now finished a new run. The new run surpasses the peak strength of the old 1-week-long run in only 3.5 days with (20-28)xV100. After 18 to 19 days now, the final 20 block 256 channel network has finished around LZ-ELFv2 strength. Given that ELF used more than 50 times more compute than this (according to Facebook's paper), I'm happy with this.
Tests also indicate roughly similar strength to LZ190 or LZ195 with equal visits, consistent with being around ELFv2. At equal time instead of equal visits, it may be stronger than LZ190, since LZ's network is 40 blocks, but also I think LZ's GPU implementation might be more efficient than mine for playing single games rather than hundreds in parallel, so if that's true it may compensate for that. I've never said anything about or compared with LZ229 though.