update history
2022-04-18 v32 update required. First play urgency. dfpn tsume-solver for all nodes. w3954, 54540k games.
2022-04-07 v30 update required. Root Policy temperarure is from 1.0 to 1.8. +50 ELO stronger. Randomness up to 30 moves was reverted to the original AlphaZero format. The win rate was not well adjusted to stay within a certain range, so all moves in the initial phase had the same win rate. kldgain=0.000006. 789 playouts per move on average. w3933, 53930k games.
2022-02-25 v28 update required. Network structure is changed. NN's input has Piece movable area. Policy output is from 11259(139x9x9) to 2187(27x9x9). Swish activation. Tsume solver up to 3 plies. Randomness up to 30 plies is based on one ply searched value. w3881, 52390k games.
2022-01-10 v18 update required. From fixed 800playout/move to variable 100 to 3200 playouts/move(average 777playout). Generated games are +76 ELO stronger. kldgain=0.0000013. w3806, 50130k games.
2021-12-24 value loss is from 'game result' to average of 'searched winrate' and 'game result'. Temp 1.0 to 1.3 makes +33 ELO. w3770, 49050k games.
2021-09-30 v2.0 update required. Temperature upto 30 moves is changed from 1.0 to 1.3. Now we use 20 block. 40 block is upto no47075521, no47075522 is 20block. Training does not use any 40block games. w3703 is same as w3459(last 20block).
2021-09-30 40block is finished. Thank you! Sever has stopped. Experimental 20 block server will start in a few days.
2021-09-05 Learning rate was changed from 0.000001 to 0.0000005 on July 30. 44460k games, w3616.
2021-08-31 Another distributed Deep reinforcement learning for Shogi handicap games, AobaKomaochi is running.
2021-04-27 AobaZero has finished, and moved to 40 block. 39825k games, w3459 is last 20 block, w3460 is 40 block. Thank you!
2021-03-11 weight_decay(L2 regularization) is changed from 0.00004 to 0.0002. Back to original value. 34987k games, w3299.
2021-01-31 Drop the learning rate to 0.0000002. (from 30474k games, w3148).
2021-01-11 Replay buffer is past 1000000 games from past 500000 games. 28352k games, w3077.
2020-12-28 weight is updated from each 10000 games to 34285 games. 26700k games, w3022.
2020-12-11 v1.9 Sente one ply mate returns loss(-1) not win(+1) is fixed.
2020-12-06 weight_decay is changed from 0.0002 to 0.00004. 23982k games, w2750. We reache 24000k games, but we'll continue a little more with some another hyper parameters.
2020-12-01 v1.8 Add resign threshold, like %TORYO,'autousi,resign-th=0.229996,v=0.230,...
2020-11-23 v1.7 Update required.
2020-11-21 Training games is develop-branch only. Release will be update soon. Resign winrate is auto adjustable. Server has been updated. *.csa format has been changed. Winrate is added like +6978KI,'v=0.545,800,6978KI, ... (from 22123k games, w2564).
2019-10-25 v1.6 Update required. resign winrate 10% for ganerating games.
2020-10-24 resign is available for generating games(develop branch). 18956k games, w2250.
2020-10-22 v1.5 OpenCL is 5 times faster.
2020-09-28 Drop the learning rate to 0.000002. (from 16948k games, w2047).
2020-07-08 Drop the learning rate to 0.00002. (from 10980k games, w1450).
2020-06-28 Colab generates games 13 times faster(Tesla T4) by using develop branch.
2019-11-08 Drop the learning rate to 0.0002. (from 4340k games, w787).
2019-10-29 Drop the learning rate to 0.02. (from 4220k games, w775).
2019-07-09 v1.4 Update required. Random seed for visit count sampling was constant.
2019-07-08 v1.3 Tree reuse. PV and score is available on Shogidokoro.
2019-05-29 v1.2 Update required. MCTS initial value is not draw(0), but loss(-1).
2019-05-01 v1.0 Release.
2020-09-04 Server will stop from 18:00, 4th September(JST) for 27 hours. This is for power outage.
2019-11-30 Server will stop from 20:00, 30th November(JST) for 24 hours. This is for power outage.
2019-09-07 Server will stop from 08:00 AM, 7th September(JST) for 24 hours. This is for power outage.
2019-07-06 Server will stop from 14:00, 6th July(JST) for 24 hours. This is for backup, OS update, and v1.3 release.