Training Started: 20251211_121404 Number of Episodes: 2000 Print Frequency: 20 Target Gap Height: 16.491741 mm Network: 256 hidden units with LayerNorm Policy LR: 5e-4, Value LR: 1e-3, Entropy: 0.02 ====================================================================== Ep 20 | R: -16131.1 | Len: 500 | R/s: -32.26 (-4032.8%) | Gap: 4.41mm (min: 4.26) | Best: 4.26mm Ep 40 | R: -16140.1 | Len: 500 | R/s: -32.28 (-4035.0%) | Gap: 4.36mm (min: 4.25) | Best: 4.25mm Ep 60 | R: -16140.8 | Len: 500 | R/s: -32.28 (-4035.2%) | Gap: 4.35mm (min: 4.22) | Best: 4.22mm Ep 80 | R: -16143.7 | Len: 500 | R/s: -32.29 (-4035.9%) | Gap: 4.33mm (min: 4.22) | Best: 4.22mm Ep 100 | R: -16142.5 | Len: 500 | R/s: -32.29 (-4035.6%) | Gap: 4.35mm (min: 4.26) | Best: 4.22mm Ep 120 | R: -16142.2 | Len: 500 | R/s: -32.28 (-4035.5%) | Gap: 4.35mm (min: 4.26) | Best: 4.22mm Ep 140 | R: -16142.9 | Len: 500 | R/s: -32.29 (-4035.7%) | Gap: 4.33mm (min: 4.23) | Best: 4.22mm Ep 160 | R: -16144.1 | Len: 500 | R/s: -32.29 (-4036.0%) | Gap: 4.32mm (min: 4.21) | Best: 4.21mm Ep 180 | R: -16141.0 | Len: 500 | R/s: -32.28 (-4035.3%) | Gap: 4.36mm (min: 4.22) | Best: 4.21mm Ep 200 | R: -16143.8 | Len: 500 | R/s: -32.29 (-4035.9%) | Gap: 4.33mm (min: 4.21) | Best: 4.21mm Ep 220 | R: -16144.3 | Len: 500 | R/s: -32.29 (-4036.1%) | Gap: 4.33mm (min: 4.23) | Best: 4.21mm Ep 240 | R: -16145.9 | Len: 500 | R/s: -32.29 (-4036.5%) | Gap: 4.32mm (min: 4.21) | Best: 4.21mm Ep 260 | R: -16142.5 | Len: 500 | R/s: -32.28 (-4035.6%) | Gap: 4.34mm (min: 4.24) | Best: 4.21mm Ep 280 | R: -16146.9 | Len: 500 | R/s: -32.29 (-4036.7%) | Gap: 4.32mm (min: 4.21) | Best: 4.21mm Ep 300 | R: -16145.6 | Len: 500 | R/s: -32.29 (-4036.4%) | Gap: 4.34mm (min: 4.21) | Best: 4.21mm