I'm not sure if this is the expected behavior or not, but below are examples of running the example with and without OptOnFly, and in the with case over a lot of itterations we tend to move away from a good solution, not towards it...

Training on the fly

Not training on the fly