When I run python main.py --train_pg the reward or Avg.reward is negative. What' s wrong with it?
When I run
python main.py --train_pg
the reward or Avg.reward is negative. What' s wrong with it?