We exposed the class member td_error in the TDControlAgent in an attempt to log algorithm progress, but the values seem noisy and do not converge even when the agent finds an excellent representation. Is td_error hidden for a reason; are we missing something by trying to track it?