I'm attempting to show a superiority of a collaborative agent over
other agents exist in a learning system. In such a system there is a
one unique agent that is able to learn both from interaction with the
environment and from the other learning agents that exist in the
system. The other agents can only learn from interaction with the
environment, independently and are not aware of the collaborative
agent or any other agents.
I don't know if it is possible to prove superiority without attaching
conditions to my proof. Thereby, I added a few conditions. Here is my
progress (it's short):
If I will be able to show that Equation 7 holds then the collaborative
agent superiority will be demonstrated.
However, to show that Equation 7 holds, first I have to show that the
residual in both sides of the equation are positive. If I show that (I
don't know how - maybe add another assumption?), then the next stage
will be to show that Equation 7 really holds (currently I don't know
how to show that as well).
Here is a convergence proof for a single Q learner as suggested by
[Jaakkola et al., 1994]:
I hope that someone can help/give an opinion/share some thoughts about
this issue, please.
Thanks and happy new year!