Hope this is in the right section. Having trouble ironing out an apparent inconsistency in matrix trace derivative rules.
Two particular rules for matrix trace derivatives are
[math]\frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}^2\mathbf{A})=(\mathbf{X}\mathbf{A}+\mathbf{A}\mathbf{X})^T[/math]
and
[math] \frac{\partial}{\partial\mathbf{X}} Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)=\mathbf{X}\mathbf{A}^T+\mathbf{X}\mathbf{A}[/math]
Now assume that [math]\mathbf{A}[/math] is diagonal and [math] \mathbf{X}[/math] is anti-symmetric. Then by the cyclic property of the trace, [math]-Tr(\mathbf{X}^2\mathbf{A})=Tr(\mathbf{X}\mathbf{A}\mathbf{X}^T)[/math]. So the two derivatives should be equal up to a minus sign, no?
However, the first rule returns the derivative
[math]- (\mathbf{X}\mathbf{A}+\mathbf{A}\mathbf{X})[/math]
and the second returns
[math] 2\mathbf{X}\mathbf{A}[/math].
The two matrices do not commute, so the results are different. Am I missing something?