Hi everybody, i have a question about machine learning. I'm not sure about how joint entropy and mutual information work.
Since:
https://imgur.com/mZU383m
For the first equation, we have that:
https://imgur.com/C1zIHFT
H(x,y) seems to be: 'everything that are not in common between x and y'. But for the second:
https://imgur.com/a/3iWIyps
In this case H(x,y) can not be 'everything that are not in common between x and y', otherwise the result would not be the mutual information I(x,y). So, how should I read H(x,y)?