But why should the energy in a dielectric differ from the same charge distribution in a vacuum?
For a concrete example, consider again a parallel plate capacitor with charge \(Q\), area \(A\), separation \(d\), and ignore edge effects. Fill it with dielectric of constant \(k\). Now, we know that there will be a surface charge density of \(-\frac{(k-1)Q}{kA}\) on the side of the dielectric next to the positively charged plate, and the opposite on the side of the dielectric next to the negatively charged plate. This reduces the electric field. If you consider this charge arrangement in a vacuum, it has energy \(\frac{AdE^2 \varepsilon_0}{2k^2}\). But with a dielectric, apparently you must replace \( \varepsilon_0 \) with \( \varepsilon \), and the energy is now suddenly \( \frac{AdE^2 \varepsilon_0}{2k} \) for the same charge distribution!