From a mathematical standpoint, the entire thing comes from the calculus of variations; basically, you try to minimize or maximize integrals in the same way that you might work out the first derivative to find critical points on a function. Sounds useless and somewhat whimsicle, but it allows you to work out things like the shapes of hanging ropes or shortest route from A to B over rough terrain.
The general method was developed by Euler and Lagrange (hence they're called the Euler-Lagrange equations). However, I'm not totally boned up as to who did what when. It would seem logical, for me at least, that Hamilton and Lagrange had some sort of correspondence and came up with a method of implementing Hamilton's Principle using actions etc. After this it might have been generalized with some assistance from Euler. This is only me guessing though, so who knows