I agree, Acme. It's actually an A/B Taste Test of two RISC compilers. (I'd go with IBM or SPARC.) I'd use a direct test of the machine language length, watch memory usage (probably not a problem but remember that step where you store the value) and run a sprint test of factorials overnight. Finally, I'd test the output values to make sure it works rather than just outputting whatever.
I'd compare the results with a standard RISC compiler's factorial handling using the same test, and also with the performance of a CISC compiler which both the standard and your modified RISC compilers should beat.
I suspect you'll find, Asterisk, if you try to use CISCs and CISC compilers, that it performs worse. But that's because they don't have optimized primitives, so that's OK; your goal isn't a better CISC algorithm, but a better RISC algorithm. If your algorithm really works right, then it will work only on RISCs.