Hmm, you're leaving out integer parsing and writing (implying (un)archival, moving around pieces of text...), yet I see that with TIGCC 0.95 and an empty _main(), the 89z takes 
2745 bytes...
That's not what I would call an optimization 

 And you see the evils of static linking : would you have suspected that this small piece of code could generate nearly 3 kb of code ???