"esp", not "eps" but that's the right idea. Yes, popping a register will do the same thing. Often, "ecx" is used as a "scratch register" for this purpose (Dr. Paul Carter does it this way in his tutorial). "eax" is used for the return value from a function. In the case of _printf, this isn't very interesting. In other cases it is, so you might not want to "pop eax" to clean up the stack, as a general rule.
You don't really need to clean up the stack after each function call. It can be "deferred" and several calls cleaned up at once. It may not be obvious at first, but "call" and "ret" use the stack to store the return address. "call" pushes the return address - the address of the next instruction after the call - and jumps to the function. When the function gets to "ret" it essentially pops the address off the stack to jump back to. At this point, esp had "better be" pointed to a valid return address, or we're going to crash! (oh yeah, we did!)
"xor" is a good way to zero a register - "mov eax, 0" won't kill ya. "pop eax" would overwrite all of eax, so there's no point in zeroing it first. You might want to zero it before the "ret". Returning zero traditionally indicates "no error". We really don't care, but it's "the right thing to do" to return something meaningful. (and to check return values for error before proceding, when appropriate)
It isn't entirely clear, in your last example, whether "_start:" is called. In Linux, the "_start:" label is not called - the very first thing on the stack is "argc" - so "ret" isn't going to work. I think the entrypoint is always called in Windows. It might be "safer" to exit with _exit (or ExitProcess). In either case "push 0" first (or some other meaningful value) - you don't have to clean up after this one.
Best,
Frank