One thing people didn t mention is the addressing, in 32 bit protected mode the segment registers have meaning and the SS DS and CS can each be at a different offset. In 64 bit protected mode that can t happen. The only registers that can have an offset (but no limit) are FS and GS. Which means that in 32 bit mode ds:[ebx] and cs:[ebx] can have a different value which allows some nastiness. But usually OSes don t do this.
Another thing that people didn t mention here is that if you modify a 32 bit register in 64 bit mode it will clear the upper half, but only if you modify the 32bits. e.g. mov eax,0 will result in rax being 0, whereas mov ax,0 wouldn t touch the upper half. So it s a bit tricky when looking at assembly.
As for the stack, it is more a question of OS than the CPU. The windows ABI for x64 is different from the one used by everyone else (linux, mac ...). You probably need to look more deeply at "calling conventions" and ABIs (application binary interface). However, on x64 the RSP needs to be 16 byte aligned at the entry to a function, which is why you ll often see dummy rsp decrements. This is to make sure 16 byte values on the stack are always aligned. But at the CPU level it s all the same, RSP decrements, push is still "sp-=word_size ; ram[sp]=value". Oh, and on x64 RSP doesn t have a limit, on x32 you can tell the CPU that the stack pointer can t go below a certain address, so stack access to lower addresses will cause a fault.
I m not sure what you re asking exactly. Maybe a more specific question would permit a more specific answer.