English 中文(简体)
Dummy operations handling of Intel processor
原标题:

Admittedly, I have a bit silly question. Basically, I am wondering if there are some special mechanisms provided by Intel processors to efficiently execute a series of dummy, i.e., NOP instructions? For instance,I could imagine there could be some kind of pre-fetch mechanism that identifies NOPS, discards them and tries to fetch some useful instructions instead. Or are these NOPS dispatched to the execution unit as normal instructions, meaning that i can roughly process 5 nops each cycle (under the assumption that there are 5 execution units)

Thanks, Reinhard

问题回答

Discarding them would be pretty bad idea: they are often used for busy-waiting. If you discard NOPs, you make your wait-loop much tighter than it should be and potentially introduce considerable communications overhead.

If you feel that NOPs are inefficient, you could try HLT which saves some energy. Or you could even send the CPU into a sleep state. However, these only make sense if you want to "do nothing" for a considerable amount of time and they usually require suvervisor privileges.

No. They are decoded and executed as normal instructions; there is hardware support to remove the false dependency that would otherwise be introduced on the EAX register for the single byte NOP, 0x90 (which is really xchg eax, eax), but that s all.

Reference: Intel(R) 64 and IA-32 Architectures Optimization Reference Manual - section 3.5.1.8, "Using NOPs".

There s very little need for optimizing sequences of no-ops on the x86 architecture because it has no-op encodings of varying lengths. Instead of many one-byte no-ops, one can just use a single multi-byte no-op. Somewhat more work for the decoder, but the actual execution units only see a single instruction to execute.





相关问题
How can I do a CPU cache flush in x86 Windows?

I am interested in forcing a CPU cache flush in Windows (for benchmarking reasons, I want to emulate starting with no data in CPU cache), preferably a basic C implementation or Win32 call. Is there a ...

What are the names of the new X86_64 processors registers?

Where can I find the names of the new registers for assembly on this architecture? I am referring to registers in X86 like EAX, ESP, EBX, etc. But I d like them in 64bit. I don t think they are the ...

How do I disassemble raw 16-bit x86 machine code?

I d like to disassemble the MBR (first 512 bytes) of a bootable x86 disk that I have. I have copied the MBR to a file using dd if=/dev/my-device of=mbr bs=512 count=1 Any suggestions for a Linux ...

32bit application access to 64bit registry

I have an OS Shell written in 32bit that is replacing the Explorer.exe of a Vista machine. I run a utility which is also written in 32bit, which allows to switch between the Explorer shell and My ...

Left Shift Overflow on 68k/x86?

I heard that the Motorola 68000 and Intel x86 architectures handle overflow from left shifting differently. Specifically the 68k LSL vs. the Intel SAL/SHL assembly instructions. Does anyone know the ...

Translate a FOR to assembler

I need to translate what is commented within the method, to assembler. I have a roughly idea, but can t. Anyone can help me please? Is for an Intel x32 architecture: int secuencia ( int n, ...

热门标签