What kind of address instruction does the x86 cpu have?
I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use?
x86 cpu computer-science cpu-architecture instruction-set
add a comment |
I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use?
x86 cpu computer-science cpu-architecture instruction-set
By "address", do you mean "operand"?
– Sneftel
Nov 15 '18 at 17:56
@Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)
– Peter Cordes
Nov 15 '18 at 17:58
add a comment |
I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use?
x86 cpu computer-science cpu-architecture instruction-set
I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use?
x86 cpu computer-science cpu-architecture instruction-set
x86 cpu computer-science cpu-architecture instruction-set
edited Nov 16 '18 at 2:41
Peter Cordes
133k18202339
133k18202339
asked Nov 15 '18 at 17:51
Jolt151Jolt151
155
155
By "address", do you mean "operand"?
– Sneftel
Nov 15 '18 at 17:56
@Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)
– Peter Cordes
Nov 15 '18 at 17:58
add a comment |
By "address", do you mean "operand"?
– Sneftel
Nov 15 '18 at 17:56
@Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)
– Peter Cordes
Nov 15 '18 at 17:58
By "address", do you mean "operand"?
– Sneftel
Nov 15 '18 at 17:56
By "address", do you mean "operand"?
– Sneftel
Nov 15 '18 at 17:56
@Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)
– Peter Cordes
Nov 15 '18 at 17:58
@Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)
– Peter Cordes
Nov 15 '18 at 17:58
add a comment |
1 Answer
1
active
oldest
votes
x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]
. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)
Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx
which does eax+=edx
.
Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1
where the top of the x87 stack (st0
) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.
Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1
or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1
There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.
There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:
- 2-bit Mode for the r/m operand (register direct
reg
, register indirect[reg]
, [reg+disp8],[reg+disp32]
). The modes with displacement bits signal that those bytes follow the ModR/M byte. - 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.
- 3-bit reg field, always a register.
Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32
or add r32, r/m32
.
Common instructions also have other opcodes for immediate source forms, but typically they use the reg
field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123
. An exception to this is the immediate form of imul
added with 286, e.g. imul eax, [rdi + rbx*4], 12345
. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.
Some one-operand instructions use the same trick of using the reg
field as extra opcode bits, but without an immediate. e.g. neg r/m32
, not r/m32
, inc r/m32
, or the shl
/shr
/rotate encodings that shift by an implicit 1 (not by cl
or an immediate). So unfortunately you can't copy-and-shift (until BMI2).
There are some special-case encodings to improve code density, like single-byte encodings for push rax
/push rdx
that pack the reg
field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc
/dec
any register. But in 64-bit mode those 0x4?
codes are used as REX prefixes to extend the reg
and r/m
fields to provide 16 architectural registers.
There are also instructions with some or all implicit operands, like movsb
which copies a byte from [rsi]
to [rdi]
, and can be used with a rep
prefix to repeat that rcx
times.
Or mul ecx
does edx:eax = eax * ecx
. One explicit source operand, one implicit source, and 2 implicit destination registers. div
/idiv
are similar.
Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb
or cdq
) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence
.
Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32
or push imm8
have their own opcodes. The implicit destinations (memory at [rsp]
, and RSP itself being updated to rsp-=8
).
LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123]
to do eax = rdi*3 + 123
in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg
field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).
VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx
or vaddps ymm0, ymm1, [rsi]
. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)
The 3rd operand is encoded in the 2 or 3-byte VEX prefix.
There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0>
where XMM0 is an implicit operand using that register.
The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4
.
x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53325275%2fwhat-kind-of-address-instruction-does-the-x86-cpu-have%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]
. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)
Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx
which does eax+=edx
.
Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1
where the top of the x87 stack (st0
) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.
Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1
or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1
There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.
There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:
- 2-bit Mode for the r/m operand (register direct
reg
, register indirect[reg]
, [reg+disp8],[reg+disp32]
). The modes with displacement bits signal that those bytes follow the ModR/M byte. - 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.
- 3-bit reg field, always a register.
Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32
or add r32, r/m32
.
Common instructions also have other opcodes for immediate source forms, but typically they use the reg
field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123
. An exception to this is the immediate form of imul
added with 286, e.g. imul eax, [rdi + rbx*4], 12345
. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.
Some one-operand instructions use the same trick of using the reg
field as extra opcode bits, but without an immediate. e.g. neg r/m32
, not r/m32
, inc r/m32
, or the shl
/shr
/rotate encodings that shift by an implicit 1 (not by cl
or an immediate). So unfortunately you can't copy-and-shift (until BMI2).
There are some special-case encodings to improve code density, like single-byte encodings for push rax
/push rdx
that pack the reg
field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc
/dec
any register. But in 64-bit mode those 0x4?
codes are used as REX prefixes to extend the reg
and r/m
fields to provide 16 architectural registers.
There are also instructions with some or all implicit operands, like movsb
which copies a byte from [rsi]
to [rdi]
, and can be used with a rep
prefix to repeat that rcx
times.
Or mul ecx
does edx:eax = eax * ecx
. One explicit source operand, one implicit source, and 2 implicit destination registers. div
/idiv
are similar.
Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb
or cdq
) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence
.
Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32
or push imm8
have their own opcodes. The implicit destinations (memory at [rsp]
, and RSP itself being updated to rsp-=8
).
LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123]
to do eax = rdi*3 + 123
in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg
field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).
VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx
or vaddps ymm0, ymm1, [rsi]
. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)
The 3rd operand is encoded in the 2 or 3-byte VEX prefix.
There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0>
where XMM0 is an implicit operand using that register.
The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4
.
x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.
add a comment |
x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]
. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)
Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx
which does eax+=edx
.
Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1
where the top of the x87 stack (st0
) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.
Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1
or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1
There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.
There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:
- 2-bit Mode for the r/m operand (register direct
reg
, register indirect[reg]
, [reg+disp8],[reg+disp32]
). The modes with displacement bits signal that those bytes follow the ModR/M byte. - 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.
- 3-bit reg field, always a register.
Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32
or add r32, r/m32
.
Common instructions also have other opcodes for immediate source forms, but typically they use the reg
field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123
. An exception to this is the immediate form of imul
added with 286, e.g. imul eax, [rdi + rbx*4], 12345
. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.
Some one-operand instructions use the same trick of using the reg
field as extra opcode bits, but without an immediate. e.g. neg r/m32
, not r/m32
, inc r/m32
, or the shl
/shr
/rotate encodings that shift by an implicit 1 (not by cl
or an immediate). So unfortunately you can't copy-and-shift (until BMI2).
There are some special-case encodings to improve code density, like single-byte encodings for push rax
/push rdx
that pack the reg
field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc
/dec
any register. But in 64-bit mode those 0x4?
codes are used as REX prefixes to extend the reg
and r/m
fields to provide 16 architectural registers.
There are also instructions with some or all implicit operands, like movsb
which copies a byte from [rsi]
to [rdi]
, and can be used with a rep
prefix to repeat that rcx
times.
Or mul ecx
does edx:eax = eax * ecx
. One explicit source operand, one implicit source, and 2 implicit destination registers. div
/idiv
are similar.
Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb
or cdq
) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence
.
Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32
or push imm8
have their own opcodes. The implicit destinations (memory at [rsp]
, and RSP itself being updated to rsp-=8
).
LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123]
to do eax = rdi*3 + 123
in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg
field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).
VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx
or vaddps ymm0, ymm1, [rsi]
. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)
The 3rd operand is encoded in the 2 or 3-byte VEX prefix.
There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0>
where XMM0 is an implicit operand using that register.
The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4
.
x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.
add a comment |
x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]
. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)
Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx
which does eax+=edx
.
Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1
where the top of the x87 stack (st0
) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.
Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1
or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1
There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.
There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:
- 2-bit Mode for the r/m operand (register direct
reg
, register indirect[reg]
, [reg+disp8],[reg+disp32]
). The modes with displacement bits signal that those bytes follow the ModR/M byte. - 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.
- 3-bit reg field, always a register.
Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32
or add r32, r/m32
.
Common instructions also have other opcodes for immediate source forms, but typically they use the reg
field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123
. An exception to this is the immediate form of imul
added with 286, e.g. imul eax, [rdi + rbx*4], 12345
. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.
Some one-operand instructions use the same trick of using the reg
field as extra opcode bits, but without an immediate. e.g. neg r/m32
, not r/m32
, inc r/m32
, or the shl
/shr
/rotate encodings that shift by an implicit 1 (not by cl
or an immediate). So unfortunately you can't copy-and-shift (until BMI2).
There are some special-case encodings to improve code density, like single-byte encodings for push rax
/push rdx
that pack the reg
field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc
/dec
any register. But in 64-bit mode those 0x4?
codes are used as REX prefixes to extend the reg
and r/m
fields to provide 16 architectural registers.
There are also instructions with some or all implicit operands, like movsb
which copies a byte from [rsi]
to [rdi]
, and can be used with a rep
prefix to repeat that rcx
times.
Or mul ecx
does edx:eax = eax * ecx
. One explicit source operand, one implicit source, and 2 implicit destination registers. div
/idiv
are similar.
Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb
or cdq
) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence
.
Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32
or push imm8
have their own opcodes. The implicit destinations (memory at [rsp]
, and RSP itself being updated to rsp-=8
).
LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123]
to do eax = rdi*3 + 123
in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg
field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).
VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx
or vaddps ymm0, ymm1, [rsi]
. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)
The 3rd operand is encoded in the 2 or 3-byte VEX prefix.
There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0>
where XMM0 is an implicit operand using that register.
The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4
.
x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.
x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]
. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)
Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx
which does eax+=edx
.
Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1
where the top of the x87 stack (st0
) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.
Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1
or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1
There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.
There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:
- 2-bit Mode for the r/m operand (register direct
reg
, register indirect[reg]
, [reg+disp8],[reg+disp32]
). The modes with displacement bits signal that those bytes follow the ModR/M byte. - 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.
- 3-bit reg field, always a register.
Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32
or add r32, r/m32
.
Common instructions also have other opcodes for immediate source forms, but typically they use the reg
field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123
. An exception to this is the immediate form of imul
added with 286, e.g. imul eax, [rdi + rbx*4], 12345
. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.
Some one-operand instructions use the same trick of using the reg
field as extra opcode bits, but without an immediate. e.g. neg r/m32
, not r/m32
, inc r/m32
, or the shl
/shr
/rotate encodings that shift by an implicit 1 (not by cl
or an immediate). So unfortunately you can't copy-and-shift (until BMI2).
There are some special-case encodings to improve code density, like single-byte encodings for push rax
/push rdx
that pack the reg
field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc
/dec
any register. But in 64-bit mode those 0x4?
codes are used as REX prefixes to extend the reg
and r/m
fields to provide 16 architectural registers.
There are also instructions with some or all implicit operands, like movsb
which copies a byte from [rsi]
to [rdi]
, and can be used with a rep
prefix to repeat that rcx
times.
Or mul ecx
does edx:eax = eax * ecx
. One explicit source operand, one implicit source, and 2 implicit destination registers. div
/idiv
are similar.
Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb
or cdq
) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence
.
Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32
or push imm8
have their own opcodes. The implicit destinations (memory at [rsp]
, and RSP itself being updated to rsp-=8
).
LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123]
to do eax = rdi*3 + 123
in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg
field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).
VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx
or vaddps ymm0, ymm1, [rsi]
. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)
The 3rd operand is encoded in the 2 or 3-byte VEX prefix.
There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0>
where XMM0 is an implicit operand using that register.
The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4
.
x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.
answered Nov 15 '18 at 18:40
Peter CordesPeter Cordes
133k18202339
133k18202339
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53325275%2fwhat-kind-of-address-instruction-does-the-x86-cpu-have%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
By "address", do you mean "operand"?
– Sneftel
Nov 15 '18 at 17:56
@Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)
– Peter Cordes
Nov 15 '18 at 17:58