What kind of address instruction does the x86 cpu have?












2















I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use?










share|improve this question

























  • By "address", do you mean "operand"?

    – Sneftel
    Nov 15 '18 at 17:56











  • @Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)

    – Peter Cordes
    Nov 15 '18 at 17:58
















2















I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use?










share|improve this question

























  • By "address", do you mean "operand"?

    – Sneftel
    Nov 15 '18 at 17:56











  • @Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)

    – Peter Cordes
    Nov 15 '18 at 17:58














2












2








2


1






I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use?










share|improve this question
















I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use?







x86 cpu computer-science cpu-architecture instruction-set






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 16 '18 at 2:41









Peter Cordes

133k18202339




133k18202339










asked Nov 15 '18 at 17:51









Jolt151Jolt151

155




155













  • By "address", do you mean "operand"?

    – Sneftel
    Nov 15 '18 at 17:56











  • @Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)

    – Peter Cordes
    Nov 15 '18 at 17:58



















  • By "address", do you mean "operand"?

    – Sneftel
    Nov 15 '18 at 17:56











  • @Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)

    – Peter Cordes
    Nov 15 '18 at 17:58

















By "address", do you mean "operand"?

– Sneftel
Nov 15 '18 at 17:56





By "address", do you mean "operand"?

– Sneftel
Nov 15 '18 at 17:56













@Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)

– Peter Cordes
Nov 15 '18 at 17:58





@Sneftel: yes, in abstract ISA-classification terminology, it means operand. like the 5-bit register fields in a MIPS instruction word are "addresses". (I don't know if geeksforgeeks.org/… is any good, but that's the terminology they use)

– Peter Cordes
Nov 15 '18 at 17:58












1 Answer
1






active

oldest

votes


















5














x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)



Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx which does eax+=edx.



Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1 where the top of the x87 stack (st0) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.



Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1 or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1



There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.



There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:




  • 2-bit Mode for the r/m operand (register direct reg, register indirect [reg], [reg+disp8], [reg+disp32]). The modes with displacement bits signal that those bytes follow the ModR/M byte.

  • 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.

  • 3-bit reg field, always a register.


Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32 or add r32, r/m32.



Common instructions also have other opcodes for immediate source forms, but typically they use the reg field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123. An exception to this is the immediate form of imul added with 286, e.g. imul eax, [rdi + rbx*4], 12345. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.



Some one-operand instructions use the same trick of using the reg field as extra opcode bits, but without an immediate. e.g. neg r/m32, not r/m32, inc r/m32, or the shl/shr/rotate encodings that shift by an implicit 1 (not by cl or an immediate). So unfortunately you can't copy-and-shift (until BMI2).



There are some special-case encodings to improve code density, like single-byte encodings for push rax/push rdx that pack the reg field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc/dec any register. But in 64-bit mode those 0x4? codes are used as REX prefixes to extend the reg and r/m fields to provide 16 architectural registers.





There are also instructions with some or all implicit operands, like movsb which copies a byte from [rsi] to [rdi], and can be used with a rep prefix to repeat that rcx times.



Or mul ecx does edx:eax = eax * ecx. One explicit source operand, one implicit source, and 2 implicit destination registers. div/idiv are similar.



Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb or cdq) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence.



Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32 or push imm8 have their own opcodes. The implicit destinations (memory at [rsp], and RSP itself being updated to rsp-=8).





LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123] to do eax = rdi*3 + 123 in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).





VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx or vaddps ymm0, ymm1, [rsi]. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)



The 3rd operand is encoded in the 2 or 3-byte VEX prefix.





There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0> where XMM0 is an implicit operand using that register.



The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4.





x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53325275%2fwhat-kind-of-address-instruction-does-the-x86-cpu-have%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    5














    x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)



    Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx which does eax+=edx.



    Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1 where the top of the x87 stack (st0) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.



    Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1 or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1



    There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.



    There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:




    • 2-bit Mode for the r/m operand (register direct reg, register indirect [reg], [reg+disp8], [reg+disp32]). The modes with displacement bits signal that those bytes follow the ModR/M byte.

    • 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.

    • 3-bit reg field, always a register.


    Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32 or add r32, r/m32.



    Common instructions also have other opcodes for immediate source forms, but typically they use the reg field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123. An exception to this is the immediate form of imul added with 286, e.g. imul eax, [rdi + rbx*4], 12345. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.



    Some one-operand instructions use the same trick of using the reg field as extra opcode bits, but without an immediate. e.g. neg r/m32, not r/m32, inc r/m32, or the shl/shr/rotate encodings that shift by an implicit 1 (not by cl or an immediate). So unfortunately you can't copy-and-shift (until BMI2).



    There are some special-case encodings to improve code density, like single-byte encodings for push rax/push rdx that pack the reg field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc/dec any register. But in 64-bit mode those 0x4? codes are used as REX prefixes to extend the reg and r/m fields to provide 16 architectural registers.





    There are also instructions with some or all implicit operands, like movsb which copies a byte from [rsi] to [rdi], and can be used with a rep prefix to repeat that rcx times.



    Or mul ecx does edx:eax = eax * ecx. One explicit source operand, one implicit source, and 2 implicit destination registers. div/idiv are similar.



    Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb or cdq) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence.



    Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32 or push imm8 have their own opcodes. The implicit destinations (memory at [rsp], and RSP itself being updated to rsp-=8).





    LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123] to do eax = rdi*3 + 123 in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).





    VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx or vaddps ymm0, ymm1, [rsi]. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)



    The 3rd operand is encoded in the 2 or 3-byte VEX prefix.





    There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0> where XMM0 is an implicit operand using that register.



    The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4.





    x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.






    share|improve this answer




























      5














      x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)



      Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx which does eax+=edx.



      Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1 where the top of the x87 stack (st0) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.



      Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1 or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1



      There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.



      There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:




      • 2-bit Mode for the r/m operand (register direct reg, register indirect [reg], [reg+disp8], [reg+disp32]). The modes with displacement bits signal that those bytes follow the ModR/M byte.

      • 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.

      • 3-bit reg field, always a register.


      Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32 or add r32, r/m32.



      Common instructions also have other opcodes for immediate source forms, but typically they use the reg field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123. An exception to this is the immediate form of imul added with 286, e.g. imul eax, [rdi + rbx*4], 12345. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.



      Some one-operand instructions use the same trick of using the reg field as extra opcode bits, but without an immediate. e.g. neg r/m32, not r/m32, inc r/m32, or the shl/shr/rotate encodings that shift by an implicit 1 (not by cl or an immediate). So unfortunately you can't copy-and-shift (until BMI2).



      There are some special-case encodings to improve code density, like single-byte encodings for push rax/push rdx that pack the reg field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc/dec any register. But in 64-bit mode those 0x4? codes are used as REX prefixes to extend the reg and r/m fields to provide 16 architectural registers.





      There are also instructions with some or all implicit operands, like movsb which copies a byte from [rsi] to [rdi], and can be used with a rep prefix to repeat that rcx times.



      Or mul ecx does edx:eax = eax * ecx. One explicit source operand, one implicit source, and 2 implicit destination registers. div/idiv are similar.



      Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb or cdq) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence.



      Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32 or push imm8 have their own opcodes. The implicit destinations (memory at [rsp], and RSP itself being updated to rsp-=8).





      LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123] to do eax = rdi*3 + 123 in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).





      VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx or vaddps ymm0, ymm1, [rsi]. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)



      The 3rd operand is encoded in the 2 or 3-byte VEX prefix.





      There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0> where XMM0 is an implicit operand using that register.



      The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4.





      x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.






      share|improve this answer


























        5












        5








        5







        x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)



        Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx which does eax+=edx.



        Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1 where the top of the x87 stack (st0) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.



        Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1 or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1



        There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.



        There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:




        • 2-bit Mode for the r/m operand (register direct reg, register indirect [reg], [reg+disp8], [reg+disp32]). The modes with displacement bits signal that those bytes follow the ModR/M byte.

        • 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.

        • 3-bit reg field, always a register.


        Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32 or add r32, r/m32.



        Common instructions also have other opcodes for immediate source forms, but typically they use the reg field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123. An exception to this is the immediate form of imul added with 286, e.g. imul eax, [rdi + rbx*4], 12345. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.



        Some one-operand instructions use the same trick of using the reg field as extra opcode bits, but without an immediate. e.g. neg r/m32, not r/m32, inc r/m32, or the shl/shr/rotate encodings that shift by an implicit 1 (not by cl or an immediate). So unfortunately you can't copy-and-shift (until BMI2).



        There are some special-case encodings to improve code density, like single-byte encodings for push rax/push rdx that pack the reg field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc/dec any register. But in 64-bit mode those 0x4? codes are used as REX prefixes to extend the reg and r/m fields to provide 16 architectural registers.





        There are also instructions with some or all implicit operands, like movsb which copies a byte from [rsi] to [rdi], and can be used with a rep prefix to repeat that rcx times.



        Or mul ecx does edx:eax = eax * ecx. One explicit source operand, one implicit source, and 2 implicit destination registers. div/idiv are similar.



        Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb or cdq) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence.



        Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32 or push imm8 have their own opcodes. The implicit destinations (memory at [rsp], and RSP itself being updated to rsp-=8).





        LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123] to do eax = rdi*3 + 123 in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).





        VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx or vaddps ymm0, ymm1, [rsi]. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)



        The 3rd operand is encoded in the 2 or 3-byte VEX prefix.





        There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0> where XMM0 is an implicit operand using that register.



        The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4.





        x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.






        share|improve this answer













        x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4]. (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?)



        Typical x86 integer instructions have 2 operands, both explicit, like add eax, edx which does eax+=edx.



        Legacy x87 FP code uses 1-operand instructions, with the x87 stack, like faddp st1 where the top of the x87 stack (st0) is an implicit operand. SSE2 is baseline for x86-64, so it's no longer widely used.



        Modern FP code uses SSE/SSE2 2-operand instructions like addsd xmm0,xmm1 or 3-operand AVX encodings like vaddsd xmm2, xmm0, xmm1



        There are x86 instructions with 0, 1, 2, 3, and even 4 explicit operands.



        There are multiple instruction formats, but explicit reg/memory operands are normally encoded in a ModR/M byte that follows the opcode byte(s). It has 3 fields:




        • 2-bit Mode for the r/m operand (register direct reg, register indirect [reg], [reg+disp8], [reg+disp32]). The modes with displacement bits signal that those bytes follow the ModR/M byte.

        • 3-bit r/m field (the register to use for that operand, or for memory addressing modes, an escape code that means there's a Scale/Index/Base byte after ModRM which can encode scaled-index addressing modes for the r/m operand). See rbp not allowed as SIB base? for the details of the special cases / escape codes.

        • 3-bit reg field, always a register.


        Most instructions are available in at least 2 encodings, reg/memory destination or reg/memory source. If the operands you want are both registers, you can use either opcode, either the add r/m32, r32 or add r32, r/m32.



        Common instructions also have other opcodes for immediate source forms, but typically they use the reg field in ModR/M as extra opcode bits, so you still only get 2 operands like add eax, 123. An exception to this is the immediate form of imul added with 286, e.g. imul eax, [rdi + rbx*4], 12345. Instead of sharing coding space with other immediate instructions, it has a register dst and a r/m source in ModR/M plus the immediate operand implied by the opcode.



        Some one-operand instructions use the same trick of using the reg field as extra opcode bits, but without an immediate. e.g. neg r/m32, not r/m32, inc r/m32, or the shl/shr/rotate encodings that shift by an implicit 1 (not by cl or an immediate). So unfortunately you can't copy-and-shift (until BMI2).



        There are some special-case encodings to improve code density, like single-byte encodings for push rax/push rdx that pack the reg field into the low 3 bits of the opcode byte. And in 16/32-bit mode, one-byte encodings for inc/dec any register. But in 64-bit mode those 0x4? codes are used as REX prefixes to extend the reg and r/m fields to provide 16 architectural registers.





        There are also instructions with some or all implicit operands, like movsb which copies a byte from [rsi] to [rdi], and can be used with a rep prefix to repeat that rcx times.



        Or mul ecx does edx:eax = eax * ecx. One explicit source operand, one implicit source, and 2 implicit destination registers. div/idiv are similar.



        Instructions with at least 1 explicit reg/mem operand use a ModR/M encoding for it, but instructions with zero explicit operands (like movsb or cdq) have no ModR/M byte. They just have the opcode. Some instructions have no operands at all, not even implicit, like mfence.



        Immediate operands can't be signalled through ModR/M, only by the opcode itself, so push imm32 or push imm8 have their own opcodes. The implicit destinations (memory at [rsp], and RSP itself being updated to rsp-=8).





        LEA is a workaround that gives x86 3-operand shift-and-add, like lea eax, [rdi + rdi*2 + 123] to do eax = rdi*3 + 123 in one instruction. See Using LEA on values that aren't addresses / pointers? The destination register is encoded in ModR/M's reg field, and the two source registers are encoded in the addressing mode. (Involving a SIB byte, the presence of which is signalled by the ModR/M byte using the encoding that would otherwise mean base = RSP).





        VEX prefixes (introduced with AVX) provide 3-operand instructions like bzhi eax, [rsi], edx or vaddps ymm0, ymm1, [rsi]. (For many instructions, the 2nd source is the one that's optionally memory, but for some it's the first source.)



        The 3rd operand is encoded in the 2 or 3-byte VEX prefix.





        There are a few 3-operand non-VEX instructions, such as SSE4.1 variable blends like vpblendvb xmm1, xmm2/m128, <XMM0> where XMM0 is an implicit operand using that register.



        The AVX version makes it non-destructive (with a separate destination encoded in the VEX prefix), and makes the blend-control operand explicit (encoded in the high 4 bits of a 1-byte immediate). This gives us an instruction with 4 explicit operands, VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4.





        x86 is pretty wild and has been extended many times, but typical integer code uses mostly 2-operand instructions, with a good amount of LEA thrown in to save instructions.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 15 '18 at 18:40









        Peter CordesPeter Cordes

        133k18202339




        133k18202339
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53325275%2fwhat-kind-of-address-instruction-does-the-x86-cpu-have%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Florida Star v. B. J. F.

            Error while running script in elastic search , gateway timeout

            Adding quotations to stringified JSON object values