Updated x86 docenizer to read adjacent tables (#3541)gh-2441

* Updated x86 docenizer to read adjacent tables * Added regression test
author: Jeremy Rifkin <51220084+jeremy-rifkin@users.noreply.github.com> 2022-04-23 16:04:09 -0400
committer: GitHub <noreply@github.com> 2022-04-23 22:04:09 +0200
commit: 16b071b9f0c97faa82149e983dd708f0ecfcb8ef (patch)
tree: 71b336739d85d25b8d7720e919af668d11b4e084 /lib/asm-docs/generated
parent: ac8a45ce8a4fe786483e4971fe05b94169f875cd (diff)
download: compiler-explorer-16b071b9f0c97faa82149e983dd708f0ecfcb8ef.tar.gz
compiler-explorer-16b071b9f0c97faa82149e983dd708f0ecfcb8ef.zip
1 files changed, 385 insertions, 382 deletions
diff --git a/lib/asm-docs/generated/asm-docs-amd64.js b/lib/asm-docs/generated/asm-docs-amd64.js
index 07ab8eef8..2ca3545f3 100644
--- a/lib/asm-docs/generated/asm-docs-amd64.js
+++ b/lib/asm-docs/generated/asm-docs-amd64.js
@@ -59,8 +59,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/ADDPD.html"
             };
 
-        case "VADDPS":
         case "ADDPS":
+        case "VADDPS":
             return {
                 "html": "<p>Add four, eight or sixteen packed single-precision floating-point values from the first source operand with the second source operand, and stores the packed single-precision floating-point results in the destination operand.</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p><p>VEX.128 encoded version: the first source operand is a XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper Bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>",
                 "tooltip": "Add four, eight or sixteen packed single-precision floating-point values from the first source operand with the second source operand, and stores the packed single-precision floating-point results in the destination operand.",
@@ -83,8 +83,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/ADDSS.html"
             };
 
-        case "VADDSUBPD":
         case "ADDSUBPD":
+        case "VADDSUBPD":
             return {
                 "html": "<p>Adds odd-numbered double-precision floating-point values of the first source operand (second operand) with the corresponding double-precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered double-precision floating-point values from the second source operand from the corresponding double-precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified. See <a href=\"http://www.felixcloutier.com/x86/ADDSUBPD.html#fig-3-3\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 3-3</a>.</p><p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Adds odd-numbered double-precision floating-point values of the first source operand (second operand) with the corresponding double-precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered double-precision floating-point values from the second source operand from the corresponding double-precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand.",
@@ -106,16 +106,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/ADOX.html"
             };
 
-        case "VAESDEC":
         case "AESDEC":
+        case "VAESDEC":
             return {
                 "html": "<p>This instruction performs a single round of the AES decryption flow using the Equivalent Inverse Cipher, with the round key from the second source operand, operating on a 128-bit data (state) from the first source operand, and store the result in the destination operand.</p><p>Use the AESDEC instruction for all but the last decryption round. For the last decryption round, use the AESDECLAST instruction.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an XMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction performs a single round of the AES decryption flow using the Equivalent Inverse Cipher, with the round key from the second source operand, operating on a 128-bit data (state) from the first source operand, and store the result in the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/AESDEC.html"
             };
 
-        case "VAESDECLAST":
         case "AESDECLAST":
+        case "VAESDECLAST":
             return {
                 "html": "<p>This instruction performs the last round of the AES decryption flow using the Equivalent Inverse Cipher, with the round key from the second source operand, operating on a 128-bit data (state) from the first source operand, and store the result in the destination operand.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an XMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction performs the last round of the AES decryption flow using the Equivalent Inverse Cipher, with the round key from the second source operand, operating on a 128-bit data (state) from the first source operand, and store the result in the destination operand.",
@@ -130,16 +130,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/AESENC.html"
             };
 
-        case "VAESENCLAST":
         case "AESENCLAST":
+        case "VAESENCLAST":
             return {
                 "html": "<p>This instruction performs the last round of an AES encryption flow using a round key from the second source operand, operating on 128-bit data (state) from the first source operand, and store the result in the destination operand.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same and must be an XMM register. The second source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "This instruction performs the last round of an AES encryption flow using a round key from the second source operand, operating on 128-bit data (state) from the first source operand, and store the result in the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/AESENCLAST.html"
             };
 
-        case "VAESIMC":
         case "AESIMC":
+        case "VAESIMC":
             return {
                 "html": "<p>Perform the InvMixColumns transformation on the source operand and store the result in the destination operand. The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location.</p><p>Note: the AESIMC instruction should be applied to the expanded AES round keys (except for the first and last round key) in order to prepare them for decryption using the \u201cEquivalent Inverse Cipher\u201d (defined in FIPS 197).</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Perform the InvMixColumns transformation on the source operand and store the result in the destination operand. The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location.",
@@ -214,24 +214,24 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/BEXTR.html"
             };
 
-        case "VBLENDPD":
         case "BLENDPD":
+        case "VBLENDPD":
             return {
                 "html": "<p>Double-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [3:0] determine whether the corresponding double-precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \u201d1\u201d, then the double-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Double-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [3:0] determine whether the corresponding double-precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \u201d1\u201d, then the double-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.",
                 "url": "http://www.felixcloutier.com/x86/BLENDPD.html"
             };
 
-        case "VBLENDPS":
         case "BLENDPS":
+        case "VBLENDPS":
             return {
                 "html": "<p>Packed single-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [7:0] determine whether the corresponding single precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \u201c1\u201d, then the single-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: The first source operand an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Packed single-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [7:0] determine whether the corresponding single precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is \u201c1\u201d, then the single-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.",
                 "url": "http://www.felixcloutier.com/x86/BLENDPS.html"
             };
 
-        case "VBLENDVPD":
         case "BLENDVPD":
+        case "VBLENDVPD":
             return {
                 "html": "<p>Conditionally copy each quadword data element of double-precision floating-point value from the second source operand and the first source operand depending on mask bits defined in the mask register operand. The mask bits are the most significant bit in each quadword element of the mask register.</p><p>Each quadword element of the destination operand is copied from:</p><p>The register assignment of the implicit mask operand for BLENDVPD is defined to be the architectural register XMM0.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand is the same. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged. The mask register operand is implicitly defined to be the architectural register XMM0. An attempt to execute BLENDVPD with a VEX prefix will cause #UD.</p><p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand is an XMM register or 128-bit memory location. The mask operand is the third source register, and encoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is ignored. The upper bits (MAXVL-1:128) of the corresponding YMM register (destination register) are zeroed. VEX.W must be 0, otherwise, the instruction will #UD.</p>",
                 "tooltip": "Conditionally copy each quadword data element of double-precision floating-point value from the second source operand and the first source operand depending on mask bits defined in the mask register operand. The mask bits are the most significant bit in each quadword element of the mask register.",
@@ -380,9 +380,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CALL.html"
             };
 
-        case "CWDE":
-        case "CDQE":
         case "CBW":
+        case "CDQE":
+        case "CWDE":
             return {
                 "html": "<p>Double the size of the source operand by means of sign extension. The CBW (convert byte to word) instruction copies the sign (bit 7) in the source operand into every bit in the AH register. The CWDE (convert word to double-word) instruction copies the sign (bit 15) of the word in the AX register into the high 16 bits of the EAX register.</p><p>CBW and CWDE reference the same opcode. The CBW instruction is intended for use when the operand-size attribute is 16; CWDE is intended for use when the operand-size attribute is 32. Some assemblers may force the operand size. Others may treat these two mnemonics as synonyms (CBW/CWDE) and use the setting of the operand-size attribute to determine the size of values to be converted.</p><p>In 64-bit mode, the default operation size is the size of the destination register. Use of the REX.W prefix promotes this instruction (CDQE when promoted) to operate on 64-bit operands. In which case, CDQE copies the sign (bit 31) of the doubleword in the EAX register into the high 32 bits of RAX.</p>",
                 "tooltip": "Double the size of the source operand by means of sign extension. The CBW (convert byte to word) instruction copies the sign (bit 7) in the source operand into every bit in the AH register. The CWDE (convert word to double-word) instruction copies the sign (bit 15) of the word in the AX register into the high 16 bits of the EAX register.",
@@ -452,36 +452,36 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CMC.html"
             };
 
-        case "CMOVS":
-        case "CMOVO":
-        case "CMOVNC":
-        case "CMOVNS":
-        case "CMOVB":
-        case "CMOVLE":
         case "CMOVA":
-        case "CMOVGE":
         case "CMOVAE":
-        case "CMOVNL":
+        case "CMOVB":
+        case "CMOVBE":
         case "CMOVC":
-        case "CMOVNBE":
-        case "CMOVNGE":
         case "CMOVE":
-        case "CMOVNAE":
         case "CMOVG":
-        case "CMOVNLE":
-        case "CMOVNP":
+        case "CMOVGE":
         case "CMOVL":
-        case "CMOVBE":
-        case "CMOVZ":
+        case "CMOVLE":
+        case "CMOVNA":
+        case "CMOVNAE":
+        case "CMOVNB":
+        case "CMOVNBE":
+        case "CMOVNC":
         case "CMOVNE":
-        case "CMOVP":
         case "CMOVNG":
+        case "CMOVNGE":
+        case "CMOVNL":
+        case "CMOVNLE":
+        case "CMOVNO":
+        case "CMOVNP":
+        case "CMOVNS":
         case "CMOVNZ":
-        case "CMOVPO":
+        case "CMOVO":
+        case "CMOVP":
         case "CMOVPE":
-        case "CMOVNA":
-        case "CMOVNB":
-        case "CMOVNO":
+        case "CMOVPO":
+        case "CMOVS":
+        case "CMOVZ":
             return {
                 "html": "<p>The CMOV<em>cc</em> instructions check the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) and perform a move operation if the flags are in a specified state (or condition). A condition code (<em>cc</em>) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, a move is not performed and execution continues with the instruction following the CMOV<em>cc</em> instruction.</p><p>These instructions can move 16-bit, 32-bit or 64-bit values from memory to a general-purpose register or from one general-purpose register to another. Conditional moves of 8-bit register operands are not supported.</p><p>The condition for each CMOV<em>cc</em> mnemonic is given in the description column of the above table. The terms \u201cless\u201d and \u201cgreater\u201d are used for comparisons of signed integers and the terms \u201cabove\u201d and \u201cbelow\u201d are used for unsigned integers.</p><p>Because a particular state of the status flags can sometimes be interpreted in two ways, two mnemonics are defined for some opcodes. For example, the CMOVA (conditional move if above) instruction and the CMOVNBE (conditional move if not below or equal) instruction are alternate mnemonics for the opcode 0F 47H.</p><p>The CMOV<em>cc</em> instructions were introduced in P6 family processors; however, these instructions may not be supported by all IA-32 processors. Software can determine if the CMOV<em>cc</em> instructions are supported by checking the processor\u2019s feature information with the CPUID instruction (see \u201cCPUID\u2014CPU Identification\u201d in this chapter).</p>",
                 "tooltip": "The CMOVcc instructions check the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) and perform a move operation if the flags are in a specified state (or condition). A condition code (cc) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, a move is not performed and execution continues with the instruction following the CMOVcc instruction.",
@@ -495,8 +495,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CMP.html"
             };
 
-        case "VCMPPD":
         case "CMPPD":
+        case "VCMPPD":
             return {
                 "html": "<p>Performs a SIMD compare of the packed double-precision floating-point values in the second source operand and the first source operand and returns the results of the comparison to the destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands.</p><p>EVEX encoded versions: The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand (first operand) is an opmask register. Comparison results are written to the destination operand under the writemask k2. Each comparison result is a single mask bit of 1 (comparison true) or 0 (comparison false).</p><p>VEX.256 encoded version: The first source operand (second operand) is a YMM register. The second source operand (third operand) can be a YMM register or a 256-bit memory location. The destination operand (first operand) is a YMM register. Four comparisons are performed with results written to the destination operand. The result of each comparison is a quadword mask of all 1s (comparison true) or all 0s (comparison false).</p><p>128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 128-bit memory location. Bits (MAXVL-1:128) of the corresponding ZMM destination register remain unchanged. Two comparisons are performed with results written to bits 127:0 of the destination operand. The result of each comparison is a quadword mask of all 1s (comparison true) or all 0s (comparison false).</p><p>VEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source operand (third operand) can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination ZMM register are zeroed. Two comparisons are performed with results written to bits 127:0 of the destination operand.</p>",
                 "tooltip": "Performs a SIMD compare of the packed double-precision floating-point values in the second source operand and the first source operand and returns the results of the comparison to the destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands.",
@@ -544,8 +544,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CMPXCHG8B%3ACMPXCHG16B.html"
             };
 
-        case "VCOMISD":
         case "COMISD":
+        case "VCOMISD":
             return {
                 "html": "<p>Compares the double-precision floating-point values in the low quadwords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN).</p><p>Operand 1 is an XMM register; operand 2 can be an XMM register or a 64 bit memory</p><p>location. The COMISD instruction differs from the UCOMISD instruction in that it signals a SIMD floating-point invalid operation exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISD instruction signals an invalid numeric exception only if a source operand is an SNaN.</p><p>The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.</p><p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Compares the double-precision floating-point values in the low quadwords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN).",
@@ -582,16 +582,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CVTDQ2PD.html"
             };
 
-        case "VCVTDQ2PS":
         case "CVTDQ2PS":
+        case "VCVTDQ2PS":
             return {
                 "html": "<p>Converts four, eight or sixteen packed signed doubleword integers in the source operand to four, eight or sixteen packed single-precision floating-point values in the destination operand.</p><p>EVEX encoded versions: The source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operand is a YMM register. Bits (MAXVL-1:256) of the corresponding register destination are zeroed.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding register destination are zeroed.</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operand is an XMM register. The upper Bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>",
                 "tooltip": "Converts four, eight or sixteen packed signed doubleword integers in the source operand to four, eight or sixteen packed single-precision floating-point values in the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/CVTDQ2PS.html"
             };
 
-        case "VCVTPD2DQ":
         case "CVTPD2DQ":
+        case "VCVTPD2DQ":
             return {
                 "html": "<p>Converts packed double-precision floating-point values in the source operand (second operand) to packed signed doubleword integers in the destination operand (first operand).</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits. If a converted result cannot be represented in the destination format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (2<sup>w-1</sup>, where w represents the number of bits in the destination format) is returned.</p><p>EVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512-bit memory location, or a 512-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1. The upper bits (MAXVL-1:256/128/64) of the corresponding destination are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:64) of the corresponding ZMM register destination are zeroed.</p>",
                 "tooltip": "Converts packed double-precision floating-point values in the source operand (second operand) to packed signed doubleword integers in the destination operand (first operand).",
@@ -605,8 +605,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CVTPD2PI.html"
             };
 
-        case "VCVTPD2PS":
         case "CVTPD2PS":
+        case "VCVTPD2PS":
             return {
                 "html": "<p>Converts two, four or eight packed double-precision floating-point values in the source operand (second operand) to two, four or eight packed single-precision floating-point values in the destination operand (first operand).</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits.</p><p>EVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a YMM/XMM/XMM (low 64-bits) register conditionally updated with writemask k1. The upper bits (MAXVL-1:256/128/64) of the corresponding destination are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:64) of the corresponding ZMM register destination are zeroed.</p>",
                 "tooltip": "Converts two, four or eight packed double-precision floating-point values in the source operand (second operand) to two, four or eight packed single-precision floating-point values in the destination operand (first operand).",
@@ -635,8 +635,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CVTPS2DQ.html"
             };
 
-        case "VCVTPS2PD":
         case "CVTPS2PD":
+        case "VCVTPS2PD":
             return {
                 "html": "<p>Converts two, four or eight packed single-precision floating-point values in the source operand (second operand) to two, four or eight packed double-precision floating-point values in the destination operand (first operand).</p><p>EVEX encoded versions: The source operand is a YMM/XMM/XMM (low 64-bits) register, a 256/128/64-bit memory location or a 256/128/64-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a YMM register. Bits (MAXVL-1:256) of the corresponding destination ZMM register are zeroed.</p><p>VEX.128 encoded version: The source operand is an XMM register or 64- bit memory location. The destination operand is a XMM register. The upper Bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The source operand is an XMM register or 64- bit memory location. The destination operand is an XMM register. The upper Bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>",
                 "tooltip": "Converts two, four or eight packed single-precision floating-point values in the source operand (second operand) to two, four or eight packed double-precision floating-point values in the destination operand (first operand).",
@@ -658,8 +658,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CVTSD2SI.html"
             };
 
-        case "VCVTSD2SS":
         case "CVTSD2SS":
+        case "VCVTSD2SS":
             return {
                 "html": "<p>Converts a double-precision floating-point value in the \u201cconvert-from\u201d source operand (the second operand in SSE2 version, otherwise the third operand) to a single-precision floating-point value in the destination operand.</p><p>When the \u201cconvert-from\u201d operand is an XMM register, the double-precision floating-point value is contained in the low quadword of the register. The result is stored in the low doubleword of the destination operand. When the conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.</p><p>128-bit Legacy SSE version: The \u201cconvert-from\u201d source operand (the second operand) is an XMM register or memory location. Bits (MAXVL-1:32) of the corresponding destination register remain unchanged. The destination operand is an XMM register.</p><p>VEX.128 and EVEX encoded versions: The \u201cconvert-from\u201d source operand (the third operand) can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers. Bits (127:32) of the XMM register destination are copied from the corresponding bits in the first source operand. Bits (MAXVL-1:128) of the destination register are zeroed.</p><p>EVEX encoded version: the converted result in written to the low doubleword element of the destination under the writemask.</p>",
                 "tooltip": "Converts a double-precision floating-point value in the \u201cconvert-from\u201d source operand (the second operand in SSE2 version, otherwise the third operand) to a single-precision floating-point value in the destination operand.",
@@ -674,8 +674,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CVTSI2SD.html"
             };
 
-        case "VCVTSI2SS":
         case "CVTSI2SS":
+        case "VCVTSI2SS":
             return {
                 "html": "<p>Converts a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the \u201cconvert-from\u201d source operand to a single-precision floating-point value in the destination operand (first operand). The \u201cconvert-from\u201d source operand can be a general-purpose register or a memory location. The destination operand is an XMM register. The result is stored in the low doubleword of the destination operand, and the upper three doublewords are left unchanged. When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits.</p><p>128-bit Legacy SSE version: In 64-bit mode, Use of the REX.W prefix promotes the instruction to use 64-bit input value. The \u201cconvert-from\u201d source operand (the second operand) is a general-purpose register or memory location. Bits (MAXVL-1:32) of the corresponding destination register remain unchanged.</p><p>VEX.128 and EVEX encoded versions: The \u201cconvert-from\u201d source operand (the third operand) can be a general-purpose register or a memory location. The first source and destination operands are XMM registers. Bits (127:32) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (MAXVL-1:128) of the destination register are zeroed.</p><p>EVEX encoded version: the converted result in written to the low doubleword element of the destination under the writemask.</p><p>Software should ensure VCVTSI2SS is encoded with VEX.L=0. Encoding VCVTSI2SS with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>",
                 "tooltip": "Converts a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the \u201cconvert-from\u201d source operand to a single-precision floating-point value in the destination operand (first operand). The \u201cconvert-from\u201d source operand can be a general-purpose register or a memory location. The destination operand is an XMM register. The result is stored in the low doubleword of the destination operand, and the upper three doublewords are left unchanged. When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits.",
@@ -690,16 +690,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CVTSS2SD.html"
             };
 
-        case "VCVTSS2SI":
         case "CVTSS2SI":
+        case "VCVTSS2SI":
             return {
                 "html": "<p>Converts a single-precision floating-point value in the source operand (the second operand) to a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (the first operand). The source operand can be an XMM register or a memory location. The destination operand is a general-purpose register. When the source operand is an XMM register, the single-precision floating-point value is contained in the low doubleword of the register.</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits. If a converted result cannot be represented in the destination format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (2<sup>w-1</sup>, where w represents the number of bits in the destination format) is returned.</p><p>Legacy SSE instructions: In 64-bit mode, Use of the REX.W prefix promotes the instruction to produce 64-bit data. See the summary chart at the beginning of this section for encoding data and limits.</p><p>VEX.W1 and EVEX.W1 versions: promotes the instruction to produce 64-bit data in 64-bit mode.</p><p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Converts a single-precision floating-point value in the source operand (the second operand) to a signed doubleword integer (or signed quadword integer if operand size is 64 bits) in the destination operand (the first operand). The source operand can be an XMM register or a memory location. The destination operand is a general-purpose register. When the source operand is an XMM register, the single-precision floating-point value is contained in the low doubleword of the register.",
                 "url": "http://www.felixcloutier.com/x86/CVTSS2SI.html"
             };
 
-        case "VCVTTPD2DQ":
         case "CVTTPD2DQ":
+        case "VCVTTPD2DQ":
             return {
                 "html": "<p>Converts two, four or eight packed double-precision floating-point values in the source operand (second operand) to two, four or eight packed signed doubleword integers in the destination operand (first operand).</p><p>When a conversion is inexact, a truncated (round toward zero) value is returned. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>EVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a YMM/XMM/XMM (low 64 bits) register conditionally updated with writemask k1. The upper bits (MAXVL-1:256) of the corresponding destination are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:64) of the corresponding ZMM register destination are zeroed.</p>",
                 "tooltip": "Converts two, four or eight packed double-precision floating-point values in the source operand (second operand) to two, four or eight packed signed doubleword integers in the destination operand (first operand).",
@@ -728,8 +728,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/CVTTPS2PI.html"
             };
 
-        case "VCVTTSD2SI":
         case "CVTTSD2SI":
+        case "VCVTTSD2SI":
             return {
                 "html": "<p>Converts a double-precision floating-point value in the source operand (the second operand) to a signed double-word integer (or signed quadword integer if operand size is 64 bits) in the destination operand (the first operand). The source operand can be an XMM register or a 64-bit memory location. The destination operand is a general purpose register. When the source operand is an XMM register, the double-precision floating-point value is contained in the low quadword of the register.</p><p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.</p><p>If a converted result exceeds the range limits of signed doubleword integer (in non-64-bit modes or 64-bit mode with REX.W/VEX.W/EVEX.W=0), the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p><p>If a converted result exceeds the range limits of signed quadword integer (in 64-bit mode and REX.W/VEX.W/EVEX.W = 1), the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000_00000000H) is returned.</p><p>Legacy SSE instructions: In 64-bit mode, Use of the REX.W prefix promotes the instruction to 64-bit operation. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Converts a double-precision floating-point value in the source operand (the second operand) to a signed double-word integer (or signed quadword integer if operand size is 64 bits) in the destination operand (the first operand). The source operand can be an XMM register or a 64-bit memory location. The destination operand is a general purpose register. When the source operand is an XMM register, the double-precision floating-point value is contained in the low quadword of the register.",
@@ -745,8 +745,8 @@ export function getAsmOpcode(opcode) {
             };
 
         case "CDQ":
-        case "CWD":
         case "CQO":
+        case "CWD":
             return {
                 "html": "<p>Doubles the size of the operand in register AX, EAX, or RAX (depending on the operand size) by means of sign extension and stores the result in registers DX:AX, EDX:EAX, or RDX:RAX, respectively. The CWD instruction copies the sign (bit 15) of the value in the AX register into every bit position in the DX register. The CDQ instruction copies the sign (bit 31) of the value in the EAX register into every bit position in the EDX register. The CQO instruction (available in 64-bit mode only) copies the sign (bit 63) of the value in the RAX register into every bit position in the RDX register.</p><p>The CWD instruction can be used to produce a doubleword dividend from a word before word division. The CDQ instruction can be used to produce a quadword dividend from a doubleword before doubleword division. The CQO instruction can be used to produce a double quadword dividend from a quadword before a quadword division.</p><p>The CWD and CDQ mnemonics reference the same opcode. The CWD instruction is intended for use when the operand-size attribute is 16 and the CDQ instruction for when the operand-size attribute is 32. Some assemblers may force the operand size to 16 when CWD is used and to 32 when CDQ is used. Others may treat these mnemonics as synonyms (CWD/CDQ) and use the current setting of the operand-size attribute to determine the size of values to be converted, regardless of the mnemonic used.</p><p>In 64-bit mode, use of the REX.W prefix promotes operation to 64 bits. The CQO mnemonics reference the same opcode as CWD/CDQ. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Doubles the size of the operand in register AX, EAX, or RAX (depending on the operand size) by means of sign extension and stores the result in registers DX:AX, EDX:EAX, or RDX:RAX, respectively. The CWD instruction copies the sign (bit 15) of the value in the AX register into every bit position in the DX register. The CDQ instruction copies the sign (bit 31) of the value in the EAX register into every bit position in the EDX register. The CQO instruction (available in 64-bit mode only) copies the sign (bit 63) of the value in the RAX register into every bit position in the RDX register.",
@@ -789,8 +789,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/DIVPD.html"
             };
 
-        case "VDIVPS":
         case "DIVPS":
+        case "VDIVPS":
             return {
                 "html": "<p>Performs a SIMD divide of the four, eight or sixteen packed single-precision floating-point values in the first source operand (the second operand) by the four, eight or sixteen packed single-precision floating-point values in the second source operand (the third operand). Results are written to the destination operand (the first operand).</p><p>EVEX encoded versions: The first source operand (the second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p><p>VEX.128 encoded version: The first source operand is a XMM register. The second source operand can be a XMM register or a 128-bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>",
                 "tooltip": "Performs a SIMD divide of the four, eight or sixteen packed single-precision floating-point values in the first source operand (the second operand) by the four, eight or sixteen packed single-precision floating-point values in the second source operand (the third operand). Results are written to the destination operand (the first operand).",
@@ -805,8 +805,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/DIVSD.html"
             };
 
-        case "VDIVSS":
         case "DIVSS":
+        case "VDIVSS":
             return {
                 "html": "<p>Divides the low single-precision floating-point value in the first source operand by the low single-precision floating-point value in the second source operand, and stores the single-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 32-bit memory location.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (MAXVL-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source operand is an xmm register encoded by VEX.vvvv. The three high-order doublewords of the destination operand are copied from the first source operand. Bits (MAXVL-1:128) of the destination register are zeroed.</p><p>EVEX.128 encoded version: The first source operand is an xmm register encoded by EVEX.vvvv. The doubleword elements of the destination operand at bits 127:32 are copied from the first source operand. Bits (MAXVL-1:128) of the destination register are zeroed.</p><p>EVEX version: The low doubleword element of the destination is updated according to the writemask.</p>",
                 "tooltip": "Divides the low single-precision floating-point value in the first source operand by the low single-precision floating-point value in the second source operand, and stores the single-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 32-bit memory location.",
@@ -821,8 +821,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/DPPD.html"
             };
 
-        case "VDPPS":
         case "DPPS":
+        case "VDPPS":
             return {
                 "html": "<p>Conditionally multiplies the packed single precision floating-point values in the destination operand (first operand) with the packed single-precision floats in the source (second operand) depending on a mask extracted from the high 4 bits of the immediate byte (third operand). If a condition mask bit in Imm8[7:4] is zero, the corresponding multiplication is replaced by a value of 0.0 in the manner described by Section 12.8.4 of <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>.</p><p>The four resulting single-precision values are summed into an intermediate result. The intermediate result is conditionally broadcasted to the destination using a broadcast mask specified by bits [3:0] of the immediate byte.</p><p>If a broadcast mask bit is \u201c1\u201d, the intermediate result is copied to the corresponding dword element in the destination operand. If a broadcast mask bit is zero, the corresponding element in the destination is set to zero.</p><p>DPPS follows the NaN forwarding rules stated in the Software Developer\u2019s Manual, vol. 1, <span class=\"not-imported\">table 4-7</span>. These rules do not cover horizontal prioritization of NaNs. Horizontal propagation of NaNs to the destination and the positioning of those NaNs in the destination is implementation dependent. NaNs on the input sources or computationally generated NaNs will have at least one NaN propagated to the destination.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified.</p>",
                 "tooltip": "Conditionally multiplies the packed single precision floating-point values in the destination operand (first operand) with the packed single-precision floats in the source (second operand) depending on a mask extracted from the high 4 bits of the immediate byte (third operand). If a condition mask bit in Imm8[7:4] is zero, the corresponding multiplication is replaced by a value of 0.0 in the manner described by Section 12.8.4 of Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1.",
@@ -850,8 +850,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/ENTER.html"
             };
 
-        case "VEXTRACTPS":
         case "EXTRACTPS":
+        case "VEXTRACTPS":
             return {
                 "html": "<p>Extracts a single-precision floating-point value from the source operand (second operand) at the 32-bit offset specified from imm8. Immediate bits higher than the most significant offset for the vector length are ignored.</p><p>The extracted single-precision floating-point value is stored in the low 32-bits of the destination operand</p><p>In 64-bit mode, destination register operand has default operand size of 64 bits. The upper 32-bits of the register are filled with zero. REX.W is ignored.</p><p>VEX.128 and EVEX encoded version: When VEX.W1 or EVEX.W1 form is used in 64-bit mode with a general purpose register (GPR) as a destination operand, the packed single quantity is zero extended to 64 bits.</p><p>VEX.vvvv/EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "Extracts a single-precision floating-point value from the source operand (second operand) at the 32-bit offset specified from imm8. Immediate bits higher than the most significant offset for the vector length are ignored.",
@@ -872,9 +872,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/FABS.html"
             };
 
+        case "FADD":
         case "FADDP":
         case "FIADD":
-        case "FADD":
             return {
                 "html": "<p>Adds the destination and source operands and stores the sum in the destination location. The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.</p><p>The no-operand version of the instruction adds the contents of the ST(0) register to the ST(1) register. The one-operand version adds the contents of a memory location (either a floating-point or an integer value) to the contents of the ST(0) register. The two-operand version, adds the contents of the ST(0) register to the ST(i) register or vice versa. The value in ST(0) can be doubled by coding:</p>",
                 "tooltip": "Adds the destination and source operands and stores the sum in the destination location. The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.",
@@ -902,22 +902,22 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/FCHS.html"
             };
 
-        case "FNCLEX":
         case "FCLEX":
+        case "FNCLEX":
             return {
                 "html": "<p>Clears the floating-point exception flags (PE, UE, OE, ZE, DE, and IE), the exception summary status flag (ES), the stack fault flag (SF), and the busy flag (B) in the FPU status word. The FCLEX instruction checks for and handles any pending unmasked floating-point exceptions before clearing the exception flags; the FNCLEX instruction does not.</p><p>The assembler issues two instructions for the FCLEX instruction (an FWAIT instruction followed by an FNCLEX instruction), and the processor executes each of these instructions separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p><p>When operating a Pentium or Intel486 processor in MS-DOS* compatibility mode, it is possible (under unusual circumstances) for an FNCLEX instruction to be interrupted prior to being executed to handle a pending FPU exception. See the section titled \u201cNo-Wait FPU Instructions Can Get FPU Interrupt in Window\u201d in Appendix D of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for a description of these circumstances. An FNCLEX instruction cannot be interrupted in this way on later Intel processors, except for the Intel Quark<sup>TM</sup> X1000 processor.</p><p>This instruction affects only the x87 FPU floating-point exception flags. It does not affect the SIMD floating-point exception flags in the MXCRS register.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Clears the floating-point exception flags (PE, UE, OE, ZE, DE, and IE), the exception summary status flag (ES), the stack fault flag (SF), and the busy flag (B) in the FPU status word. The FCLEX instruction checks for and handles any pending unmasked floating-point exceptions before clearing the exception flags; the FNCLEX instruction does not.",
                 "url": "http://www.felixcloutier.com/x86/FCLEX%3AFNCLEX.html"
             };
 
+        case "FCMOVB":
+        case "FCMOVBE":
         case "FCMOVE":
         case "FCMOVNB":
-        case "FCMOVU":
         case "FCMOVNBE":
-        case "FCMOVNU":
-        case "FCMOVBE":
-        case "FCMOVB":
         case "FCMOVNE":
+        case "FCMOVNU":
+        case "FCMOVU":
             return {
                 "html": "<p>Tests the status flags in the EFLAGS register and moves the source operand (second operand) to the destination operand (first operand) if the given test condition is true. The condition for each mnemonic os given in the Description column above and in Chapter 8 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>. The source operand is always in the ST(i) register and the destination operand is always ST(0).</p><p>The FCMOV<em>cc</em> instructions are useful for optimizing small IF constructions. They also help eliminate branching overhead for IF operations and the possibility of branch mispredictions by the processor.</p><p>A processor may not support the FCMOV<em>cc</em> instructions. Software can check if the FCMOV<em>cc</em> instructions are supported by checking the processor\u2019s feature information with the CPUID instruction (see \u201cCOMISS\u2014Compare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGS\u201d in this chapter). If both the CMOV and FPU feature bits are set, the FCMOV<em>cc</em> instructions are supported.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>The FCMOVcc instructions were introduced to the IA-32 Architecture in the P6 family processors and are not available in earlier IA-32 processors.</p>",
                 "tooltip": "Tests the status flags in the EFLAGS register and moves the source operand (second operand) to the destination operand (first operand) if the given test condition is true. The condition for each mnemonic os given in the Description column above and in Chapter 8 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1. The source operand is always in the ST(i) register and the destination operand is always ST(0).",
@@ -933,10 +933,10 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/FCOM%3AFCOMP%3AFCOMPP.html"
             };
 
-        case "FUCOMIP":
-        case "FUCOMI":
-        case "FCOMIP":
         case "FCOMI":
+        case "FCOMIP":
+        case "FUCOMI":
+        case "FUCOMIP":
             return {
                 "html": "<p>Performs an unordered comparison of the contents of registers ST(0) and ST(i) and sets the status flags ZF, PF, and CF in the EFLAGS register according to the results (see the table below). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.</p><p>An unordered comparison checks the class of the numbers being compared (see \u201cFXAM\u2014Examine Floating-Point\u201d in this chapter). The FUCOMI/FUCOMIP instructions perform the same operations as the FCOMI/FCOMIP instructions. The only difference is that the FUCOMI/FUCOMIP instructions raise the invalid-arithmetic-operand exception (#IA) only when either or both operands are an SNaN or are in an unsupported format; QNaNs cause the condition code flags to be set to unordered, but do not cause an exception to be generated. The FCOMI/FCOMIP instructions raise an invalid-operation exception when either or both of the operands are a NaN value of any kind or are in an unsupported format.</p><p>If the operation results in an invalid-arithmetic-operand exception being raised, the status flags in the EFLAGS register are set only if the exception is masked.</p><p>The FCOMI/FCOMIP and FUCOMI/FUCOMIP instructions set the OF, SF and AF flags to zero in the EFLAGS register (regardless of whether an invalid-operation exception is detected).</p><p>The FCOMIP and FUCOMIP instructions also pop the register stack following the comparison operation. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1.</p>",
                 "tooltip": "Performs an unordered comparison of the contents of registers ST(0) and ST(i) and sets the status flags ZF, PF, and CF in the EFLAGS register according to the results (see the table below). The sign of zero is ignored for comparisons, so that \u20130.0 is equal to +0.0.",
@@ -966,8 +966,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/FDIV%3AFDIVP%3AFIDIV.html"
             };
 
-        case "FDIVRP":
         case "FDIVR":
+        case "FDIVRP":
         case "FIDIVR":
             return {
                 "html": "<p>Divides the source operand by the destination operand and stores the result in the destination location. The destination operand (divisor) is always in an FPU register; the source operand (dividend) can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format, word or doubleword integer format.</p><p>These instructions perform the reverse operations of the FDIV, FDIVP, and FIDIV instructions. They are provided to support more efficient coding.</p><p>The no-operand version of the instruction divides the contents of the ST(0) register by the contents of the ST(1) register. The one-operand version divides the contents of a memory location (either a floating-point or an integer value) by the contents of the ST(0) register. The two-operand version, divides the contents of the ST(i) register by the contents of the ST(0) register or vice versa.</p><p>The FDIVRP instructions perform the additional operation of popping the FPU register stack after storing the result. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point divide instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FDIVR rather than FDIVRP.</p><p>The FIDIVR instructions convert an integer source operand to double extended-precision floating-point format before performing the division.</p>",
@@ -1004,8 +1004,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/FINCSTP.html"
             };
 
-        case "FNINIT":
         case "FINIT":
+        case "FNINIT":
             return {
                 "html": "<p>Sets the FPU control, status, tag, instruction pointer, and data pointer registers to their default states. The FPU control word is set to 037FH (round to nearest, all exceptions masked, 64-bit precision). The status word is cleared (no exception flags set, TOP is set to 0). The data registers in the register stack are left unchanged, but they are all tagged as empty (11B). Both the instruction and data pointers are cleared.</p><p>The FINIT instruction checks for and handles any pending unmasked floating-point exceptions before performing the initialization; the FNINIT instruction does not.</p><p>The assembler issues two instructions for the FINIT instruction (an FWAIT instruction followed by an FNINIT instruction), and the processor executes each of these instructions in separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>When operating a Pentium or Intel486 processor in MS-DOS compatibility mode, it is possible (under unusual circumstances) for an FNINIT instruction to be interrupted prior to being executed to handle a pending FPU exception. See the section titled \u201cNo-Wait FPU Instructions Can Get FPU Interrupt in Window\u201d in Appendix D of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for a description of these circumstances. An FNINIT instruction cannot be interrupted in this way on later Intel processors, except for the Intel Quark<sup>TM</sup> X1000 processor.</p>",
                 "tooltip": "Sets the FPU control, status, tag, instruction pointer, and data pointer registers to their default states. The FPU control word is set to 037FH (round to nearest, all exceptions masked, 64-bit precision). The status word is cleared (no exception flags set, TOP is set to 0). The data registers in the register stack are left unchanged, but they are all tagged as empty (11B). Both the instruction and data pointers are cleared.",
@@ -1034,13 +1034,13 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/FLD.html"
             };
 
-        case "FLDPI":
-        case "FLDLN2":
         case "FLD1":
         case "FLDL2E":
         case "FLDL2T":
-        case "FLDZ":
         case "FLDLG2":
+        case "FLDLN2":
+        case "FLDPI":
+        case "FLDZ":
             return {
                 "html": "<p>Push one of seven commonly used constants (in double extended-precision floating-point format) onto the FPU register stack. The constants that can be loaded with these instructions include +1.0, +0.0, log<sub>2</sub>10, log<sub>2</sub>e, \u03c0, log<sub>10</sub>2, and log<sub>e</sub>2. For each constant, an internal 66-bit constant is rounded (as specified by the RC field in the FPU control word) to double extended-precision floating-point format. The inexact-result exception (#P) is not generated as a result of the rounding, nor is the C1 flag set in the x87 FPU status word if the value is rounded up.</p><p>See the section titled \u201cApproximation of Pi\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for a description of the \u03c0 constant.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>When the RC field is set to round-to-nearest, the FPU produces the same constants that is produced by the Intel 8087 and Intel 287 math coprocessors.</p>",
                 "tooltip": "Push one of seven commonly used constants (in double extended-precision floating-point format) onto the FPU register stack. The constants that can be loaded with these instructions include +1.0, +0.0, log210, log2e, \u03c0, log102, and loge2. For each constant, an internal 66-bit constant is rounded (as specified by the RC field in the FPU control word) to double extended-precision floating-point format. The inexact-result exception (#P) is not generated as a result of the rounding, nor is the C1 flag set in the x87 FPU status word if the value is rounded up.",
@@ -1061,8 +1061,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/FLDENV.html"
             };
 
-        case "FMUL":
         case "FIMUL":
+        case "FMUL":
         case "FMULP":
             return {
                 "html": "<p>Multiplies the destination and source operands and stores the product in the destination location. The destination operand is always an FPU data register; the source operand can be an FPU data register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.</p><p>The no-operand version of the instruction multiplies the contents of the ST(1) register by the contents of the ST(0) register and stores the product in the ST(1) register. The one-operand version multiplies the contents of the ST(0) register by the contents of a memory location (either a floating point or an integer value) and stores the product in the ST(0) register. The two-operand version, multiplies the contents of the ST(0) register by the contents of the ST(i) register, or vice versa, with the result being stored in the register specified with the first operand (the destination operand).</p><p>The FMULP instructions perform the additional operation of popping the FPU register stack after storing the product. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point multiply instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FMUL rather than FMULP.</p><p>The FIMUL instructions convert an integer source operand to double extended-precision floating-point format before performing the multiplication.</p><p>The sign of the result is always the exclusive-OR of the source signs, even if one or more of the values being multiplied is 0 or \u221e. When the source operand is an integer 0, it is treated as a +0.</p>",
@@ -1155,24 +1155,24 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/FSQRT.html"
             };
 
-        case "FSTP":
         case "FST":
+        case "FSTP":
             return {
                 "html": "<p>The FST instruction copies the value in the ST(0) register to the destination operand, which can be a memory location or another register in the FPU register stack. When storing the value in memory, the value is converted to single-precision or double-precision floating-point format.</p><p>The FSTP instruction performs the same operation as the FST instruction and then pops the register stack. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The FSTP instruction can also store values in memory in double extended-precision floating-point format.</p><p>If the destination operand is a memory location, the operand specifies the address where the first byte of the destination value is to be stored. If the destination operand is a register, the operand specifies a register in the register stack relative to the top of the stack.</p><p>If the destination size is single-precision or double-precision, the significand of the value being stored is rounded to the width of the destination (according to the rounding mode specified by the RC field of the FPU control word), and the exponent is converted to the width and bias of the destination format. If the value being stored is too large for the destination format, a numeric overflow exception (#O) is generated and, if the exception is unmasked, no value is stored in the destination operand. If the value being stored is a denormal value, the denormal exception (#D) is not generated. This condition is simply signaled as a numeric underflow exception (#U) condition.</p><p>If the value being stored is \u00b10, \u00b1\u221e, or a NaN, the least-significant bits of the significand and the exponent are truncated to fit the destination format. This operation preserves the value\u2019s identity as a 0, \u221e, or NaN.</p>",
                 "tooltip": "The FST instruction copies the value in the ST(0) register to the destination operand, which can be a memory location or another register in the FPU register stack. When storing the value in memory, the value is converted to single-precision or double-precision floating-point format.",
                 "url": "http://www.felixcloutier.com/x86/FST%3AFSTP.html"
             };
 
-        case "FSTCW":
         case "FNSTCW":
+        case "FSTCW":
             return {
                 "html": "<p>Stores the current value of the FPU control word at the specified destination in memory. The FSTCW instruction checks for and handles pending unmasked floating-point exceptions before storing the control word; the FNSTCW instruction does not.</p><p>The assembler issues two instructions for the FSTCW instruction (an FWAIT instruction followed by an FNSTCW instruction), and the processor executes each of these instructions in separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p><p>When operating a Pentium or Intel486 processor in MS-DOS compatibility mode, it is possible (under unusual circumstances) for an FNSTCW instruction to be interrupted prior to being executed to handle a pending FPU exception. See the section titled \u201cNo-Wait FPU Instructions Can Get FPU Interrupt in Window\u201d in Appendix D of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for a description of these circumstances. An FNSTCW instruction cannot be interrupted in this way on later Intel processors, except for the Intel Quark<sup>TM</sup> X1000 processor.</p>",
                 "tooltip": "Stores the current value of the FPU control word at the specified destination in memory. The FSTCW instruction checks for and handles pending unmasked floating-point exceptions before storing the control word; the FNSTCW instruction does not.",
                 "url": "http://www.felixcloutier.com/x86/FSTCW%3AFNSTCW.html"
             };
 
-        case "FSTENV":
         case "FNSTENV":
+        case "FSTENV":
             return {
                 "html": "<p>Saves the current FPU operating environment at the memory location specified with the destination operand, and then masks all floating-point exceptions. The FPU operating environment consists of the FPU control word, status word, tag word, instruction pointer, data pointer, and last opcode. Figures 8-9 through 8-12 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, show the layout in memory of the stored environment, depending on the operating mode of the processor (protected or real) and the current operand-size attribute (16-bit or 32-bit). In virtual-8086 mode, the real mode layouts are used.</p><p>The FSTENV instruction checks for and handles any pending unmasked floating-point exceptions before storing the FPU environment; the FNSTENV instruction does not. The saved image reflects the state of the FPU after all floating-point instructions preceding the FSTENV/FNSTENV instruction in the instruction stream have been executed.</p><p>These instructions are often used by exception handlers because they provide access to the FPU instruction and data pointers. The environment is typically saved in the stack. Masking all exceptions after saving the environment prevents floating-point exceptions from interrupting the exception handler.</p><p>The assembler issues two instructions for the FSTENV instruction (an FWAIT instruction followed by an FNSTENV instruction), and the processor executes each of these instructions separately. If an exception is generated for either of these instructions, the save EIP points to the instruction that caused the exception.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Saves the current FPU operating environment at the memory location specified with the destination operand, and then masks all floating-point exceptions. The FPU operating environment consists of the FPU control word, status word, tag word, instruction pointer, data pointer, and last opcode. Figures 8-9 through 8-12 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, show the layout in memory of the stored environment, depending on the operating mode of the processor (protected or real) and the current operand-size attribute (16-bit or 32-bit). In virtual-8086 mode, the real mode layouts are used.",
@@ -1188,17 +1188,17 @@ export function getAsmOpcode(opcode) {
             };
 
         case "FISUB":
-        case "FSUBP":
         case "FSUB":
+        case "FSUBP":
             return {
                 "html": "<p>Subtracts the source operand from the destination operand and stores the difference in the destination location. The destination operand is always an FPU data register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.</p><p>The no-operand version of the instruction subtracts the contents of the ST(0) register from the ST(1) register and stores the result in ST(1). The one-operand version subtracts the contents of a memory location (either a floating-point or an integer value) from the contents of the ST(0) register and stores the result in ST(0). The two-operand version, subtracts the contents of the ST(0) register from the ST(i) register or vice versa.</p><p>The FSUBP instructions perform the additional operation of popping the FPU register stack following the subtraction. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point subtract instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FSUB rather than FSUBP.</p><p>The FISUB instructions convert an integer source operand to double extended-precision floating-point format before performing the subtraction.</p><p><a href=\"http://www.felixcloutier.com/x86/FSUB:FSUBP:FISUB.html#tbl-3-38\" rel=\"noreferrer noopener\" target=\"_blank\">Table 3-38</a> shows the results obtained when subtracting various classes of numbers from one another, assuming that neither overflow nor underflow occurs. Here, the SRC value is subtracted from the DEST value (DEST \u2212 SRC = result).</p>",
                 "tooltip": "Subtracts the source operand from the destination operand and stores the difference in the destination location. The destination operand is always an FPU data register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.",
                 "url": "http://www.felixcloutier.com/x86/FSUB%3AFSUBP%3AFISUB.html"
             };
 
+        case "FISUBR":
         case "FSUBR":
         case "FSUBRP":
-        case "FISUBR":
             return {
                 "html": "<p>Subtracts the destination operand from the source operand and stores the difference in the destination location. The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.</p><p>These instructions perform the reverse operations of the FSUB, FSUBP, and FISUB instructions. They are provided to support more efficient coding.</p><p>The no-operand version of the instruction subtracts the contents of the ST(1) register from the ST(0) register and stores the result in ST(1). The one-operand version subtracts the contents of the ST(0) register from the contents of a memory location (either a floating-point or an integer value) and stores the result in ST(0). The two-operand version, subtracts the contents of the ST(i) register from the ST(0) register or vice versa.</p><p>The FSUBRP instructions perform the additional operation of popping the FPU register stack following the subtraction. To pop the register stack, the processor marks the ST(0) register as empty and increments the stack pointer (TOP) by 1. The no-operand version of the floating-point reverse subtract instructions always results in the register stack being popped. In some assemblers, the mnemonic for this instruction is FSUBR rather than FSUBRP.</p><p>The FISUBR instructions convert an integer source operand to double extended-precision floating-point format before performing the subtraction.</p>",
                 "tooltip": "Subtracts the destination operand from the source operand and stores the difference in the destination location. The destination operand is always an FPU register; the source operand can be a register or a memory location. Source operands in memory can be in single-precision or double-precision floating-point format or in word or doubleword integer format.",
@@ -1278,8 +1278,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/HADDPD.html"
             };
 
-        case "VHADDPS":
         case "HADDPS":
+        case "VHADDPS":
             return {
                 "html": "<p>Adds the single-precision floating-point values in the first and second dwords of the destination operand and stores the result in the first dword of the destination operand.</p><p>Adds single-precision floating-point values in the third and fourth dword of the destination operand and stores the result in the second dword of the destination operand.</p><p>Adds single-precision floating-point values in the first and second dword of the source operand and stores the result in the third dword of the destination operand.</p><p>Adds single-precision floating-point values in the third and fourth dword of the source operand and stores the result in the fourth dword of the destination operand.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Adds the single-precision floating-point values in the first and second dwords of the destination operand and stores the result in the first dword of the destination operand.",
@@ -1293,8 +1293,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/HLT.html"
             };
 
-        case "VHSUBPD":
         case "HSUBPD":
+        case "VHSUBPD":
             return {
                 "html": "<p>The HSUBPD instruction subtracts horizontally the packed DP FP numbers of both operands.</p><p>Subtracts the double-precision floating-point value in the high quadword of the destination operand from the low quadword of the destination operand and stores the result in the low quadword of the destination operand.</p><p>Subtracts the double-precision floating-point value in the high quadword of the source operand from the low quadword of the source operand and stores the result in the high quadword of the destination operand.</p><p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p><p>See <a href=\"http://www.felixcloutier.com/x86/HSUBPD.html#fig-3-20\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 3-20</a> for HSUBPD; see <a href=\"http://www.felixcloutier.com/x86/HSUBPD.html#fig-3-21\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 3-21</a> for VHSUBPD.</p>",
                 "tooltip": "The HSUBPD instruction subtracts horizontally the packed DP FP numbers of both operands.",
@@ -1337,10 +1337,10 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/INC.html"
             };
 
-        case "INSW":
         case "INS":
         case "INSB":
         case "INSD":
+        case "INSW":
             return {
                 "html": "<p>Copies the data from the I/O port specified with the source operand (second operand) to the destination operand (first operand). The source operand is an I/O port address (from 0 to 65,535) that is read from the DX register. The destination operand is a memory location, the address of which is read from either the ES:DI, ES:EDI or the RDI registers (depending on the address-size attribute of the instruction, 16, 32 or 64, respectively). (The ES segment cannot be overridden with a segment override prefix.) The size of the I/O port being accessed (that is, the size of the source and destination operands) is determined by the opcode for an 8-bit I/O port or by the operand-size attribute of the instruction for a 16- or 32-bit I/O port.</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the INS mnemonic) allows the source and destination operands to be specified explicitly. Here, the source operand must be \u201cDX,\u201d and the destination operand should be a symbol that indicates the size of the I/O port and the destination address. This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the destination operand symbol must specify the correct <strong>type</strong> (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct <strong>location</strong>. The location is always specified by the ES:(E)DI registers, which must be loaded correctly before the INS instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, and doubleword versions of the INS instructions. Here also DX is assumed by the processor to be the source operand and ES:(E)DI is assumed to be the destination operand. The size of the I/O port is specified with the choice of mnemonic: INSB (byte), INSW (word), or INSD (doubleword).</p><p>After the byte, word, or doubleword is transfer from the I/O port to the memory location, the DI/EDI/RDI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)DI register is incremented; if the DF flag is 1, the (E)DI register is decremented.) The (E)DI register is incremented or decremented by 1 for byte operations, by 2 for word operations, or by 4 for doubleword operations.</p><p>The INS, INSB, INSW, and INSD instructions can be preceded by the REP prefix for block input of ECX bytes, words, or doublewords. See \u201cREP/REPE/REPZ /REPNE/REPNZ\u2014Repeat String Operation Prefix\u201d in Chapter 4 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 2B</em>, for a description of the REP prefix.</p>",
                 "tooltip": "Copies the data from the I/O port specified with the source operand (second operand) to the destination operand (first operand). The source operand is an I/O port address (from 0 to 65,535) that is read from the DX register. The destination operand is a memory location, the address of which is read from either the ES:DI, ES:EDI or the RDI registers (depending on the address-size attribute of the instruction, 16, 32 or 64, respectively). (The ES segment cannot be overridden with a segment override prefix.) The size of the I/O port being accessed (that is, the size of the source and destination operands) is determined by the opcode for an 8-bit I/O port or by the operand-size attribute of the instruction for a 16- or 32-bit I/O port.",
@@ -1355,10 +1355,10 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/INSERTPS.html"
             };
 
-        case "INTO":
         case "INT":
         case "INT1":
         case "INT3":
+        case "INTO":
             return {
                 "html": "<p>The INT <em>n</em> instruction generates a call to the interrupt or exception handler specified with the destination operand (see the section titled \u201cInterrupts and Exceptions\u201d in Chapter 6 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>). The destination operand specifies a vector from 0 to 255, encoded as an 8-bit unsigned intermediate value. Each vector provides an index to a gate descriptor in the IDT. The first 32 vectors are reserved by Intel for system use. Some of these vectors are used for internally generated exceptions.</p><p>The INT <em>n</em> instruction is the general mnemonic for executing a software-generated call to an interrupt handler. The INTO instruction is a special mnemonic for calling overflow exception (#OF), exception 4. The overflow interrupt checks the OF flag in the EFLAGS register and calls the overflow interrupt handler if the OF flag is set to 1. (The INTO instruction cannot be used in 64-bit mode.)</p><p>The INT3 instruction uses a one-byte opcode (CC) and is intended for calling the debug exception handler with a breakpoint exception (#BP). (This one-byte form is useful because it can replace the first byte of any instruction at which a breakpoint is desired, including other one-byte instructions, without overwriting other instructions.)</p><p>The INT1 instruction also uses a one-byte opcode (F1) and generates a debug exception (#DB) without setting any bits in DR6.<sup>1</sup> Hardware vendors may use the INT1 instruction for hardware debug. For that reason, Intel recommends software vendors instead use the INT3 instruction for software breakpoints.</p><p>An interrupt generated by the INTO, INT3, or INT1 instruction differs from one generated by INT <em>n</em> in the following ways:</p>",
                 "tooltip": "The INT n instruction generates a call to the interrupt or exception handler specified with the destination operand (see the section titled \u201cInterrupts and Exceptions\u201d in Chapter 6 of the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1). The destination operand specifies a vector from 0 to 255, encoded as an 8-bit unsigned intermediate value. Each vector provides an index to a gate descriptor in the IDT. The first 32 vectors are reserved by Intel for system use. Some of these vectors are used for internally generated exceptions.",
@@ -1379,9 +1379,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/INVPCID.html"
             };
 
-        case "IRETQ":
-        case "IRETD":
         case "IRET":
+        case "IRETD":
+        case "IRETQ":
             return {
                 "html": "<p>Returns program control from an exception or interrupt handler to a program or procedure that was interrupted by an exception, an external interrupt, or a software-generated interrupt. These instructions are also used to perform a return from a nested task. (A nested task is created when a CALL instruction is used to initiate a task switch or when an interrupt or exception causes a task switch to an interrupt or exception handler.) See the section titled \u201cTask Linking\u201d in Chapter 7 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A</em>.</p><p>IRET and IRETD are mnemonics for the same opcode. The IRETD mnemonic (interrupt return double) is intended for use when returning from an interrupt when using the 32-bit operand size; however, most assemblers use the IRET mnemonic interchangeably for both operand sizes.</p><p>In Real-Address Mode, the IRET instruction preforms a far return to the interrupted program or procedure. During this operation, the processor pops the return instruction pointer, return code segment selector, and EFLAGS image from the stack to the EIP, CS, and EFLAGS registers, respectively, and then resumes execution of the interrupted program or procedure.</p><p>In Protected Mode, the action of the IRET instruction depends on the settings of the NT (nested task) and VM flags in the EFLAGS register and the VM flag in the EFLAGS image stored on the current stack. Depending on the setting of these flags, the processor performs the following types of interrupt returns:</p><p>If the NT flag (EFLAGS register) is cleared, the IRET instruction performs a far return from the interrupt procedure, without a task switch. The code segment being returned to must be equally or less privileged than the interrupt handler routine (as indicated by the RPL field of the code segment selector popped from the stack).</p>",
                 "tooltip": "Returns program control from an exception or interrupt handler to a program or procedure that was interrupted by an exception, an external interrupt, or a software-generated interrupt. These instructions are also used to perform a return from a nested task. (A nested task is created when a CALL instruction is used to initiate a task switch or when an interrupt or exception causes a task switch to an interrupt or exception handler.) See the section titled \u201cTask Linking\u201d in Chapter 7 of the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 3A.",
@@ -1395,88 +1395,88 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/JMP.html"
             };
 
+        case "JA":
+        case "JAE":
+        case "JB":
+        case "JBE":
+        case "JC":
         case "JCXZ":
-        case "JPE":
-        case "JNZ":
+        case "JE":
+        case "JECXZ":
         case "JG":
+        case "JGE":
         case "JL":
-        case "JAE":
-        case "JNGE":
-        case "JNO":
-        case "JNS":
-        case "JA":
-        case "JE":
+        case "JLE":
         case "JNA":
-        case "JNG":
-        case "JZ":
+        case "JNAE":
+        case "JNB":
         case "JNBE":
+        case "JNC":
+        case "JNE":
+        case "JNG":
+        case "JNGE":
+        case "JNL":
         case "JNLE":
-        case "JECXZ":
-        case "JGE":
+        case "JNO":
         case "JNP":
+        case "JNS":
+        case "JNZ":
+        case "JO":
+        case "JP":
+        case "JPE":
+        case "JPO":
         case "JRCXZ":
-        case "JNAE":
-        case "JC":
         case "JS":
-        case "JNB":
-        case "JNL":
-        case "JB":
-        case "JPO":
-        case "JP":
-        case "JNC":
-        case "JO":
-        case "JNE":
-        case "JLE":
-        case "JBE":
+        case "JZ":
             return {
                 "html": "<p>Checks the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) and, if the flags are in the specified state (condition), performs a jump to the target instruction specified by the destination operand. A condition code (<em>cc</em>) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, the jump is not performed and execution continues with the instruction following the J<em>cc</em> instruction.</p><p>The target instruction is specified with a relative offset (a signed offset relative to the current value of the instruction pointer in the EIP register). A relative offset (<em>rel8</em>, <em>rel16,</em> or <em>rel32</em>) is generally specified as a label in assembly code, but at the machine code level, it is encoded as a signed, 8-bit or 32-bit immediate value, which is added to the instruction pointer. Instruction coding is most efficient for offsets of \u2013128 to +127. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared, resulting in a maximum instruction pointer size of 16 bits.</p><p>The conditions for each J<em>cc</em> mnemonic are given in the \u201cDescription\u201d column of the table on the preceding page. The terms \u201cless\u201d and \u201cgreater\u201d are used for comparisons of signed integers and the terms \u201cabove\u201d and \u201cbelow\u201d are used for unsigned integers.</p><p>Because a particular state of the status flags can sometimes be interpreted in two ways, two mnemonics are defined for some opcodes. For example, the JA (jump if above) instruction and the JNBE (jump if not below or equal) instruction are alternate mnemonics for the opcode 77H.</p><p>The J<em>cc</em> instruction does not support far jumps (jumps to other code segments). When the target for the conditional jump is in a different segment, use the opposite condition from the condition being tested for the J<em>cc</em> instruction, and then access the target with an unconditional far jump (JMP instruction) to the other segment. For example, the following conditional far jump is illegal:</p>",
                 "tooltip": "Checks the state of one or more of the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) and, if the flags are in the specified state (condition), performs a jump to the target instruction specified by the destination operand. A condition code (cc) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, the jump is not performed and execution continues with the instruction following the Jcc instruction.",
                 "url": "http://www.felixcloutier.com/x86/Jcc.html"
             };
 
-        case "KADDW":
+        case "KADDB":
         case "KADDD":
         case "KADDQ":
-        case "KADDB":
+        case "KADDW":
             return {
                 "html": "<p>Adds the vector mask k2 and the vector mask k3, and writes the result into vector mask k1.</p>",
                 "tooltip": "Adds the vector mask k2 and the vector mask k3, and writes the result into vector mask k1.",
                 "url": "http://www.felixcloutier.com/x86/KADDW%3AKADDB%3AKADDQ%3AKADDD.html"
             };
 
+        case "KANDNB":
+        case "KANDND":
         case "KANDNQ":
         case "KANDNW":
-        case "KANDND":
-        case "KANDNB":
             return {
                 "html": "<p>Performs a bitwise AND NOT between the vector mask k2 and the vector mask k3, and writes the result into vector mask k1.</p>",
                 "tooltip": "Performs a bitwise AND NOT between the vector mask k2 and the vector mask k3, and writes the result into vector mask k1.",
                 "url": "http://www.felixcloutier.com/x86/KANDNW%3AKANDNB%3AKANDNQ%3AKANDND.html"
             };
 
-        case "KANDW":
+        case "KANDB":
         case "KANDD":
         case "KANDQ":
-        case "KANDB":
+        case "KANDW":
             return {
                 "html": "<p>Performs a bitwise AND between the vector mask k2 and the vector mask k3, and writes the result into vector mask k1.</p>",
                 "tooltip": "Performs a bitwise AND between the vector mask k2 and the vector mask k3, and writes the result into vector mask k1.",
                 "url": "http://www.felixcloutier.com/x86/KANDW%3AKANDB%3AKANDQ%3AKANDD.html"
             };
 
+        case "KMOVB":
         case "KMOVD":
         case "KMOVQ":
         case "KMOVW":
-        case "KMOVB":
             return {
                 "html": "<p>Copies values from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be mask registers, memory location or general purpose. The instruction cannot be used to transfer data between general purpose registers and or memory locations.</p><p>When moving to a mask register, the result is zero extended to MAX_KL size (i.e., 64 bits currently). When moving to a general-purpose register (GPR), the result is zero-extended to the size of the destination. In 32-bit mode, the default GPR destination\u2019s size is 32 bits. In 64-bit mode, the default GPR destination\u2019s size is 64 bits. Note that VEX.W can only be used to modify the size of the GPR operand in 64b mode.</p>",
                 "tooltip": "Copies values from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be mask registers, memory location or general purpose. The instruction cannot be used to transfer data between general purpose registers and or memory locations.",
                 "url": "http://www.felixcloutier.com/x86/KMOVW%3AKMOVB%3AKMOVQ%3AKMOVD.html"
             };
 
-        case "KNOTQ":
         case "KNOTB":
         case "KNOTD":
+        case "KNOTQ":
         case "KNOTW":
             return {
                 "html": "<p>Performs a bitwise NOT of vector mask k2 and writes the result into vector mask k1.</p>",
@@ -1484,18 +1484,18 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/KNOTW%3AKNOTB%3AKNOTQ%3AKNOTD.html"
             };
 
-        case "KORTESTQ":
+        case "KORTESTB":
         case "KORTESTD":
+        case "KORTESTQ":
         case "KORTESTW":
-        case "KORTESTB":
             return {
                 "html": "<p>Performs a bitwise OR between the vector mask register k2, and the vector mask register k1, and sets CF and ZF based on the operation result.</p><p>ZF flag is set if both sources are 0x0. CF is set if, after the OR operation is done, the operation result is all 1\u2019s.</p>",
                 "tooltip": "Performs a bitwise OR between the vector mask register k2, and the vector mask register k1, and sets CF and ZF based on the operation result.",
                 "url": "http://www.felixcloutier.com/x86/KORTESTW%3AKORTESTB%3AKORTESTQ%3AKORTESTD.html"
             };
 
-        case "KORD":
         case "KORB":
+        case "KORD":
         case "KORQ":
         case "KORW":
             return {
@@ -1514,20 +1514,20 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/KSHIFTLW%3AKSHIFTLB%3AKSHIFTLQ%3AKSHIFTLD.html"
             };
 
+        case "KSHIFTRB":
         case "KSHIFTRD":
-        case "KSHIFTRW":
         case "KSHIFTRQ":
-        case "KSHIFTRB":
+        case "KSHIFTRW":
             return {
                 "html": "<p>Shifts 8/16/32/64 bits in the second operand (source operand) right by the count specified in immediate and place the least significant 8/16/32/64 bits of the result in the destination operand. The higher bits of the destination are zero-extended. The destination is set to zero if the count value is greater than 7 (for byte shift), 15 (for word shift), 31 (for doubleword shift) or 63 (for quadword shift).</p>",
                 "tooltip": "Shifts 8/16/32/64 bits in the second operand (source operand) right by the count specified in immediate and place the least significant 8/16/32/64 bits of the result in the destination operand. The higher bits of the destination are zero-extended. The destination is set to zero if the count value is greater than 7 (for byte shift), 15 (for word shift), 31 (for doubleword shift) or 63 (for quadword shift).",
                 "url": "http://www.felixcloutier.com/x86/KSHIFTRW%3AKSHIFTRB%3AKSHIFTRQ%3AKSHIFTRD.html"
             };
 
-        case "KTESTW":
+        case "KTESTB":
         case "KTESTD":
         case "KTESTQ":
-        case "KTESTB":
+        case "KTESTW":
             return {
                 "html": "<p>Performs a bitwise comparison of the bits of the first source operand and corresponding bits in the second source operand. If the AND operation produces all zeros, the ZF is set else the ZF is clear. If the bitwise AND operation of the inverted first source operand with the second source operand produces all zeros the CF is set else the CF is clear. Only the EFLAGS register is updated.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Performs a bitwise comparison of the bits of the first source operand and corresponding bits in the second source operand. If the AND operation produces all zeros, the ZF is set else the ZF is clear. If the bitwise AND operation of the inverted first source operand with the second source operand produces all zeros the CF is set else the CF is clear. Only the EFLAGS register is updated.",
@@ -1535,8 +1535,8 @@ export function getAsmOpcode(opcode) {
             };
 
         case "KUNPCKBW":
-        case "KUNPCKWD":
         case "KUNPCKDQ":
+        case "KUNPCKWD":
             return {
                 "html": "<p>Unpacks the lower 8/16/32 bits of the second and third operands (source operands) into the low part of the first operand (destination operand), starting from the low bytes. The result is zero-extended in the destination.</p>",
                 "tooltip": "Unpacks the lower 8/16/32 bits of the second and third operands (source operands) into the low part of the first operand (destination operand), starting from the low bytes. The result is zero-extended in the destination.",
@@ -1553,10 +1553,10 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/KXNORW%3AKXNORB%3AKXNORQ%3AKXNORD.html"
             };
 
-        case "KXORW":
         case "KXORB":
-        case "KXORQ":
         case "KXORD":
+        case "KXORQ":
+        case "KXORW":
             return {
                 "html": "<p>Performs a bitwise XOR between the vector mask k2 and the vector mask k3, and writes the result into vector mask k1 (three-operand form).</p>",
                 "tooltip": "Performs a bitwise XOR between the vector mask k2 and the vector mask k3, and writes the result into vector mask k1 (three-operand form).",
@@ -1586,8 +1586,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/LDMXCSR.html"
             };
 
-        case "LES":
         case "LDS":
+        case "LES":
         case "LFS":
         case "LGS":
         case "LSS":
@@ -1647,20 +1647,20 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/LOCK.html"
             };
 
-        case "LODSW":
-        case "LODSQ":
-        case "LODSD":
         case "LODS":
         case "LODSB":
+        case "LODSD":
+        case "LODSQ":
+        case "LODSW":
             return {
                 "html": "<p>Loads a byte, word, or doubleword from the source operand into the AL, AX, or EAX register, respectively. The source operand is a memory location, the address of which is read from the DS:ESI or the DS:SI registers (depending on the address-size attribute of the instruction, 32 or 16, respectively). The DS segment may be overridden with a segment override prefix.</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the LODS mnemonic) allows the source operand to be specified explicitly. Here, the source operand should be a symbol that indicates the size and location of the source value. The destination operand is then automatically selected to match the size of the source operand (the AL register for byte operands, AX for word operands, and EAX for doubleword operands). This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the source operand symbol must specify the correct <strong>type</strong> (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct <strong>location</strong>. The location is always specified by the DS:(E)SI registers, which must be loaded correctly before the load string instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, and doubleword versions of the LODS instructions. Here also DS:(E)SI is assumed to be the source operand and the AL, AX, or EAX register is assumed to be the destination operand. The size of the source and destination operands is selected with the mnemonic: LODSB (byte loaded into register AL), LODSW (word loaded into AX), or LODSD (doubleword loaded into EAX).</p><p>After the byte, word, or doubleword is transferred from the memory location into the AL, AX, or EAX register, the (E)SI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)SI register is incremented; if the DF flag is 1, the ESI register is decremented.) The (E)SI register is incremented or decremented by 1 for byte operations, by 2 for word operations, or by 4 for doubleword operations.</p><p>In 64-bit mode, use of the REX.W prefix promotes operation to 64 bits. LODS/LODSQ load the quadword at address (R)SI into RAX. The (R)SI register is then incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register.</p>",
                 "tooltip": "Loads a byte, word, or doubleword from the source operand into the AL, AX, or EAX register, respectively. The source operand is a memory location, the address of which is read from the DS:ESI or the DS:SI registers (depending on the address-size attribute of the instruction, 32 or 16, respectively). The DS segment may be overridden with a segment override prefix.",
                 "url": "http://www.felixcloutier.com/x86/LODS%3ALODSB%3ALODSW%3ALODSD%3ALODSQ.html"
             };
 
-        case "LOOPNE":
         case "LOOP":
         case "LOOPE":
+        case "LOOPNE":
             return {
                 "html": "<p>Performs a loop operation using the RCX, ECX or CX register as a counter (depending on whether address size is 64 bits, 32 bits, or 16 bits). Note that the LOOP instruction ignores REX.W; but 64-bit address size can be over-ridden using a 67H prefix.</p><p>Each time the LOOP instruction is executed, the count register is decremented, then checked for 0. If the count is 0, the loop is terminated and program execution continues with the instruction following the LOOP instruction. If the count is not zero, a near jump is performed to the destination (target) operand, which is presumably the instruction at the beginning of the loop.</p><p>The target instruction is specified with a relative offset (a signed offset relative to the current value of the instruction pointer in the IP/EIP/RIP register). This offset is generally specified as a label in assembly code, but at the machine code level, it is encoded as a signed, 8-bit immediate value, which is added to the instruction pointer. Offsets of \u2013128 to +127 are allowed with this instruction.</p><p>Some forms of the loop instruction (LOOP<em>cc</em>) also accept the ZF flag as a condition for terminating the loop before the count reaches zero. With these forms of the instruction, a condition code (<em>cc</em>) is associated with each instruction to indicate the condition being tested for. Here, the LOOP<em>cc</em> instruction itself does not affect the state of the ZF flag; the ZF flag is changed by other instructions in the loop.</p>",
                 "tooltip": "Performs a loop operation using the RCX, ECX or CX register as a counter (depending on whether address size is 64 bits, 32 bits, or 16 bits). Note that the LOOP instruction ignores REX.W; but 64-bit address size can be over-ridden using a 67H prefix.",
@@ -1688,8 +1688,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/LZCNT.html"
             };
 
-        case "VMASKMOVDQU":
         case "MASKMOVDQU":
+        case "VMASKMOVDQU":
             return {
                 "html": "<p>Stores selected bytes from the source operand (first operand) into an 128-bit memory location. The mask operand (second operand) selects which bytes from the source operand are written to memory. The source and mask operands are XMM registers. The memory location specified by the effective address in the DI/EDI/RDI register (the default segment register is DS, but this may be overridden with a segment-override prefix). The memory location does not need to be aligned on a natural boundary. (The size of the store address depends on the address-size attribute.)</p><p>The most significant bit in each byte of the mask operand determines whether the corresponding byte in the source operand is written to the corresponding byte location in memory: 0 indicates no write and 1 indicates write.</p><p>The MASKMOVDQU instruction generates a non-temporal hint to the processor to minimize cache pollution. The non-temporal hint is implemented by using a write combining (WC) memory type protocol (see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10, of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>). Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MASKMOVDQU instructions if multiple processors might use different memory types to read/write the destination memory locations.</p><p>Behavior with a mask of all 0s is as follows:</p><p>The MASKMOVDQU instruction can be used to improve performance of algorithms that need to merge data on a byte-by-byte basis. MASKMOVDQU should not cause a read for ownership; doing so generates unnecessary bandwidth since data is to be written directly using the byte-mask without allocating old data prior to the store.</p>",
                 "tooltip": "Stores selected bytes from the source operand (first operand) into an 128-bit memory location. The mask operand (second operand) selects which bytes from the source operand are written to memory. The source and mask operands are XMM registers. The memory location specified by the effective address in the DI/EDI/RDI register (the default segment register is DS, but this may be overridden with a segment-override prefix). The memory location does not need to be aligned on a natural boundary. (The size of the store address depends on the address-size attribute.)",
@@ -1711,16 +1711,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MAXPD.html"
             };
 
-        case "VMAXPS":
         case "MAXPS":
+        case "VMAXPS":
             return {
                 "html": "<p>Performs a SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second operand (source operand) is returned. If a value in the second operand is an SNaN, then SNaN is forwarded unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second operand (source operand), either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN source operand (from either the first or second operand) be returned, the action of MAXPS can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>EVEX encoded versions: The first source operand (the second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>",
                 "tooltip": "Performs a SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/MAXPS.html"
             };
 
-        case "VMAXSD":
         case "MAXSD":
+        case "VMAXSD":
             return {
                 "html": "<p>Compares the low double-precision floating-point values in the first source operand and the second source operand, and returns the maximum value to the low quadword of the destination operand. The second source operand can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers. When the second source operand is a memory operand, only 64 bits are accessed.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second source operand is returned. If a value in the second source operand is an SNaN, that SNaN is returned unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second source operand, either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN of either source operand be returned, the action of MAXSD can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>128-bit Legacy SSE version: The destination and first source operand are the same. Bits (MAXVL-1:64) of the corresponding destination register remain unchanged.</p><p>VEX.128 and EVEX encoded version: Bits (127:64) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (MAXVL-1:128) of the destination register are zeroed.</p>",
                 "tooltip": "Compares the low double-precision floating-point values in the first source operand and the second source operand, and returns the maximum value to the low quadword of the destination operand. The second source operand can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers. When the second source operand is a memory operand, only 64 bits are accessed.",
@@ -1759,8 +1759,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MINSD.html"
             };
 
-        case "VMINSS":
         case "MINSS":
+        case "VMINSS":
             return {
                 "html": "<p>Compares the low single-precision floating-point values in the first source operand and the second source operand and returns the minimum value to the low doubleword of the destination operand.</p><p>If the values being compared are both 0.0s (of either sign), the value in the second source operand is returned. If a value in the second operand is an SNaN, that SNaN is returned unchanged to the destination (that is, a QNaN version of the SNaN is not returned).</p><p>If only one value is a NaN (SNaN or QNaN) for this instruction, the second source operand, either a NaN or a valid floating-point value, is written to the result. If instead of this behavior, it is required that the NaN in either source operand be returned, the action of MINSD can be emulated using a sequence of instructions, such as, a comparison followed by AND, ANDN and OR.</p><p>The second source operand can be an XMM register or a 32-bit memory location. The first source and destination operands are XMM registers.</p><p>128-bit Legacy SSE version: The destination and first source operand are the same. Bits (MAXVL:32) of the corresponding destination register remain unchanged.</p>",
                 "tooltip": "Compares the low single-precision floating-point values in the first source operand and the second source operand and returns the minimum value to the low doubleword of the destination operand.",
@@ -1774,16 +1774,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MOV.html"
             };
 
-        case "VMOVAPD":
         case "MOVAPD":
+        case "VMOVAPD":
             return {
                 "html": "<p>Moves 2, 4 or 8 double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM, YMM or ZMM register from an 128-bit, 256-bit or 512-bit memory location, to store the contents of an XMM, YMM or ZMM register into a 128-bit, 256-bit or 512-bit memory location, or to move data between two XMM, two YMM or two ZMM registers.</p><p>When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (128-bit versions), 32-byte (256-bit version) or 64-byte (EVEX.512 encoded version) boundary or a general-protection exception (#GP) will be generated. For EVEX encoded versions, the operand must be aligned to the size of the memory operand. To move double-precision floating-point values to and from unaligned memory locations, use the VMOVUPD instruction.</p><p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p><p>EVEX.512 encoded version:</p><p>Moves 512 bits of packed double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a ZMM register from a 512-bit float64 memory location, to store the contents of a ZMM register into a 512-bit float64 memory location, or to move data between two ZMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 64-byte boundary or a general-protection exception (#GP) will be generated. To move single-precision floating-point values to and from unaligned memory locations, use the VMOVUPD instruction.</p>",
                 "tooltip": "Moves 2, 4 or 8 double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM, YMM or ZMM register from an 128-bit, 256-bit or 512-bit memory location, to store the contents of an XMM, YMM or ZMM register into a 128-bit, 256-bit or 512-bit memory location, or to move data between two XMM, two YMM or two ZMM registers.",
                 "url": "http://www.felixcloutier.com/x86/MOVAPD.html"
             };
 
-        case "VMOVAPS":
         case "MOVAPS":
+        case "VMOVAPS":
             return {
                 "html": "<p>Moves 4, 8 or 16 single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM, YMM or ZMM register from an 128-bit, 256-bit or 512-bit memory location, to store the contents of an XMM, YMM or ZMM register into a 128-bit, 256-bit or 512-bit memory location, or to move data between two XMM, two YMM or two ZMM registers.</p><p>When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (128-bit version), 32-byte (VEX.256 encoded version) or 64-byte (EVEX.512 encoded version) boundary or a general-protection exception (#GP) will be generated. For EVEX.512 encoded versions, the operand must be aligned to the size of the memory operand. To move single-precision floating-point values to and from unaligned memory locations, use the VMOVUPS instruction.</p><p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p><p>EVEX.512 encoded version:</p><p>Moves 512 bits of packed single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a ZMM register from a 512-bit float32 memory location, to store the contents of a ZMM register into a float32 memory location, or to move data between two ZMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 64-byte boundary or a general-protection exception (#GP) will be generated. To move single-precision floating-point values to and from unaligned memory locations, use the VMOVUPS instruction.</p>",
                 "tooltip": "Moves 4, 8 or 16 single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM, YMM or ZMM register from an 128-bit, 256-bit or 512-bit memory location, to store the contents of an XMM, YMM or ZMM register into a 128-bit, 256-bit or 512-bit memory location, or to move data between two XMM, two YMM or two ZMM registers.",
@@ -1797,18 +1797,18 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MOVBE.html"
             };
 
-        case "VMOVQ":
-        case "VMOVD":
-        case "MOVQ":
         case "MOVD":
+        case "MOVQ":
+        case "VMOVD":
+        case "VMOVQ":
             return {
                 "html": "<p>Copies a doubleword from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be general-purpose registers, MMX technology registers, XMM registers, or 32-bit memory locations. This instruction can be used to move a doubleword to and from the low doubleword of an MMX technology register and a general-purpose register or a 32-bit memory location, or to and from the low doubleword of an XMM register and a general-purpose register or a 32-bit memory location. The instruction cannot be used to transfer data between MMX technology registers, between XMM registers, between general-purpose registers, or between memory locations.</p><p>When the destination operand is an MMX technology register, the source operand is written to the low doubleword of the register, and the register is zero-extended to 64 bits. When the destination operand is an XMM register, the source operand is written to the low doubleword of the register, and the register is zero-extended to 128 bits.</p><p>In 64-bit mode, the instruction\u2019s default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p><p>Moves a dword/qword integer from the source operand and stores it in the low 32/64-bits of the destination XMM register. The upper bits of the destination are zeroed. The source operand can be a 32/64-bit register or 32/64-bit memory location.</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged. Qword operation requires the use of REX.W=1.</p>",
                 "tooltip": "Copies a doubleword from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be general-purpose registers, MMX technology registers, XMM registers, or 32-bit memory locations. This instruction can be used to move a doubleword to and from the low doubleword of an MMX technology register and a general-purpose register or a 32-bit memory location, or to and from the low doubleword of an XMM register and a general-purpose register or a 32-bit memory location. The instruction cannot be used to transfer data between MMX technology registers, between XMM registers, between general-purpose registers, or between memory locations.",
                 "url": "http://www.felixcloutier.com/x86/MOVD%3AMOVQ.html"
             };
 
-        case "VMOVDDUP":
         case "MOVDDUP":
+        case "VMOVDDUP":
             return {
                 "html": "<p>For 256-bit or higher versions: Duplicates even-indexed double-precision floating-point values from the source operand (the second operand) and into adjacent pair and store to the destination operand (the first operand).</p><p>For 128-bit versions: Duplicates the low double-precision floating-point value from the source operand (the second operand) and store to the destination operand (the first operand).</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding destination register are unchanged. The source operand is XMM register or a 64-bit memory location.</p><p>VEX.128 and EVEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are zeroed. The source operand is XMM register or a 64-bit memory location. The destination is updated conditionally under the writemask for EVEX version.</p><p>VEX.256 and EVEX.256 encoded version: Bits (MAXVL-1:256) of the destination register are zeroed. The source operand is YMM register or a 256-bit memory location. The destination is updated conditionally under the writemask for EVEX version.</p>",
                 "tooltip": "For 256-bit or higher versions: Duplicates even-indexed double-precision floating-point values from the source operand (the second operand) and into adjacent pair and store to the destination operand (the first operand).",
@@ -1829,22 +1829,22 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MOVDIRI.html"
             };
 
-        case "VMOVDQA32":
         case "MOVDQA":
-        case "VMOVDQA64":
         case "VMOVDQA":
+        case "VMOVDQA32":
+        case "VMOVDQA64":
             return {
                 "html": "<p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p><p>EVEX encoded versions:</p><p>Moves 128, 256 or 512 bits of packed doubleword/quadword integer values from the source operand (the second operand) to the destination operand (the first operand). This instruction can be used to load a vector register from an int32/int64 memory location, to store the contents of a vector register into an int32/int64 memory location, or to move data between two ZMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 16 (EVEX.128)/32(EVEX.256)/64(EVEX.512)-byte boundary or a general-protection exception (#GP) will be generated. To move integer data to and from unaligned memory locations, use the VMOVDQU instruction.</p><p>The destination operand is updated at 32-bit (VMOVDQA32) or 64-bit (VMOVDQA64) granularity according to the writemask.</p><p>VEX.256 encoded version:</p>",
                 "tooltip": "Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.",
                 "url": "http://www.felixcloutier.com/x86/MOVDQA%3AVMOVDQA32%3AVMOVDQA64.html"
             };
 
+        case "MOVDQU":
         case "VMOVDQU":
+        case "VMOVDQU16":
         case "VMOVDQU32":
         case "VMOVDQU64":
-        case "MOVDQU":
         case "VMOVDQU8":
-        case "VMOVDQU16":
             return {
                 "html": "<p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p><p><strong>EVEX encoded versions:</strong></p><p>Moves 128, 256 or 512 bits of packed byte/word/doubleword/quadword integer values from the source operand (the second operand) to the destination operand (first operand). This instruction can be used to load a vector register from a memory location, to store the contents of a vector register into a memory location, or to move data between two vector registers.</p><p>The destination operand is updated at 8-bit (VMOVDQU8), 16-bit (VMOVDQU16), 32-bit (VMOVDQU32), or 64-bit (VMOVDQU64) granularity according to the writemask.</p><p><strong>VEX.256 encoded version:</strong></p>",
                 "tooltip": "Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.",
@@ -1915,8 +1915,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MOVMSKPS.html"
             };
 
-        case "VMOVNTDQ":
         case "MOVNTDQ":
+        case "VMOVNTDQ":
             return {
                 "html": "<p>Moves the packed integers in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register, YMM register or ZMM register, which is assumed to contain integer data (packed bytes, words, double-words, or quadwords). The destination operand is a 128-bit, 256-bit or 512-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version), 32-byte (VEX.256 encoded version) or 64-byte (512-bit version) boundary otherwise a general-protection exception (#GP) will be generated.</p><p>The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region. For more information on non-temporal stores, see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10 in the IA-32 Intel Architecture Software Developer\u2019s Manual, Volume 1.</p><p>Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with VMOVNTDQ instructions if multiple processors might use different memory types to read/write the destination memory locations.</p><p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, VEX.L must be 0; otherwise instructions will #UD.</p>",
                 "tooltip": "Moves the packed integers in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register, YMM register or ZMM register, which is assumed to contain integer data (packed bytes, words, double-words, or quadwords). The destination operand is a 128-bit, 256-bit or 512-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version), 32-byte (VEX.256 encoded version) or 64-byte (512-bit version) boundary otherwise a general-protection exception (#GP) will be generated.",
@@ -1938,8 +1938,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MOVNTI.html"
             };
 
-        case "VMOVNTPD":
         case "MOVNTPD":
+        case "VMOVNTPD":
             return {
                 "html": "<p>Moves the packed double-precision floating-point values in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register, YMM register or ZMM register, which is assumed to contain packed double-precision, floating-pointing data. The destination operand is a 128-bit, 256-bit or 512-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version), 32-byte (VEX.256 encoded version) or 64-byte (EVEX.512 encoded version) boundary otherwise a general-protection exception (#GP) will be generated.</p><p>The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region. For more information on non-temporal stores, see \u201cCaching of Temporal vs. Non-Temporal Data\u201d in Chapter 10 in the IA-32 Intel Architecture Software Developer\u2019s Manual, Volume 1.</p><p>Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MOVNTPD instructions if multiple processors might use different memory types to read/write the destination memory locations.</p><p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, VEX.L must be 0; otherwise instructions will #UD.</p>",
                 "tooltip": "Moves the packed double-precision floating-point values in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register, YMM register or ZMM register, which is assumed to contain packed double-precision, floating-pointing data. The destination operand is a 128-bit, 256-bit or 512-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version), 32-byte (VEX.256 encoded version) or 64-byte (EVEX.512 encoded version) boundary otherwise a general-protection exception (#GP) will be generated.",
@@ -1968,9 +1968,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MOVQ2DQ.html"
             };
 
-        case "MOVSD":
-        case "MOVSB":
         case "MOVS":
+        case "MOVSB":
+        case "MOVSD":
         case "MOVSQ":
         case "MOVSW":
             return {
@@ -1979,24 +1979,24 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MOVS%3AMOVSB%3AMOVSW%3AMOVSD%3AMOVSQ.html"
             };
 
-        case "VMOVSHDUP":
         case "MOVSHDUP":
+        case "VMOVSHDUP":
             return {
                 "html": "<p>Duplicates odd-indexed single-precision floating-point values from the source operand (the second operand) to adjacent element pair in the destination operand (the first operand). See <a href=\"http://www.felixcloutier.com/x86/MOVSHDUP.html#fig-4-3\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-3</a>. The source operand is an XMM, YMM or ZMM register or 128, 256 or 512-bit memory location and the destination operand is an XMM, YMM or ZMM register.</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are zeroed.</p><p>VEX.256 encoded version: Bits (MAXVL-1:256) of the destination register are zeroed.</p><p>EVEX encoded version: The destination operand is updated at 32-bit granularity according to the writemask.</p>",
                 "tooltip": "Duplicates odd-indexed single-precision floating-point values from the source operand (the second operand) to adjacent element pair in the destination operand (the first operand). See Figure 4-3. The source operand is an XMM, YMM or ZMM register or 128, 256 or 512-bit memory location and the destination operand is an XMM, YMM or ZMM register.",
                 "url": "http://www.felixcloutier.com/x86/MOVSHDUP.html"
             };
 
-        case "VMOVSLDUP":
         case "MOVSLDUP":
+        case "VMOVSLDUP":
             return {
                 "html": "<p>Duplicates even-indexed single-precision floating-point values from the source operand (the second operand). See <a href=\"http://www.felixcloutier.com/x86/MOVSLDUP.html#fig-4-4\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-4</a>. The source operand is an XMM, YMM or ZMM register or 128, 256 or 512-bit memory location and the destination operand is an XMM, YMM or ZMM register.</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are zeroed.</p><p>VEX.256 encoded version: Bits (MAXVL-1:256) of the destination register are zeroed.</p><p>EVEX encoded version: The destination operand is updated at 32-bit granularity according to the writemask.</p>",
                 "tooltip": "Duplicates even-indexed single-precision floating-point values from the source operand (the second operand). See Figure 4-4. The source operand is an XMM, YMM or ZMM register or 128, 256 or 512-bit memory location and the destination operand is an XMM, YMM or ZMM register.",
                 "url": "http://www.felixcloutier.com/x86/MOVSLDUP.html"
             };
 
-        case "VMOVSS":
         case "MOVSS":
+        case "VMOVSS":
             return {
                 "html": "<p>Moves a scalar single-precision floating-point value from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be XMM registers or 32-bit memory locations. This instruction can be used to move a single-precision floating-point value to and from the low doubleword of an XMM register and a 32-bit memory location, or to move a single-precision floating-point value between the low doublewords of two XMM registers. The instruction cannot be used to transfer data between memory locations.</p><p>Legacy version: When the source and destination operands are XMM registers, bits (MAXVL-1:32) of the corresponding destination register are unmodified. When the source operand is a memory location and destination operand is an XMM registers, Bits (127:32) of the destination operand is cleared to all 0s, bits MAXVL:128 of the destination operand remains unchanged.</p><p>VEX and EVEX encoded register-register syntax: Moves a scalar single-precision floating-point value from the second source operand (the third operand) to the low doubleword element of the destination operand (the first operand). Bits 127:32 of the destination operand are copied from the first source operand (the second operand). Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX and EVEX encoded memory load syntax: When the source operand is a memory location and destination operand is an XMM registers, bits MAXVL:32 of the destination operand is cleared to all 0s.</p><p>EVEX encoded versions: The low doubleword of the destination is updated according to the writemask.</p>",
                 "tooltip": "Moves a scalar single-precision floating-point value from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be XMM registers or 32-bit memory locations. This instruction can be used to move a single-precision floating-point value to and from the low doubleword of an XMM register and a 32-bit memory location, or to move a single-precision floating-point value between the low doublewords of two XMM registers. The instruction cannot be used to transfer data between memory locations.",
@@ -2011,16 +2011,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MOVSX%3AMOVSXD.html"
             };
 
-        case "VMOVUPD":
         case "MOVUPD":
+        case "VMOVUPD":
             return {
                 "html": "<p>Note: VEX.vvvv and EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p><p><strong>EVEX.512 encoded version:</strong></p><p>Moves 512 bits of packed double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a ZMM register from a float64 memory location, to store the contents of a ZMM register into a memory. The destination operand is updated according to the writemask.</p><p><strong>VEX.256 encoded version:</strong></p><p>Moves 256 bits of packed double-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a YMM register from a 256-bit memory location, to store the contents of a YMM register into a 256-bit memory location, or to move data between two YMM registers. Bits (MAXVL-1:256) of the destination register are zeroed.</p>",
                 "tooltip": "Note: VEX.vvvv and EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.",
                 "url": "http://www.felixcloutier.com/x86/MOVUPD.html"
             };
 
-        case "VMOVUPS":
         case "MOVUPS":
+        case "VMOVUPS":
             return {
                 "html": "<p>Note: VEX.vvvv and EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p><p><strong>EVEX.512 encoded version:</strong></p><p>Moves 512 bits of packed single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a ZMM register from a 512-bit float32 memory location, to store the contents of a ZMM register into memory. The destination operand is updated according to the writemask.</p><p><strong>VEX.256 and EVEX.256 encoded versions:</strong></p><p>Moves 256 bits of packed single-precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a YMM register from a 256-bit memory location, to store the contents of a YMM register into a 256-bit memory location, or to move data between two YMM registers. Bits (MAXVL-1:256) of the destination register are zeroed.</p>",
                 "tooltip": "Note: VEX.vvvv and EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.",
@@ -2049,16 +2049,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MUL.html"
             };
 
-        case "VMULPD":
         case "MULPD":
+        case "VMULPD":
             return {
                 "html": "<p>Multiply packed double-precision floating-point values from the first source operand with corresponding values in the second source operand, and stores the packed double-precision floating-point results in the destination operand.</p><p>EVEX encoded versions: The first source operand (the second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register. Bits (MAXVL-1:256) of the corresponding destination ZMM register are zeroed.</p><p>VEX.128 encoded version: The first source operand is a XMM register. The second source operand can be a XMM register or a 128-bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the destination YMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>",
                 "tooltip": "Multiply packed double-precision floating-point values from the first source operand with corresponding values in the second source operand, and stores the packed double-precision floating-point results in the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/MULPD.html"
             };
 
-        case "VMULPS":
         case "MULPS":
+        case "VMULPS":
             return {
                 "html": "<p>Multiply the packed single-precision floating-point values from the first source operand with the corresponding values in the second source operand, and stores the packed double-precision floating-point results in the destination operand.</p><p>EVEX encoded versions: The first source operand (the second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register. Bits (MAXVL-1:256) of the corresponding destination ZMM register are zeroed.</p><p>VEX.128 encoded version: The first source operand is a XMM register. The second source operand can be a XMM register or a 128-bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the destination YMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>",
                 "tooltip": "Multiply the packed single-precision floating-point values from the first source operand with the corresponding values in the second source operand, and stores the packed double-precision floating-point results in the destination operand.",
@@ -2073,8 +2073,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/MULSD.html"
             };
 
-        case "VMULSS":
         case "MULSS":
+        case "VMULSS":
             return {
                 "html": "<p>Multiplies the low single-precision floating-point value from the second source operand by the low single-precision floating-point value in the first source operand, and stores the single-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 32-bit memory location. The first source operand and the destination operands are XMM registers.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (MAXVL-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 and EVEX encoded version: The first source operand is an xmm register encoded by VEX.vvvv. The three high-order doublewords of the destination operand are copied from the first source operand. Bits (MAXVL-1:128) of the destination register are zeroed.</p><p>EVEX encoded version: The low doubleword element of the destination operand is updated according to the writemask.</p><p>Software should ensure VMULSS is encoded with VEX.L=0. Encoding VMULSS with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>",
                 "tooltip": "Multiplies the low single-precision floating-point value from the second source operand by the low single-precision floating-point value in the first source operand, and stores the single-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 32-bit memory location. The first source operand and the destination operands are XMM registers.",
@@ -2123,16 +2123,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/OR.html"
             };
 
-        case "VORPD":
         case "ORPD":
+        case "VORPD":
             return {
                 "html": "<p>Performs a bitwise logical OR of the two, four or eight packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p><p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>",
                 "tooltip": "Performs a bitwise logical OR of the two, four or eight packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/ORPD.html"
             };
 
-        case "VORPS":
         case "ORPS":
+        case "VORPS":
             return {
                 "html": "<p>Performs a bitwise logical OR of the four, eight or sixteen packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p><p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>",
                 "tooltip": "Performs a bitwise logical OR of the four, eight or sixteen packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand",
@@ -2146,22 +2146,23 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/OUT.html"
             };
 
-        case "OUTSB":
-        case "OUTSW":
         case "OUTS":
+        case "OUTSB":
         case "OUTSD":
+        case "OUTSW":
             return {
                 "html": "<p>Copies data from the source operand (second operand) to the I/O port specified with the destination operand (first operand). The source operand is a memory location, the address of which is read from either the DS:SI, DS:ESI or the RSI registers (depending on the address-size attribute of the instruction, 16, 32 or 64, respectively). (The DS segment may be overridden with a segment override prefix.) The destination operand is an I/O port address (from 0 to 65,535) that is read from the DX register. The size of the I/O port being accessed (that is, the size of the source and destination operands) is determined by the opcode for an 8-bit I/O port or by the operand-size attribute of the instruction for a 16- or 32-bit I/O port.</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the OUTS mnemonic) allows the source and destination operands to be specified explicitly. Here, the source operand should be a symbol that indicates the size of the I/O port and the source address, and the destination operand must be DX. This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the source operand symbol must specify the correct <strong>type</strong> (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct <strong>location</strong>. The location is always specified by the DS:(E)SI or RSI registers, which must be loaded correctly before the OUTS instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, and doubleword versions of the OUTS instructions. Here also DS:(E)SI is assumed to be the source operand and DX is assumed to be the destination operand. The size of the I/O port is specified with the choice of mnemonic: OUTSB (byte), OUTSW (word), or OUTSD (doubleword).</p><p>After the byte, word, or doubleword is transferred from the memory location to the I/O port, the SI/ESI/RSI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)SI register is incremented; if the DF flag is 1, the SI/ESI/RSI register is decremented.) The SI/ESI/RSI register is incremented or decremented by 1 for byte operations, by 2 for word operations, and by 4 for doubleword operations.</p><p>The OUTS, OUTSB, OUTSW, and OUTSD instructions can be preceded by the REP prefix for block input of ECX bytes, words, or doublewords. See \u201cREP/REPE/REPZ /REPNE/REPNZ\u2014Repeat String Operation Prefix\u201d in this chapter for a description of the REP prefix. This instruction is only useful for accessing I/O ports located in the processor\u2019s I/O address space. See Chapter 18, \u201cInput/Output,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more information on accessing I/O ports in the I/O address space.</p>",
                 "tooltip": "Copies data from the source operand (second operand) to the I/O port specified with the destination operand (first operand). The source operand is a memory location, the address of which is read from either the DS:SI, DS:ESI or the RSI registers (depending on the address-size attribute of the instruction, 16, 32 or 64, respectively). (The DS segment may be overridden with a segment override prefix.) The destination operand is an I/O port address (from 0 to 65,535) that is read from the DX register. The size of the I/O port being accessed (that is, the size of the source and destination operands) is determined by the opcode for an 8-bit I/O port or by the operand-size attribute of the instruction for a 16- or 32-bit I/O port.",
                 "url": "http://www.felixcloutier.com/x86/OUTS%3AOUTSB%3AOUTSW%3AOUTSD.html"
             };
 
-        case "VPABSD":
-        case "VPABSB":
-        case "VPABSW":
-        case "PABSW":
         case "PABSB":
         case "PABSD":
+        case "PABSW":
+        case "VPABSB":
+        case "VPABSD":
+        case "VPABSQ":
+        case "VPABSW":
             return {
                 "html": "<p>PABSB/W/D computes the absolute value of each data element of the source operand (the second operand) and stores the UNSIGNED results in the destination operand (the first operand). PABSB operates on signed bytes, PABSW operates on signed 16-bit words, and PABSD operates on signed 32-bit integers.</p><p>EVEX encoded VPABSD/Q: The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination operand is a ZMM/YMM/XMM register updated according to the writemask.</p><p>EVEX encoded VPABSB/W: The source operand is a ZMM/YMM/XMM register, or a 512/256/128-bit memory location. The destination operand is a ZMM/YMM/XMM register updated according to the writemask.</p><p>VEX.256 encoded versions: The source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding register destination are zeroed.</p><p>VEX.128 encoded versions: The source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding register destination are zeroed.</p>",
                 "tooltip": "PABSB/W/D computes the absolute value of each data element of the source operand (the second operand) and stores the UNSIGNED results in the destination operand (the first operand). PABSB operates on signed bytes, PABSW operates on signed 16-bit words, and PABSD operates on signed 32-bit integers.",
@@ -2169,8 +2170,8 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PACKSSDW":
-        case "VPACKSSDW":
         case "PACKSSWB":
+        case "VPACKSSDW":
         case "VPACKSSWB":
             return {
                 "html": "<p>Converts packed signed word integers into packed signed byte integers (PACKSSWB) or converts packed signed doubleword integers into packed signed word integers (PACKSSDW), using saturation to handle overflow conditions. See <a href=\"http://www.felixcloutier.com/x86/PACKSSWB:PACKSSDW.html#fig-4-6\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-6</a> for an example of the packing operation.</p><p>PACKSSWB converts packed signed word integers in the first and second source operands into packed signed byte integers using signed saturation to handle overflow conditions beyond the range of signed byte integers. If the signed word value is beyond the range of a signed byte value (i.e., greater than 7FH or less than 80H), the saturated signed byte integer value of 7FH or 80H, respectively, is stored in the destination. PACKSSDW converts packed signed doubleword integers in the first and second source operands into packed signed word integers using signed saturation to handle overflow conditions beyond 7FFFH and 8000H.</p><p>EVEX encoded PACKSSWB: The first source operand is a ZMM/YMM/XMM register. The second source operand is a ZMM/YMM/XMM register or a 512/256/128-bit memory location. The destination operand is a ZMM/YMM/XMM register, updated conditional under the writemask k1.</p><p>EVEX encoded PACKSSDW: The first source operand is a ZMM/YMM/XMM register. The second source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register, updated conditional under the writemask k1.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>",
@@ -2195,41 +2196,41 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PADDB":
-        case "VPADDD":
+        case "PADDD":
         case "PADDQ":
+        case "PADDW":
         case "VPADDB":
-        case "VPADDW":
+        case "VPADDD":
         case "VPADDQ":
-        case "PADDW":
-        case "PADDD":
+        case "VPADDW":
             return {
                 "html": "<p>Performs a SIMD add of the packed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See <span class=\"not-imported\">Figure 9-4</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs.</p><p>The PADDB and VPADDB instructions add packed byte integers from the first source operand and second source operand and store the packed integer results in the destination operand. When an individual result is too large to be represented in 8 bits (overflow), the result is wrapped around and the low 8 bits are written to the destination operand (that is, the carry is ignored).</p><p>The PADDW and VPADDW instructions add packed word integers from the first source operand and second source operand and store the packed integer results in the destination operand. When an individual result is too large to be represented in 16 bits (overflow), the result is wrapped around and the low 16 bits are written to the destination operand (that is, the carry is ignored).</p><p>The PADDD and VPADDD instructions add packed doubleword integers from the first source operand and second source operand and store the packed integer results in the destination operand. When an individual result is too large to be represented in 32 bits (overflow), the result is wrapped around and the low 32 bits are written to the destination operand (that is, the carry is ignored).</p><p>The PADDQ and VPADDQ instructions add packed quadword integers from the first source operand and second source operand and store the packed integer results in the destination operand. When a quadword result is too</p>",
                 "tooltip": "Performs a SIMD add of the packed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs.",
                 "url": "http://www.felixcloutier.com/x86/PADDB%3APADDW%3APADDD%3APADDQ.html"
             };
 
-        case "VPADDSW":
-        case "PADDSW":
         case "PADDSB":
+        case "PADDSW":
         case "VPADDSB":
+        case "VPADDSW":
             return {
                 "html": "<p>Performs a SIMD add of the packed signed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See <span class=\"not-imported\">Figure 9-4</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with signed saturation, as described in the following paragraphs.</p><p>(V)PADDSB performs a SIMD add of the packed signed integers with saturation from the first source operand and second source operand and stores the packed integer results in the destination operand. When an individual byte result is beyond the range of a signed byte integer (that is, greater than 7FH or less than 80H), the saturated value of 7FH or 80H, respectively, is written to the destination operand.</p><p>(V)PADDSW performs a SIMD add of the packed signed word integers with saturation from the first source operand and second source operand and stores the packed integer results in the destination operand. When an individual word result is beyond the range of a signed word integer (that is, greater than 7FFFH or less than 8000H), the saturated value of 7FFFH or 8000H, respectively, is written to the destination operand.</p><p>EVEX encoded versions: The first source operand is an ZMM/YMM/XMM register. The second source operand is an ZMM/YMM/XMM register or a memory location. The destination operand is an ZMM/YMM/XMM register.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD add of the packed signed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with signed saturation, as described in the following paragraphs.",
                 "url": "http://www.felixcloutier.com/x86/PADDSB%3APADDSW.html"
             };
 
-        case "PADDUSW":
         case "PADDUSB":
-        case "VPADDUSW":
+        case "PADDUSW":
         case "VPADDUSB":
+        case "VPADDUSW":
             return {
                 "html": "<p>Performs a SIMD add of the packed unsigned integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See <span class=\"not-imported\">Figure 9-4</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with unsigned saturation, as described in the following paragraphs.</p><p>(V)PADDUSB performs a SIMD add of the packed unsigned integers with saturation from the first source operand and second source operand and stores the packed integer results in the destination operand. When an individual byte result is beyond the range of an unsigned byte integer (that is, greater than FFH), the saturated value of FFH is written to the destination operand.</p><p>(V)PADDUSW performs a SIMD add of the packed unsigned word integers with saturation from the first source operand and second source operand and stores the packed integer results in the destination operand. When an individual word result is beyond the range of an unsigned word integer (that is, greater than FFFFH), the saturated value of FFFFH is written to the destination operand.</p><p>EVEX encoded versions: The first source operand is an ZMM/YMM/XMM register. The second source operand is an ZMM/YMM/XMM register or a 512/256/128-bit memory location. The destination is an ZMM/YMM/XMM register.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Performs a SIMD add of the packed unsigned integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with unsigned saturation, as described in the following paragraphs.",
                 "url": "http://www.felixcloutier.com/x86/PADDUSB%3APADDUSW.html"
             };
 
-        case "VPALIGNR":
         case "PALIGNR":
+        case "VPALIGNR":
             return {
                 "html": "<p>(V)PALIGNR concatenates the destination operand (the first operand) and the source operand (the second operand) into an intermediate composite, shifts the composite at byte granularity to the right by a constant immediate, and extracts the right-aligned result into the destination. The first and the second operands can be an MMX,</p><p>XMM or a YMM register. The immediate value is considered unsigned. Immediate shift counts larger than the 2L (i.e. 32 for 128-bit operands, or 16 for 64-bit operands) produce a zero result. Both operands can be MMX registers, XMM registers or YMM registers. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>In 64-bit mode and not encoded by VEX/EVEX prefix, use the REX prefix to access additional registers.</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>EVEX.512 encoded version: The first source operand is a ZMM register and contains four 16-byte blocks. The second source operand is a ZMM register or a 512-bit memory location containing four 16-byte block. The destination operand is a ZMM register and contain four 16-byte results. The imm8[7:0] is the common shift count</p>",
                 "tooltip": "(V)PALIGNR concatenates the destination operand (the first operand) and the source operand (the second operand) into an intermediate composite, shifts the composite at byte granularity to the right by a constant immediate, and extracts the right-aligned result into the destination. The first and the second operands can be an MMX",
@@ -2237,9 +2238,9 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PAND":
+        case "VPAND":
         case "VPANDD":
         case "VPANDQ":
-        case "VPAND":
             return {
                 "html": "<p>Performs a bitwise logical AND operation on the first source operand and second source operand and stores the result in the destination operand. Each bit of the result is set to 1 if the corresponding bits of the first and second operands are 1, otherwise it is set to 0.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1 at 32/64-bit granularity.</p>",
                 "tooltip": "Performs a bitwise logical AND operation on the first source operand and second source operand and stores the result in the destination operand. Each bit of the result is set to 1 if the corresponding bits of the first and second operands are 1, otherwise it is set to 0.",
@@ -2247,9 +2248,9 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PANDN":
-        case "VPANDNQ":
         case "VPANDN":
         case "VPANDND":
+        case "VPANDNQ":
             return {
                 "html": "<p>Performs a bitwise logical NOT operation on the first source operand, then performs bitwise AND with second source operand and stores the result in the destination operand. Each bit of the result is set to 1 if the corresponding bit in the first operand is 0 and the corresponding bit in the second operand is 1, otherwise it is set to 0.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1 at 32/64-bit granularity.</p>",
                 "tooltip": "Performs a bitwise logical NOT operation on the first source operand, then performs bitwise AND with second source operand and stores the result in the destination operand. Each bit of the result is set to 1 if the corresponding bit in the first operand is 0 and the corresponding bit in the second operand is 1, otherwise it is set to 0.",
@@ -2263,10 +2264,10 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PAUSE.html"
             };
 
-        case "VPAVGW":
-        case "PAVGW":
         case "PAVGB":
+        case "PAVGW":
         case "VPAVGB":
+        case "VPAVGW":
             return {
                 "html": "<p>Performs a SIMD average of the packed unsigned integers from the source operand (second operand) and the destination operand (first operand), and stores the results in the destination operand. For each corresponding pair of data elements in the first and second operands, the elements are added together, a 1 is added to the temporary sum, and that result is shifted right one bit position.</p><p>The (V)PAVGB instruction operates on packed unsigned bytes and the (V)PAVGW instruction operates on packed unsigned words.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source operand is an XMM register. The second operand can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>",
                 "tooltip": "Performs a SIMD average of the packed unsigned integers from the source operand (second operand) and the destination operand (first operand), and stores the results in the destination operand. For each corresponding pair of data elements in the first and second operands, the elements are added together, a 1 is added to the temporary sum, and that result is shifted right one bit position.",
@@ -2281,8 +2282,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PBLENDVB.html"
             };
 
-        case "VPBLENDW":
         case "PBLENDW":
+        case "VPBLENDW":
             return {
                 "html": "<p>Words from the source operand (second operand) are conditionally written to the destination operand (first operand) depending on bits in the immediate operand (third operand). The immediate bits (bits 7:0) form a mask that determines whether the corresponding word in the destination is copied from the source. If a bit in the mask, corresponding to a word, is \u201c1\", then the word is copied, else the word element in the destination operand is unchanged.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>",
                 "tooltip": "Words from the source operand (second operand) are conditionally written to the destination operand (first operand) depending on bits in the immediate operand (third operand). The immediate bits (bits 7:0) form a mask that determines whether the corresponding word in the destination is copied from the source. If a bit in the mask, corresponding to a word, is \u201c1\", then the word is copied, else the word element in the destination operand is unchanged.",
@@ -2297,20 +2298,20 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PCLMULQDQ.html"
             };
 
+        case "PCMPEQB":
         case "PCMPEQD":
-        case "VPCMPEQD":
-        case "VPCMPEQW":
         case "PCMPEQW":
-        case "PCMPEQB":
         case "VPCMPEQB":
+        case "VPCMPEQD":
+        case "VPCMPEQW":
             return {
                 "html": "<p>Performs a SIMD compare for equality of the packed bytes, words, or doublewords in the destination operand (first operand) and the source operand (second operand). If a pair of data elements is equal, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s.</p><p>The (V)PCMPEQB instruction compares the corresponding bytes in the destination and source operands; the (V)PCMPEQW instruction compares the corresponding words in the destination and source operands; and the (V)PCMPEQD instruction compares the corresponding doublewords in the destination and source operands.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p>",
                 "tooltip": "Performs a SIMD compare for equality of the packed bytes, words, or doublewords in the destination operand (first operand) and the source operand (second operand). If a pair of data elements is equal, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s.",
                 "url": "http://www.felixcloutier.com/x86/PCMPEQB%3APCMPEQW%3APCMPEQD.html"
             };
 
-        case "VPCMPEQQ":
         case "PCMPEQQ":
+        case "VPCMPEQQ":
             return {
                 "html": "<p>Performs an SIMD compare for equality of the packed quadwords in the destination operand (first operand) and the source operand (second operand). If a pair of data elements is equal, the corresponding data element in the destination is set to all 1s; otherwise, it is set to 0s.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand can be an XMM register or a 128-bit memory location. The first source and destination operands are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p><p>EVEX encoded VPCMPEQQ: The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand (first operand) is a mask register updated according to the writemask k2.</p>",
                 "tooltip": "Performs an SIMD compare for equality of the packed quadwords in the destination operand (first operand) and the source operand (second operand). If a pair of data elements is equal, the corresponding data element in the destination is set to all 1s; otherwise, it is set to 0s.",
@@ -2331,20 +2332,20 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PCMPESTRM.html"
             };
 
-        case "VPCMPGTD":
-        case "PCMPGTD":
-        case "VPCMPGTW":
         case "PCMPGTB":
-        case "VPCMPGTB":
+        case "PCMPGTD":
         case "PCMPGTW":
+        case "VPCMPGTB":
+        case "VPCMPGTD":
+        case "VPCMPGTW":
             return {
                 "html": "<p>Performs an SIMD signed compare for the greater value of the packed byte, word, or doubleword integers in the destination operand (first operand) and the source operand (second operand). If a data element in the destination operand is greater than the corresponding date element in the source operand, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s.</p><p>The PCMPGTB instruction compares the corresponding signed byte integers in the destination and source operands; the PCMPGTW instruction compares the corresponding signed word integers in the destination and source operands; and the PCMPGTD instruction compares the corresponding signed doubleword integers in the destination and source operands.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source operand and destination operand are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p>",
                 "tooltip": "Performs an SIMD signed compare for the greater value of the packed byte, word, or doubleword integers in the destination operand (first operand) and the source operand (second operand). If a data element in the destination operand is greater than the corresponding date element in the source operand, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s.",
                 "url": "http://www.felixcloutier.com/x86/PCMPGTB%3APCMPGTW%3APCMPGTD.html"
             };
 
-        case "VPCMPGTQ":
         case "PCMPGTQ":
+        case "VPCMPGTQ":
             return {
                 "html": "<p>Performs an SIMD signed compare for the packed quadwords in the destination operand (first operand) and the source operand (second operand). If the data element in the first (destination) operand is greater than the corresponding element in the second (source) operand, the corresponding data element in the destination is set to all 1s; otherwise, it is set to 0s.</p><p>128-bit Legacy SSE version: The second source operand can be an XMM register or a 128-bit memory location. The first source operand and destination operand are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand can be an XMM register or a 128-bit memory location. The first source operand and destination operand are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p><p>EVEX encoded VPCMPGTD/Q: The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand (first operand) is a mask register updated according to the writemask k2.</p>",
                 "tooltip": "Performs an SIMD signed compare for the packed quadwords in the destination operand (first operand) and the source operand (second operand). If the data element in the first (destination) operand is greater than the corresponding element in the second (source) operand, the corresponding data element in the destination is set to all 1s; otherwise, it is set to 0s.",
@@ -2379,12 +2380,12 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PEXT.html"
             };
 
-        case "VPEXTRB":
-        case "VPEXTRQ":
         case "PEXTRB":
         case "PEXTRD":
         case "PEXTRQ":
+        case "VPEXTRB":
         case "VPEXTRD":
+        case "VPEXTRQ":
             return {
                 "html": "<p>Extract a byte/dword/qword integer value from the source XMM register at a byte/dword/qword offset determined from imm8[3:0]. The destination can be a register or byte/dword/qword memory location. If the destination is a register, the upper bits of the register are zero extended.</p><p>In legacy non-VEX encoded version and if the destination operand is a register, the default operand size in 64-bit mode for PEXTRB/PEXTRD is 64 bits, the bits above the least significant byte/dword data are filled with zeros. PEXTRQ is not encodable in non-64-bit modes and requires REX.W in 64-bit mode.</p><p>Note: In VEX.128 encoded versions, VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD. In EVEX.128 encoded versions, EVEX.vvvv is reserved and must be 1111b, EVEX.L\u201dL must be</p><p>0, otherwise the instruction will #UD. If the destination operand is a register, the default operand size in 64-bit mode for VPEXTRB/VPEXTRD is 64 bits, the bits above the least significant byte/word/dword data are filled with zeros.</p>",
                 "tooltip": "Extract a byte/dword/qword integer value from the source XMM register at a byte/dword/qword offset determined from imm8[3:0]. The destination can be a register or byte/dword/qword memory location. If the destination is a register, the upper bits of the register are zero extended.",
@@ -2399,17 +2400,17 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PEXTRW.html"
             };
 
-        case "VPHADDSW":
         case "PHADDSW":
+        case "VPHADDSW":
             return {
                 "html": "<p>(V)PHADDSW adds two adjacent signed 16-bit integers horizontally from the source and destination operands and saturates the signed results; packs the signed, saturated 16-bit results to the destination operand (first operand) When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Legacy SSE version: Both operands can be MMX registers. The second source operand can be an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "(V)PHADDSW adds two adjacent signed 16-bit integers horizontally from the source and destination operands and saturates the signed results; packs the signed, saturated 16-bit results to the destination operand (first operand) When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.",
                 "url": "http://www.felixcloutier.com/x86/PHADDSW.html"
             };
 
-        case "VPHADDD":
-        case "PHADDW":
         case "PHADDD":
+        case "PHADDW":
+        case "VPHADDD":
         case "VPHADDW":
             return {
                 "html": "<p>(V)PHADDW adds two adjacent 16-bit signed integers horizontally from the source and destination operands and packs the 16-bit signed results to the destination operand (first operand). (V)PHADDD adds two adjacent 32-bit signed integers horizontally from the source and destination operands and packs the 32-bit signed results to the destination operand (first operand). When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Note that these instructions can operate on either unsigned or signed (two\u2019s complement notation) integers; however, it does not set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent undetected overflow conditions, software must control the ranges of the values operated on.</p><p>Legacy SSE instructions: Both operands can be MMX registers. The second source operand can be an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p>",
@@ -2433,9 +2434,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PHSUBSW.html"
             };
 
+        case "PHSUBD":
         case "PHSUBW":
         case "VPHSUBD":
-        case "PHSUBD":
         case "VPHSUBW":
             return {
                 "html": "<p>(V)PHSUBW performs horizontal subtraction on each adjacent pair of 16-bit signed integers by subtracting the most significant word from the least significant word of each pair in the source and destination operands, and packs the signed 16-bit results to the destination operand (first operand). (V)PHSUBD performs horizontal subtraction on each adjacent pair of 32-bit signed integers by subtracting the most significant doubleword from the least significant doubleword of each pair, and packs the signed 32-bit result to the destination operand. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Legacy SSE version: Both operands can be MMX registers. The second source operand can be an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>In 64-bit mode, use the REX prefix to access additional registers.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>",
@@ -2444,19 +2445,19 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PINSRB":
+        case "PINSRD":
+        case "PINSRQ":
+        case "VPINSRB":
         case "VPINSRD":
         case "VPINSRQ":
-        case "VPINSRB":
-        case "PINSRQ":
-        case "PINSRD":
             return {
                 "html": "<p>Copies a byte/dword/qword from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other elements in the destination register are left untouched.) The source operand can be a general-purpose register or a memory location. (When the source operand is a general-purpose register, PINSRB copies the low byte of the register.) The destination operand is an XMM register. The count operand is an 8-bit immediate. When specifying a qword[dword, byte] location in an XMM register, the [2, 4] least-significant bit(s) of the count operand specify the location.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15, R8-15). Use of REX.W permits the use of 64 bit general purpose registers.</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are zeroed. VEX.L must be 0, otherwise the instruction will #UD. Attempt to execute VPINSRQ in non-64-bit mode will cause #UD.</p><p>EVEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are zeroed. EVEX.L\u2019L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Copies a byte/dword/qword from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other elements in the destination register are left untouched.) The source operand can be a general-purpose register or a memory location. (When the source operand is a general-purpose register, PINSRB copies the low byte of the register.) The destination operand is an XMM register. The count operand is an 8-bit immediate. When specifying a qword[dword, byte] location in an XMM register, the [2, 4] least-significant bit(s) of the count operand specify the location.",
                 "url": "http://www.felixcloutier.com/x86/PINSRB%3APINSRD%3APINSRQ.html"
             };
 
-        case "VPINSRW":
         case "PINSRW":
+        case "VPINSRW":
             return {
                 "html": "<p>Copies a word from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other words in the destination register are left untouched.) The source operand can be a general-purpose register or a 16-bit memory location. (When the source operand is a general-purpose register, the low word of the register is copied.) The destination operand can be an MMX technology register or an XMM register. The count operand is an 8-bit immediate. When specifying a word location in an MMX technology register, the 2 least-significant bits of the count operand specify the location; for an XMM register, the 3 least-significant bits specify the location.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15, R8-15).</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (MAXVL-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD.</p><p>EVEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are zeroed. EVEX.L\u2019L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Copies a word from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other words in the destination register are left untouched.) The source operand can be a general-purpose register or a 16-bit memory location. (When the source operand is a general-purpose register, the low word of the register is copied.) The destination operand can be an MMX technology register or an XMM register. The count operand is an 8-bit immediate. When specifying a word location in an MMX technology register, the 2 least-significant bits of the count operand specify the location; for an XMM register, the 3 least-significant bits specify the location.",
@@ -2479,42 +2480,42 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PMADDWD.html"
             };
 
-        case "PMAXSD":
-        case "VPMAXSW":
         case "PMAXSB":
-        case "VPMAXSD":
+        case "PMAXSD":
         case "PMAXSW":
         case "VPMAXSB":
+        case "VPMAXSD":
         case "VPMAXSQ":
+        case "VPMAXSW":
             return {
                 "html": "<p>Performs a SIMD compare of the packed signed byte, word, dword or qword integers in the second source operand and the first source operand and returns the maximum value for each pair of integers to the destination operand.</p><p>Legacy SSE version PMAXSW: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers. Bits (MAXVL-1:256) of the corresponding destination register are zeroed.</p>",
                 "tooltip": "Performs a SIMD compare of the packed signed byte, word, dword or qword integers in the second source operand and the first source operand and returns the maximum value for each pair of integers to the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMAXSB%3APMAXSW%3APMAXSD%3APMAXSQ.html"
             };
 
+        case "PMAXUB":
         case "PMAXUW":
-        case "VPMAXUW":
         case "VPMAXUB":
-        case "PMAXUB":
+        case "VPMAXUW":
             return {
                 "html": "<p>Performs a SIMD compare of the packed unsigned byte, word integers in the second source operand and the first source operand and returns the maximum value for each pair of integers to the destination operand.</p><p>Legacy SSE version PMAXUB: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p>",
                 "tooltip": "Performs a SIMD compare of the packed unsigned byte, word integers in the second source operand and the first source operand and returns the maximum value for each pair of integers to the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMAXUB%3APMAXUW.html"
             };
 
-        case "VPMAXUQ":
         case "PMAXUD":
         case "VPMAXUD":
+        case "VPMAXUQ":
             return {
                 "html": "<p>Performs a SIMD compare of the packed unsigned dword or qword integers in the second source operand and the first source operand and returns the maximum value for each pair of integers to the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register; The second source operand is a YMM register or 256-bit memory location. Bits (MAXVL-1:256) of the corresponding destination register are zeroed.</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register; The second source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination operand is conditionally updated based on writemask k1.</p>",
                 "tooltip": "Performs a SIMD compare of the packed unsigned dword or qword integers in the second source operand and the first source operand and returns the maximum value for each pair of integers to the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMAXUD%3APMAXUQ.html"
             };
 
+        case "PMINSB":
         case "PMINSW":
         case "VPMINSB":
         case "VPMINSW":
-        case "PMINSB":
             return {
                 "html": "<p>Performs a SIMD compare of the packed signed byte, word, or dword integers in the second source operand and the first source operand and returns the minimum value for each pair of integers to the destination operand.</p><p>Legacy SSE version PMINSW: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p>",
                 "tooltip": "Performs a SIMD compare of the packed signed byte, word, or dword integers in the second source operand and the first source operand and returns the minimum value for each pair of integers to the destination operand.",
@@ -2530,95 +2531,95 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PMINSD%3APMINSQ.html"
             };
 
-        case "PMINUW":
-        case "VPMINUW":
         case "PMINUB":
+        case "PMINUW":
         case "VPMINUB":
+        case "VPMINUW":
             return {
                 "html": "<p>Performs a SIMD compare of the packed unsigned byte or word integers in the second source operand and the first source operand and returns the minimum value for each pair of integers to the destination operand.</p><p>Legacy SSE version PMINUB: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand can be an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p>",
                 "tooltip": "Performs a SIMD compare of the packed unsigned byte or word integers in the second source operand and the first source operand and returns the minimum value for each pair of integers to the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMINUB%3APMINUW.html"
             };
 
+        case "PMINUD":
         case "VPMINUD":
         case "VPMINUQ":
-        case "PMINUD":
             return {
                 "html": "<p>Performs a SIMD compare of the packed unsigned dword/qword integers in the second source operand and the first source operand and returns the minimum value for each pair of integers to the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers. Bits (MAXVL-1:256) of the corresponding destination register are zeroed.</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register; The second source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination operand is conditionally updated based on writemask k1.</p>",
                 "tooltip": "Performs a SIMD compare of the packed unsigned dword/qword integers in the second source operand and the first source operand and returns the minimum value for each pair of integers to the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMINUD%3APMINUQ.html"
             };
 
-        case "VPMOVMSKB":
         case "PMOVMSKB":
+        case "VPMOVMSKB":
             return {
                 "html": "<p>Creates a mask made up of the most significant bit of each byte of the source operand (second operand) and stores the result in the low byte or word of the destination operand (first operand).</p><p>The byte mask is 8 bits for 64-bit source operand, 16 bits for 128-bit source operand and 32 bits for 256-bit source operand. The destination operand is a general-purpose register.</p><p>In 64-bit mode, the instruction can access additional registers (XMM8-XMM15, R8-R15) when used with a REX.R prefix. The default operand size is 64-bit in 64-bit mode.</p><p>Legacy SSE version: The source operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The source operand is an XMM register.</p>",
                 "tooltip": "Creates a mask made up of the most significant bit of each byte of the source operand (second operand) and stores the result in the low byte or word of the destination operand (first operand).",
                 "url": "http://www.felixcloutier.com/x86/PMOVMSKB.html"
             };
 
-        case "VPMOVSXBQ":
-        case "VPMOVSXWD":
-        case "PMOVSXWD":
+        case "PMOVSXBD":
         case "PMOVSXBQ":
-        case "VPMOVSXBW":
-        case "VPMOVSXWQ":
-        case "PMOVSXDQ":
         case "PMOVSXBW":
-        case "PMOVSXBD":
+        case "PMOVSXDQ":
+        case "PMOVSXWD":
+        case "PMOVSXWQ":
         case "VPMOVSXBD":
+        case "VPMOVSXBQ":
+        case "VPMOVSXBW":
         case "VPMOVSXDQ":
-        case "PMOVSXWQ":
+        case "VPMOVSXWD":
+        case "VPMOVSXWQ":
             return {
                 "html": "<p>Legacy and VEX encoded versions: Packed byte, word, or dword integers in the low bytes of the source operand (second operand) are sign extended to word, dword, or quadword integers and stored in packed signed bytes the destination operand.</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 and EVEX.128 encoded versions: Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 and EVEX.256 encoded versions: Bits (MAXVL-1:256) of the corresponding destination register are zeroed.</p><p>EVEX encoded versions: Packed byte, word or dword integers starting from the low bytes of the source operand (second operand) are sign extended to word, dword or quadword integers and stored to the destination operand under the writemask. The destination register is XMM, YMM or ZMM Register.</p>",
                 "tooltip": "Legacy and VEX encoded versions: Packed byte, word, or dword integers in the low bytes of the source operand (second operand) are sign extended to word, dword, or quadword integers and stored in packed signed bytes the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMOVSX.html"
             };
 
+        case "PMOVZXBD":
+        case "PMOVZXBQ":
+        case "PMOVZXBW":
+        case "PMOVZXDQ":
         case "PMOVZXWD":
-        case "VPMOVZXBW":
         case "PMOVZXWQ":
-        case "VPMOVZXDQ":
-        case "PMOVZXBW":
-        case "VPMOVZXWQ":
         case "VPMOVZXBD":
         case "VPMOVZXBQ":
+        case "VPMOVZXBW":
+        case "VPMOVZXDQ":
         case "VPMOVZXWD":
-        case "PMOVZXBD":
-        case "PMOVZXBQ":
-        case "PMOVZXDQ":
+        case "VPMOVZXWQ":
             return {
                 "html": "<p>Legacy, VEX and EVEX encoded versions: Packed byte, word, or dword integers starting from the low bytes of the source operand (second operand) are zero extended to word, dword, or quadword integers and stored in packed signed bytes the destination operand.</p><p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: Bits (MAXVL-1:256) of the corresponding destination register are zeroed.</p><p>EVEX encoded versions: Packed dword integers starting from the low bytes of the source operand (second operand) are zero extended to quadword integers and stored to the destination operand under the writemask.The destination register is XMM, YMM or ZMM Register.</p>",
                 "tooltip": "Legacy, VEX and EVEX encoded versions: Packed byte, word, or dword integers starting from the low bytes of the source operand (second operand) are zero extended to word, dword, or quadword integers and stored in packed signed bytes the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMOVZX.html"
             };
 
-        case "VPMULDQ":
         case "PMULDQ":
+        case "VPMULDQ":
             return {
                 "html": "<p>Multiplies packed signed doubleword integers in the even-numbered (zero-based reference) elements of the first source operand with the packed signed doubleword integers in the corresponding elements of the second source operand and stores packed signed quadword results in the destination operand.</p><p>128-bit Legacy SSE version: The input signed doubleword integers are taken from the even-numbered elements of the source operands, i.e. the first (low) and third doubleword element. For 128-bit memory operands, 128 bits are fetched from memory, but only the first and third doublewords are used in the computation. The first source operand and the destination XMM operand is the same. The second source operand can be an XMM register or 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p><p>VEX.128 encoded version: The input signed doubleword integers are taken from the even-numbered elements of the source operands, i.e., the first (low) and third doubleword element. For 128-bit memory operands, 128 bits are fetched from memory, but only the first and third doublewords are used in the computation.The first source operand and the destination operand are XMM registers. The second source operand can be an XMM register or 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The input signed doubleword integers are taken from the even-numbered elements of the source operands, i.e. the first, 3rd, 5th, 7th doubleword element. For 256-bit memory operands, 256 bits are fetched from memory, but only the four even-numbered doublewords are used in the computation. The first source operand and the destination operand are YMM registers. The second source operand can be a YMM register or 256-bit memory location. Bits (MAXVL-1:256) of the corresponding destination ZMM register are zeroed.</p><p>EVEX encoded version: The input signed doubleword integers are taken from the even-numbered elements of the source operands. The first source operand is a ZMM/YMM/XMM registers. The second source operand can be an ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination is a ZMM/YMM/XMM register, and updated according to the writemask at 64-bit granularity.</p>",
                 "tooltip": "Multiplies packed signed doubleword integers in the even-numbered (zero-based reference) elements of the first source operand with the packed signed doubleword integers in the corresponding elements of the second source operand and stores packed signed quadword results in the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMULDQ.html"
             };
 
-        case "VPMULHRSW":
         case "PMULHRSW":
+        case "VPMULHRSW":
             return {
                 "html": "<p>PMULHRSW multiplies vertically each signed 16-bit integer from the destination operand (first operand) with the corresponding signed 16-bit integer of the source operand (second operand), producing intermediate, signed 32-bit integers. Each intermediate 32-bit integer is truncated to the 18 most significant bits. Rounding is always performed by adding 1 to the least significant bit of the 18-bit intermediate result. The final result is obtained by selecting the 16 bits immediately to the right of the most significant bit of each 18-bit intermediate result and packed to the destination operand.</p><p>When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>In 64-bit mode and not encoded with VEX/EVEX, use the REX prefix to access XMM8-XMM15 registers.</p><p>Legacy SSE version 64-bit operand: Both operands can be MMX registers. The second source operand is an MMX register or a 64-bit memory location.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p>",
                 "tooltip": "PMULHRSW multiplies vertically each signed 16-bit integer from the destination operand (first operand) with the corresponding signed 16-bit integer of the source operand (second operand), producing intermediate, signed 32-bit integers. Each intermediate 32-bit integer is truncated to the 18 most significant bits. Rounding is always performed by adding 1 to the least significant bit of the 18-bit intermediate result. The final result is obtained by selecting the 16 bits immediately to the right of the most significant bit of each 18-bit intermediate result and packed to the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMULHRSW.html"
             };
 
-        case "VPMULHUW":
         case "PMULHUW":
+        case "VPMULHUW":
             return {
                 "html": "<p>Performs a SIMD unsigned multiply of the packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each 32-bit intermediate results in the destination operand. (<a href=\"http://www.felixcloutier.com/x86/PMULHUW.html#fig-4-12\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-12</a> shows this operation when using 64-bit operands.)</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version 64-bit operand: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs a SIMD unsigned multiply of the packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each 32-bit intermediate results in the destination operand. (Figure 4-12 shows this operation when using 64-bit operands.)",
                 "url": "http://www.felixcloutier.com/x86/PMULHUW.html"
             };
 
-        case "VPMULHW":
         case "PMULHW":
+        case "VPMULHW":
             return {
                 "html": "<p>Performs a SIMD signed multiply of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each intermediate 32-bit result in the destination operand. (<a href=\"http://www.felixcloutier.com/x86/PMULHUW.html#fig-4-12\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-12</a> shows this operation when using 64-bit operands.)</p><p>n 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version 64-bit operand: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs a SIMD signed multiply of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each intermediate 32-bit result in the destination operand. (Figure 4-12 shows this operation when using 64-bit operands.)",
@@ -2626,24 +2627,24 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PMULLD":
-        case "VPMULLQ":
         case "VPMULLD":
+        case "VPMULLQ":
             return {
                 "html": "<p>Performs a SIMD signed multiply of the packed signed dword/qword integers from each element of the first source operand with the corresponding element in the second source operand. The low 32/64 bits of each 64/128-bit intermediate results are stored to the destination operand.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding ZMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding ZMM register are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register; The second source operand is a YMM register or 256-bit memory location. Bits (MAXVL-1:256) of the corresponding destination ZMM register are zeroed.</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination operand is conditionally updated based on writemask k1.</p>",
                 "tooltip": "Performs a SIMD signed multiply of the packed signed dword/qword integers from each element of the first source operand with the corresponding element in the second source operand. The low 32/64 bits of each 64/128-bit intermediate results are stored to the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PMULLD%3APMULLQ.html"
             };
 
-        case "VPMULLW":
         case "PMULLW":
+        case "VPMULLW":
             return {
                 "html": "<p>Performs a SIMD signed multiply of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and stores the low 16 bits of each intermediate 32-bit result in the destination operand. (<a href=\"http://www.felixcloutier.com/x86/PMULHUW.html#fig-4-12\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-12</a> shows this operation when using 64-bit operands.)</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version 64-bit operand: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Performs a SIMD signed multiply of the packed signed word integers in the destination operand (first operand) and the source operand (second operand), and stores the low 16 bits of each intermediate 32-bit result in the destination operand. (Figure 4-12 shows this operation when using 64-bit operands.)",
                 "url": "http://www.felixcloutier.com/x86/PMULLW.html"
             };
 
-        case "VPMULUDQ":
         case "PMULUDQ":
+        case "VPMULUDQ":
             return {
                 "html": "<p>Multiplies the first operand (destination operand) by the second operand (source operand) and stores the result in the destination operand.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version 64-bit operand: The source operand can be an unsigned doubleword integer stored in the low doubleword of an MMX technology register or a 64-bit memory location. The destination operand can be an unsigned doubleword integer stored in the low doubleword an MMX technology register. The result is an unsigned</p><p>quadword integer stored in the destination an MMX technology register. When a quadword result is too large to be represented in 64 bits (overflow), the result is wrapped around and the low 64 bits are written to the destination element (that is, the carry is ignored).</p><p>For 64-bit memory operands, 64 bits are fetched from memory, but only the low doubleword is used in the computation.</p>",
                 "tooltip": "Multiplies the first operand (destination operand) by the second operand (source operand) and stores the result in the destination operand.",
@@ -2672,19 +2673,19 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/POPCNT.html"
             };
 
-        case "POPFQ":
-        case "POPFD":
         case "POPF":
+        case "POPFD":
+        case "POPFQ":
             return {
                 "html": "<p>Pops a doubleword (POPFD) from the top of the stack (if the current operand-size attribute is 32) and stores the value in the EFLAGS register, or pops a word from the top of the stack (if the operand-size attribute is 16) and stores it in the lower 16 bits of the EFLAGS register (that is, the FLAGS register). These instructions reverse the operation of the PUSHF/PUSHFD/PUSHFQ instructions.</p><p>The POPF (pop flags) and POPFD (pop flags double) mnemonics reference the same opcode. The POPF instruction is intended for use when the operand-size attribute is 16; the POPFD instruction is intended for use when the operand-size attribute is 32. Some assemblers may force the operand size to 16 for POPF and to 32 for POPFD. Others may treat the mnemonics as synonyms (POPF/POPFD) and use the setting of the operand-size attribute to determine the size of values to pop from the stack.</p><p>The effect of POPF/POPFD on the EFLAGS register changes, depending on the mode of operation. See <a href=\"http://www.felixcloutier.com/x86/POPF:POPFD:POPFQ.html#tbl-4-15\" rel=\"noreferrer noopener\" target=\"_blank\">Table 4-15</a> and the key below for details.</p><p>When operating in protected, compatibility, or 64-bit mode at privilege level 0 (or in real-address mode, the equivalent to privilege level 0), all non-reserved flags in the EFLAGS register except RF<sup>1</sup>, VIP, VIF, and VM may be modified. VIP, VIF and VM remain unaffected.</p><p>When operating in protected, compatibility, or 64-bit mode with a privilege level greater than 0, but less than or equal to IOPL, all flags can be modified except the IOPL field and RF, IF, VIP, VIF, and VM; these remain unaffected. The AC and ID flags can only be modified if the operand-size attribute is 32. The interrupt flag (IF) is altered only when executing at a level at least as privileged as the IOPL. If a POPF/POPFD instruction is executed with insufficient privilege, an exception does not occur but privileged bits do not change.</p>",
                 "tooltip": "Pops a doubleword (POPFD) from the top of the stack (if the current operand-size attribute is 32) and stores the value in the EFLAGS register, or pops a word from the top of the stack (if the operand-size attribute is 16) and stores it in the lower 16 bits of the EFLAGS register (that is, the FLAGS register). These instructions reverse the operation of the PUSHF/PUSHFD/PUSHFQ instructions.",
                 "url": "http://www.felixcloutier.com/x86/POPF%3APOPFD%3APOPFQ.html"
             };
 
-        case "VPORD":
+        case "POR":
         case "VPOR":
+        case "VPORD":
         case "VPORQ":
-        case "POR":
             return {
                 "html": "<p>Performs a bitwise logical OR operation on the source operand (second operand) and the destination operand (first operand) and stores the result in the destination operand. Each bit of the result is set to 1 if either or both of the corresponding bits of the first and second operands are 1; otherwise, it is set to 0.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The second source operand is an XMM register or a 128-bit memory location. The first source and destination operands can be XMM registers. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand is an XMM register or a 128-bit memory location. The first source and destination operands can be XMM registers. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Performs a bitwise logical OR operation on the source operand (second operand) and the destination operand (first operand) and stores the result in the destination operand. Each bit of the result is set to 1 if either or both of the corresponding bits of the first and second operands are 1; otherwise, it is set to 0.",
@@ -2706,8 +2707,8 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PREFETCHNTA":
-        case "PREFETCHT1":
         case "PREFETCHT0":
+        case "PREFETCHT1":
         case "PREFETCHT2":
             return {
                 "html": "<p>Fetches the line of data from memory that contains the byte specified with the source operand to a location in the cache hierarchy specified by a locality hint:</p><p>The source operand is a byte memory location. (The locality hints are encoded into the machine level instruction using bits 3 through 5 of the ModR/M byte.)</p><p>If the line selected is already present in the cache hierarchy at a level closer to the processor, no data movement occurs. Prefetches from uncacheable or WC memory are ignored.</p><p>The PREFETCH<em>h</em> instruction is merely a hint and does not affect program behavior. If executed, this instruction moves data closer to the processor in anticipation of future use.</p><p>The implementation of prefetch locality hints is implementation-dependent, and can be overloaded or ignored by a processor implementation. The amount of data prefetched is also processor implementation-dependent. It will, however, be a minimum of 32 bytes. Additional details of the implementation-dependent locality hints are described in Section 7.4 of <em>Intel\u00ae 64 and IA-32 Architectures Optimization Reference Manual</em>.</p>",
@@ -2731,16 +2732,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PSHUFB.html"
             };
 
-        case "VPSHUFD":
         case "PSHUFD":
+        case "VPSHUFD":
             return {
                 "html": "<p>Copies doublewords from source operand (second operand) and inserts them in the destination operand (first operand) at the locations selected with the order operand (third operand). <span class=\"not-imported\">Figure 4-16</span> shows the operation of the 256-bit VPSHUFD instruction and the encoding of the order operand. Each 2-bit field in the order operand selects the contents of one doubleword location within a 128-bit lane and copy to the target element in the destination operand. For example, bits 0 and 1 of the order operand targets the first doubleword element in the low and high 128-bit lane of the destination operand for 256-bit VPSHUFD. The encoded value of bits 1:0 of the order operand (see the field encoding in <span class=\"not-imported\">Figure 4-16</span>) determines which doubleword element (from the respective 128-bit lane) of the source operand will be copied to doubleword 0 of the destination operand.</p><p>For 128-bit operation, only the low 128-bit lane are operative. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. The order operand is an 8-bit immediate. Note that this instruction permits a doubleword in the source operand to be copied to more than one doubleword location in the destination operand.</p><p>10B - X2 ORDER Operand 11B-X7 7 6 5 4 3 2 1 0 Operand 11B-X3</p><p>The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register. The order operand is an 8-bit immediate. Note that this instruction permits a doubleword in the source operand to be copied to more than one doubleword location in the destination operand.</p><p>In 64-bit mode and not encoded in VEX/EVEX, using REX.R permits this instruction to access XMM8-XMM15.</p>",
                 "tooltip": "Copies doublewords from source operand (second operand) and inserts them in the destination operand (first operand) at the locations selected with the order operand (third operand). Figure 4-16 shows the operation of the 256-bit VPSHUFD instruction and the encoding of the order operand. Each 2-bit field in the order operand selects the contents of one doubleword location within a 128-bit lane and copy to the target element in the destination operand. For example, bits 0 and 1 of the order operand targets the first doubleword element in the low and high 128-bit lane of the destination operand for 256-bit VPSHUFD. The encoded value of bits 1:0 of the order operand (see the field encoding in Figure 4-16) determines which doubleword element (from the respective 128-bit lane) of the source operand will be copied to doubleword 0 of the destination operand.",
                 "url": "http://www.felixcloutier.com/x86/PSHUFD.html"
             };
 
-        case "VPSHUFHW":
         case "PSHUFHW":
+        case "VPSHUFHW":
             return {
                 "html": "<p>Copies words from the high quadword of a 128-bit lane of the source operand and inserts them in the high quadword of the destination operand at word locations (of the respective lane) selected with the immediate operand. This 256-bit operation is similar to the in-lane operation used by the 256-bit VPSHUFD instruction, which is illustrated in <span class=\"not-imported\">Figure 4-16</span>. For 128-bit operation, only the low 128-bit lane is operative. Each 2-bit field in the immediate operand selects the contents of one word location in the high quadword of the destination operand. The binary encodings of the immediate operand fields select words (0, 1, 2 or 3, 4) from the high quadword of the source operand to be copied to the destination operand. The low quadword of the source operand is copied to the low quadword of the destination operand, for each 128-bit lane.</p><p>Note that this instruction permits a word in the high quadword of the source operand to be copied to more than one word location in the high quadword of the destination operand.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed. VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD.</p>",
                 "tooltip": "Copies words from the high quadword of a 128-bit lane of the source operand and inserts them in the high quadword of the destination operand at word locations (of the respective lane) selected with the immediate operand. This 256-bit operation is similar to the in-lane operation used by the 256-bit VPSHUFD instruction, which is illustrated in Figure 4-16. For 128-bit operation, only the low 128-bit lane is operative. Each 2-bit field in the immediate operand selects the contents of one word location in the high quadword of the destination operand. The binary encodings of the immediate operand fields select words (0, 1, 2 or 3, 4) from the high quadword of the source operand to be copied to the destination operand. The low quadword of the source operand is copied to the low quadword of the destination operand, for each 128-bit lane.",
@@ -2762,32 +2763,32 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PSHUFW.html"
             };
 
+        case "PSIGNB":
         case "PSIGND":
         case "PSIGNW":
         case "VPSIGNB":
-        case "VPSIGNW":
-        case "PSIGNB":
         case "VPSIGND":
+        case "VPSIGNW":
             return {
                 "html": "<p>(V)PSIGNB/(V)PSIGNW/(V)PSIGND negates each data element of the destination operand (the first operand) if the signed integer value of the corresponding data element in the source operand (the second operand) is less than zero. If the signed integer value of a data element in the source operand is positive, the corresponding data element in the destination operand is unchanged. If a data element in the source operand is zero, the corresponding data element in the destination operand is set to zero.</p><p>(V)PSIGNB operates on signed bytes. (V)PSIGNW operates on 16-bit signed words. (V)PSIGND operates on signed 32-bit integers. When the source operand is a 128bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p><p>Legacy SSE instructions: Both operands can be MMX registers. In 64-bit mode, use the REX prefix to access additional registers.</p><p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise instructions will #UD.</p>",
                 "tooltip": "(V)PSIGNB/(V)PSIGNW/(V)PSIGND negates each data element of the destination operand (the first operand) if the signed integer value of the corresponding data element in the source operand (the second operand) is less than zero. If the signed integer value of a data element in the source operand is positive, the corresponding data element in the destination operand is unchanged. If a data element in the source operand is zero, the corresponding data element in the destination operand is set to zero.",
                 "url": "http://www.felixcloutier.com/x86/PSIGNB%3APSIGNW%3APSIGND.html"
             };
 
-        case "VPSLLDQ":
         case "PSLLDQ":
+        case "VPSLLDQ":
             return {
                 "html": "<p>Shifts the destination operand (first operand) to the left by the number of bytes specified in the count operand (second operand). The empty low-order bytes are cleared (set to all 0s). If the value specified by the count operand is greater than 15, the destination operand is set to all 0s. The count operand is an 8-bit immediate.</p><p>128-bit Legacy SSE version: The source and destination operands are the same. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The source and destination operands are XMM registers. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The source operand is YMM register. The destination operand is an YMM register. Bits (MAXVL-1:256) of the corresponding ZMM register are zeroed. The count operand applies to both the low and high 128-bit lanes.</p><p>EVEX encoded versions: The source operand is a ZMM/YMM/XMM register or a 512/256/128-bit memory location. The destination operand is a ZMM/YMM/XMM register. The count operand applies to each 128-bit lanes.</p>",
                 "tooltip": "Shifts the destination operand (first operand) to the left by the number of bytes specified in the count operand (second operand). The empty low-order bytes are cleared (set to all 0s). If the value specified by the count operand is greater than 15, the destination operand is set to all 0s. The count operand is an 8-bit immediate.",
                 "url": "http://www.felixcloutier.com/x86/PSLLDQ.html"
             };
 
-        case "VPSLLW":
+        case "PSLLD":
         case "PSLLQ":
         case "PSLLW":
         case "VPSLLD":
         case "VPSLLQ":
-        case "PSLLD":
+        case "VPSLLW":
             return {
                 "html": "<p>Shifts the bits in the individual data elements (words, doublewords, or quadword) in the destination operand (first operand) to the left by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted left, the empty low-order bits are cleared (set to 0). If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination operand is set to all 0s. <a href=\"http://www.felixcloutier.com/x86/PSLLW:PSLLD:PSLLQ.html#fig-4-17\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-17</a> gives an example of shifting words in a 64-bit operand.</p><p>The (V)PSLLW instruction shifts each of the words in the destination operand to the left by the number of bits specified in the count operand; the (V)PSLLD instruction shifts each of the doublewords in the destination operand; and the (V)PSLLQ instruction shifts the quadword (or quadwords) in the destination operand.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions 64-bit operand: The destination operand is an MMX technology register; the count operand can be either an MMX technology register or an 64-bit memory location.</p><p>128-bit Legacy SSE version: The destination and first source operands are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged. The count operand can be either an XMM register or a 128-bit memory location or an 8-bit immediate. If the count operand is a memory address, 128 bits are loaded but the upper 64 bits are ignored.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (words, doublewords, or quadword) in the destination operand (first operand) to the left by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted left, the empty low-order bits are cleared (set to 0). If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination operand is set to all 0s. Figure 4-17 gives an example of shifting words in a 64-bit operand.",
@@ -2795,29 +2796,30 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PSRAD":
-        case "VPSRAW":
-        case "VPSRAD":
         case "PSRAW":
+        case "VPSRAD":
+        case "VPSRAQ":
+        case "VPSRAW":
             return {
                 "html": "<p>Shifts the bits in the individual data elements (words, doublewords or quadwords) in the destination operand (first operand) to the right by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted right, the empty high-order bits are filled with the initial value of the sign bit of the data element. If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for quadwords), each destination data element is filled with the initial value of the sign bit of the element. (<a href=\"http://www.felixcloutier.com/x86/PSRAW:PSRAD:PSRAQ.html#fig-4-18\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-18</a> gives an example of shifting words in a 64-bit operand.)</p><p>Note that only the first 64-bits of a 128-bit count operand are checked to compute the count. If the second source operand is a memory address, 128 bits are loaded.</p><p>The (V)PSRAW instruction shifts each of the words in the destination operand to the right by the number of bits specified in the count operand, and the (V)PSRAD instruction shifts each of the doublewords in the destination operand.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions 64-bit operand: The destination operand is an MMX technology register; the count operand can be either an MMX technology register or an 64-bit memory location.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (words, doublewords or quadwords) in the destination operand (first operand) to the right by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted right, the empty high-order bits are filled with the initial value of the sign bit of the data element. If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for quadwords), each destination data element is filled with the initial value of the sign bit of the element. (Figure 4-18 gives an example of shifting words in a 64-bit operand.)",
                 "url": "http://www.felixcloutier.com/x86/PSRAW%3APSRAD%3APSRAQ.html"
             };
 
-        case "VPSRLDQ":
         case "PSRLDQ":
+        case "VPSRLDQ":
             return {
                 "html": "<p>Shifts the destination operand (first operand) to the right by the number of bytes specified in the count operand (second operand). The empty high-order bytes are cleared (set to all 0s). If the value specified by the count operand is greater than 15, the destination operand is set to all 0s. The count operand is an 8-bit immediate.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>128-bit Legacy SSE version: The source and destination operands are the same. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The source and destination operands are XMM registers. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p><p>VEX.256 encoded version: The source operand is a YMM register. The destination operand is a YMM register. The count operand applies to both the low and high 128-bit lanes.</p>",
                 "tooltip": "Shifts the destination operand (first operand) to the right by the number of bytes specified in the count operand (second operand). The empty high-order bytes are cleared (set to all 0s). If the value specified by the count operand is greater than 15, the destination operand is set to all 0s. The count operand is an 8-bit immediate.",
                 "url": "http://www.felixcloutier.com/x86/PSRLDQ.html"
             };
 
-        case "VPSRLW":
         case "PSRLD":
+        case "PSRLQ":
         case "PSRLW":
-        case "VPSRLQ":
         case "VPSRLD":
-        case "PSRLQ":
+        case "VPSRLQ":
+        case "VPSRLW":
             return {
                 "html": "<p>Shifts the bits in the individual data elements (words, doublewords, or quadword) in the destination operand (first operand) to the right by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted right, the empty high-order bits are cleared (set to 0). If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination operand is set to all 0s. <a href=\"http://www.felixcloutier.com/x86/PSRLW:PSRLD:PSRLQ.html#fig-4-19\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-19</a> gives an example of shifting words in a 64-bit operand.</p><p>Note that only the low 64-bits of a 128-bit count operand are checked to compute the count.</p><p>The (V)PSRLW instruction shifts each of the words in the destination operand to the right by the number of bits specified in the count operand; the (V)PSRLD instruction shifts each of the doublewords in the destination operand; and the PSRLQ instruction shifts the quadword (or quadwords) in the destination operand.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instruction 64-bit operand: The destination operand is an MMX technology register; the count operand can be either an MMX technology register or an 64-bit memory location.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (words, doublewords, or quadword) in the destination operand (first operand) to the right by the number of bits specified in the count operand (second operand). As the bits in the data elements are shifted right, the empty high-order bits are cleared (set to 0). If the value specified by the count operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination operand is set to all 0s. Figure 4-19 gives an example of shifting words in a 64-bit operand.",
@@ -2825,11 +2827,11 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PSUBB":
-        case "VPSUBB":
-        case "VPSUBW":
-        case "VPSUBD":
         case "PSUBD":
         case "PSUBW":
+        case "VPSUBB":
+        case "VPSUBD":
+        case "VPSUBW":
             return {
                 "html": "<p>Performs a SIMD subtract of the packed integers of the source operand (second operand) from the packed integers of the destination operand (first operand), and stores the packed integer results in the destination operand. See <span class=\"not-imported\">Figure 9-4</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs.</p><p>The (V)PSUBB instruction subtracts packed byte integers. When an individual result is too large or too small to be represented in a byte, the result is wrapped around and the low 8 bits are written to the destination element.</p><p>The (V)PSUBW instruction subtracts packed word integers. When an individual result is too large or too small to be represented in a word, the result is wrapped around and the low 16 bits are written to the destination element.</p><p>The (V)PSUBD instruction subtracts packed doubleword integers. When an individual result is too large or too small to be represented in a doubleword, the result is wrapped around and the low 32 bits are written to the destination element.</p><p>Note that the (V)PSUBB, (V)PSUBW, and (V)PSUBD instructions can operate on either unsigned or signed (two's complement notation) packed integers; however, it does not set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent undetected overflow conditions, software must control the ranges of values upon which it operates.</p>",
                 "tooltip": "Performs a SIMD subtract of the packed integers of the source operand (second operand) from the packed integers of the destination operand (first operand), and stores the packed integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs.",
@@ -2845,8 +2847,8 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PSUBSB":
-        case "VPSUBSB":
         case "PSUBSW":
+        case "VPSUBSB":
         case "VPSUBSW":
             return {
                 "html": "<p>Performs a SIMD subtract of the packed signed integers of the source operand (second operand) from the packed signed integers of the destination operand (first operand), and stores the packed integer results in the destination operand. See <span class=\"not-imported\">Figure 9-4</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with signed saturation, as described in the following paragraphs.</p><p>The (V)PSUBSB instruction subtracts packed signed byte integers. When an individual byte result is beyond the range of a signed byte integer (that is, greater than 7FH or less than 80H), the saturated value of 7FH or 80H, respectively, is written to the destination operand.</p><p>The (V)PSUBSW instruction subtracts packed signed word integers. When an individual word result is beyond the range of a signed word integer (that is, greater than 7FFFH or less than 8000H), the saturated value of 7FFFH or 8000H, respectively, is written to the destination operand.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE version 64-bit operand: The destination operand must be an MMX technology register and the source operand can be either an MMX technology register or a 64-bit memory location.</p>",
@@ -2855,17 +2857,17 @@ export function getAsmOpcode(opcode) {
             };
 
         case "PSUBUSB":
-        case "VPSUBUSW":
         case "PSUBUSW":
         case "VPSUBUSB":
+        case "VPSUBUSW":
             return {
                 "html": "<p>Performs a SIMD subtract of the packed unsigned integers of the source operand (second operand) from the packed unsigned integers of the destination operand (first operand), and stores the packed unsigned integer results in the destination operand. See <span class=\"not-imported\">Figure 9-4</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for an illustration of a SIMD operation. Overflow is handled with unsigned saturation, as described in the following paragraphs.</p><p>These instructions can operate on either 64-bit or 128-bit operands.</p><p>The (V)PSUBUSB instruction subtracts packed unsigned byte integers. When an individual byte result is less than zero, the saturated value of 00H is written to the destination operand.</p><p>The (V)PSUBUSW instruction subtracts packed unsigned word integers. When an individual word result is less than zero, the saturated value of 0000H is written to the destination operand.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>",
                 "tooltip": "Performs a SIMD subtract of the packed unsigned integers of the source operand (second operand) from the packed unsigned integers of the destination operand (first operand), and stores the packed unsigned integer results in the destination operand. See Figure 9-4 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with unsigned saturation, as described in the following paragraphs.",
                 "url": "http://www.felixcloutier.com/x86/PSUBUSB%3APSUBUSW.html"
             };
 
-        case "VPTEST":
         case "PTEST":
+        case "VPTEST":
             return {
                 "html": "<p>PTEST and VPTEST set the ZF flag if all bits in the result are 0 of the bitwise AND of the first source operand (first operand) and the second source operand (second operand). VPTEST sets the CF flag if all bits in the result are 0 of the bitwise AND of the second source operand (second operand) and the logical NOT of the destination operand.</p><p>The first source register is specified by the ModR/M <em>reg</em> field.</p><p>128-bit versions: The first source register is an XMM register. The second source register can be an XMM register or a 128-bit memory location. The destination register is not modified.</p><p>VEX.256 encoded version: The first source register is a YMM register. The second source register can be a YMM register or a 256-bit memory location. The destination register is not modified.</p><p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "PTEST and VPTEST set the ZF flag if all bits in the result are 0 of the bitwise AND of the first source operand (first operand) and the second source operand (second operand). VPTEST sets the CF flag if all bits in the result are 0 of the bitwise AND of the second source operand (second operand) and the logical NOT of the destination operand.",
@@ -2879,28 +2881,28 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PTWRITE.html"
             };
 
-        case "VPUNPCKHWD":
-        case "VPUNPCKHQDQ":
-        case "PUNPCKHWD":
+        case "PUNPCKHBW":
         case "PUNPCKHDQ":
-        case "VPUNPCKHBW":
         case "PUNPCKHQDQ":
-        case "PUNPCKHBW":
+        case "PUNPCKHWD":
+        case "VPUNPCKHBW":
         case "VPUNPCKHDQ":
+        case "VPUNPCKHQDQ":
+        case "VPUNPCKHWD":
             return {
                 "html": "<p>Unpacks and interleaves the high-order data elements (bytes, words, doublewords, or quadwords) of the destination operand (first operand) and source operand (second operand) into the destination operand. <a href=\"http://www.felixcloutier.com/x86/PUNPCKHBW:PUNPCKHWD:PUNPCKHDQ:PUNPCKHQDQ.html#fig-4-20\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-20</a> shows the unpack operation for bytes in 64-bit operands. The low-order data elements are ignored.</p>",
                 "tooltip": "Unpacks and interleaves the high-order data elements (bytes, words, doublewords, or quadwords) of the destination operand (first operand) and source operand (second operand) into the destination operand. Figure 4-20 shows the unpack operation for bytes in 64-bit operands. The low-order data elements are ignored.",
                 "url": "http://www.felixcloutier.com/x86/PUNPCKHBW%3APUNPCKHWD%3APUNPCKHDQ%3APUNPCKHQDQ.html"
             };
 
-        case "VPUNPCKLBW":
-        case "VPUNPCKLWD":
-        case "PUNPCKLWD":
-        case "VPUNPCKLQDQ":
+        case "PUNPCKLBW":
+        case "PUNPCKLDQ":
         case "PUNPCKLQDQ":
+        case "PUNPCKLWD":
+        case "VPUNPCKLBW":
         case "VPUNPCKLDQ":
-        case "PUNPCKLDQ":
-        case "PUNPCKLBW":
+        case "VPUNPCKLQDQ":
+        case "VPUNPCKLWD":
             return {
                 "html": "<p>Unpacks and interleaves the low-order data elements (bytes, words, doublewords, and quadwords) of the destination operand (first operand) and source operand (second operand) into the destination operand. (<a href=\"http://www.felixcloutier.com/x86/PUNPCKLBW:PUNPCKLWD:PUNPCKLDQ:PUNPCKLQDQ.html#fig-4-22\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-22</a> shows the unpack operation for bytes in 64-bit operands.). The high-order data elements are ignored.</p>",
                 "tooltip": "Unpacks and interleaves the low-order data elements (bytes, words, doublewords, and quadwords) of the destination operand (first operand) and source operand (second operand) into the destination operand. (Figure 4-22 shows the unpack operation for bytes in 64-bit operands.). The high-order data elements are ignored.",
@@ -2922,28 +2924,28 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/PUSHA%3APUSHAD.html"
             };
 
-        case "PUSHFQ":
-        case "PUSHFD":
         case "PUSHF":
+        case "PUSHFD":
+        case "PUSHFQ":
             return {
                 "html": "<p>Decrements the stack pointer by 4 (if the current operand-size attribute is 32) and pushes the entire contents of the EFLAGS register onto the stack, or decrements the stack pointer by 2 (if the operand-size attribute is 16) and pushes the lower 16 bits of the EFLAGS register (that is, the FLAGS register) onto the stack. These instructions reverse the operation of the POPF/POPFD instructions.</p><p>When copying the entire EFLAGS register to the stack, the VM and RF flags (bits 16 and 17) are not copied; instead, the values for these flags are cleared in the EFLAGS image stored on the stack. See Chapter 3 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more information about the EFLAGS register.</p><p>The PUSHF (push flags) and PUSHFD (push flags double) mnemonics reference the same opcode. The PUSHF instruction is intended for use when the operand-size attribute is 16 and the PUSHFD instruction for when the operand-size attribute is 32. Some assemblers may force the operand size to 16 when PUSHF is used and to 32 when PUSHFD is used. Others may treat these mnemonics as synonyms (PUSHF/PUSHFD) and use the current setting of the operand-size attribute to determine the size of values to be pushed from the stack, regardless of the mnemonic used.</p><p>In 64-bit mode, the instruction\u2019s default operation is to decrement the stack pointer (RSP) by 8 and pushes RFLAGS on the stack. 16-bit operation is supported using the operand size override prefix 66H. 32-bit operand size cannot be encoded in this mode. When copying RFLAGS to the stack, the VM and RF flags (bits 16 and 17) are not copied; instead, values for these flags are cleared in the RFLAGS image stored on the stack.</p><p>When operating in virtual-8086 mode (EFLAGS.VM = 1) without the virtual-8086 mode extensions (CR4.VME = 0), the PUSHF/PUSHFD instructions can be used only if IOPL = 3; otherwise, a general-protection exception (#GP) occurs. If the virtual-8086 mode extensions are enabled (CR4.VME = 1), PUSHF (but not PUSHFD) can be executed in virtual-8086 mode with IOPL &lt; 3.</p>",
                 "tooltip": "Decrements the stack pointer by 4 (if the current operand-size attribute is 32) and pushes the entire contents of the EFLAGS register onto the stack, or decrements the stack pointer by 2 (if the operand-size attribute is 16) and pushes the lower 16 bits of the EFLAGS register (that is, the FLAGS register) onto the stack. These instructions reverse the operation of the POPF/POPFD instructions.",
                 "url": "http://www.felixcloutier.com/x86/PUSHF%3APUSHFD%3APUSHFQ.html"
             };
 
-        case "VPXORQ":
+        case "PXOR":
         case "VPXOR":
         case "VPXORD":
-        case "PXOR":
+        case "VPXORQ":
             return {
                 "html": "<p>Performs a bitwise logical exclusive-OR (XOR) operation on the source operand (second operand) and the destination operand (first operand) and stores the result in the destination operand. Each bit of the result is 1 if the corresponding bits of the two operands are different; each bit is 0 if the corresponding bits of the operands are the same.</p><p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p><p>Legacy SSE instructions 64-bit operand: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p><p>128-bit Legacy SSE version: The second source operand is an XMM register or a 128-bit memory location. The first source operand and destination operands are XMM registers. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 encoded version: The second source operand is an XMM register or a 128-bit memory location. The first source operand and destination operands are XMM registers. Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>",
                 "tooltip": "Performs a bitwise logical exclusive-OR (XOR) operation on the source operand (second operand) and the destination operand (first operand) and stores the result in the destination operand. Each bit of the result is 1 if the corresponding bits of the two operands are different; each bit is 0 if the corresponding bits of the operands are the same.",
                 "url": "http://www.felixcloutier.com/x86/PXOR.html"
             };
 
-        case "ROL":
         case "RCL":
         case "RCR":
+        case "ROL":
         case "ROR":
             return {
                 "html": "<p>Shifts (rotates) the bits of the first operand (destination operand) the number of bit positions specified in the second operand (count operand) and stores the result in the destination operand. The destination operand can be a register or a memory location; the count operand is an unsigned integer that can be an immediate or a value in the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W = 1).</p><p>The rotate left (ROL) and rotate through carry left (RCL) instructions shift all the bits toward more-significant bit positions, except for the most-significant bit, which is rotated to the least-significant bit location. The rotate right (ROR) and rotate through carry right (RCR) instructions shift all the bits toward less significant bit positions, except for the least-significant bit, which is rotated to the most-significant bit location.</p><p>The RCL and RCR instructions include the CF flag in the rotation. The RCL instruction shifts the CF flag into the least-significant bit and shifts the most-significant bit into the CF flag. The RCR instruction shifts the CF flag into the most-significant bit and shifts the least-significant bit into the CF flag. For the ROL and ROR instructions, the original value of the CF flag is not a part of the result, but the CF flag receives a copy of the bit that was shifted from one end to the other.</p><p>The OF flag is defined only for the 1-bit rotates; it is undefined in all other cases (except RCL and RCR instructions only: a zero-bit rotate does nothing, that is affects no flags). For left rotates, the OF flag is set to the exclusive OR of the CF bit (after the rotate) and the most-significant bit of the result. For right rotates, the OF flag is set to the exclusive OR of the two most-significant bits of the result.</p><p>In 64-bit mode, using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Use of REX.W promotes the first operand to 64 bits and causes the count operand to become a 6-bit counter.</p>",
@@ -3010,9 +3012,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/RDTSCP.html"
             };
 
-        case "REPNE":
-        case "REPE":
         case "REP":
+        case "REPE":
+        case "REPNE":
             return {
                 "html": "<p>Repeats a string instruction the number of times specified in the count register or until the indicated condition of the ZF flag is no longer met. The REP (repeat), REPE (repeat while equal), REPNE (repeat while not equal), REPZ (repeat while zero), and REPNZ (repeat while not zero) mnemonics are prefixes that can be added to one of the string instructions. The REP prefix can be added to the INS, OUTS, MOVS, LODS, and STOS instructions, and the REPE, REPNE, REPZ, and REPNZ prefixes can be added to the CMPS and SCAS instructions. (The REPZ and REPNZ prefixes are synonymous forms of the REPE and REPNE prefixes, respectively.) The F3H prefix is defined for the following instructions and undefined for the rest:</p><p>The REP prefixes apply only to one string instruction at a time. To repeat a block of instructions, use the LOOP instruction or another looping construct. All of these repeat prefixes cause the associated instruction to be repeated until the count in register is decremented to 0. See <a href=\"http://www.felixcloutier.com/x86/REP:REPE:REPZ:REPNE:REPNZ.html#tbl-4-16\" rel=\"noreferrer noopener\" target=\"_blank\">Table 4-16</a>.</p><p>The REPE, REPNE, REPZ, and REPNZ prefixes also check the state of the ZF flag after each iteration and terminate the repeat loop if the ZF flag is not in the specified state. When both termination conditions are tested, the cause of a repeat termination can be determined either by testing the count register with a JECXZ instruction or by testing the ZF flag (with a JZ, JNZ, or JNE instruction).</p><p>When the REPE/REPZ and REPNE/REPNZ prefixes are used, the ZF flag does not require initialization because both the CMPS and SCAS instructions affect the ZF flag according to the results of the comparisons they make.</p><p>A repeating string operation can be suspended by an exception or interrupt. When this happens, the state of the registers is preserved to allow the string operation to be resumed upon a return from the exception or interrupt handler. The source and destination registers point to the next string elements to be operated on, the EIP register points to the string instruction, and the ECX register has the value it held following the last successful iteration of the instruction. This mechanism allows long string operations to proceed without affecting the interrupt response time of the system.</p>",
                 "tooltip": "Repeats a string instruction the number of times specified in the count register or until the indicated condition of the ZF flag is no longer met. The REP (repeat), REPE (repeat while equal), REPNE (repeat while not equal), REPZ (repeat while zero), and REPNZ (repeat while not zero) mnemonics are prefixes that can be added to one of the string instructions. The REP prefix can be added to the INS, OUTS, MOVS, LODS, and STOS instructions, and the REPE, REPNE, REPZ, and REPNZ prefixes can be added to the CMPS and SCAS instructions. (The REPZ and REPNZ prefixes are synonymous forms of the REPE and REPNE prefixes, respectively.) The F3H prefix is defined for the following instructions and undefined for the rest",
@@ -3033,8 +3035,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/RORX.html"
             };
 
-        case "VROUNDPD":
         case "ROUNDPD":
+        case "VROUNDPD":
             return {
                 "html": "<p>Round the 2 double-precision floating-point values in the source operand (second operand) using the rounding mode specified in the immediate operand (third operand) and place the results in the destination operand (first operand). The rounding process rounds each input floating-point value to an integer value and returns the integer result as a double-precision floating-point value.</p><p>The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in <a href=\"http://www.felixcloutier.com/x86/ROUNDPD.html#fig-4-24\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-24</a>. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (<span class=\"not-imported\">Table 4-17</span> lists the encoded values for rounding-mode field).</p><p>The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to \u20181 then denormals will be converted to zero before rounding.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified.</p><p>VEX.128 encoded version: the source operand second source operand or a 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p>",
                 "tooltip": "Round the 2 double-precision floating-point values in the source operand (second operand) using the rounding mode specified in the immediate operand (third operand) and place the results in the destination operand (first operand). The rounding process rounds each input floating-point value to an integer value and returns the integer result as a double-precision floating-point value.",
@@ -3098,6 +3100,7 @@ export function getAsmOpcode(opcode) {
         case "SAL":
         case "SAR":
         case "SHL":
+        case "SHR":
             return {
                 "html": "<p>Shifts the bits in the first operand (destination operand) to the left or right by the number of bits specified in the second operand (count operand). Bits shifted beyond the destination operand boundary are first shifted into the CF flag, then discarded. At the end of the shift operation, the CF flag contains the last bit shifted out of the destination operand.</p><p>The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). A special opcode encoding is provided for a count of 1.</p><p>The shift arithmetic left (SAL) and shift logical left (SHL) instructions perform the same operation; they shift the bits in the destination operand to the left (toward more significant bit locations). For each shift count, the most significant bit of the destination operand is shifted into the CF flag, and the least significant bit is cleared (see <span class=\"not-imported\">Figure 7-7</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>).</p><p>The shift arithmetic right (SAR) and shift logical right (SHR) instructions shift the bits of the destination operand to the right (toward less significant bit locations). For each shift count, the least significant bit of the destination operand is shifted into the CF flag, and the most significant bit is either set or cleared depending on the instruction type. The SHR instruction clears the most significant bit (see <span class=\"not-imported\">Figure 7-8</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>); the SAR instruction sets or clears the most significant bit to correspond to the sign (most significant bit) of the original value in the destination operand. In effect, the SAR instruction fills the empty bit position\u2019s shifted value with the sign of the unshifted value (see <span class=\"not-imported\">Figure 7-9</span> in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>).</p><p>The SAR and SHR instructions can be used to perform signed or unsigned division, respectively, of the destination operand by powers of 2. For example, using the SAR instruction to shift a signed integer 1 bit to the right divides the value by 2.</p>",
                 "tooltip": "Shifts the bits in the first operand (destination operand) to the left or right by the number of bits specified in the second operand (count operand). Bits shifted beyond the destination operand boundary are first shifted into the CF flag, then discarded. At the end of the shift operation, the CF flag contains the last bit shifted out of the destination operand.",
@@ -3105,8 +3108,8 @@ export function getAsmOpcode(opcode) {
             };
 
         case "SARX":
-        case "SHRX":
         case "SHLX":
+        case "SHRX":
             return {
                 "html": "<p>Shifts the bits of the first source operand (the second operand) to the left or right by a COUNT value specified in the second source operand (the third operand). The result is written to the destination operand (the first operand).</p><p>The shift arithmetic right (SARX) and shift logical right (SHRX) instructions shift the bits of the destination operand to the right (toward less significant bit locations), SARX keeps and propagates the most significant bit (sign bit) while shifting.</p><p>The logical shift left (SHLX) shifts the bits of the destination operand to the left (toward more significant bit locations).</p><p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p><p>If the value specified in the first source operand exceeds OperandSize -1, the COUNT value is masked.</p>",
                 "tooltip": "Shifts the bits of the first source operand (the second operand) to the left or right by a COUNT value specified in the second source operand (the third operand). The result is written to the destination operand (the first operand).",
@@ -3120,46 +3123,46 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/SBB.html"
             };
 
-        case "SCASB":
         case "SCAS":
+        case "SCASB":
+        case "SCASD":
         case "SCASQ":
         case "SCASW":
-        case "SCASD":
             return {
                 "html": "<p>In non-64-bit modes and in default 64-bit mode: this instruction compares a byte, word, doubleword or quadword specified using a memory operand with the value in AL, AX, or EAX. It then sets status flags in EFLAGS recording the results. The memory operand address is read from ES:(E)DI register (depending on the address-size attribute of the instruction and the current operational mode). Note that ES cannot be overridden with a segment override prefix.</p><p>At the assembly-code level, two forms of this instruction are allowed. The explicit-operand form and the no-operands form. The explicit-operand form (specified using the SCAS mnemonic) allows a memory operand to be specified explicitly. The memory operand must be a symbol that indicates the size and location of the operand value. The register operand is then automatically selected to match the size of the memory operand (AL register for byte comparisons, AX for word comparisons, EAX for doubleword comparisons). The explicit-operand form is provided to allow documentation. Note that the documentation provided by this form can be misleading. That is, the memory operand symbol must specify the correct type (size) of the operand (byte, word, or doubleword) but it does not have to specify the correct location. The location is always specified by ES:(E)DI.</p><p>The no-operands form of the instruction uses a short form of SCAS. Again, ES:(E)DI is assumed to be the memory operand and AL, AX, or EAX is assumed to be the register operand. The size of operands is selected by the mnemonic: SCASB (byte comparison), SCASW (word comparison), or SCASD (doubleword comparison).</p><p>After the comparison, the (E)DI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. If the DF flag is 0, the (E)DI register is incremented; if the DF flag is 1, the (E)DI register is decremented. The register is incremented or decremented by 1 for byte operations, by 2 for word operations, and by 4 for doubleword operations.</p><p>SCAS, SCASB, SCASW, SCASD, and SCASQ can be preceded by the REP prefix for block comparisons of ECX bytes, words, doublewords, or quadwords. Often, however, these instructions will be used in a LOOP construct that takes</p>",
                 "tooltip": "In non-64-bit modes and in default 64-bit mode: this instruction compares a byte, word, doubleword or quadword specified using a memory operand with the value in AL, AX, or EAX. It then sets status flags in EFLAGS recording the results. The memory operand address is read from ES:(E)DI register (depending on the address-size attribute of the instruction and the current operational mode). Note that ES cannot be overridden with a segment override prefix.",
                 "url": "http://www.felixcloutier.com/x86/SCAS%3ASCASB%3ASCASW%3ASCASD.html"
             };
 
-        case "SETO":
+        case "SETA":
+        case "SETAE":
+        case "SETB":
+        case "SETBE":
         case "SETC":
+        case "SETE":
         case "SETG":
-        case "SETNG":
-        case "SETNC":
         case "SETGE":
-        case "SETNB":
-        case "SETNA":
-        case "SETNBE":
-        case "SETNL":
-        case "SETNP":
-        case "SETP":
         case "SETL":
-        case "SETNO":
-        case "SETNS":
-        case "SETB":
-        case "SETS":
         case "SETLE":
+        case "SETNA":
         case "SETNAE":
-        case "SETNGE":
+        case "SETNB":
+        case "SETNBE":
+        case "SETNC":
         case "SETNE":
+        case "SETNG":
+        case "SETNGE":
+        case "SETNL":
         case "SETNLE":
-        case "SETPO":
-        case "SETPE":
-        case "SETBE":
-        case "SETAE":
-        case "SETA":
+        case "SETNO":
+        case "SETNP":
+        case "SETNS":
         case "SETNZ":
-        case "SETE":
+        case "SETO":
+        case "SETP":
+        case "SETPE":
+        case "SETPO":
+        case "SETS":
         case "SETZ":
             return {
                 "html": "<p>Sets the destination operand to 0 or 1 depending on the settings of the status flags (CF, SF, OF, ZF, and PF) in the EFLAGS register. The destination operand points to a byte register or a byte in memory. The condition code suffix (<em>cc</em>) indicates the condition being tested for.</p><p>The terms \u201cabove\u201d and \u201cbelow\u201d are associated with the CF flag and refer to the relationship between two unsigned integer values. The terms \u201cgreater\u201d and \u201cless\u201d are associated with the SF and OF flags and refer to the relationship between two signed integer values.</p><p>Many of the SET<em>cc</em> instruction opcodes have alternate mnemonics. For example, SETG (set byte if greater) and SETNLE (set if not less or equal) have the same opcode and test for the same condition: ZF equals 0 and SF equals OF. These alternate mnemonics are provided to make code more intelligible. Appendix B, \u201cEFLAGS Condition Codes,\u201d in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, shows the alternate mnemonics for various test conditions.</p><p>Some languages represent a logical one as an integer with all bits set. This representation can be obtained by choosing the logically opposite condition for the SET<em>cc</em> instruction, then decrementing the result. For example, to test for overflow, use the SETNO instruction, then decrement the result.</p><p>The reg field of the ModR/M byte is not used for the SETCC instruction and those opcode bits are ignored by the processor.</p>",
@@ -3237,8 +3240,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/SHRD.html"
             };
 
-        case "VSHUFPD":
         case "SHUFPD":
+        case "VSHUFPD":
             return {
                 "html": "<p>Selects a double-precision floating-point value of an input pair using a bit control and move to a designated element of the destination operand. The low-to-high order of double-precision element of the destination operand is interleaved between the first source operand and the second source operand at the granularity of input pair of 128 bits. Each bit in the imm8 byte, starting from bit 0, is the select control of the corresponding element of the destination to received the shuffled result of an input pair.</p><p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location The destination operand is a ZMM/YMM/XMM register updated according to the writemask. The select controls are the lower 8/4/2 bits of the imm8 byte.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register. The select controls are the bit 3:0 of the imm8 byte, imm8[7:4) are ignored.</p><p>VEX.128 encoded version: The first source operand is a XMM register. The second source operand can be a XMM register or a 128-bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed. The select controls are the bit 1:0 of the imm8 byte, imm8[7:2) are ignored.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination operand and the first source operand is the same and is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified. The select controls are the bit 1:0 of the imm8 byte, imm8[7:2) are ignored.</p>",
                 "tooltip": "Selects a double-precision floating-point value of an input pair using a bit control and move to a designated element of the destination operand. The low-to-high order of double-precision element of the destination operand is interleaved between the first source operand and the second source operand at the granularity of input pair of 128 bits. Each bit in the imm8 byte, starting from bit 0, is the select control of the corresponding element of the destination to received the shuffled result of an input pair.",
@@ -3274,8 +3277,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/SMSW.html"
             };
 
-        case "VSQRTPD":
         case "SQRTPD":
+        case "VSQRTPD":
             return {
                 "html": "<p>Performs a SIMD computation of the square roots of the two, four or eight packed double-precision floating-point values in the source operand (the second operand) stores the packed double-precision floating-point results in the destination operand (the first operand).</p><p>EVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register updated according to the writemask.</p><p>VEX.256 encoded version: The source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p><p>VEX.128 encoded version: the source operand second source operand or a 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>",
                 "tooltip": "Performs a SIMD computation of the square roots of the two, four or eight packed double-precision floating-point values in the source operand (the second operand) stores the packed double-precision floating-point results in the destination operand (the first operand).",
@@ -3298,8 +3301,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/SQRTSD.html"
             };
 
-        case "VSQRTSS":
         case "SQRTSS":
+        case "VSQRTSS":
             return {
                 "html": "<p>Computes the square root of the low single-precision floating-point value in the second source operand and stores the single-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 32-bit memory location. The first source and destination operands is an XMM register.</p><p>128-bit Legacy SSE version: The first source operand and the destination operand are the same. Bits (MAXVL-1:32) of the corresponding YMM destination register remain unchanged.</p><p>VEX.128 and EVEX encoded versions: Bits 127:32 of the destination operand are copied from the corresponding bits of the first source operand. Bits (MAXVL-1:128) of the destination ZMM register are zeroed.</p><p>EVEX encoded version: The low doubleword element of the destination operand is updated according to the writemask.</p><p>Software should ensure VSQRTSS is encoded with VEX.L=0. Encoding VSQRTSS with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>",
                 "tooltip": "Computes the square root of the low single-precision floating-point value in the second source operand and stores the single-precision floating-point result in the destination operand. The second source operand can be an XMM register or a 32-bit memory location. The first source and destination operands is an XMM register.",
@@ -3342,11 +3345,11 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/STMXCSR.html"
             };
 
+        case "STOS":
         case "STOSB":
+        case "STOSD":
         case "STOSQ":
         case "STOSW":
-        case "STOSD":
-        case "STOS":
             return {
                 "html": "<p>In non-64-bit and default 64-bit mode; stores a byte, word, or doubleword from the AL, AX, or EAX register (respectively) into the destination operand. The destination operand is a memory location, the address of which is read from either the ES:EDI or ES:DI register (depending on the address-size attribute of the instruction and the mode of operation). The ES segment cannot be overridden with a segment override prefix.</p><p>At the assembly-code level, two forms of the instruction are allowed: the \u201cexplicit-operands\u201d form and the \u201cno-operands\u201d form. The explicit-operands form (specified with the STOS mnemonic) allows the destination operand to be specified explicitly. Here, the destination operand should be a symbol that indicates the size and location of the destination value. The source operand is then automatically selected to match the size of the destination operand (the AL register for byte operands, AX for word operands, EAX for doubleword operands). The explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the destination operand symbol must specify the correct <strong>type</strong> (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct <strong>location</strong>. The location is always specified by the ES:(E)DI register. These must be loaded correctly before the store string instruction is executed.</p><p>The no-operands form provides \u201cshort forms\u201d of the byte, word, doubleword, and quadword versions of the STOS instructions. Here also ES:(E)DI is assumed to be the destination operand and AL, AX, or EAX is assumed to be the source operand. The size of the destination and source operands is selected by the mnemonic: STOSB (byte read from register AL), STOSW (word from AX), STOSD (doubleword from EAX).</p><p>After the byte, word, or doubleword is transferred from the register to the memory location, the (E)DI register is incremented or decremented according to the setting of the DF flag in the EFLAGS register. If the DF flag is 0, the register is incremented; if the DF flag is 1, the register is decremented (the register is incremented or decremented by 1 for byte operations, by 2 for word operations, by 4 for doubleword operations).</p><p>NOTE: To improve performance, more recent processors support modifications to the processor\u2019s operation during the string store operations initiated with STOS and STOSB. See Section 7.3.9.3 in the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em> for additional information on fast-string operation.</p>",
                 "tooltip": "In non-64-bit and default 64-bit mode; stores a byte, word, or doubleword from the AL, AX, or EAX register (respectively) into the destination operand. The destination operand is a memory location, the address of which is read from either the ES:EDI or ES:DI register (depending on the address-size attribute of the instruction and the mode of operation). The ES segment cannot be overridden with a segment override prefix.",
@@ -3375,8 +3378,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/SUBPD.html"
             };
 
-        case "VSUBPS":
         case "SUBPS":
+        case "VSUBPS":
             return {
                 "html": "<p>Performs a SIMD subtract of the packed single-precision floating-point values in the second Source operand from the First Source operand, and stores the packed single-precision floating-point results in the destination operand.</p><p>VEX.128 and EVEX.128 encoded versions: The second source operand is an XMM register or an 128-bit memory location. The first source operand and destination operands are XMM registers. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 and EVEX.256 encoded versions: The second source operand is an YMM register or an 256-bit memory location. The first source operand and destination operands are YMM registers. Bits (MAXVL-1:256) of the corresponding destination register are zeroed.</p><p>EVEX.512 encoded version: The second source operand is a ZMM register, a 512-bit memory location or a 512-bit vector broadcasted from a 32-bit memory location. The first source operand and destination operands are ZMM registers. The destination operand is conditionally updated according to the writemask.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper Bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>",
                 "tooltip": "Performs a SIMD subtract of the packed single-precision floating-point values in the second Source operand from the First Source operand, and stores the packed single-precision floating-point results in the destination operand.",
@@ -3463,17 +3466,17 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/UCOMISD.html"
             };
 
-        case "VUCOMISS":
         case "UCOMISS":
+        case "VUCOMISS":
             return {
                 "html": "<p>Compares the single-precision floating-point values in the low doublewords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN).</p><p>Operand 1 is an XMM register; operand 2 can be an XMM register or a 32 bit memory location.</p><p>The UCOMISS instruction differs from the COMISS instruction in that it signals a SIMD floating-point invalid operation exception (#I) only if a source operand is an SNaN. The COMISS instruction signals an invalid numeric exception when a source operand is either a QNaN or SNaN.</p><p>The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.</p><p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.</p>",
                 "tooltip": "Compares the single-precision floating-point values in the low doublewords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN).",
                 "url": "http://www.felixcloutier.com/x86/UCOMISS.html"
             };
 
-        case "UD2":
         case "UD01":
         case "UD1":
+        case "UD2":
             return {
                 "html": "<p>Generates an invalid opcode exception. This instruction is provided for software testing to explicitly generate an invalid opcode exception. The opcodes for this instruction are reserved for this purpose.</p><p>Other than raising the invalid opcode exception, this instruction has no effect on processor state or memory.</p><p>Even though it is the execution of the UD instruction that causes the invalid opcode exception, the instruction pointer saved by delivery of the exception references the UD instruction (and not the following instruction).</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Generates an invalid opcode exception. This instruction is provided for software testing to explicitly generate an invalid opcode exception. The opcodes for this instruction are reserved for this purpose.",
@@ -3494,24 +3497,24 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/UMWAIT.html"
             };
 
-        case "VUNPCKHPD":
         case "UNPCKHPD":
+        case "VUNPCKHPD":
             return {
                 "html": "<p>Performs an interleaved unpack of the high double-precision floating-point values from the first source operand and the second source operand. See <a href=\"http://www.felixcloutier.com/x86/PSHUFB.html#fig-4-15\" rel=\"noreferrer noopener\" target=\"_blank\">Figure 4-15</a> in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 2B.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified. When unpacking from a memory operand, an implementation may fetch only the appropriate 64 bits; however, alignment to 16-byte boundary and normal segment checking will still be enforced.</p><p>VEX.128 encoded version: The first source operand is a XMM register. The second source operand can be a XMM register or a 128-bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p><p>EVEX.512 encoded version: The first source operand is a ZMM register. The second source operand is a ZMM register, a 512-bit memory location, or a 512-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM register, conditionally updated using writemask k1.</p>",
                 "tooltip": "Performs an interleaved unpack of the high double-precision floating-point values from the first source operand and the second source operand. See Figure 4-15 in the Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 2B.",
                 "url": "http://www.felixcloutier.com/x86/UNPCKHPD.html"
             };
 
-        case "VUNPCKHPS":
         case "UNPCKHPS":
+        case "VUNPCKHPS":
             return {
                 "html": "<p>Performs an interleaved unpack of the high single-precision floating-point values from the first source operand and the second source operand.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified. When unpacking from a memory operand, an implementation may fetch only the appropriate 64 bits; however, alignment to 16-byte boundary and normal segment checking will still be enforced.</p><p>VEX.128 encoded version: The first source operand is a XMM register. The second source operand can be a XMM register or a 128-bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>VEX.256 encoded version: The second source operand is an YMM register or an 256-bit memory location. The first source operand and destination operands are YMM registers.</p><p>EVEX.512 encoded version: The first source operand is a ZMM register. The second source operand is a ZMM register, a 512-bit memory location, or a 512-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM register, conditionally updated using writemask k1.</p>",
                 "tooltip": "Performs an interleaved unpack of the high single-precision floating-point values from the first source operand and the second source operand.",
                 "url": "http://www.felixcloutier.com/x86/UNPCKHPS.html"
             };
 
-        case "VUNPCKLPD":
         case "UNPCKLPD":
+        case "VUNPCKLPD":
             return {
                 "html": "<p>Performs an interleaved unpack of the low double-precision floating-point values from the first source operand and the second source operand.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified. When unpacking from a memory operand, an implementation may fetch only the appropriate 64 bits; however, alignment to 16-byte boundary and normal segment checking will still be enforced.</p><p>VEX.128 encoded version: The first source operand is a XMM register. The second source operand can be a XMM register or a 128-bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p><p>EVEX.512 encoded version: The first source operand is a ZMM register. The second source operand is a ZMM register, a 512-bit memory location, or a 512-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM register, conditionally updated using writemask k1.</p>",
                 "tooltip": "Performs an interleaved unpack of the low double-precision floating-point values from the first source operand and the second source operand.",
@@ -3526,8 +3529,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/UNPCKLPS.html"
             };
 
-        case "V4FNMADDPS":
         case "V4FMADDPS":
+        case "V4FNMADDPS":
             return {
                 "html": "<p>This instruction computes 4 sequential packed fused single-precision floating-point multiply-add instructions with a sequentially selected memory operand in each of the four steps.</p><p>In the above box, the notation of \u201c+3\u201d is used to denote that the instruction accesses 4 source registers based on that operand; sources are consecutive, start in a multiple of 4 boundary, and contain the encoded register operand.</p><p>This instruction supports memory fault suppression. The entire memory operand is loaded if any of the 16 lowest significant mask bits is set to 1 or if a \u201cno masking\u201d encoding is used.</p><p>The tuple type Tuple1_4X implies that four 32-bit elements (16 bytes) are referenced by the memory operation portion of this instruction.</p><p>Rounding is performed at every FMA (fused multiply and add) boundary. Exceptions are also taken sequentially. Pre- and post-computational exceptions of the first FMA take priority over the pre- and post-computational exceptions of the second FMA, etc.</p>",
                 "tooltip": "This instruction computes 4 sequential packed fused single-precision floating-point multiply-add instructions with a sequentially selected memory operand in each of the four steps.",
@@ -3550,21 +3553,21 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VALIGND%3AVALIGNQ.html"
             };
 
-        case "VBLENDMPS":
         case "VBLENDMPD":
+        case "VBLENDMPS":
             return {
                 "html": "<p>Performs an element-by-element blending between float64/float32 elements in the first source operand (the second operand) with the elements in the second source operand (the third operand) using an opmask register as select control. The blended result is written to the destination register.</p><p>The destination and first source operands are ZMM/YMM/XMM registers. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location.</p><p>The opmask register is not used as a writemask for this instruction. Instead, the mask is used as an element selector: every element of the destination is conditionally selected between first source or second source using the value of the related mask bit (0 for first source operand, 1 for second source operand).</p><p>If EVEX.z is set, the elements with corresponding mask bit value of 0 in the destination operand are zeroed.</p>",
                 "tooltip": "Performs an element-by-element blending between float64/float32 elements in the first source operand (the second operand) with the elements in the second source operand (the third operand) using an opmask register as select control. The blended result is written to the destination register.",
                 "url": "http://www.felixcloutier.com/x86/VBLENDMPD%3AVBLENDMPS.html"
             };
 
-        case "VBROADCASTF32X4":
         case "VBROADCASTF128":
         case "VBROADCASTF32X2":
-        case "VBROADCASTF64X4":
+        case "VBROADCASTF32X4":
         case "VBROADCASTF32X8":
-        case "VBROADCASTSD":
         case "VBROADCASTF64X2":
+        case "VBROADCASTF64X4":
+        case "VBROADCASTSD":
         case "VBROADCASTSS":
             return {
                 "html": "<p>VBROADCASTSD/VBROADCASTSS/VBROADCASTF128 load floating-point values as one tuple from the source operand (second operand) in memory and broadcast to all elements of the destination operand (first operand).</p><p>VEX256-encoded versions: The destination operand is a YMM register. The source operand is either a 32-bit, 64-bit, or 128-bit memory location. Register source encodings are reserved and will #UD. Bits (MAXVL-1:256) of the destination register are zeroed.</p><p>EVEX-encoded versions: The destination operand is a ZMM/YMM/XMM register and updated according to the writemask k1. The source operand is either a 32-bit, 64-bit memory location or the low doubleword/quadword element of an XMM register.</p><p>VBROADCASTF32X2/VBROADCASTF32X4/VBROADCASTF64X2/VBROADCASTF32X8/VBROADCASTF64X4 load floating-point values as tuples from the source operand (the second operand) in memory or register and broadcast to all elements of the destination operand (the first operand). The destination operand is a YMM/ZMM register updated according to the writemask k1. The source operand is either a register or 64-bit/128-bit/256-bit memory location.</p><p>VBROADCASTSD and VBROADCASTF128,F32x4 and F64x2 are only supported as 256-bit and 512-bit wide versions and up. VBROADCASTSS is supported in 128-bit, 256-bit and 512-bit wide versions. F32x8 and F64x4 are only supported as 512-bit wide versions.</p>",
@@ -3775,8 +3778,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VDBPSADBW.html"
             };
 
-        case "VERW":
         case "VERR":
+        case "VERW":
             return {
                 "html": "<p>Verifies whether the code or data segment specified with the source operand is readable (VERR) or writable (VERW) from the current privilege level (CPL). The source operand is a 16-bit register or a memory location that contains the segment selector for the segment to be verified. If the segment is accessible and readable (VERR) or writable (VERW), the ZF flag is set; otherwise, the ZF flag is cleared. Code segments are never verified as writable. This check cannot be performed on system segments.</p><p>To set the ZF flag, the following conditions must be met:</p><p>The validation performed is the same as is performed when a segment selector is loaded into the DS, ES, FS, or GS register, and the indicated access (read or write) is performed. The segment selector's value cannot result in a protection exception, enabling the software to anticipate possible segment access problems.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode. The operand size is fixed at 16 bits.</p>",
                 "tooltip": "Verifies whether the code or data segment specified with the source operand is readable (VERR) or writable (VERW) from the current privilege level (CPL). The source operand is a 16-bit register or a memory location that contains the segment selector for the segment to be verified. If the segment is accessible and readable (VERR) or writable (VERW), the ZF flag is set; otherwise, the ZF flag is cleared. Code segments are never verified as writable. This check cannot be performed on system segments.",
@@ -3811,20 +3814,20 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VEXPANDPS.html"
             };
 
-        case "VEXTRACTF64X2":
-        case "VEXTRACTF32X8":
-        case "VEXTRACTF32X4":
         case "VEXTRACTF128":
+        case "VEXTRACTF32X4":
+        case "VEXTRACTF32X8":
+        case "VEXTRACTF64X2":
             return {
                 "html": "<p>VEXTRACTF128/VEXTRACTF32x4 and VEXTRACTF64x2 extract 128-bits of single-precision floating-point values from the source operand (the second operand) and store to the low 128-bit of the destination operand (the first operand). The 128-bit data extraction occurs at an 128-bit granular offset specified by imm8[0] (256-bit) or imm8[1:0] as the multiply factor. The destination may be either a vector register or an 128-bit memory location.</p><p>VEXTRACTF32x4: The low 128-bit of the destination operand is updated at 32-bit granularity according to the writemask.</p><p>VEXTRACTF32x8 and VEXTRACTF64x4 extract 256-bits of double-precision floating-point values from the source operand (second operand) and store to the low 256-bit of the destination operand (the first operand). The 256-bit data extraction occurs at an 256-bit granular offset specified by imm8[0] (256-bit) or imm8[0] as the multiply factor The destination may be either a vector register or a 256-bit memory location.</p><p>VEXTRACTF64x4: The low 256-bit of the destination operand is updated at 64-bit granularity according to the writemask.</p><p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "VEXTRACTF128/VEXTRACTF32x4 and VEXTRACTF64x2 extract 128-bits of single-precision floating-point values from the source operand (the second operand) and store to the low 128-bit of the destination operand (the first operand). The 128-bit data extraction occurs at an 128-bit granular offset specified by imm8[0] (256-bit) or imm8[1:0] as the multiply factor. The destination may be either a vector register or an 128-bit memory location.",
                 "url": "http://www.felixcloutier.com/x86/VEXTRACTF128%3AVEXTRACTF32x4%3AVEXTRACTF64x2%3AVEXTRACTF32x8%3AVEXTRACTF64x4.html"
             };
 
-        case "VEXTRACTI64X2":
-        case "VEXTRACTI32X8":
-        case "VEXTRACTI32X4":
         case "VEXTRACTI128":
+        case "VEXTRACTI32X4":
+        case "VEXTRACTI32X8":
+        case "VEXTRACTI64X2":
             return {
                 "html": "<p>VEXTRACTI128/VEXTRACTI32x4 and VEXTRACTI64x2 extract 128-bits of doubleword integer values from the source operand (the second operand) and store to the low 128-bit of the destination operand (the first operand). The 128-bit data extraction occurs at an 128-bit granular offset specified by imm8[0] (256-bit) or imm8[1:0] as the multiply factor. The destination may be either a vector register or an 128-bit memory location.</p><p>VEXTRACTI32x4: The low 128-bit of the destination operand is updated at 32-bit granularity according to the writemask.</p><p>VEXTRACTI64x2: The low 128-bit of the destination operand is updated at 64-bit granularity according to the writemask.</p><p>VEXTRACTI32x8 and VEXTRACTI64x4 extract 256-bits of quadword integer values from the source operand (the second operand) and store to the low 256-bit of the destination operand (the first operand). The 256-bit data extraction occurs at an 256-bit granular offset specified by imm8[0] (256-bit) or imm8[0] as the multiply factor The destination may be either a vector register or a 256-bit memory location.</p><p>VEXTRACTI32x8: The low 256-bit of the destination operand is updated at 32-bit granularity according to the writemask.</p>",
                 "tooltip": "VEXTRACTI128/VEXTRACTI32x4 and VEXTRACTI64x2 extract 128-bits of doubleword integer values from the source operand (the second operand) and store to the low 128-bit of the destination operand (the first operand). The 128-bit data extraction occurs at an 128-bit granular offset specified by imm8[0] (256-bit) or imm8[1:0] as the multiply factor. The destination may be either a vector register or an 128-bit memory location.",
@@ -3859,17 +3862,17 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VFIXUPIMMSS.html"
             };
 
-        case "VFMADD231PD":
-        case "VFMADD213PD":
         case "VFMADD132PD":
+        case "VFMADD213PD":
+        case "VFMADD231PD":
             return {
                 "html": "<p>Performs a set of SIMD multiply-add computation on packed double-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMADD132PD: Multiplies the two, four or eight packed double-precision floating-point values from the first source operand to the two, four or eight packed double-precision floating-point values in the third source operand, adds the infinite precision intermediate result to the two, four or eight packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMADD213PD: Multiplies the two, four or eight packed double-precision floating-point values from the second source operand to the two, four or eight packed double-precision floating-point values in the first source operand, adds the infinite precision intermediate result to the two, four or eight packed double-precision floating-point values in the third source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMADD231PD: Multiplies the two, four or eight packed double-precision floating-point values from the second source to the two, four or eight packed double-precision floating-point values in the third source operand, adds the infinite precision intermediate result to the two, four or eight packed double-precision floating-point values in the first source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) is a ZMM register and encoded in reg_field. The second source operand is a ZMM register and encoded in EVEX.vvvv. The third source operand is a ZMM register, a 512-bit memory location, or a 512-bit vector broadcasted from a 64-bit memory location. The destination operand is conditionally updated with write mask k1.</p>",
                 "tooltip": "Performs a set of SIMD multiply-add computation on packed double-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.",
                 "url": "http://www.felixcloutier.com/x86/VFMADD132PD%3AVFMADD213PD%3AVFMADD231PD.html"
             };
 
-        case "VFMADD213PS":
         case "VFMADD132PS":
+        case "VFMADD213PS":
         case "VFMADD231PS":
             return {
                 "html": "<p>Performs a set of SIMD multiply-add computation on packed single-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMADD132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the four, eight or sixteen packed single-precision floating-point values in the third source operand, adds the infinite precision intermediate result to the four, eight or sixteen packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMADD213PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the four, eight or sixteen packed single-precision floating-point values in the first source operand, adds the infinite precision intermediate result to the four, eight or sixteen packed single-precision floating-point values in the third source operand, performs rounding and stores the resulting the four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMADD231PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the four, eight or sixteen packed single-precision floating-point values in the third source operand, adds the infinite precision intermediate result to the four, eight or sixteen packed single-precision floating-point values in the first source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) is a ZMM register and encoded in reg_field. The second source operand is a ZMM register and encoded in EVEX.vvvv. The third source operand is a ZMM register, a 512-bit memory location, or a 512-bit vector broadcasted from a 32-bit memory location. The destination operand is conditionally updated with write mask k1.</p>",
@@ -3877,8 +3880,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VFMADD132PS%3AVFMADD213PS%3AVFMADD231PS.html"
             };
 
-        case "VFMADD213SD":
         case "VFMADD132SD":
+        case "VFMADD213SD":
         case "VFMADD231SD":
             return {
                 "html": "<p>Performs a SIMD multiply-add computation on the low double-precision floating-point values using three source operands and writes the multiply-add result in the destination operand. The destination operand is also the first source operand. The first and second operand are XMM registers. The third source operand can be an XMM register or a 64-bit memory location.</p><p>VFMADD132SD: Multiplies the low double-precision floating-point value from the first source operand to the low double-precision floating-point value in the third source operand, adds the infinite precision intermediate result to the low double-precision floating-point values in the second source operand, performs rounding and stores the resulting double-precision floating-point value to the destination operand (first source operand).</p><p>VFMADD213SD: Multiplies the low double-precision floating-point value from the second source operand to the low double-precision floating-point value in the first source operand, adds the infinite precision intermediate result to the low double-precision floating-point value in the third source operand, performs rounding and stores the resulting double-precision floating-point value to the destination operand (first source operand).</p><p>VFMADD231SD: Multiplies the low double-precision floating-point value from the second source to the low double-precision floating-point value in the third source operand, adds the infinite precision intermediate result to the low double-precision floating-point value in the first source operand, performs rounding and stores the resulting double-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 and EVEX encoded version: The destination operand (also first source operand) is encoded in reg_field. The second source operand is encoded in VEX.vvvv/EVEX.vvvv. The third source operand is encoded in rm_field. Bits 127:64 of the destination are unchanged. Bits MAXVL-1:128 of the destination register are zeroed.</p>",
@@ -3886,9 +3889,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VFMADD132SD%3AVFMADD213SD%3AVFMADD231SD.html"
             };
 
+        case "VFMADD132SS":
         case "VFMADD213SS":
         case "VFMADD231SS":
-        case "VFMADD132SS":
             return {
                 "html": "<p>Performs a SIMD multiply-add computation on single-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The first and second operands are XMM registers. The third source operand can be a XMM register or a 32-bit memory location.</p><p>VFMADD132SS: Multiplies the low single-precision floating-point value from the first source operand to the low single-precision floating-point value in the third source operand, adds the infinite precision intermediate result to the low single-precision floating-point value in the second source operand, performs rounding and stores the resulting single-precision floating-point value to the destination operand (first source operand).</p><p>VFMADD213SS: Multiplies the low single-precision floating-point value from the second source operand to the low single-precision floating-point value in the first source operand, adds the infinite precision intermediate result to the low single-precision floating-point value in the third source operand, performs rounding and stores the resulting single-precision floating-point value to the destination operand (first source operand).</p><p>VFMADD231SS: Multiplies the low single-precision floating-point value from the second source operand to the low single-precision floating-point value in the third source operand, adds the infinite precision intermediate result to the low single-precision floating-point value in the first source operand, performs rounding and stores the resulting single-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 and EVEX encoded version: The destination operand (also first source operand) is encoded in reg_field. The second source operand is encoded in VEX.vvvv/EVEX.vvvv. The third source operand is encoded in rm_field. Bits 127:32 of the destination are unchanged. Bits MAXVL-1:128 of the destination register are zeroed.</p>",
                 "tooltip": "Performs a SIMD multiply-add computation on single-precision floating-point values using three source operands and writes the multiply-add results in the destination operand. The destination operand is also the first source operand. The first and second operands are XMM registers. The third source operand can be a XMM register or a 32-bit memory location.",
@@ -3896,17 +3899,17 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VFMADDSUB132PD":
-        case "VFMADDSUB231PD":
         case "VFMADDSUB213PD":
+        case "VFMADDSUB231PD":
             return {
                 "html": "<p>VFMADDSUB132PD: Multiplies the two, four, or eight packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From the infinite precision intermediate result, adds the odd double-precision floating-point elements and subtracts the even double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMADDSUB213PD: Multiplies the two, four, or eight packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the first source operand. From the infinite precision intermediate result, adds the odd double-precision floating-point elements and subtracts the even double-precision floating-point values in the third source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMADDSUB231PD: Multiplies the two, four, or eight packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the third source operand. From the infinite precision intermediate result, adds the odd double-precision floating-point elements and subtracts the even double-precision floating-point values in the first source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is conditionally updated with write mask k1.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFMADDSUB132PD: Multiplies the two, four, or eight packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From the infinite precision intermediate result, adds the odd double-precision floating-point elements and subtracts the even double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).",
                 "url": "http://www.felixcloutier.com/x86/VFMADDSUB132PD%3AVFMADDSUB213PD%3AVFMADDSUB231PD.html"
             };
 
-        case "VFMADDSUB231PS":
         case "VFMADDSUB132PS":
         case "VFMADDSUB213PS":
+        case "VFMADDSUB231PS":
             return {
                 "html": "<p>VFMADDSUB132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the corresponding packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, adds the odd single-precision floating-point elements and subtracts the even single-precision floating-point values in the second source operand, performs rounding and stores the resulting packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMADDSUB213PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the corresponding packed single-precision floating-point values in the first source operand. From the infinite precision intermediate result, adds the odd single-precision floating-point elements and subtracts the even single-precision floating-point values in the third source operand, performs rounding and stores the resulting packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMADDSUB231PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the corresponding packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, adds the odd single-precision floating-point elements and subtracts the even single-precision floating-point values in the first source operand, performs rounding and stores the resulting packed single-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is conditionally updated with write mask k1.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFMADDSUB132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the corresponding packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, adds the odd single-precision floating-point elements and subtracts the even single-precision floating-point values in the second source operand, performs rounding and stores the resulting packed single-precision floating-point values to the destination operand (first source operand).",
@@ -3922,17 +3925,17 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VFMSUB132PD%3AVFMSUB213PD%3AVFMSUB231PD.html"
             };
 
+        case "VFMSUB132PS":
         case "VFMSUB213PS":
         case "VFMSUB231PS":
-        case "VFMSUB132PS":
             return {
                 "html": "<p>Performs a set of SIMD multiply-subtract computation on packed single-precision floating-point values using three source operands and writes the multiply-subtract results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.</p><p>VFMSUB132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the four, eight or sixteen packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the four, eight or sixteen packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUB213PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the four, eight or sixteen packed single-precision floating-point values in the first source operand. From the infinite precision intermediate result, subtracts the four, eight or sixteen packed single-precision floating-point values in the third source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUB231PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source to the four, eight or sixteen packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the four, eight or sixteen packed single-precision floating-point values in the first source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is conditionally updated with write mask k1.</p>",
                 "tooltip": "Performs a set of SIMD multiply-subtract computation on packed single-precision floating-point values using three source operands and writes the multiply-subtract results in the destination operand. The destination operand is also the first source operand. The second operand must be a SIMD register. The third source operand can be a SIMD register or a memory location.",
                 "url": "http://www.felixcloutier.com/x86/VFMSUB132PS%3AVFMSUB213PS%3AVFMSUB231PS.html"
             };
 
-        case "VFMSUB213SD":
         case "VFMSUB132SD":
+        case "VFMSUB213SD":
         case "VFMSUB231SD":
             return {
                 "html": "<p>Performs a SIMD multiply-subtract computation on the low packed double-precision floating-point values using three source operands and writes the multiply-subtract result in the destination operand. The destination operand is also the first source operand. The second operand must be a XMM register. The third source operand can be a XMM register or a 64-bit memory location.</p><p>VFMSUB132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand. From the infinite precision intermediate result, subtracts the low packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFMSUB213SD: Multiplies the low packed double-precision floating-point value from the second source operand to the low packed double-precision floating-point value in the first source operand. From the infinite precision intermediate result, subtracts the low packed double-precision floating-point value in the third source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFMSUB231SD: Multiplies the low packed double-precision floating-point value from the second source to the low packed double-precision floating-point value in the third source operand. From the infinite precision intermediate result, subtracts the low packed double-precision floating-point value in the first source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 and EVEX encoded version: The destination operand (also first source operand) is encoded in reg_field. The second source operand is encoded in VEX.vvvv/EVEX.vvvv. The third source operand is encoded in rm_field. Bits 127:64 of the destination are unchanged. Bits MAXVL-1:128 of the destination register are zeroed.</p>",
@@ -3940,17 +3943,17 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VFMSUB132SD%3AVFMSUB213SD%3AVFMSUB231SD.html"
             };
 
-        case "VFMSUB231SS":
-        case "VFMSUB213SS":
         case "VFMSUB132SS":
+        case "VFMSUB213SS":
+        case "VFMSUB231SS":
             return {
                 "html": "<p>Performs a SIMD multiply-subtract computation on the low packed single-precision floating-point values using three source operands and writes the multiply-subtract result in the destination operand. The destination operand is also the first source operand. The second operand must be a XMM register. The third source operand can be a XMM register or a 32-bit memory location.</p><p>VFMSUB132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand. From the infinite precision intermediate result, subtracts the low packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFMSUB213SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the first source operand. From the infinite precision intermediate result, subtracts the low packed single-precision floating-point value in the third source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFMSUB231SS: Multiplies the low packed single-precision floating-point value from the second source to the low packed single-precision floating-point value in the third source operand. From the infinite precision intermediate result, subtracts the low packed single-precision floating-point value in the first source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 and EVEX encoded version: The destination operand (also first source operand) is encoded in reg_field. The second source operand is encoded in VEX.vvvv/EVEX.vvvv. The third source operand is encoded in rm_field. Bits 127:32 of the destination are unchanged. Bits MAXVL-1:128 of the destination register are zeroed.</p>",
                 "tooltip": "Performs a SIMD multiply-subtract computation on the low packed single-precision floating-point values using three source operands and writes the multiply-subtract result in the destination operand. The destination operand is also the first source operand. The second operand must be a XMM register. The third source operand can be a XMM register or a 32-bit memory location.",
                 "url": "http://www.felixcloutier.com/x86/VFMSUB132SS%3AVFMSUB213SS%3AVFMSUB231SS.html"
             };
 
-        case "VFMSUBADD213PD":
         case "VFMSUBADD132PD":
+        case "VFMSUBADD213PD":
         case "VFMSUBADD231PD":
             return {
                 "html": "<p>VFMSUBADD132PD: Multiplies the two, four, or eight packed double-precision floating-point values from the first source operand to the two or four packed double-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the odd double-precision floating-point elements and adds the even double-precision floating-point values in the second source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUBADD213PD: Multiplies the two, four, or eight packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the first source operand. From the infinite precision intermediate result, subtracts the odd double-precision floating-point elements and adds the even double-precision floating-point values in the third source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUBADD231PD: Multiplies the two, four, or eight packed double-precision floating-point values from the second source operand to the two or four packed double-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the odd double-precision floating-point elements and adds the even double-precision floating-point values in the first source operand, performs rounding and stores the resulting two or four packed double-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is conditionally updated with write mask k1.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
@@ -3958,27 +3961,27 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VFMSUBADD132PD%3AVFMSUBADD213PD%3AVFMSUBADD231PD.html"
             };
 
-        case "VFMSUBADD231PS":
-        case "VFMSUBADD213PS":
         case "VFMSUBADD132PS":
+        case "VFMSUBADD213PS":
+        case "VFMSUBADD231PS":
             return {
                 "html": "<p>VFMSUBADD132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the corresponding packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the odd single-precision floating-point elements and adds the even single-precision floating-point values in the second source operand, performs rounding and stores the resulting packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUBADD213PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the corresponding packed single-precision floating-point values in the first source operand. From the infinite precision intermediate result, subtracts the odd single-precision floating-point elements and adds the even single-precision floating-point values in the third source operand, performs rounding and stores the resulting packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFMSUBADD231PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the corresponding packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the odd single-precision floating-point elements and adds the even single-precision floating-point values in the first source operand, performs rounding and stores the resulting packed single-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is conditionally updated with write mask k1.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFMSUBADD132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the corresponding packed single-precision floating-point values in the third source operand. From the infinite precision intermediate result, subtracts the odd single-precision floating-point elements and adds the even single-precision floating-point values in the second source operand, performs rounding and stores the resulting packed single-precision floating-point values to the destination operand (first source operand).",
                 "url": "http://www.felixcloutier.com/x86/VFMSUBADD132PS%3AVFMSUBADD213PS%3AVFMSUBADD231PS.html"
             };
 
-        case "VFNMADD231PD":
         case "VFNMADD132PD":
         case "VFNMADD213PD":
+        case "VFNMADD231PD":
             return {
                 "html": "<p>VFNMADD132PD: Multiplies the two, four or eight packed double-precision floating-point values from the first source operand to the two, four or eight packed double-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the two, four or eight packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFNMADD213PD: Multiplies the two, four or eight packed double-precision floating-point values from the second source operand to the two, four or eight packed double-precision floating-point values in the first source operand, adds the negated infinite precision intermediate result to the two, four or eight packed double-precision floating-point values in the third source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFNMADD231PD: Multiplies the two, four or eight packed double-precision floating-point values from the second source to the two, four or eight packed double-precision floating-point values in the third source operand, the negated infinite precision intermediate result to the two, four or eight packed double-precision floating-point values in the first source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is conditionally updated with write mask k1.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFNMADD132PD: Multiplies the two, four or eight packed double-precision floating-point values from the first source operand to the two, four or eight packed double-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the two, four or eight packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).",
                 "url": "http://www.felixcloutier.com/x86/VFNMADD132PD%3AVFNMADD213PD%3AVFNMADD231PD.html"
             };
 
-        case "VFNMADD231PS":
         case "VFNMADD132PS":
         case "VFNMADD213PS":
+        case "VFNMADD231PS":
             return {
                 "html": "<p>VFNMADD132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the four, eight or sixteen packed single-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the four, eight or sixteen packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFNMADD213PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the four, eight or sixteen packed single-precision floating-point values in the first source operand, adds the negated infinite precision intermediate result to the four, eight or sixteen packed single-precision floating-point values in the third source operand, performs rounding and stores the resulting the four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFNMADD231PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the four, eight or sixteen packed single-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the four, eight or sixteen packed single-precision floating-point values in the first source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is conditionally updated with write mask k1.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFNMADD132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the four, eight or sixteen packed single-precision floating-point values in the third source operand, adds the negated infinite precision intermediate result to the four, eight or sixteen packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).",
@@ -3986,35 +3989,35 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VFNMADD132SD":
-        case "VFNMADD231SD":
         case "VFNMADD213SD":
+        case "VFNMADD231SD":
             return {
                 "html": "<p>VFNMADD132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFNMADD213SD: Multiplies the low packed double-precision floating-point value from the second source operand to the low packed double-precision floating-point value in the first source operand, adds the negated infinite precision intermediate result to the low packed double-precision floating-point value in the third source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VFNMADD231SD: Multiplies the low packed double-precision floating-point value from the second source to the low packed double-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed double-precision floating-point value in the first source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 and EVEX encoded version: The destination operand (also first source operand) is encoded in reg_field. The second source operand is encoded in VEX.vvvv/EVEX.vvvv. The third source operand is encoded in rm_field. Bits 127:64 of the destination are unchanged. Bits MAXVL-1:128 of the destination register are zeroed.</p><p>EVEX encoded version: The low quadword element of the destination is updated according to the writemask.</p>",
                 "tooltip": "VFNMADD132SD: Multiplies the low packed double-precision floating-point value from the first source operand to the low packed double-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting packed double-precision floating-point value to the destination operand (first source operand).",
                 "url": "http://www.felixcloutier.com/x86/VFNMADD132SD%3AVFNMADD213SD%3AVFNMADD231SD.html"
             };
 
-        case "VFNMADD231SS":
-        case "VFNMADD213SS":
         case "VFNMADD132SS":
+        case "VFNMADD213SS":
+        case "VFNMADD231SS":
             return {
                 "html": "<p>VFNMADD132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed single-precision floating-point value in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFNMADD213SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the first source operand, adds the negated infinite precision intermediate result to the low packed single-precision floating-point value in the third source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VFNMADD231SS: Multiplies the low packed single-precision floating-point value from the second source operand to the low packed single-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed single-precision floating-point value in the first source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).</p><p>VEX.128 and EVEX encoded version: The destination operand (also first source operand) is encoded in reg_field. The second source operand is encoded in VEX.vvvv/EVEX.vvvv. The third source operand is encoded in rm_field. Bits 127:32 of the destination are unchanged. Bits MAXVL-1:128 of the destination register are zeroed.</p><p>EVEX encoded version: The low doubleword element of the destination is updated according to the writemask.</p>",
                 "tooltip": "VFNMADD132SS: Multiplies the low packed single-precision floating-point value from the first source operand to the low packed single-precision floating-point value in the third source operand, adds the negated infinite precision intermediate result to the low packed single-precision floating-point value in the second source operand, performs rounding and stores the resulting packed single-precision floating-point value to the destination operand (first source operand).",
                 "url": "http://www.felixcloutier.com/x86/VFNMADD132SS%3AVFNMADD213SS%3AVFNMADD231SS.html"
             };
 
-        case "VFNMSUB231PD":
         case "VFNMSUB132PD":
         case "VFNMSUB213PD":
+        case "VFNMSUB231PD":
             return {
                 "html": "<p>VFNMSUB132PD: Multiplies the two, four or eight packed double-precision floating-point values from the first source operand to the two, four or eight packed double-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the two, four or eight packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFNMSUB213PD: Multiplies the two, four or eight packed double-precision floating-point values from the second source operand to the two, four or eight packed double-precision floating-point values in the first source operand. From negated infinite precision intermediate results, subtracts the two, four or eight packed double-precision floating-point values in the third source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>VFNMSUB231PD: Multiplies the two, four or eight packed double-precision floating-point values from the second source to the two, four or eight packed double-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the two, four or eight packed double-precision floating-point values in the first source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is conditionally updated with write mask k1.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFNMSUB132PD: Multiplies the two, four or eight packed double-precision floating-point values from the first source operand to the two, four or eight packed double-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the two, four or eight packed double-precision floating-point values in the second source operand, performs rounding and stores the resulting two, four or eight packed double-precision floating-point values to the destination operand (first source operand).",
                 "url": "http://www.felixcloutier.com/x86/VFNMSUB132PD%3AVFNMSUB213PD%3AVFNMSUB231PD.html"
             };
 
+        case "VFNMSUB132PS":
         case "VFNMSUB213PS":
         case "VFNMSUB231PS":
-        case "VFNMSUB132PS":
             return {
                 "html": "<p>VFNMSUB132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the four, eight or sixteen packed single-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the four, eight or sixteen packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFNMSUB213PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source operand to the four, eight or sixteen packed single-precision floating-point values in the first source operand. From negated infinite precision intermediate results, subtracts the four, eight or sixteen packed single-precision floating-point values in the third source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>VFNMSUB231PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the second source to the four, eight or sixteen packed single-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the four, eight or sixteen packed single-precision floating-point values in the first source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).</p><p>EVEX encoded versions: The destination operand (also first source operand) and the second source operand are ZMM/YMM/XMM register. The third source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is conditionally updated with write mask k1.</p><p>VEX.256 encoded version: The destination operand (also first source operand) is a YMM register and encoded in reg_field. The second source operand is a YMM register and encoded in VEX.vvvv. The third source operand is a YMM register or a 256-bit memory location and encoded in rm_field.</p>",
                 "tooltip": "VFNMSUB132PS: Multiplies the four, eight or sixteen packed single-precision floating-point values from the first source operand to the four, eight or sixteen packed single-precision floating-point values in the third source operand. From negated infinite precision intermediate results, subtracts the four, eight or sixteen packed single-precision floating-point values in the second source operand, performs rounding and stores the resulting four, eight or sixteen packed single-precision floating-point values to the destination operand (first source operand).",
@@ -4075,16 +4078,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VGATHERDPD%3AVGATHERQPD.html"
             };
 
-        case "VGATHERQPS":
         case "VGATHERDPS":
+        case "VGATHERQPS":
             return {
                 "html": "<p>The instruction conditionally loads up to 4 or 8 single-precision floating-point values from memory addresses specified by the memory operand (the second operand) and using dword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor.</p><p>The mask operand (the third operand) specifies the conditional load operation from each memory address and the corresponding update of each data element of the destination operand (the first operand). Conditionality is specified by the most significant bit of each data element of the mask register. If an element\u2019s mask bit is not set, the corresponding element of the destination register is left unchanged. The width of data element in the destination register and mask register are identical. The entire mask register will be set to zero by this instruction unless the instruction causes an exception.</p><p>Using qword indices, the instruction conditionally loads up to 2 or 4 single-precision floating-point values from the VSIB addressing memory operand, and updates the lower half of the destination register. The upper 128 or 256 bits of the destination register are zero\u2019ed with qword indices.</p><p>This instruction can be suspended by an exception if at least one element is already gathered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set). When this happens, the destination register and the mask operand are partially updated; those elements that have been gathered are placed into the destination register and have their mask bits set to zero. If any traps or interrupts are pending from already gathered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued.</p><p>If the data size and index size are different, part of the destination register and part of the mask register do not correspond to any elements being gathered. This instruction sets those parts to zero. It may do this to one or both of those registers even if the instruction triggers an exception, and even if the instruction triggers the exception before gathering any elements.</p>",
                 "tooltip": "The instruction conditionally loads up to 4 or 8 single-precision floating-point values from memory addresses specified by the memory operand (the second operand) and using dword indices. The memory operand uses the VSIB form of the SIB byte to specify a general purpose register operand as the common base, a vector register for an array of indices relative to the base and a constant scale factor.",
                 "url": "http://www.felixcloutier.com/x86/VGATHERDPS%3AVGATHERQPS.html"
             };
 
-        case "VGATHERPF0DPS":
         case "VGATHERPF0DPD":
+        case "VGATHERPF0DPS":
         case "VGATHERPF0QPD":
         case "VGATHERPF0QPS":
             return {
@@ -4094,9 +4097,9 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VGATHERPF1DPD":
-        case "VGATHERPF1QPS":
         case "VGATHERPF1DPS":
         case "VGATHERPF1QPD":
+        case "VGATHERPF1QPS":
             return {
                 "html": "<p>The instruction conditionally prefetches up to sixteen 32-bit or eight 64-bit integer byte data elements. The elements are specified via the VSIB (i.e., the index register is an zmm, holding packed indices). Elements will only be prefetched if their corresponding mask bit is one.</p><p>Lines prefetched are loaded into to a location in the cache hierarchy specified by a locality hint (T1):</p><p>[PS data] For dword indices, the instruction will prefetch sixteen memory locations. For qword indices, the instruction will prefetch eight values.</p><p>[PD data] For dword and qword indices, the instruction will prefetch eight memory locations.</p>",
                 "tooltip": "The instruction conditionally prefetches up to sixteen 32-bit or eight 64-bit integer byte data elements. The elements are specified via the VSIB (i.e., the index register is an zmm, holding packed indices). Elements will only be prefetched if their corresponding mask bit is one.",
@@ -4159,22 +4162,22 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VGETMANTSS.html"
             };
 
+        case "VINSERTF128":
+        case "VINSERTF32X4":
         case "VINSERTF32X8":
         case "VINSERTF64X2":
-        case "VINSERTF128":
         case "VINSERTF64X4":
-        case "VINSERTF32X4":
             return {
                 "html": "<p>VINSERTF128/VINSERTF32x4 and VINSERTF64x2 insert 128-bits of packed floating-point values from the second source operand (the third operand) into the destination operand (the first operand) at an 128-bit granularity offset multiplied by imm8[0] (256-bit) or imm8[1:0]. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand (the second operand). The second source operand can be either an XMM register or a 128-bit memory location. The destination and first source operands are vector registers.</p><p>VINSERTF32x4: The destination operand is a ZMM/YMM register and updated at 32-bit granularity according to the writemask. The high 6/7 bits of the immediate are ignored.</p><p>VINSERTF64x2: The destination operand is a ZMM/YMM register and updated at 64-bit granularity according to the writemask. The high 6/7 bits of the immediate are ignored.</p><p>VINSERTF32x8 and VINSERTF64x4 inserts 256-bits of packed floating-point values from the second source operand (the third operand) into the destination operand (the first operand) at a 256-bit granular offset multiplied by imm8[0]. The remaining portions of the destination are copied from the corresponding fields of the first source operand (the second operand). The second source operand can be either an YMM register or a 256-bit memory location. The high 7 bits of the immediate are ignored. The destination operand is a ZMM register and updated at 32/64-bit granularity according to the writemask.</p>",
                 "tooltip": "VINSERTF128/VINSERTF32x4 and VINSERTF64x2 insert 128-bits of packed floating-point values from the second source operand (the third operand) into the destination operand (the first operand) at an 128-bit granularity offset multiplied by imm8[0] (256-bit) or imm8[1:0]. The remaining portions of the destination operand are copied from the corresponding fields of the first source operand (the second operand). The second source operand can be either an XMM register or a 128-bit memory location. The destination and first source operands are vector registers.",
                 "url": "http://www.felixcloutier.com/x86/VINSERTF128%3AVINSERTF32x4%3AVINSERTF64x2%3AVINSERTF32x8%3AVINSERTF64x4.html"
             };
 
-        case "VINSERTI32X8":
         case "VINSERTI128":
+        case "VINSERTI32X4":
+        case "VINSERTI32X8":
         case "VINSERTI64X2":
         case "VINSERTI64X4":
-        case "VINSERTI32X4":
             return {
                 "html": "<p>VINSERTI32x4 and VINSERTI64x2 inserts 128-bits of packed integer values from the second source operand (the third operand) into the destination operand (the first operand) at an 128-bit granular offset multiplied by imm8[0] (256-bit) or imm8[1:0]. The remaining portions of the destination are copied from the corresponding fields of the first source operand (the second operand). The second source operand can be either an XMM register or a 128-bit memory location. The high 6/7bits of the immediate are ignored. The destination operand is a ZMM/YMM register and updated at 32 and 64-bit granularity according to the writemask.</p><p>VINSERTI32x8 and VINSERTI64x4 inserts 256-bits of packed integer values from the second source operand (the third operand) into the destination operand (the first operand) at a 256-bit granular offset multiplied by imm8[0]. The remaining portions of the destination are copied from the corresponding fields of the first source operand (the second operand). The second source operand can be either an YMM register or a 256-bit memory location. The upper bits of the immediate are ignored. The destination operand is a ZMM register and updated at 32 and 64-bit granularity according to the writemask.</p><p>VINSERTI128 inserts 128-bits of packed integer data from the second source operand (the third operand) into the destination operand (the first operand) at a 128-bit granular offset multiplied by imm8[0]. The remaining portions of the destination are copied from the corresponding fields of the first source operand (the second operand). The second source operand can be either an XMM register or a 128-bit memory location. The high 7 bits of the immediate are ignored. VEX.L must be 1, otherwise attempt to execute this instruction with VEX.L=0 will cause #UD.</p>",
                 "tooltip": "VINSERTI32x4 and VINSERTI64x2 inserts 128-bits of packed integer values from the second source operand (the third operand) into the destination operand (the first operand) at an 128-bit granular offset multiplied by imm8[0] (256-bit) or imm8[1:0]. The remaining portions of the destination are copied from the corresponding fields of the first source operand (the second operand). The second source operand can be either an XMM register or a 128-bit memory location. The high 6/7bits of the immediate are ignored. The destination operand is a ZMM/YMM register and updated at 32 and 64-bit granularity according to the writemask.",
@@ -4218,39 +4221,39 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPBLENDMB%3AVPBLENDMW.html"
             };
 
-        case "VPBLENDMQ":
         case "VPBLENDMD":
+        case "VPBLENDMQ":
             return {
                 "html": "<p>Performs an element-by-element blending of dword/qword elements between the first source operand (the second operand) and the elements of the second source operand (the third operand) using an opmask register as select control. The blended result is written into the destination.</p><p>The destination and first source operands are ZMM registers. The second source operand can be a ZMM register, a 512-bit memory location or a 512-bit vector broadcasted from a 32-bit memory location.</p><p>The opmask register is not used as a writemask for this instruction. Instead, the mask is used as an element selector: every element of the destination is conditionally selected between first source or second source using the value of the related mask bit (0 for the first source operand, 1 for the second source operand).</p><p>If EVEX.z is set, the elements with corresponding mask bit value of 0 in the destination operand are zeroed.</p>",
                 "tooltip": "Performs an element-by-element blending of dword/qword elements between the first source operand (the second operand) and the elements of the second source operand (the third operand) using an opmask register as select control. The blended result is written into the destination.",
                 "url": "http://www.felixcloutier.com/x86/VPBLENDMD%3AVPBLENDMQ.html"
             };
 
-        case "VPBROADCASTQ":
+        case "VBROADCASTI128":
         case "VBROADCASTI32X4":
-        case "VPBROADCASTD":
         case "VBROADCASTI32X8":
-        case "VBROADCASTI128":
-        case "VPBROADCASTW":
-        case "VPBROADCASTB":
         case "VBROADCASTI64X2":
         case "VBROADCASTI64X4":
+        case "VPBROADCASTB":
+        case "VPBROADCASTD":
+        case "VPBROADCASTQ":
+        case "VPBROADCASTW":
             return {
                 "html": "<p>Load integer data from the source operand (the second operand) and broadcast to all elements of the destination operand (the first operand).</p><p>VEX256-encoded VPBROADCASTB/W/D/Q: The source operand is 8-bit, 16-bit, 32-bit, 64-bit memory location or the low 8-bit, 16-bit 32-bit, 64-bit data in an XMM register. The destination operand is a YMM register. VPBROADCASTI128 support the source operand of 128-bit memory location. Register source encodings for VPBROADCASTI128 is reserved and will #UD. Bits (MAXVL-1:256) of the destination register are zeroed.</p><p>EVEX-encoded VPBROADCASTD/Q: The source operand is a 32-bit, 64-bit memory location or the low 32-bit, 64-bit data in an XMM register. The destination operand is a ZMM/YMM/XMM register and updated according to the writemask k1.</p><p>VPBROADCASTI32X4 and VPBROADCASTI64X4: The destination operand is a ZMM register and updated according to the writemask k1. The source operand is 128-bit or 256-bit memory location. Register source encodings for VBROADCASTI32X4 and VBROADCASTI64X4 are reserved and will #UD.</p><p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "Load integer data from the source operand (the second operand) and broadcast to all elements of the destination operand (the first operand).",
                 "url": "http://www.felixcloutier.com/x86/VPBROADCAST.html"
             };
 
-        case "VPBROADCASTMW2D":
         case "VPBROADCASTMB2Q":
+        case "VPBROADCASTMW2D":
             return {
                 "html": "<p>Broadcasts the zero-extended 64/32 bit value of the low byte/word of the source operand (the second operand) to each 64/32 bit element of the destination operand (the first operand). The source operand is an opmask register. The destination operand is a ZMM register (EVEX.512), YMM register (EVEX.256), or XMM register (EVEX.128).</p><p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "Broadcasts the zero-extended 64/32 bit value of the low byte/word of the source operand (the second operand) to each 64/32 bit element of the destination operand (the first operand). The source operand is an opmask register. The destination operand is a ZMM register (EVEX.512), YMM register (EVEX.256), or XMM register (EVEX.128).",
                 "url": "http://www.felixcloutier.com/x86/VPBROADCASTM.html"
             };
 
-        case "VPCMPUB":
         case "VPCMPB":
+        case "VPCMPUB":
             return {
                 "html": "<p>Performs a SIMD compare of the packed byte values in the second source operand and the first source operand and returns the results of the comparison to the mask destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands. The result of each comparison is a single mask bit result of 1 (comparison true) or 0 (comparison false).</p><p>VPCMPB performs a comparison between pairs of signed byte values.</p><p>VPCMPUB performs a comparison between pairs of unsigned byte values.</p><p>The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register or a 512/256/128-bit memory location. The destination operand (first operand) is a mask register k1. Up to 64/32/16 comparisons are performed with results written to the destination operand under the writemask k2.</p><p>The comparison predicate operand is an 8-bit immediate: bits 2:0 define the type of comparison to be performed. Bits 3 through 7 of the immediate are reserved. Compiler can implement the pseudo-op mnemonic listed in Table 5-17.</p>",
                 "tooltip": "Performs a SIMD compare of the packed byte values in the second source operand and the first source operand and returns the results of the comparison to the mask destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands. The result of each comparison is a single mask bit result of 1 (comparison true) or 0 (comparison false).",
@@ -4265,16 +4268,16 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPCMPD%3AVPCMPUD.html"
             };
 
-        case "VPCMPUQ":
         case "VPCMPQ":
+        case "VPCMPUQ":
             return {
                 "html": "<p>Performs a SIMD compare of the packed integer values in the second source operand and the first source operand and returns the results of the comparison to the mask destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands. The result of each comparison is a single mask bit result of 1 (comparison true) or 0 (comparison false).</p><p>VPCMPQ/VPCMPUQ performs a comparison between pairs of signed/unsigned quadword integer values.</p><p>The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register or a 512/256/128-bit memory location or a 512-bit vector broadcasted from a 64-bit memory location. The destination operand (first operand) is a mask register k1. Up to 8/4/2 comparisons are performed with results written to the destination operand under the writemask k2.</p><p>The comparison predicate operand is an 8-bit immediate: bits 2:0 define the type of comparison to be performed. Bits 3 through 7 of the immediate are reserved. Compiler can implement the pseudo-op mnemonic listed in Table 5-17.</p>",
                 "tooltip": "Performs a SIMD compare of the packed integer values in the second source operand and the first source operand and returns the results of the comparison to the mask destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands. The result of each comparison is a single mask bit result of 1 (comparison true) or 0 (comparison false).",
                 "url": "http://www.felixcloutier.com/x86/VPCMPQ%3AVPCMPUQ.html"
             };
 
-        case "VPCMPW":
         case "VPCMPUW":
+        case "VPCMPW":
             return {
                 "html": "<p>Performs a SIMD compare of the packed integer word in the second source operand and the first source operand and returns the results of the comparison to the mask destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands. The result of each comparison is a single mask bit result of 1 (comparison true) or 0 (comparison false).</p><p>VPCMPW performs a comparison between pairs of signed word values.</p><p>VPCMPUW performs a comparison between pairs of unsigned word values.</p><p>The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register or a 512/256/128-bit memory location. The destination operand (first operand) is a mask register k1. Up to 32/16/8 comparisons are performed with results written to the destination operand under the writemask k2.</p><p>The comparison predicate operand is an 8-bit immediate: bits 2:0 define the type of comparison to be performed. Bits 3 through 7 of the immediate are reserved. Compiler can implement the pseudo-op mnemonic listed in Table 5-17.</p>",
                 "tooltip": "Performs a SIMD compare of the packed integer word in the second source operand and the first source operand and returns the results of the comparison to the mask destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands. The result of each comparison is a single mask bit result of 1 (comparison true) or 0 (comparison false).",
@@ -4324,8 +4327,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPERMB.html"
             };
 
-        case "VPERMW":
         case "VPERMD":
+        case "VPERMW":
             return {
                 "html": "<p>Copies doublewords (or words) from the second source operand (the third operand) to the destination operand (the first operand) according to the indices in the first source operand (the second operand). Note that this instruction permits a doubleword (word) in the source operand to be copied to more than one location in the destination operand.</p><p>VEX.256 encoded VPERMD: The first and second operands are YMM registers, the third operand can be a YMM register or memory location. Bits (MAXVL-1:256) of the corresponding destination register are zeroed.</p><p>EVEX encoded VPERMD: The first and second operands are ZMM/YMM registers, the third operand can be a ZMM/YMM register, a 512/256-bit memory location or a 512/256-bit vector broadcasted from a 32-bit memory location. The elements in the destination are updated using the writemask k1.</p><p>VPERMW: first and second operands are ZMM/YMM/XMM registers, the third operand can be a ZMM/YMM/XMM register, or a 512/256/128-bit memory location. The destination is updated using the writemask k1.</p><p>EVEX.128 encoded versions: Bits (MAXVL-1:128) of the corresponding ZMM register are zeroed.</p>",
                 "tooltip": "Copies doublewords (or words) from the second source operand (the third operand) to the destination operand (the first operand) according to the indices in the first source operand (the second operand). Note that this instruction permits a doubleword (word) in the source operand to be copied to more than one location in the destination operand.",
@@ -4340,9 +4343,9 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VPERMI2D":
-        case "VPERMI2Q":
         case "VPERMI2PD":
         case "VPERMI2PS":
+        case "VPERMI2Q":
         case "VPERMI2W":
             return {
                 "html": "<p>Permutes 16-bit/32-bit/64-bit values in the second operand (the first source operand) and the third operand (the second source operand) using indices in the first operand to select elements from the second and third operands. The selected elements are written to the destination operand (the first operand) according to the writemask k1.</p><p>The first and second operands are ZMM/YMM/XMM registers. The first operand contains input indices to select elements from the two input tables in the 2nd and 3rd operands. The first operand is also the destination of the result.</p><p>D/Q/PS/PD element versions: The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. Broadcast from the low 32/64-bit memory location is performed if EVEX.b and the id bit for table selection are set (selecting table_2).</p><p>Dword/PS versions: The id bit for table selection is bit 4/3/2, depending on VL=512, 256, 128. Bits [3:0]/[2:0]/[1:0] of each element in the input index vector select an element within the two source operands, If the id bit is 0, table_1 (the first source) is selected; otherwise the second source operand is selected.</p><p>Qword/PD versions: The id bit for table selection is bit 3/2/1, and bits [2:0]/[1:0] /bit 0 selects element within each input table.</p>",
@@ -4393,10 +4396,10 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VPERMT2D":
-        case "VPERMT2PS":
-        case "VPERMT2W":
         case "VPERMT2PD":
+        case "VPERMT2PS":
         case "VPERMT2Q":
+        case "VPERMT2W":
             return {
                 "html": "<p>Permutes 16-bit/32-bit/64-bit values in the first operand and the third operand (the second source operand) using indices in the second operand (the first source operand) to select elements from the first and third operands. The selected elements are written to the destination operand (the first operand) according to the writemask k1.</p><p>The first and second operands are ZMM/YMM/XMM registers. The second operand contains input indices to select elements from the two input tables in the 1st and 3rd operands. The first operand is also the destination of the result.</p><p>D/Q/PS/PD element versions: The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. Broadcast from the low 32/64-bit memory location is performed if EVEX.b and the id bit for table selection are set (selecting table_2).</p><p>Dword/PS versions: The id bit for table selection is bit 4/3/2, depending on VL=512, 256, 128. Bits [3:0]/[2:0]/[1:0] of each element in the input index vector select an element within the two source operands, If the id bit is 0, table_1 (the first source) is selected; otherwise the second source operand is selected.</p><p>Qword/PD versions: The id bit for table selection is bit 3/2/1, and bits [2:0]/[1:0] /bit 0 selects element within each input table.</p>",
                 "tooltip": "Permutes 16-bit/32-bit/64-bit values in the first operand and the third operand (the second source operand) using indices in the second operand (the first source operand) to select elements from the first and third operands. The selected elements are written to the destination operand (the first operand) according to the writemask k1.",
@@ -4417,8 +4420,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPEXPANDQ.html"
             };
 
-        case "VPGATHERDQ":
         case "VPGATHERDD":
+        case "VPGATHERDQ":
             return {
                 "html": "<p>A set of 16 or 8 doubleword/quadword memory locations pointed to by base address BASE_ADDR and index vector VINDEX with scale SCALE are gathered. The result is written into vector zmm1. The elements are specified via the VSIB (i.e., the index register is a zmm, holding packed indices). Elements will only be loaded if their corresponding mask bit is one. If an element\u2019s mask bit is not set, the corresponding element of the destination register (zmm1) is left unchanged. The entire mask register will be set to zero by this instruction unless it triggers an exception.</p><p>This instruction can be suspended by an exception if at least one element is already gathered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set). When this happens, the destination register and the mask register (k1) are partially updated; those elements that have been gathered are placed into the destination register and have their mask bits set to zero. If any traps or interrupts are pending from already gathered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued.</p><p>If the data element size is less than the index element size, the higher part of the destination register and the mask register do not correspond to any elements being gathered. This instruction sets those higher parts to zero. It may update these unused elements to one or both of those registers even if the instruction triggers an exception, and even if the instruction triggers the exception before gathering any elements.</p>",
                 "tooltip": "A set of 16 or 8 doubleword/quadword memory locations pointed to by base address BASE_ADDR and index vector VINDEX with scale SCALE are gathered. The result is written into vector zmm1. The elements are specified via the VSIB (i.e., the index register is a zmm, holding packed indices). Elements will only be loaded if their corresponding mask bit is one. If an element\u2019s mask bit is not set, the corresponding element of the destination register (zmm1) is left unchanged. The entire mask register will be set to zero by this instruction unless it triggers an exception.",
@@ -4463,10 +4466,10 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPMASKMOV.html"
             };
 
-        case "VPMOVW2M":
+        case "VPMOVB2M":
         case "VPMOVD2M":
         case "VPMOVQ2M":
-        case "VPMOVB2M":
+        case "VPMOVW2M":
             return {
                 "html": "<p>Converts a vector register to a mask register. Each element in the destination register is set to 1 or 0 depending on the value of most significant bit of the corresponding element in the source register.</p><p>The source operand is a ZMM/YMM/XMM register. The destination operand is a mask register.</p><p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "Converts a vector register to a mask register. Each element in the destination register is set to 1 or 0 depending on the value of most significant bit of the corresponding element in the source register.",
@@ -4482,8 +4485,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPMOVDB%3AVPMOVSDB%3AVPMOVUSDB.html"
             };
 
-        case "VPMOVSDW":
         case "VPMOVDW":
+        case "VPMOVSDW":
         case "VPMOVUSDW":
             return {
                 "html": "<p>VPMOVDW down converts 32-bit integer elements in the source operand (the second operand) into packed words using truncation. VPMOVSDW converts signed 32-bit integers into packed signed words using signed saturation. VPMOVUSDW convert unsigned double-word values into unsigned word values using unsigned saturation.</p><p>The source operand is a ZMM/YMM/XMM register. The destination operand is a YMM/XMM/XMM register or a 256/128/64-bit memory location.</p><p>Down-converted word elements are written to the destination operand (the first operand) from the least-significant word. Word elements of the destination operand are updated according to the writemask. Bits (MAXVL-1:256/128/64) of the register destination are zeroed.</p><p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
@@ -4491,9 +4494,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPMOVDW%3AVPMOVSDW%3AVPMOVUSDW.html"
             };
 
+        case "VPMOVM2B":
         case "VPMOVM2D":
         case "VPMOVM2Q":
-        case "VPMOVM2B":
         case "VPMOVM2W":
             return {
                 "html": "<p>Converts a mask register to a vector register. Each element in the destination register is set to all 1\u2019s or all 0\u2019s depending on the value of the corresponding bit in the source mask register.</p><p>The source operand is a mask register. The destination operand is a ZMM/YMM/XMM register.</p><p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
@@ -4501,9 +4504,9 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPMOVM2B%3AVPMOVM2W%3AVPMOVM2D%3AVPMOVM2Q.html"
             };
 
+        case "VPMOVQB":
         case "VPMOVSQB":
         case "VPMOVUSQB":
-        case "VPMOVQB":
             return {
                 "html": "<p>VPMOVQB down converts 64-bit integer elements in the source operand (the second operand) into packed byte elements using truncation. VPMOVSQB converts signed 64-bit integers into packed signed bytes using signed saturation. VPMOVUSQB convert unsigned quad-word values into unsigned byte values using unsigned saturation. The source operand is a vector register. The destination operand is an XMM register or a memory location.</p><p>Down-converted byte elements are written to the destination operand (the first operand) from the least-significant byte. Byte elements of the destination operand are updated according to the writemask. Bits (MAXVL-1:64) of the destination are zeroed.</p><p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "VPMOVQB down converts 64-bit integer elements in the source operand (the second operand) into packed byte elements using truncation. VPMOVSQB converts signed 64-bit integers into packed signed bytes using signed saturation. VPMOVUSQB convert unsigned quad-word values into unsigned byte values using unsigned saturation. The source operand is a vector register. The destination operand is an XMM register or a memory location.",
@@ -4511,17 +4514,17 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VPMOVQD":
-        case "VPMOVUSQD":
         case "VPMOVSQD":
+        case "VPMOVUSQD":
             return {
                 "html": "<p>VPMOVQW down converts 64-bit integer elements in the source operand (the second operand) into packed double-words using truncation. VPMOVSQW converts signed 64-bit integers into packed signed doublewords using signed saturation. VPMOVUSQW convert unsigned quad-word values into unsigned double-word values using unsigned saturation.</p><p>The source operand is a ZMM/YMM/XMM register. The destination operand is a YMM/XMM/XMM register or a 256/128/64-bit memory location.</p><p>Down-converted doubleword elements are written to the destination operand (the first operand) from the least-significant doubleword. Doubleword elements of the destination operand are updated according to the writemask. Bits (MAXVL-1:256/128/64) of the register destination are zeroed.</p><p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "VPMOVQW down converts 64-bit integer elements in the source operand (the second operand) into packed double-words using truncation. VPMOVSQW converts signed 64-bit integers into packed signed doublewords using signed saturation. VPMOVUSQW convert unsigned quad-word values into unsigned double-word values using unsigned saturation.",
                 "url": "http://www.felixcloutier.com/x86/VPMOVQD%3AVPMOVSQD%3AVPMOVUSQD.html"
             };
 
-        case "VPMOVUSQW":
-        case "VPMOVSQW":
         case "VPMOVQW":
+        case "VPMOVSQW":
+        case "VPMOVUSQW":
             return {
                 "html": "<p>VPMOVQW down converts 64-bit integer elements in the source operand (the second operand) into packed words using truncation. VPMOVSQW converts signed 64-bit integers into packed signed words using signed saturation. VPMOVUSQW convert unsigned quad-word values into unsigned word values using unsigned saturation.</p><p>The source operand is a ZMM/YMM/XMM register. The destination operand is a XMM register or a 128/64/32-bit memory location.</p><p>Down-converted word elements are written to the destination operand (the first operand) from the least-significant word. Word elements of the destination operand are updated according to the writemask. Bits (MAXVL-1:128/64/32) of the register destination are zeroed.</p><p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "VPMOVQW down converts 64-bit integer elements in the source operand (the second operand) into packed words using truncation. VPMOVSQW converts signed 64-bit integers into packed signed words using signed saturation. VPMOVUSQW convert unsigned quad-word values into unsigned word values using unsigned saturation.",
@@ -4529,8 +4532,8 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VPMOVSWB":
-        case "VPMOVWB":
         case "VPMOVUSWB":
+        case "VPMOVWB":
             return {
                 "html": "<p>VPMOVWB down converts 16-bit integers into packed bytes using truncation. VPMOVSWB converts signed 16-bit integers into packed signed bytes using signed saturation. VPMOVUSWB convert unsigned word values into unsigned byte values using unsigned saturation.</p><p>The source operand is a ZMM/YMM/XMM register. The destination operand is a YMM/XMM/XMM register or a 256/128/64-bit memory location.</p><p>Down-converted byte elements are written to the destination operand (the first operand) from the least-significant byte. Byte elements of the destination operand are updated according to the writemask. Bits (MAXVL-1:256/128/64) of the register destination are zeroed.</p><p>Note: EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>",
                 "tooltip": "VPMOVWB down converts 16-bit integers into packed bytes using truncation. VPMOVSWB converts signed 16-bit integers into packed signed bytes using signed saturation. VPMOVUSWB convert unsigned word values into unsigned byte values using unsigned saturation.",
@@ -4544,18 +4547,18 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPMULTISHIFTQB.html"
             };
 
+        case "VPROLD":
+        case "VPROLQ":
         case "VPROLVD":
         case "VPROLVQ":
-        case "VPROLQ":
-        case "VPROLD":
             return {
                 "html": "<p>Rotates the bits in the individual data elements (doublewords, or quadword) in the first source operand to the left by the number of bits specified in the count operand. If the value specified by the count operand is greater than 31 (for doublewords), or 63 (for a quadword), then the count operand modulo the data size (32 or 64) is used.</p><p>EVEX.128 encoded version: The destination operand is a XMM register. The source operand is a XMM register or a memory location (for immediate form). The count operand can come either from an XMM register or a memory location or an 8-bit immediate. Bits (MAXVL-1:128) of the corresponding ZMM register are zeroed.</p><p>EVEX.256 encoded version: The destination operand is a YMM register. The source operand is a YMM register or a memory location (for immediate form). The count operand can come either from an XMM register or a memory location or an 8-bit immediate. Bits (MAXVL-1:256) of the corresponding ZMM register are zeroed.</p><p>EVEX.512 encoded version: The destination operand is a ZMM register updated according to the writemask. For the count operand in immediate form, the source operand can be a ZMM register, a 512-bit memory location or a 512-bit vector broadcasted from a 32/64-bit memory location, the count operand is an 8-bit immediate. For the count operand in variable form, the first source operand (the second operand) is a ZMM register and the counter operand (the third operand) is a ZMM register, a 512-bit memory location or a 512-bit vector broadcasted from a 32/64-bit memory location.</p>",
                 "tooltip": "Rotates the bits in the individual data elements (doublewords, or quadword) in the first source operand to the left by the number of bits specified in the count operand. If the value specified by the count operand is greater than 31 (for doublewords), or 63 (for a quadword), then the count operand modulo the data size (32 or 64) is used.",
                 "url": "http://www.felixcloutier.com/x86/VPROLD%3AVPROLVD%3AVPROLQ%3AVPROLVQ.html"
             };
 
-        case "VPRORQ":
         case "VPRORD":
+        case "VPRORQ":
         case "VPRORVD":
         case "VPRORVQ":
             return {
@@ -4564,8 +4567,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VPRORD%3AVPRORVD%3AVPRORQ%3AVPRORVQ.html"
             };
 
-        case "VPSCATTERDQ":
         case "VPSCATTERDD":
+        case "VPSCATTERDQ":
         case "VPSCATTERQD":
         case "VPSCATTERQQ":
             return {
@@ -4575,26 +4578,26 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VPSLLVD":
-        case "VPSLLVW":
         case "VPSLLVQ":
+        case "VPSLLVW":
             return {
                 "html": "<p>Shifts the bits in the individual data elements (words, doublewords or quadword) in the first source operand to the left by the count value of respective data elements in the second source operand. As the bits in the data elements are shifted left, the empty low-order bits are cleared (set to 0).</p><p>The count values are specified individually in each data element of the second source operand. If the unsigned integer value specified in the respective data element of the second source operand is greater than 15 (for word), 31 (for doublewords), or 63 (for a quadword), then the destination data element are written with 0.</p><p>VEX.128 encoded version: The destination and first source operands are XMM registers. The count operand can be either an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The destination and first source operands are YMM registers. The count operand can be either an YMM register or a 256-bit memory. Bits (MAXVL-1:256) of the corresponding ZMM register are zeroed.</p><p>EVEX encoded VPSLLVD/Q: The destination and first source operands are ZMM/YMM/XMM registers. The count operand can be either a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512-bit vector broadcasted from a 32/64-bit memory location. The destination is conditionally updated with writemask k1.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (words, doublewords or quadword) in the first source operand to the left by the count value of respective data elements in the second source operand. As the bits in the data elements are shifted left, the empty low-order bits are cleared (set to 0).",
                 "url": "http://www.felixcloutier.com/x86/VPSLLVW%3AVPSLLVD%3AVPSLLVQ.html"
             };
 
-        case "VPSRAVW":
         case "VPSRAVD":
         case "VPSRAVQ":
+        case "VPSRAVW":
             return {
                 "html": "<p>Shifts the bits in the individual data elements (word/doublewords/quadword) in the first source operand (the second operand) to the right by the number of bits specified in the count value of respective data elements in the second source operand (the third operand). As the bits in the data elements are shifted right, the empty high-order bits are set to the MSB (sign extension).</p><p>The count values are specified individually in each data element of the second source operand. If the unsigned integer value specified in the respective data element of the second source operand is greater than 15 (for words), 31 (for doublewords), or 63 (for a quadword), then the destination data element is filled with the corresponding sign bit of the source element.</p><p>VEX.128 encoded version: The destination and first source operands are XMM registers. The count operand can be either an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The destination and first source operands are YMM registers. The count operand can be either an YMM register or a 256-bit memory. Bits (MAXVL-1:256) of the corresponding destination register are zeroed.</p><p>EVEX.512/256/128 encoded VPSRAVD/W: The destination and first source operands are ZMM/YMM/XMM registers. The count operand can be either a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination is conditionally updated with writemask k1.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (word/doublewords/quadword) in the first source operand (the second operand) to the right by the number of bits specified in the count value of respective data elements in the second source operand (the third operand). As the bits in the data elements are shifted right, the empty high-order bits are set to the MSB (sign extension).",
                 "url": "http://www.felixcloutier.com/x86/VPSRAVW%3AVPSRAVD%3AVPSRAVQ.html"
             };
 
-        case "VPSRLVW":
         case "VPSRLVD":
         case "VPSRLVQ":
+        case "VPSRLVW":
             return {
                 "html": "<p>Shifts the bits in the individual data elements (words, doublewords or quadword) in the first source operand to the right by the count value of respective data elements in the second source operand. As the bits in the data elements are shifted right, the empty high-order bits are cleared (set to 0).</p><p>The count values are specified individually in each data element of the second source operand. If the unsigned integer value specified in the respective data element of the second source operand is greater than 15 (for word), 31 (for doublewords), or 63 (for a quadword), then the destination data element are written with 0.</p><p>VEX.128 encoded version: The destination and first source operands are XMM registers. The count operand can be either an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register are zeroed.</p><p>VEX.256 encoded version: The destination and first source operands are YMM registers. The count operand can be either an YMM register or a 256-bit memory. Bits (MAXVL-1:256) of the corresponding ZMM register are zeroed.</p><p>EVEX encoded VPSRLVD/Q: The destination and first source operands are ZMM/YMM/XMM registers. The count operand can be either a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512-bit vector broadcasted from a 32/64-bit memory location. The destination is conditionally updated with writemask k1.</p>",
                 "tooltip": "Shifts the bits in the individual data elements (words, doublewords or quadword) in the first source operand to the right by the count value of respective data elements in the second source operand. As the bits in the data elements are shifted right, the empty high-order bits are cleared (set to 0).",
@@ -4611,18 +4614,18 @@ export function getAsmOpcode(opcode) {
 
         case "VPTESTMB":
         case "VPTESTMD":
-        case "VPTESTMW":
         case "VPTESTMQ":
+        case "VPTESTMW":
             return {
                 "html": "<p>Performs a bitwise logical AND operation on the first source operand (the second operand) and second source operand (the third operand) and stores the result in the destination operand (the first operand) under the writemask. Each bit of the result is set to 1 if the bitwise AND of the corresponding elements of the first and second src operands is non-zero; otherwise it is set to 0.</p><p>VPTESTMD/VPTESTMQ: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination operand is a mask register updated under the writemask.</p><p>VPTESTMB/VPTESTMW: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register or a 512/256/128-bit memory location. The destination operand is a mask register updated under the writemask.</p>",
                 "tooltip": "Performs a bitwise logical AND operation on the first source operand (the second operand) and second source operand (the third operand) and stores the result in the destination operand (the first operand) under the writemask. Each bit of the result is set to 1 if the bitwise AND of the corresponding elements of the first and second src operands is non-zero; otherwise it is set to 0.",
                 "url": "http://www.felixcloutier.com/x86/VPTESTMB%3AVPTESTMW%3AVPTESTMD%3AVPTESTMQ.html"
             };
 
-        case "VPTESTNMW":
         case "VPTESTNMB":
         case "VPTESTNMD":
         case "VPTESTNMQ":
+        case "VPTESTNMW":
             return {
                 "html": "<p>Performs a bitwise logical NAND operation on the byte/word/doubleword/quadword element of the first source operand (the second operand) with the corresponding element of the second source operand (the third operand) and stores the logical comparison result into each bit of the destination operand (the first operand) according to the writemask k1. Each bit of the result is set to 1 if the bitwise AND of the corresponding elements of the first and second src operands is zero; otherwise it is set to 0.</p><p>EVEX encoded VPTESTNMD/Q: The first source operand is a ZMM/YMM/XMM registers. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location. The destination is updated according to the writemask.</p><p>EVEX encoded VPTESTNMB/W: The first source operand is a ZMM/YMM/XMM registers. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location. The destination is updated according to the writemask.</p>",
                 "tooltip": "Performs a bitwise logical NAND operation on the byte/word/doubleword/quadword element of the first source operand (the second operand) with the corresponding element of the second source operand (the third operand) and stores the logical comparison result into each bit of the destination operand (the first operand) according to the writemask k1. Each bit of the result is set to 1 if the bitwise AND of the corresponding elements of the first and second src operands is zero; otherwise it is set to 0.",
@@ -4853,10 +4856,10 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VSCALEFSS.html"
             };
 
+        case "VSCATTERDPD":
         case "VSCATTERDPS":
         case "VSCATTERQPD":
         case "VSCATTERQPS":
-        case "VSCATTERDPD":
             return {
                 "html": "<p>Stores up to 16 elements (or 8 elements) in doubleword/quadword vector zmm1 to the memory locations pointed by base address BASE_ADDR and index vector VINDEX, with scale SCALE. The elements are specified via the VSIB (i.e., the index register is a vector register, holding packed indices). Elements will only be stored if their corresponding mask bit is one. The entire mask register will be set to zero by this instruction unless it triggers an exception.</p><p>This instruction can be suspended by an exception if at least one element is already scattered (i.e., if the exception is triggered by an element other than the rightmost one with its mask bit set). When this happens, the destination register and the mask register (k1) are partially updated. If any traps or interrupts are pending from already scattered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruction breakpoint is not re-triggered when the instruction is continued.</p>",
                 "tooltip": "Stores up to 16 elements (or 8 elements) in doubleword/quadword vector zmm1 to the memory locations pointed by base address BASE_ADDR and index vector VINDEX, with scale SCALE. The elements are specified via the VSIB (i.e., the index register is a vector register, holding packed indices). Elements will only be stored if their corresponding mask bit is one. The entire mask register will be set to zero by this instruction unless it triggers an exception.",
@@ -4864,19 +4867,19 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VSCATTERPF0DPD":
-        case "VSCATTERPF0QPS":
-        case "VSCATTERPF0QPD":
         case "VSCATTERPF0DPS":
+        case "VSCATTERPF0QPD":
+        case "VSCATTERPF0QPS":
             return {
                 "html": "<p>The instruction conditionally prefetches up to sixteen 32-bit or eight 64-bit integer byte data elements. The elements are specified via the VSIB (i.e., the index register is an zmm, holding packed indices). Elements will only be prefetched if their corresponding mask bit is one.</p><p>cache lines will be brought into exclusive state (RFO) specified by a locality hint (T0):</p><p>[PS data] For dword indices, the instruction will prefetch sixteen memory locations. For qword indices, the instruction will prefetch eight values.</p><p>[PD data] For dword and qword indices, the instruction will prefetch eight memory locations.</p>",
                 "tooltip": "The instruction conditionally prefetches up to sixteen 32-bit or eight 64-bit integer byte data elements. The elements are specified via the VSIB (i.e., the index register is an zmm, holding packed indices). Elements will only be prefetched if their corresponding mask bit is one.",
                 "url": "http://www.felixcloutier.com/x86/VSCATTERPF0DPS%3AVSCATTERPF0QPS%3AVSCATTERPF0DPD%3AVSCATTERPF0QPD.html"
             };
 
-        case "VSCATTERPF1QPS":
-        case "VSCATTERPF1DPS":
         case "VSCATTERPF1DPD":
+        case "VSCATTERPF1DPS":
         case "VSCATTERPF1QPD":
+        case "VSCATTERPF1QPS":
             return {
                 "html": "<p>The instruction conditionally prefetches up to sixteen 32-bit or eight 64-bit integer byte data elements. The elements are specified via the VSIB (i.e., the index register is an zmm, holding packed indices). Elements will only be prefetched if their corresponding mask bit is one.</p><p>cache lines will be brought into exclusive state (RFO) specified by a locality hint (T1):</p><p>[PS data] For dword indices, the instruction will prefetch sixteen memory locations. For qword indices, the instruction will prefetch eight values.</p><p>[PD data] For dword and qword indices, the instruction will prefetch eight memory locations.</p>",
                 "tooltip": "The instruction conditionally prefetches up to sixteen 32-bit or eight 64-bit integer byte data elements. The elements are specified via the VSIB (i.e., the index register is an zmm, holding packed indices). Elements will only be prefetched if their corresponding mask bit is one.",
@@ -4884,9 +4887,9 @@ export function getAsmOpcode(opcode) {
             };
 
         case "VSHUFF32X4":
-        case "VSHUFI64X2":
-        case "VSHUFI32X4":
         case "VSHUFF64X2":
+        case "VSHUFI32X4":
+        case "VSHUFI64X2":
             return {
                 "html": "<p>256-bit Version: Moves one of the two 128-bit packed single-precision floating-point values from the first source operand (second operand) into the low 128-bit of the destination operand (first operand); moves one of the two packed 128-bit floating-point values from the second source operand (third operand) into the high 128-bit of the destination operand. The selector operand (third operand) determines which values are moved to the destination operand.</p><p>512-bit Version: Moves two of the four 128-bit packed single-precision floating-point values from the first source operand (second operand) into the low 256-bit of each double qword of the destination operand (first operand); moves two of the four packed 128-bit floating-point values from the second source operand (third operand) into the high 256-bit of the destination operand. The selector operand (third operand) determines which values are moved to the destination operand.</p><p>The first source operand is a vector register. The second source operand can be a ZMM register, a 512-bit memory location or a 512-bit vector broadcasted from a 32/64-bit memory location. The destination operand is a vector register.</p><p>The writemask updates the destination operand with the granularity of 32/64-bit data elements.</p>",
                 "tooltip": "256-bit Version: Moves one of the two 128-bit packed single-precision floating-point values from the first source operand (second operand) into the low 128-bit of the destination operand (first operand); moves one of the two packed 128-bit floating-point values from the second source operand (third operand) into the high 128-bit of the destination operand. The selector operand (third operand) determines which values are moved to the destination operand.",
@@ -4915,8 +4918,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/VZEROUPPER.html"
             };
 
-        case "WAIT":
         case "FWAIT":
+        case "WAIT":
             return {
                 "html": "<p>Causes the processor to check for and handle pending, unmasked, floating-point exceptions before proceeding. (FWAIT is an alternate mnemonic for WAIT.)</p><p>This instruction is useful for synchronizing exceptions in critical sections of code. Coding a WAIT instruction after a floating-point instruction ensures that any unmasked floating-point exceptions the instruction may raise are handled before the processor can modify the instruction\u2019s results. See the section titled \u201cFloating-Point Exception Synchronization\u201d in Chapter 8 of the <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, for more information on using the WAIT/FWAIT instruction.</p><p>This instruction\u2019s operation is the same in non-64-bit modes and 64-bit mode.</p>",
                 "tooltip": "Causes the processor to check for and handle pending, unmasked, floating-point exceptions before proceeding. (FWAIT is an alternate mnemonic for WAIT.)",
@@ -4959,8 +4962,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/XABORT.html"
             };
 
-        case "XRELEASE":
         case "XACQUIRE":
+        case "XRELEASE":
             return {
                 "html": "<p>The XACQUIRE prefix is a hint to start lock elision on the memory address specified by the instruction and the XRELEASE prefix is a hint to end lock elision on the memory address specified by the instruction.</p><p>The XACQUIRE prefix hint can only be used with the following instructions (these instructions are also referred to as XACQUIRE-enabled when used with the XACQUIRE prefix):</p><p>The XRELEASE prefix hint can only be used with the following instructions (also referred to as XRELEASE-enabled when used with the XRELEASE prefix):</p><p>The lock variables must satisfy the guidelines described in <em>Intel\u00ae 64 and IA-32 Architectures Software Developer\u2019s Manual, Volume 1</em>, Section 16.3.3, for elision to be successful, otherwise an HLE abort may be signaled.</p><p>If an encoded byte sequence that meets XACQUIRE/XRELEASE requirements includes both prefixes, then the HLE semantic is determined by the prefix byte that is placed closest to the instruction opcode. For example, an F3F2C6 will not be treated as a XRELEASE-enabled instruction since the F2H (XACQUIRE) is closest to the instruction opcode C6. Similarly, an F2F3F0 prefixed instruction will be treated as a XRELEASE-enabled instruction since F3H (XRELEASE) is closest to the instruction opcode.</p>",
                 "tooltip": "The XACQUIRE prefix is a hint to start lock elision on the memory address specified by the instruction and the XRELEASE prefix is a hint to end lock elision on the memory address specified by the instruction.",
@@ -5002,8 +5005,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/XGETBV.html"
             };
 
-        case "XLATB":
         case "XLAT":
+        case "XLATB":
             return {
                 "html": "<p>Locates a byte entry in a table in memory, using the contents of the AL register as a table index, then copies the contents of the table entry back into the AL register. The index in the AL register is treated as an unsigned integer. The XLAT and XLATB instructions get the base address of the table in memory from either the DS:EBX or the DS:BX registers (depending on the address-size attribute of the instruction, 32 or 16, respectively). (The DS segment may be overridden with a segment override prefix.)</p><p>At the assembly-code level, two forms of this instruction are allowed: the \u201cexplicit-operand\u201d form and the \u201cno-operand\u201d form. The explicit-operand form (specified with the XLAT mnemonic) allows the base address of the table to be specified explicitly with a symbol. This explicit-operands form is provided to allow documentation; however, note that the documentation provided by this form can be misleading. That is, the symbol does not have to specify the correct base address. The base address is always specified by the DS:(E)BX registers, which must be loaded correctly before the XLAT instruction is executed.</p><p>The no-operands form (XLATB) provides a \u201cshort form\u201d of the XLAT instructions. Here also the processor assumes that the DS:(E)BX registers contain the base address of the table.</p><p>In 64-bit mode, operation is similar to that in legacy or compatibility mode. AL is used to specify the table index (the operand size is fixed at 8 bits). RBX, however, is used to specify the table\u2019s base address. See the summary chart at the beginning of this section for encoding data and limits.</p>",
                 "tooltip": "Locates a byte entry in a table in memory, using the contents of the AL register as a table index, then copies the contents of the table entry back into the AL register. The index in the AL register is treated as an unsigned integer. The XLAT and XLATB instructions get the base address of the table in memory from either the DS:EBX or the DS:BX registers (depending on the address-size attribute of the instruction, 32 or 16, respectively). (The DS segment may be overridden with a segment override prefix.)",
@@ -5025,8 +5028,8 @@ export function getAsmOpcode(opcode) {
                 "url": "http://www.felixcloutier.com/x86/XORPD.html"
             };
 
-        case "XORPS":
         case "VXORPS":
+        case "XORPS":
             return {
                 "html": "<p>Performs a bitwise logical XOR of the four, eight or sixteen packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand</p><p>EVEX.512 encoded version: The first source operand is a ZMM register. The second source operand can be a ZMM register or a vector memory location. The destination operand is a ZMM register conditionally updated with writemask k1.</p><p>VEX.256 and EVEX.256 encoded versions: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register (conditionally updated with writemask k1 in case of EVEX). The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p><p>VEX.128 and EVEX.128 encoded versions: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register (conditionally updated with writemask k1 in case of EVEX). The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p><p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>",
                 "tooltip": "Performs a bitwise logical XOR of the four, eight or sixteen packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand",
author	Jeremy Rifkin <51220084+jeremy-rifkin@users.noreply.github.com>	2022-04-23 16:04:09 -0400
committer	GitHub <noreply@github.com>	2022-04-23 22:04:09 +0200
commit	16b071b9f0c97faa82149e983dd708f0ecfcb8ef (patch)
tree	71b336739d85d25b8d7720e919af668d11b4e084 /lib/asm-docs/generated
parent	ac8a45ce8a4fe786483e4971fe05b94169f875cd (diff)
download	compiler-explorer-16b071b9f0c97faa82149e983dd708f0ecfcb8ef.tar.gz compiler-explorer-16b071b9f0c97faa82149e983dd708f0ecfcb8ef.zip