mirror of https://github.com/golang/go.git
doc: update the architecture-specific information in asm.html
Still to do: ARM64 and PPC64. These architectures are woefully underdocumented. Change-Id: Iedcf767a7e0e1c931812351940bc08f0c3821212 Reviewed-on: https://go-review.googlesource.com/12110 Reviewed-by: Russ Cox <rsc@golang.org>
This commit is contained in:
parent
1bd1880906
commit
3c5eb96001
186
doc/asm.html
186
doc/asm.html
|
|
@ -514,42 +514,61 @@ even pointers to stack data must not be kept in local variables.
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
It is impractical to list all the instructions and other details for each machine.
|
It is impractical to list all the instructions and other details for each machine.
|
||||||
To see what instructions are defined for a given machine, say 32-bit Intel x86,
|
To see what instructions are defined for a given machine, say ARM,
|
||||||
look in the top-level header file for the corresponding linker, in this case <code>8l</code>.
|
look in the source for the <code>obj</code> support library for
|
||||||
That is, the file <code>$GOROOT/src/cmd/8l/8.out.h</code> contains a C enumeration, called <code>as</code>,
|
that architecture, located in the directory <code>src/cmd/internal/obj/arm</code>.
|
||||||
of the instructions and their spellings as known to the assembler and linker for that architecture.
|
In that directory is a file <code>a.out.go</code>; it contains
|
||||||
In that file you'll find a declaration that begins
|
a long list of constants starting with <code>A</code>, like this:
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<pre>
|
<pre>
|
||||||
enum as
|
const (
|
||||||
{
|
AAND = obj.ABaseARM + obj.A_ARCHSPECIFIC + iota
|
||||||
AXXX,
|
AEOR
|
||||||
AAAA,
|
ASUB
|
||||||
AAAD,
|
ARSB
|
||||||
AAAM,
|
AADD
|
||||||
AAAS,
|
|
||||||
AADCB,
|
|
||||||
...
|
...
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Each instruction begins with a initial capital <code>A</code> in this list, so <code>AADCB</code>
|
This is the list of instructions and their spellings as known to the assembler and linker for that architecture.
|
||||||
represents the <code>ADCB</code> (add carry byte) instruction.
|
Each instruction begins with an initial capital <code>A</code> in this list, so <code>AAND</code>
|
||||||
The enumeration is in alphabetical order, plus some late additions (<code>AXXX</code> occupies
|
represents the bitwise and instruction,
|
||||||
the zero slot as an invalid instruction).
|
<code>AND</code> (without the leading <code>A</code>),
|
||||||
The sequence has nothing to do with the actual encoding of the machine instructions.
|
and is written in assembly source as <code>AND</code>.
|
||||||
Again, the linker takes care of that detail.
|
The enumeration is mostly in alphabetical order.
|
||||||
|
(The architecture-independent <code>AXXX</code>, defined in the
|
||||||
|
<code>cmd/internal/obj</code> package,
|
||||||
|
represents an invalid instruction).
|
||||||
|
The sequence of the <code>A</code> names has nothing to do with the actual
|
||||||
|
encoding of the machine instructions.
|
||||||
|
The <code>cmd/internal/obj</code> package takes care of that detail.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
The instructions for both the 386 and AMD64 architectures are listed in
|
||||||
|
<code>cmd/internal/obj/x86/a.out.go</code>.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
The architectures share syntax for common addressing modes such as
|
||||||
|
<code>(R1)</code> (register indirect),
|
||||||
|
<code>4(R1)</code> (register indirect with offset), and
|
||||||
|
<code>$foo(SB)</code> (absolute address).
|
||||||
|
The assembler also supports some (not necessarily all) addressing modes
|
||||||
|
specific to each architecture.
|
||||||
|
The sections below list these.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
One detail evident in the examples from the previous sections is that data in the instructions flows from left to right:
|
One detail evident in the examples from the previous sections is that data in the instructions flows from left to right:
|
||||||
<code>MOVQ</code> <code>$0,</code> <code>CX</code> clears <code>CX</code>.
|
<code>MOVQ</code> <code>$0,</code> <code>CX</code> clears <code>CX</code>.
|
||||||
This convention applies even on architectures where the usual mode is the opposite direction.
|
This rule applies even on architectures where the conventional notation uses the opposite direction.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Here follows some descriptions of key Go-specific details for the supported architectures.
|
Here follow some descriptions of key Go-specific details for the supported architectures.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h3 id="x86">32-bit Intel 386</h3>
|
<h3 id="x86">32-bit Intel 386</h3>
|
||||||
|
|
@ -558,11 +577,11 @@ Here follows some descriptions of key Go-specific details for the supported arch
|
||||||
The runtime pointer to the <code>g</code> structure is maintained
|
The runtime pointer to the <code>g</code> structure is maintained
|
||||||
through the value of an otherwise unused (as far as Go is concerned) register in the MMU.
|
through the value of an otherwise unused (as far as Go is concerned) register in the MMU.
|
||||||
A OS-dependent macro <code>get_tls</code> is defined for the assembler if the source includes
|
A OS-dependent macro <code>get_tls</code> is defined for the assembler if the source includes
|
||||||
an architecture-dependent header file, like this:
|
a special header, <code>go_asm.h</code>:
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<pre>
|
<pre>
|
||||||
#include "zasm_GOOS_GOARCH.h"
|
#include "go_asm.h"
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
|
|
@ -575,21 +594,39 @@ The sequence to load <code>g</code> and <code>m</code> using <code>CX</code> loo
|
||||||
<pre>
|
<pre>
|
||||||
get_tls(CX)
|
get_tls(CX)
|
||||||
MOVL g(CX), AX // Move g into AX.
|
MOVL g(CX), AX // Move g into AX.
|
||||||
MOVL g_m(AX), BX // Move g->m into BX.
|
MOVL g_m(AX), BX // Move g.m into BX.
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Addressing modes:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<code>(DI)(BX*2)</code>: The location at address <code>DI</code> plus <code>BX*2</code>.
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<code>64(DI)(BX*2)</code>: The location at address <code>DI</code> plus <code>BX*2</code> plus 64.
|
||||||
|
These modes accept only 1, 2, 4, and 8 as scale factors.
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
<h3 id="amd64">64-bit Intel 386 (a.k.a. amd64)</h3>
|
<h3 id="amd64">64-bit Intel 386 (a.k.a. amd64)</h3>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
The assembly code to access the <code>m</code> and <code>g</code>
|
The two architectures behave largely the same at the assembler level.
|
||||||
pointers is the same as on the 386, except it uses <code>MOVQ</code> rather than
|
Assembly code to access the <code>m</code> and <code>g</code>
|
||||||
<code>MOVL</code>:
|
pointers on the 64-bit version is the same as on the 32-bit 386,
|
||||||
|
except it uses <code>MOVQ</code> rather than <code>MOVL</code>:
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<pre>
|
<pre>
|
||||||
get_tls(CX)
|
get_tls(CX)
|
||||||
MOVQ g(CX), AX // Move g into AX.
|
MOVQ g(CX), AX // Move g into AX.
|
||||||
MOVQ g_m(AX), BX // Move g->m into BX.
|
MOVQ g_m(AX), BX // Move g.m into BX.
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
<h3 id="arm">ARM</h3>
|
<h3 id="arm">ARM</h3>
|
||||||
|
|
@ -626,6 +663,85 @@ The name <code>SP</code> always refers to the virtual stack pointer described ea
|
||||||
For the hardware register, use <code>R13</code>.
|
For the hardware register, use <code>R13</code>.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Addressing modes:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<code>R0->16</code>
|
||||||
|
<br>
|
||||||
|
<code>R0>>16</code>
|
||||||
|
<br>
|
||||||
|
<code>R0<<16</code>
|
||||||
|
<br>
|
||||||
|
<code>R0@>16</code>:
|
||||||
|
For <code><<</code>, left shift <code>R0</code> by 16 bits.
|
||||||
|
The other codes are <code>-></code> (arithmetic right shift),
|
||||||
|
<code>>></code> (logical right shift), and
|
||||||
|
<code>@></code> (rotate right).
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<code>R0->R1</code>
|
||||||
|
<br>
|
||||||
|
<code>R0>>R1</code>
|
||||||
|
<br>
|
||||||
|
<code>R0<<R1</code>
|
||||||
|
<br>
|
||||||
|
<code>R0@>R1</code>:
|
||||||
|
For <code><<</code>, left shift <code>R0</code> by the count in <code>R1</code>.
|
||||||
|
The other codes are <code>-></code> (arithmetic right shift),
|
||||||
|
<code>>></code> (logical right shift), and
|
||||||
|
<code>@></code> (rotate right).
|
||||||
|
|
||||||
|
</li>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<code>[R0,g,R12-R15]</code>: For multi-register instructions, the set comprising
|
||||||
|
<code>R0</code>, <code>g</code>, and <code>R12</code> through <code>R15</code> inclusive.
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3 id="arm64">ARM64</h3>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
TODO
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Addressing modes:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
TODO
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3 id="ppc64">Power64, a.k.a. ppc64</h3>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
TODO
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Addressing modes:
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<code>(R5)(R6*1)</code>: The location at <code>R5</code> plus <code>R6</code>. It is a scaled
|
||||||
|
mode like on the x86, but the only scale allowed is <code>1</code>.
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
<h3 id="unsupported_opcodes">Unsupported opcodes</h3>
|
<h3 id="unsupported_opcodes">Unsupported opcodes</h3>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
|
|
@ -644,11 +760,17 @@ Here's how the 386 runtime defines the 64-bit atomic load function.
|
||||||
// uint64 atomicload64(uint64 volatile* addr);
|
// uint64 atomicload64(uint64 volatile* addr);
|
||||||
// so actually
|
// so actually
|
||||||
// void atomicload64(uint64 *res, uint64 volatile *addr);
|
// void atomicload64(uint64 *res, uint64 volatile *addr);
|
||||||
TEXT runtime·atomicload64(SB), NOSPLIT, $0-8
|
TEXT runtime·atomicload64(SB), NOSPLIT, $0-12
|
||||||
MOVL ptr+0(FP), AX
|
MOVL ptr+0(FP), AX
|
||||||
|
TESTL $7, AX
|
||||||
|
JZ 2(PC)
|
||||||
|
MOVL 0, AX // crash with nil ptr deref
|
||||||
LEAL ret_lo+4(FP), BX
|
LEAL ret_lo+4(FP), BX
|
||||||
BYTE $0x0f; BYTE $0x6f; BYTE $0x00 // MOVQ (%EAX), %MM0
|
// MOVQ (%EAX), %MM0
|
||||||
BYTE $0x0f; BYTE $0x7f; BYTE $0x03 // MOVQ %MM0, 0(%EBX)
|
BYTE $0x0f; BYTE $0x6f; BYTE $0x00
|
||||||
BYTE $0x0F; BYTE $0x77 // EMMS
|
// MOVQ %MM0, 0(%EBX)
|
||||||
|
BYTE $0x0f; BYTE $0x7f; BYTE $0x03
|
||||||
|
// EMMS
|
||||||
|
BYTE $0x0F; BYTE $0x77
|
||||||
RET
|
RET
|
||||||
</pre>
|
</pre>
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue