mirror/go - go - Git Fam. Sieh

Commit Graph

Author	SHA1	Message	Date
Russ Cox	80a153dd51	cmd/6l, cmd/8l: fix MOVL MOVQ optab The entry for LEAL/LEAQ in these optabs was listed as having two data bytes in the y array. In fact they had and expect no data bytes. However, the general loop expects to be able to look at at least one data byte, to make sure it is not 0x0f. So give them each a single data byte set to 0 (not 0x0f). Since the MOV instructions have the largest optab cases, this requires growing the size of the data array. Clang found this bug because the general o->op[z] == 0x0f test was using z == 22, which was out of bounds. In practice the next byte in memory was probably not 0x0f so it wasn't truly broken. But might as well be clean. Update #5764 R=ken2 CC=golang-dev https://golang.org/cl/13241050	2013-09-10 14:53:41 -04:00
Russ Cox	567818224e	cmd/5l, cmd/6l, cmd/8l: accept PCDATA instruction in input The portable code in cmd/ld already knows how to process it, we just have to ignore it during code generation. R=ken2 CC=golang-dev https://golang.org/cl/11363043	2013-07-16 16:23:11 -04:00
Russ Cox	5d363c6357	cmd/ld, runtime: new in-memory symbol table format Design at http://golang.org/s/go12symtab. This enables some cleanup of the garbage collector metadata that will be done in future CLs. This CL does not move the old symtab and pclntab back into an unmapped section of the file. That's a bit tricky and will be done separately. Fixes #4020. R=golang-dev, dave, cshapiro, iant, r CC=golang-dev, nigeltao https://golang.org/cl/11085043	2013-07-16 09:41:38 -04:00
Russ Cox	aad4720b51	cmd/6l, cmd/8l: use one-byte XCHG forms when possible Pointed out by khr. R=ken2 CC=golang-dev https://golang.org/cl/11145044	2013-07-12 20:58:38 -04:00
Adam Langley	6bea504b94	cmd/6a, cmd/6l: add PCLMULQDQ instruction. This Intel instruction implements multiplication in binary fields. R=golang-dev, minux.ma, dave, rsc CC=golang-dev https://golang.org/cl/10428043	2013-06-21 15:17:13 -04:00
Russ Cox	26d43a0f22	cmd/6l: accept NOP of $x+10(SP) and of X0 Needed to link code compiled with 6c -N. R=ken2 CC=golang-dev https://golang.org/cl/10043044	2013-06-05 10:38:52 -04:00
Carl Shapiro	4e0a51c210	cmd/5l, cmd/6l, cmd/8l, cmd/gc, runtime: generate and use bitmaps of argument pointer locations With this change the compiler emits a bitmap for each function covering its stack frame arguments area. If an argument word is known to contain a pointer, a bit is set. The garbage collector reads this information when scanning the stack by frames and uses it to ignores locations known to not contain a pointer. R=golang-dev, bradfitz, daniel.morsing, dvyukov, khr, khr, iant, cshapiro CC=golang-dev https://golang.org/cl/9223046	2013-05-28 17:59:10 -07:00
Russ Cox	b505ff6279	crypto/rc4: faster amd64 implementation XOR key into data 128 bits at a time instead of 64 bits and pipeline half of state loads. Rotate loop to allow single-register indexing for state[i]. On a MacBookPro10,2 (Core i5): benchmark old ns/op new ns/op delta BenchmarkRC4_128 412 224 -45.63% BenchmarkRC4_1K 3179 1613 -49.26% BenchmarkRC4_8K 25223 12545 -50.26% benchmark old MB/s new MB/s speedup BenchmarkRC4_128 310.51 570.42 1.84x BenchmarkRC4_1K 322.09 634.48 1.97x BenchmarkRC4_8K 320.97 645.32 2.01x For comparison, on the same machine, openssl 0.9.8r reports its rc4 speed as somewhat under 350 MB/s for both 1K and 8K (it is operating 64 bits at a time). On an Intel Xeon E5520: benchmark old ns/op new ns/op delta BenchmarkRC4_128 418 259 -38.04% BenchmarkRC4_1K 3200 1884 -41.12% BenchmarkRC4_8K 25173 14529 -42.28% benchmark old MB/s new MB/s speedup BenchmarkRC4_128 306.04 492.48 1.61x BenchmarkRC4_1K 319.93 543.26 1.70x BenchmarkRC4_8K 321.61 557.20 1.73x For comparison, on the same machine, openssl 1.0.1 reports its rc4 speed as 587 MB/s for 1K and 601 MB/s for 8K. R=agl CC=golang-dev https://golang.org/cl/7865046	2013-03-21 16:38:57 -04:00
Keith Randall	297bb12809	cmd/6a, cmd/8a, cmd/6l, cmd/8l: add AES instructions Instructions for use in AES hashing. See CL#7543043 R=rsc CC=golang-dev https://golang.org/cl/7548043	2013-03-07 12:54:00 -08:00
Russ Cox	1d5dc4fd48	cmd/gc: emit explicit type information for local variables The type information is (and for years has been) included as an extra field in the address chunk of an instruction. Unfortunately, suppose there is a string at a+24(FP) and we have an instruction reading its length. It will say: MOVQ x+32(FP), AX and the type of that argument is int (not slice), because it is the length being read. This confuses the picture seen by debuggers and now, worse, by the garbage collector. Instead of attaching the type information to all uses, emit an explicit list of TYPE instructions with the information. The TYPE instructions are no-ops whose only role is to provide an address to attach type information to. For example, this function: func f(x, y, z int) (a, b string) { return } now compiles into: --- prog list "f" --- 0000 (/Users/rsc/x.go:3) TEXT f+0(SB),$0-56 0001 (/Users/rsc/x.go:3) LOCALS , 0002 (/Users/rsc/x.go:3) TYPE x+0(FP){int},$8 0003 (/Users/rsc/x.go:3) TYPE y+8(FP){int},$8 0004 (/Users/rsc/x.go:3) TYPE z+16(FP){int},$8 0005 (/Users/rsc/x.go:3) TYPE a+24(FP){string},$16 0006 (/Users/rsc/x.go:3) TYPE b+40(FP){string},$16 0007 (/Users/rsc/x.go:3) MOVQ $0,b+40(FP) 0008 (/Users/rsc/x.go:3) MOVQ $0,b+48(FP) 0009 (/Users/rsc/x.go:3) MOVQ $0,a+24(FP) 0010 (/Users/rsc/x.go:3) MOVQ $0,a+32(FP) 0011 (/Users/rsc/x.go:4) RET , The { } show the formerly hidden type information. The { } syntax is used when printing from within the gc compiler. It is not accepted by the assemblers. The same type information is now included on global variables: 0055 (/Users/rsc/x.go:15) GLOBL slice+0(SB){[]string},$24(AL*0) This more accurate type information fixes a bug in the garbage collector's precise heap collection. The linker only cares about globals right now, but having the local information should make things a little nicer for Carl in the future. Fixes #4907. R=ken2 CC=golang-dev https://golang.org/cl/7395056	2013-02-25 12:13:47 -05:00
Russ Cox	d57fcbf05c	cmd/5l, cmd/6l, cmd/8l: accept CALL reg, reg The new src argument is ignored during linking (that is, CALL r1, r2 is identical to CALL r2 for linking), but it serves as a hint to the 5g/6g/8g optimizer that the src register is live on entry to the called function and must be preserved. It is possible to avoid exposing this fact to the rest of the toolchain, keeping it entirely within 5g/6g/8g, but I think it will help to be able to look in object files and assembly listings and linker -a / -W output to see CALL instructions are "Go func value" calls and which are "C function pointer" calls. R=ken2 CC=golang-dev https://golang.org/cl/7364045	2013-02-22 14:23:21 -05:00
Carl Shapiro	f466617a62	cmd/5g, cmd/5l, cmd/6l, cmd/8l, cmd/gc, cmd/ld, runtime: accurate args and locals information Previously, the func structure contained an inaccurate value for the args member and a 0 value for the locals member. This change populates the func structure with args and locals values computed by the compiler. The number of args was already available in the ATEXT instruction. The number of locals is now passed through in the new ALOCALS instruction. This change also switches the unit of args and locals to be bytes, just like the frame member, instead of 32-bit words. R=golang-dev, bradfitz, cshapiro, dave, rsc CC=golang-dev https://golang.org/cl/7399045	2013-02-21 12:52:26 -08:00
Russ Cox	3d40062c68	cmd/gc, cmd/ld: struct field tracking This is an experiment in static analysis of Go programs to understand which struct fields a program might use. It is not part of the Go language specification, it must be enabled explicitly when building the toolchain, and it may be removed at any time. After building the toolchain with GOEXPERIMENT=fieldtrack, a specific field can be marked for tracking by including `go:"track"` in the field tag: package pkg type T struct { F int `go:"track"` G int // untracked } To simplify usage, only named struct types can have tracked fields, and only exported fields can be tracked. The implementation works by making each function begin with a sequence of no-op USEFIELD instructions declaring which tracked fields are accessed by a specific function. After the linker's dead code elimination removes unused functions, the fields referred to by the remaining USEFIELD instructions are the ones reported as used by the binary. The -k option to the linker specifies the fully qualified symbol name (such as my/pkg.list) of a string variable that should be initialized with the field tracking information for the program. The field tracking string is a sequence of lines, each terminated by a \n and describing a single tracked field referred to by the program. Each line is made up of one or more tab-separated fields. The first field is the name of the tracked field, fully qualified, as in "my/pkg.T.F". Subsequent fields give a shortest path of reverse references from that field to a global variable or function, corresponding to one way in which the program might reach that field. A common source of false positives in field tracking is types with large method sets, because a reference to the type descriptor carries with it references to all methods. To address this problem, the CL also introduces a comment annotation //go:nointerface that marks an upcoming method declaration as unavailable for use in satisfying interfaces, both statically and dynamically. Such a method is also invisible to package reflect. Again, all of this is disabled by default. It only turns on if you have GOEXPERIMENT=fieldtrack set during make.bash. R=iant, ken CC=golang-dev https://golang.org/cl/6749064	2012-11-02 00:17:21 -04:00
Shenghou Ma	e039c405c8	cmd/6a, cmd/6l: add support for AES-NI instrutions and PSHUFD This CL adds support for the these 7 new instructions to 6a/6l in preparation of the upcoming CL for AES-NI accelerated crypto/aes: AESENC, AESENCLAST, AESDEC, AESDECLAST, AESIMC, AESKEYGENASSIST, and PSHUFD. R=golang-dev, rsc CC=golang-dev https://golang.org/cl/5970055	2012-09-27 01:53:08 +08:00
Adam Langley	2c5b53866c	undo CL 6498092 / 4ff71bc1a199 Broke tests on 386. ««« original CL description 6l/8l: emit correct opcodes to F(SUB\|DIV)R?D. When the destination was not F0, 6l and 8l swapped FSUBD/FSUBRD and FDIVD/FDIVRD. R=golang-dev, dave, rsc CC=golang-dev https://golang.org/cl/6498092 »»» R=golang-dev CC=golang-dev https://golang.org/cl/6492100	2012-09-10 15:52:36 -04:00
Adam Langley	72fa142fc5	6l/8l: emit correct opcodes to F(SUB\|DIV)R?D. When the destination was not F0, 6l and 8l swapped FSUBD/FSUBRD and FDIVD/FDIVRD. R=golang-dev, dave, rsc CC=golang-dev https://golang.org/cl/6498092	2012-09-10 15:35:39 -04:00
Russ Cox	f2bd3a977d	cmd/6l, cmd/8l, cmd/5l: add AUNDEF instruction On 6l and 8l, this is a real instruction, guaranteed to cause an 'undefined instruction' exception. On 5l, we simulate it as BL to address 0. The plan is to use it as a signal to the linker that this point in the instruction stream cannot be reached (hence the changes to nofollow). This will help the compiler explain that panicindex and friends do not return without having to put a list of these functions in the linker. R=ken2 CC=golang-dev https://golang.org/cl/6255064	2012-05-30 16:47:56 -04:00
Russ Cox	fefae6eed1	cmd/6g, cmd/8g: move panicindex calls out of line The old code generated for a bounds check was CMP JLT ok CALL panicindex ok: ... The new code is (once the linker finishes with it): CMP JGE panic ... panic: CALL panicindex which moves the calls out of line, putting more useful code in each cache line. This matters especially in tight loops, such as in Fannkuch. The benefit is more modest elsewhere, but real. From test/bench/go1, amd64: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 6096092000 6088808000 -0.12% BenchmarkFannkuch11 6151404000 4020463000 -34.64% BenchmarkGobDecode 28990050 28894630 -0.33% BenchmarkGobEncode 12406310 12136730 -2.17% BenchmarkGzip 179923 179903 -0.01% BenchmarkGunzip 11219 11130 -0.79% BenchmarkJSONEncode 86429350 86515900 +0.10% BenchmarkJSONDecode 334593800 315728400 -5.64% BenchmarkRevcomp25M 1219763000 1180767000 -3.20% BenchmarkTemplate 492947600 483646800 -1.89% And 386: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 6354902000 6243000000 -1.76% BenchmarkFannkuch11 8043769000 7326965000 -8.91% BenchmarkGobDecode 19010800 18941230 -0.37% BenchmarkGobEncode 14077500 13792460 -2.02% BenchmarkGzip 194087 193619 -0.24% BenchmarkGunzip 12495 12457 -0.30% BenchmarkJSONEncode 125636400 125451400 -0.15% BenchmarkJSONDecode 696648600 685032800 -1.67% BenchmarkRevcomp25M 2058088000 2052545000 -0.27% BenchmarkTemplate 602140000 589876800 -2.04% To implement this, two new instruction forms: JLT target // same as always JLT $0, target // branch expected not taken JLT $1, target // branch expected taken The linker could also emit the prediction prefixes, but it does not: expected taken branches are reversed so that the expected case is not taken (as in example above), and the default expectaton for such a jump is not taken already. R=golang-dev, gri, r, dave CC=golang-dev https://golang.org/cl/6248049	2012-05-29 12:09:27 -04:00
Russ Cox	ed480128a6	cmd/6a, cmd/6l: add BSWAPL, BSWAPQ R=ken2 CC=golang-dev https://golang.org/cl/6209087	2012-05-22 00:12:58 -04:00
Russ Cox	e530d6a1e0	6c, 6g, 6l: add MOVQL to make truncation explicit Without an explicit signal for a truncation, copy propagation will sometimes propagate a 32-bit truncation and end up overwriting uses of the original 64-bit value. The case that arose in practice is in C but I believe that the same could plausibly happen in Go. The main reason we didn't run into the same in Go is that I (perhaps incorrectly?) drop MOVL AX, AX during gins, so the truncation was never generated, so it didn't confuse the optimizer. Fixes #1315. Fixes #3488. R=ken2 CC=golang-dev https://golang.org/cl/6002043	2012-04-10 12:51:59 -04:00
Russ Cox	35d260fa4c	6a, 6l: add PREFETCH instructions R=ken2 CC=golang-dev https://golang.org/cl/5989073	2012-04-10 10:09:09 -04:00
Adam Langley	36d3707009	6a/6l: add IMUL3Q and SHLDL Although Intel considers the three-argument form of IMUL to be a variant of IMUL, I couldn't make 6l able to differentiate it without huge changes, so I called it IMUL3. R=rsc CC=golang-dev https://golang.org/cl/5686055	2012-02-23 10:51:04 -05:00
Michał Derkacz	17105870ff	6l: add MOVQ xmm_reg, xmm_reg Added handler for: MOVQ xmm_reg, xmm_reg/mem64 MOVQ xmm_reg/mem64, xmm_reg using native MOVQ (it take precedence above REX.W MOVD) I don't understood 6l code enough to be sure that my small changes didn't broke it. But now 6l works with MOVQ xmm_reg, xmm_reg and all.bash reports "0 unexpected bugs". There is test assembly source: MOVQ X0, X1 MOVQ AX, X1 MOVQ X1, AX MOVQ xxx+8(FP), X2 MOVQ X2, xxx+8(FP) and generated code (gdb disassemble /r): 0x000000000040f112 <+0>: f3 0f 7e c8 movq %xmm0,%xmm1 0x000000000040f116 <+4>: 66 48 0f 6e c8 movq %rax,%xmm1 0x000000000040f11b <+9>: 66 48 0f 7e c8 movq %xmm1,%rax 0x000000000040f120 <+14>: f3 0f 7e 54 24 10 movq 0x10(%rsp),%xmm2 0x000000000040f126 <+20>: 66 0f d6 54 24 10 movq %xmm2,0x10(%rsp) Fixes #2418. R=golang-dev, rsc CC=golang-dev https://golang.org/cl/5316076	2011-11-09 16:01:17 -05:00
Michał Derkacz	c8a2be8c38	6l: Fixes opcode for PSLLQ imm8, xmm_reg R=golang-dev, rsc CC=golang-dev https://golang.org/cl/5340056	2011-11-09 16:00:24 -05:00
Jaroslavas Počepko	a88994f804	6l, 8l: remove JCXZ; add JCXZW, JCXZL, and JCXZQ R=golang-dev CC=golang-dev, rsc https://golang.org/cl/4950050	2011-08-26 17:45:19 -04:00
Dmitriy Vyukov	4e5086b993	runtime: improve Linux mutex The implementation is hybrid active/passive spin/blocking mutex. The design minimizes amount of context switches and futex calls. The idea is that all critical sections in runtime are intentially small, so pure blocking mutex behaves badly causing a lot of context switches, thread parking/unparking and kernel calls. Note that some synthetic benchmarks become somewhat slower, that's due to increased contention on other data structures, it should not affect programs that do any real work. On 2 x Intel E5620, 8 HT cores, 2.4GHz benchmark old ns/op new ns/op delta BenchmarkSelectContended 521.00 503.00 -3.45% BenchmarkSelectContended-2 661.00 320.00 -51.59% BenchmarkSelectContended-4 1139.00 629.00 -44.78% BenchmarkSelectContended-8 2870.00 878.00 -69.41% BenchmarkSelectContended-16 5276.00 818.00 -84.50% BenchmarkChanContended 112.00 103.00 -8.04% BenchmarkChanContended-2 631.00 174.00 -72.42% BenchmarkChanContended-4 682.00 272.00 -60.12% BenchmarkChanContended-8 1601.00 520.00 -67.52% BenchmarkChanContended-16 3100.00 372.00 -88.00% BenchmarkChanSync 253.00 239.00 -5.53% BenchmarkChanSync-2 5030.00 4648.00 -7.59% BenchmarkChanSync-4 4826.00 4694.00 -2.74% BenchmarkChanSync-8 4778.00 4713.00 -1.36% BenchmarkChanSync-16 5289.00 4710.00 -10.95% BenchmarkChanProdCons0 273.00 254.00 -6.96% BenchmarkChanProdCons0-2 599.00 400.00 -33.22% BenchmarkChanProdCons0-4 1168.00 659.00 -43.58% BenchmarkChanProdCons0-8 2831.00 1057.00 -62.66% BenchmarkChanProdCons0-16 4197.00 1037.00 -75.29% BenchmarkChanProdCons10 150.00 140.00 -6.67% BenchmarkChanProdCons10-2 607.00 268.00 -55.85% BenchmarkChanProdCons10-4 1137.00 404.00 -64.47% BenchmarkChanProdCons10-8 2115.00 828.00 -60.85% BenchmarkChanProdCons10-16 4283.00 855.00 -80.04% BenchmarkChanProdCons100 117.00 110.00 -5.98% BenchmarkChanProdCons100-2 558.00 218.00 -60.93% BenchmarkChanProdCons100-4 722.00 287.00 -60.25% BenchmarkChanProdCons100-8 1840.00 431.00 -76.58% BenchmarkChanProdCons100-16 3394.00 448.00 -86.80% BenchmarkChanProdConsWork0 2014.00 1996.00 -0.89% BenchmarkChanProdConsWork0-2 1207.00 1127.00 -6.63% BenchmarkChanProdConsWork0-4 1913.00 611.00 -68.06% BenchmarkChanProdConsWork0-8 3016.00 949.00 -68.53% BenchmarkChanProdConsWork0-16 4320.00 1154.00 -73.29% BenchmarkChanProdConsWork10 1906.00 1897.00 -0.47% BenchmarkChanProdConsWork10-2 1123.00 1033.00 -8.01% BenchmarkChanProdConsWork10-4 1076.00 571.00 -46.93% BenchmarkChanProdConsWork10-8 2748.00 1096.00 -60.12% BenchmarkChanProdConsWork10-16 4600.00 1105.00 -75.98% BenchmarkChanProdConsWork100 1884.00 1852.00 -1.70% BenchmarkChanProdConsWork100-2 1235.00 1146.00 -7.21% BenchmarkChanProdConsWork100-4 1217.00 619.00 -49.14% BenchmarkChanProdConsWork100-8 1534.00 509.00 -66.82% BenchmarkChanProdConsWork100-16 4126.00 918.00 -77.75% BenchmarkSyscall 34.40 33.30 -3.20% BenchmarkSyscall-2 160.00 121.00 -24.38% BenchmarkSyscall-4 131.00 136.00 +3.82% BenchmarkSyscall-8 139.00 131.00 -5.76% BenchmarkSyscall-16 161.00 168.00 +4.35% BenchmarkSyscallWork 950.00 950.00 +0.00% BenchmarkSyscallWork-2 481.00 480.00 -0.21% BenchmarkSyscallWork-4 268.00 270.00 +0.75% BenchmarkSyscallWork-8 156.00 169.00 +8.33% BenchmarkSyscallWork-16 188.00 184.00 -2.13% BenchmarkSemaSyntNonblock 36.40 35.60 -2.20% BenchmarkSemaSyntNonblock-2 81.40 45.10 -44.59% BenchmarkSemaSyntNonblock-4 126.00 108.00 -14.29% BenchmarkSemaSyntNonblock-8 112.00 112.00 +0.00% BenchmarkSemaSyntNonblock-16 110.00 112.00 +1.82% BenchmarkSemaSyntBlock 35.30 35.30 +0.00% BenchmarkSemaSyntBlock-2 118.00 124.00 +5.08% BenchmarkSemaSyntBlock-4 105.00 108.00 +2.86% BenchmarkSemaSyntBlock-8 101.00 111.00 +9.90% BenchmarkSemaSyntBlock-16 112.00 118.00 +5.36% BenchmarkSemaWorkNonblock 810.00 811.00 +0.12% BenchmarkSemaWorkNonblock-2 476.00 414.00 -13.03% BenchmarkSemaWorkNonblock-4 238.00 228.00 -4.20% BenchmarkSemaWorkNonblock-8 140.00 126.00 -10.00% BenchmarkSemaWorkNonblock-16 117.00 116.00 -0.85% BenchmarkSemaWorkBlock 810.00 811.00 +0.12% BenchmarkSemaWorkBlock-2 454.00 466.00 +2.64% BenchmarkSemaWorkBlock-4 243.00 241.00 -0.82% BenchmarkSemaWorkBlock-8 145.00 137.00 -5.52% BenchmarkSemaWorkBlock-16 132.00 123.00 -6.82% BenchmarkContendedSemaphore 123.00 102.00 -17.07% BenchmarkContendedSemaphore-2 34.80 34.90 +0.29% BenchmarkContendedSemaphore-4 34.70 34.80 +0.29% BenchmarkContendedSemaphore-8 34.70 34.70 +0.00% BenchmarkContendedSemaphore-16 34.80 34.70 -0.29% BenchmarkMutex 26.80 26.00 -2.99% BenchmarkMutex-2 108.00 45.20 -58.15% BenchmarkMutex-4 103.00 127.00 +23.30% BenchmarkMutex-8 109.00 147.00 +34.86% BenchmarkMutex-16 102.00 152.00 +49.02% BenchmarkMutexSlack 27.00 26.90 -0.37% BenchmarkMutexSlack-2 149.00 165.00 +10.74% BenchmarkMutexSlack-4 121.00 209.00 +72.73% BenchmarkMutexSlack-8 101.00 158.00 +56.44% BenchmarkMutexSlack-16 97.00 129.00 +32.99% BenchmarkMutexWork 792.00 794.00 +0.25% BenchmarkMutexWork-2 407.00 409.00 +0.49% BenchmarkMutexWork-4 220.00 209.00 -5.00% BenchmarkMutexWork-8 267.00 160.00 -40.07% BenchmarkMutexWork-16 315.00 300.00 -4.76% BenchmarkMutexWorkSlack 792.00 793.00 +0.13% BenchmarkMutexWorkSlack-2 406.00 404.00 -0.49% BenchmarkMutexWorkSlack-4 225.00 212.00 -5.78% BenchmarkMutexWorkSlack-8 268.00 136.00 -49.25% BenchmarkMutexWorkSlack-16 300.00 300.00 +0.00% BenchmarkRWMutexWrite100 27.10 27.00 -0.37% BenchmarkRWMutexWrite100-2 33.10 40.80 +23.26% BenchmarkRWMutexWrite100-4 113.00 88.10 -22.04% BenchmarkRWMutexWrite100-8 119.00 95.30 -19.92% BenchmarkRWMutexWrite100-16 148.00 109.00 -26.35% BenchmarkRWMutexWrite10 29.60 29.40 -0.68% BenchmarkRWMutexWrite10-2 111.00 61.40 -44.68% BenchmarkRWMutexWrite10-4 270.00 208.00 -22.96% BenchmarkRWMutexWrite10-8 204.00 185.00 -9.31% BenchmarkRWMutexWrite10-16 261.00 190.00 -27.20% BenchmarkRWMutexWorkWrite100 1040.00 1036.00 -0.38% BenchmarkRWMutexWorkWrite100-2 593.00 580.00 -2.19% BenchmarkRWMutexWorkWrite100-4 470.00 365.00 -22.34% BenchmarkRWMutexWorkWrite100-8 468.00 289.00 -38.25% BenchmarkRWMutexWorkWrite100-16 604.00 374.00 -38.08% BenchmarkRWMutexWorkWrite10 951.00 951.00 +0.00% BenchmarkRWMutexWorkWrite10-2 1001.00 928.00 -7.29% BenchmarkRWMutexWorkWrite10-4 1555.00 1006.00 -35.31% BenchmarkRWMutexWorkWrite10-8 2085.00 1171.00 -43.84% BenchmarkRWMutexWorkWrite10-16 2082.00 1614.00 -22.48% R=rsc, iant, msolo, fw, iant CC=golang-dev https://golang.org/cl/4711045	2011-07-29 12:44:06 -04:00
Adam Langley	9f4c288c16	hash/crc32: add SSE4.2 support Using the CRC32 instruction speeds up the Castagnoli computation by about 20x on a modern Intel CPU. R=rsc CC=golang-dev https://golang.org/cl/4650072	2011-07-12 09:29:24 -04:00
Evan Shaw	4d429c7fe5	6l: More SSE instruction fixes PSADBW and PSHUFL had the wrong prefixes. R=rsc CC=golang-dev https://golang.org/cl/2836041	2010-11-05 13:59:53 -04:00
Evan Shaw	884dceca1f	6a/6l: fix MOVOU encoding The andproto field was set incorrectly, causing 6a to encode illegal instructions. R=rsc CC=golang-dev https://golang.org/cl/2781042	2010-11-01 16:14:43 -04:00
Russ Cox	285312a05c	6l: drop confusing comment R=ken2 CC=golang-dev https://golang.org/cl/1693047	2010-07-01 12:51:00 -07:00
Russ Cox	bf10982739	6l: implement MOVLQZX as "mov", not "movsxd" (Here, quoted strings are the official AMD names.) The amd64 "movsxd" instruction, when invoked with a 64-bit REX prefix, moves and sign extends a 32-bit value from register or memory into a 64-bit register. 6.out.h spells this MOVLQSX. 6.out.h also includes MOVLQZX, the zero extending version, which it implements as "movsxd" without the REX prefix. Without the REX prefix it's only sign extending 32 bits to 32 bits (i.e., not doing anything to the bits) and then storing in a 32-bit register. Any write to a 32-bit register zeros the top half of the corresponding 64-bit register, giving the advertised effect. This particular implementation of the functionality is non-standard, because an ordinary 32-bit "mov" would do the same thing. Because it is non-standard, it is often mishandled or not handled by binary translation tools like valgrind. Switching to the standard "mov" makes the binaries work better with those tools. It's probably useful in 6c and 6g to have an explicit instruction, though, so that the intent of the size change is clear. Thus we leave the concept of MOVLQZX and just implement it by the standard "mov" instead of the non-standard 32-bit "movsxd". Fixes #896. R=ken2 CC=golang-dev https://golang.org/cl/1733046	2010-07-01 12:18:35 -07:00
Ken Thompson	3f982aeaf6	morestack magic number automatically generated in 6g and 6c, manually set in 6a. format is TEXT a(SB),, $a-b where a is auto size and b is parameter size SVN=126946	2008-07-12 17:16:22 -07:00
Ken Thompson	ddba96aed8	stack offset SVN=123521	2008-06-18 22:07:09 -07:00
Ken Thompson	f997bc6eb6	stack offseet table marker tacked above each TEXT entry SVN=123496	2008-06-18 17:51:56 -07:00
Rob Pike	0cafb9ea3d	Add compiler source to new directory structure SVN=121164	2008-06-04 14:37:38 -07:00

35 Commits