- updated doc

SVN=125468
This commit is contained in:
Robert Griesemer 2008-07-01 08:48:24 -07:00
parent 0b6e6afb76
commit 8af8dff65b
1 changed files with 166 additions and 100 deletions

View File

@ -4,13 +4,14 @@ The Go Programming Language (DRAFT)
Robert Griesemer, Rob Pike, Ken Thompson Robert Griesemer, Rob Pike, Ken Thompson
---- ----
(June 12, 2008) (July 1, 2008)
This document is a semi-informal specification/proposal for a new This document is a semi-formal specification/proposal for a new
systems programming language. The document is under active systems programming language. The document is under active
development; any piece may change substantially as design progresses; development; any piece may change substantially as design progresses;
also there remain a number of unresolved issues. also there remain a number of unresolved issues.
Guiding principles Guiding principles
---- ----
@ -34,42 +35,50 @@ The language should be strong enough that the compiler and run time can be
written in itself. written in itself.
Modularity, identifiers and scopes
----
A Go program consists of one or more `packages' compiled separately, though
not independently. A single package may make
individual identifiers visible to other files by marking them as
exported; there is no ``header file''.
A package collects types, constants, functions, and so on into a named
entity that may be exported to enable its constituents be used in
another compilation unit.
Because there are no header files, all identifiers in a package are either
declared explicitly within the package or arise from an import statement.
Scoping is essentially the same as in C.
Program structure Program structure
---- ----
A compilation unit (usually a single source file) A Go program consists of a number of ``packages''.
consists of a package specifier followed by import
declarations followed by other declarations. There are no statements
at the top level of a file.
A program consists of a number of packages. By convention, one A package is built from one or more source files, each of which consists
package, by default called main, is the starting point for execution. of a package specifier followed by import declarations followed by other
It contains a function, also called main, that is the first function invoked declarations. There are no statements at the top level of a file.
by the run time system.
By convention, one package, by default called main, is the starting point for
execution. It contains a function, also called main, that is the first function
invoked by the run time system.
If any package within the program If any package within the program
contains a function init(), that function will be executed contains a function init(), that function will be executed
before main.main() is called. The details of initialization are before main.main() is called. The details of initialization are
still under development. still under development.
Source files can be compiled separately (without the source
code of packages they depend on), but not independently (the compiler does
check dependencies by consulting the symbol information in compiled packages).
Modularity, identifiers and scopes
----
A package is a collection of import, constant, type, variable, and function
declarations. Each declaration associates an ``identifier'' with a program
entity (such as a type).
In particular, all identifiers in a package are either
declared explicitly within the package, arise from an import statement,
or belong to a small set of predefined identifiers (such as "int32").
A package may make explicitly declared identifiers visible to other
packages by marking them as exported; there is no ``header file''.
Imported identifiers cannot be re-exported.
Scoping is essentially the same as in C: The scope of an identifier declared
within a ``block'' extends from the declaration of the identifier (that is, the
position immediately after the identifier) to the end of the block. An identifier
shadows identifiers with the same name declared in outer scopes. Within a
block, a particular identifier must be declared at most once.
Typing, polymorphism, and object-orientation Typing, polymorphism, and object-orientation
---- ----
@ -78,6 +87,22 @@ Go programs are strongly typed. Certain values can also be
polymorphic. The language provides mechanisms to make use of such polymorphic. The language provides mechanisms to make use of such
polymorphic values type-safe. polymorphic values type-safe.
Interface types provide the mechanisms to support object-oriented
programming. Different interface types are independent of each
other and no explicit hierarchy is required (such as single or
multiple inheritance explicitly specified through respective type
declarations). Interface types only define a set of methods that a
corresponding implementation must provide. Thus interface and
implementation are strictly separated.
An interface is implemented by associating methods with types.
If a type defines all methods of an interface, it
implements that interface and thus can be used where that interface is
required. Unless used through a variable of interface type, methods
can always be statically bound (they are not ``virtual''), and incur no
runtime overhead compared to an ordinary function.
[OLD
Interface types, building on structures with methods, provide Interface types, building on structures with methods, provide
the mechanisms to support object-oriented programming. the mechanisms to support object-oriented programming.
Different interface types are independent of each Different interface types are independent of each
@ -93,6 +118,7 @@ implements that interface and thus can be used where that interface is
required. Unless used through a variable of interface type, methods required. Unless used through a variable of interface type, methods
can always be statically bound (they are not ``virtual''), and incur no can always be statically bound (they are not ``virtual''), and incur no
runtime overhead compared to an ordinary function. runtime overhead compared to an ordinary function.
END]
Go has no explicit notion of classes, sub-classes, or inheritance. Go has no explicit notion of classes, sub-classes, or inheritance.
These concepts are trivially modeled in Go through the use of These concepts are trivially modeled in Go through the use of
@ -249,7 +275,11 @@ In the grammar we use the notation
utf8_char utf8_char
to refer to an arbitrary Unicode code point encoded in UTF-8. to refer to an arbitrary Unicode code point encoded in UTF-8. We use
non_ascii
to refer to the subset of "utf8_char" code points with values >= 128.
Digits and Letters Digits and Letters
@ -259,10 +289,9 @@ Digits and Letters
dec_digit = { "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" } . dec_digit = { "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" } .
hex_digit = { "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "a" | hex_digit = { "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "a" |
"A" | "b" | "B" | "c" | "C" | "d" | "D" | "e" | "E" | "f" | "F" } . "A" | "b" | "B" | "c" | "C" | "d" | "D" | "e" | "E" | "f" | "F" } .
letter = "A" | "a" | ... "Z" | "z" | "_" . letter = "A" | "a" | ... "Z" | "z" | "_" | non_ascii .
For simplicity, letters and digits are ASCII. We may in time allow All non-ASCII code points are considered letters; digits are always ASCII.
Unicode identifiers.
Identifiers Identifiers
@ -278,6 +307,21 @@ type, a function, etc. An identifier must not be a reserved word.
ThisIsVariable9 ThisIsVariable9
Reserved words
----
break fallthrough import return
case false interface select
const for map struct
continue func new switch
default go nil true
else goto package type
export if range var
TODO: "len" is currently also a reserved word - it shouldn't be.
Types Types
---- ----
@ -342,9 +386,10 @@ corresponding boolean constant values.
Strings are described in a later section. Strings are described in a later section.
[OLD
The polymorphic ``any'' type can represent a value of any type. The polymorphic ``any'' type can represent a value of any type.
TODO: we need a section about any TODO: we need a section about any
END]
Numeric literals Numeric literals
@ -353,7 +398,8 @@ Numeric literals
Integer literals take the usual C form, except for the absence of the Integer literals take the usual C form, except for the absence of the
'U', 'L', etc. suffixes, and represent integer constants. Character 'U', 'L', etc. suffixes, and represent integer constants. Character
literals are also integer constants. Similarly, floating point literals are also integer constants. Similarly, floating point
literals are also C-like, without suffixes and decimal only. literals are also C-like, without suffixes and in decimal representation
only.
An integer constant represents an abstract integer value of arbitrary An integer constant represents an abstract integer value of arbitrary
precision. Only when an integer constant (or arithmetic expression precision. Only when an integer constant (or arithmetic expression
@ -532,15 +578,13 @@ More about types
---- ----
The static type of a variable is the type defined by the variable's The static type of a variable is the type defined by the variable's
declaration. At run-time, some variables, in particular those of declaration. The dynamic type of a variable is the actual type of the
interface types, can assume a dynamic type, which may be value stored in a variable at runtime. Except for variables of interface
different at different times during execution. The dynamic type type, the static and dynamic type of variables is always the same.
of a variable is always compatible with the static type of the
variable.
At any given time, a variable or value has exactly one dynamic Variables of interface type may hold values of different types during
type, which may be the same as the static type. (They will execution. However, the dynamic type of the variable is always compatible
differ only if the variable has an interface type or "any" type.) with the static type of the variable.
Types may be composed from other types by assembling arrays, maps, Types may be composed from other types by assembling arrays, maps,
channels, structures, and functions. They are called composite types. channels, structures, and functions. They are called composite types.
@ -591,7 +635,9 @@ called (key, value) pairs. For a given map,
the keys and values must each be of a specific type. the keys and values must each be of a specific type.
Upon creation, a map is empty and values may be added and removed Upon creation, a map is empty and values may be added and removed
during execution. The number of entries in a map is called its length. during execution. The number of entries in a map is called its length.
[OLD
A map whose value type is 'any' can store values of all types. A map whose value type is 'any' can store values of all types.
END]
MapType = "map" "[" KeyType "]" ValueType . MapType = "map" "[" KeyType "]" ValueType .
KeyType = Type . KeyType = Type .
@ -601,6 +647,8 @@ A map whose value type is 'any' can store values of all types.
map [struct { pid int; name string }] *chan Buffer map [struct { pid int; name string }] *chan Buffer
map [string] any map [string] any
Implementation restriction: Currently, only pointers to maps are supported.
Struct types Struct types
---- ----
@ -633,6 +681,8 @@ Literals for compound data structures consist of the type of the constant
followed by a parenthesized expression list. In effect, they are a followed by a parenthesized expression list. In effect, they are a
conversion from expression list to compound value. conversion from expression list to compound value.
TODO: Needs to be updated.
Pointer types Pointer types
---- ----
@ -693,7 +743,10 @@ Function types
A function type denotes the set of all functions with the same signature. A function type denotes the set of all functions with the same signature.
A method is a function with a receiver, which is of type pointer to struct. A method is a function with a receiver declaration.
[OLD
, which is of type pointer to struct.
END]
Functions can return multiple values simultaneously. Functions can return multiple values simultaneously.
@ -722,6 +775,9 @@ In particular, v := func() {} creates a variable of type *func(). To call the
function referenced by v, one writes v(). It is illegal to dereference a function referenced by v, one writes v(). It is illegal to dereference a
function pointer. function pointer.
TODO: For consistency, we should require the use of & to get the pointer to
a function: &func() {}.
Function Literals Function Literals
---- ----
@ -731,10 +787,6 @@ Function literals represent anonymous functions.
FunctionLit = FunctionType Block . FunctionLit = FunctionType Block .
Block = "{" [ StatementList [ ";" ] ] "}" . Block = "{" [ StatementList [ ";" ] ] "}" .
The scope of an identifier declared within a block extends
from the declaration of the identifier (that is, the position
immediately after the identifier) to the end of the block.
A function literal can be invoked A function literal can be invoked
or assigned to a variable of the corresponding function pointer type. or assigned to a variable of the corresponding function pointer type.
For now, a function literal can reference only its parameters, global For now, a function literal can reference only its parameters, global
@ -752,9 +804,8 @@ Unresolved issues: Are there method literals? How do you use them?
Methods Methods
---- ----
A method is a function bound to a particular struct type T. When defined, A method is a function bound to a particular type T, where T is the
a method indicates the type of the struct by declaring a receiver of type type of the receiver. For instance, given type Point
*T. For instance, given type Point
type Point struct { x, y float } type Point struct { x, y float }
@ -764,9 +815,8 @@ the declaration
return scale * (p.x*p.x + p.y*p.y); return scale * (p.x*p.x + p.y*p.y);
} }
creates a method of type Point. Note that methods are not declared creates a method of type *Point. Note that methods may appear anywhere
within their struct type declaration. They may appear anywhere and after the declaration of the receiver type and may be forward-declared.
may be forward-declared for commentary.
When invoked, a method behaves like a function whose first argument When invoked, a method behaves like a function whose first argument
is the receiver, but at the call site the receiver is bound to the method is the receiver, but at the call site the receiver is bound to the method
@ -774,16 +824,16 @@ using the notation
receiver.method() receiver.method()
For instance, given a Point variable pt, one may call For instance, given a *Point variable pt, one may call
pt.distance(3.5) pt.distance(3.5)
Interface of a struct Interface of a type
---- ----
The interface of a struct is defined to be the unordered set of methods The interface of a type is defined to be the unordered set of methods
associated with that struct. associated with that type.
Interface types Interface types
@ -802,22 +852,23 @@ An interface type denotes a set of methods.
Close(); Close();
} }
Any struct whose interface has, possibly as a subset, the complete Any type whose interface has, possibly as a subset, the complete
set of methods of an interface I is said to implement interface I. set of methods of an interface I is said to implement interface I.
For instance, if two struct types S1 and S2 have the methods For instance, if two types S1 and S2 have the methods
func (p *T) Read(b Buffer) bool { return ... } func (p T) Read(b Buffer) bool { return ... }
func (p *T) Write(b Buffer) bool { return ... } func (p T) Write(b Buffer) bool { return ... }
func (p *T) Close() { ... } func (p T) Close() { ... }
then the File interface is implemented by both S1 and S2, regardless of (where T stands for either S1 or S2) then the File interface is
what other methods S1 and S2 may have or share. implemented by both S1 and S2, regardless of what other methods
S1 and S2 may have or share.
All struct types implement the empty interface: All types implement the empty interface:
interface {} interface {}
In general, a struct type implements an arbitrary number of interfaces. In general, a type implements an arbitrary number of interfaces.
For instance, if we have For instance, if we have
type Lock interface { type Lock interface {
@ -827,17 +878,20 @@ For instance, if we have
and S1 and S2 also implement and S1 and S2 also implement
func (p *T) lock() { ... } func (p T) lock() { ... }
func (p *T) unlock() { ... } func (p T) unlock() { ... }
they implement the Lock interface as well as the File interface. they implement the Lock interface as well as the File interface.
[OLD
It is legal to assign a pointer to a struct to a variable of It is legal to assign a pointer to a struct to a variable of
compatible interface type. It is legal to assign an interface compatible interface type. It is legal to assign an interface
variable to any struct pointer variable but if the struct type is variable to any struct pointer variable but if the struct type is
incompatible the result will be nil. incompatible the result will be nil.
END]
[OLD
The polymorphic "any" type The polymorphic "any" type
---- ----
@ -861,12 +915,15 @@ is a special case that can match any struct type, while type
can match any type at all, including basic types, arrays, etc. can match any type at all, including basic types, arrays, etc.
TODO: details about reflection TODO: details about reflection
END]
Equivalence of types Equivalence of types
--- ---
Types are structurally equivalent: Two types are equivalent ('equal') if they TODO: We may need to rethink this because of the new ways interfaces work.
Types are structurally equivalent: Two types are equivalent (``equal'') if they
are constructed the same way from equivalent types. are constructed the same way from equivalent types.
For instance, all variables declared as "*int" have equivalent type, For instance, all variables declared as "*int" have equivalent type,
@ -1002,7 +1059,7 @@ The syntax
is shorthand for is shorthand for
var identifer = Expression. var identifier = Expression.
i := 0 i := 0
f := func() int { return 7; } f := func() int { return 7; }
@ -1011,15 +1068,15 @@ is shorthand for
Also, in some contexts such as "if", "for", or "switch" statements, Also, in some contexts such as "if", "for", or "switch" statements,
this construct can be used to declare local temporary variables. this construct can be used to declare local temporary variables.
TODO: var a, b = 1, "x"; is permitted by grammar but not by current compiler
Function and method declarations Function and method declarations
---- ----
Functions and methods have a special declaration syntax, slightly Functions and methods have a special declaration syntax, slightly
different from the type syntax because an identifier must be present different from the type syntax because an identifier must be present
in the signature. Functions and methods can only be declared in the signature.
Implementation restriction: Functions and methods can only be declared
at the global level. at the global level.
FunctionDecl = "func" NamedSignature ( ";" | Block ) . FunctionDecl = "func" NamedSignature ( ";" | Block ) .
@ -1064,10 +1121,10 @@ Initial values
When memory is allocated to store a value, either through a declaration When memory is allocated to store a value, either through a declaration
or new(), and no explicit initialization is provided, the memory is or new(), and no explicit initialization is provided, the memory is
given a default initialization. Each element of such a value is given a default initialization. Each element of such a value is
set to the ``zero'' for that type: 0 for integers, 0.0 for floats, and set to the ``zero'' for that type: "false" for booleans, "0" for integers,
nil for pointers. This intialization is done recursively, so for "0.0" for floats, '''' for strings, and nil for pointers. This intialization
instance each element of an array of integers will be set to 0 if no is done recursively, so for instance each element of an array of integers will
other value is specified. be set to 0 if no other value is specified.
These two simple declarations are equivalent: These two simple declarations are equivalent:
@ -1094,7 +1151,7 @@ exported identifer visible outside the package. Another package may
then import the identifier to use it. then import the identifier to use it.
Export declarations must only appear at the global level of a Export declarations must only appear at the global level of a
compilation unit and can name only globally-visible identifiers. source file and can name only globally-visible identifiers.
That is, one can export global functions, types, and so on but not That is, one can export global functions, types, and so on but not
local variables or structure fields. local variables or structure fields.
@ -1236,11 +1293,15 @@ pointer or interface value.
By default, pointers are initialized to nil. By default, pointers are initialized to nil.
TODO: This needs to be revisited.
[OLD
TODO: how does this definition jibe with using nil to specify TODO: how does this definition jibe with using nil to specify
conversion failure if the result is not of pointer type, such conversion failure if the result is not of pointer type, such
as an any variable holding an int? as an any variable holding an int?
TODO: if interfaces were explicitly pointers, this gets simpler. TODO: if interfaces were explicitly pointers, this gets simpler.
END]
Allocation Allocation
@ -1275,6 +1336,10 @@ TODO: argument order for dimensions in multidimensional arrays
Conversions Conversions
---- ----
TODO: gri believes this section is too complicated. Instead we should
replace this with: 1) proper conversions of basic types, 2) compound
literals, and 3) type assertions.
Conversions create new values of a specified type derived from the Conversions create new values of a specified type derived from the
elements of a list of expressions of a different type. elements of a list of expressions of a different type.
@ -1414,20 +1479,16 @@ a set of related constants:
TODO: should iota work in var, type, func decls too? TODO: should iota work in var, type, func decls too?
Statements Statements
---- ----
Statements control execution. Statements control execution.
Statement = Statement =
[ LabelDecl ] ( StructuredStat | UnstructuredStat ) . Declaration |
SimpleStat | GoStat | ReturnStat | BreakStat | ContinueStat | GotoStat |
StructuredStat = Block | IfStat | SwitchStat | SelectStat | ForStat | RangeStat |
Block | IfStat | SwitchStat | SelectStat | ForStat | RangeStat .
UnstructuredStat =
Declaration | SimpleVarDecl |
SimpleStat | GoStat | ReturnStat | BreakStat | ContinueStat | GotoStat .
SimpleStat = SimpleStat =
ExpressionStat | IncDecStat | Assignment | SimpleVarDecl . ExpressionStat | IncDecStat | Assignment | SimpleVarDecl .
@ -1437,15 +1498,13 @@ Statement lists
---- ----
Semicolons are used to separate individual statements of a statement list. Semicolons are used to separate individual statements of a statement list.
They are optional after a statement that ends with a closing curly brace '}'. They are optional immediately before or after a closing curly brace "}",
immediately after "++" or "--", and immediately before a reserved word.
StatementList = StatementList = Statement { [ ";" ] Statement } .
StructuredStat |
UnstructuredStat |
StructuredStat [ ";" ] StatementList |
UnstructuredStat ";" StatementList .
TODO: define optional semicolons precisely
TODO: This still seems to be more complicated then necessary.
Expression statements Expression statements
@ -1478,7 +1537,7 @@ Assignments
assign_op = [ add_op | mul_op ] "=" . assign_op = [ add_op | mul_op ] "=" .
The left-hand side must be an l-value such as a variable, pointer indirection, The left-hand side must be an l-value such as a variable, pointer indirection,
or an array indexing. or an array index.
x = 1 x = 1
*p = f() *p = f()
@ -1604,13 +1663,21 @@ the variable is initialized once before the statement is entered.
} }
TODO: We should fix this and move to:
IfStat =
"if" [ [ Simplestat ] ";" ] [ Condition ] Block
{ "else" "if" Condition Block }
[ "else" Block ] .
Switch statements Switch statements
---- ----
Switches provide multi-way execution. Switches provide multi-way execution.
SwitchStat = "switch" [ [ Simplestat ] ";" ] [ Expression ] "{" { CaseClause } "}" . SwitchStat = "switch" [ [ Simplestat ] ";" ] [ Expression ] "{" { CaseClause } "}" .
CaseClause = CaseList StatementList [ ";" ] [ "fallthrough" [ ";" ] ] . CaseClause = CaseList [ StatementList [ ";" ] ] [ "fallthrough" [ ";" ] ] .
CaseList = Case { Case } . CaseList = Case { Case } .
Case = ( "case" ExpressionList | "default" ) ":" . Case = ( "case" ExpressionList | "default" ) ":" .
@ -1902,7 +1969,7 @@ an error if the import introduces name conflicts.
Program Program
---- ----
A program is package clause, optionally followed by import declarations, A program is a package clause, optionally followed by import declarations,
followed by a series of declarations. followed by a series of declarations.
Program = PackageClause { ImportDecl [ ";" ] } { Declaration [ ";" ] } . Program = PackageClause { ImportDecl [ ";" ] } { Declaration [ ";" ] } .
@ -1913,5 +1980,4 @@ TODO
- TODO: type switch? - TODO: type switch?
- TODO: words about slices - TODO: words about slices
- TODO: I (gri) would like to say that sizeof(int) == sizeof(pointer), always.
- TODO: really lock down semicolons - TODO: really lock down semicolons