Common Language Infrastructure (CLI)
Partition II:
Metadata Definition and Semantics
Table of contents
5.4 Labels and Lists of Labels
6 Assemblies, Manifests and Modules
6.1 Overview of Modules, Assemblies, and Files
6.2.1 Information about the Assembly (<asmDecl>)
6.6 Declarations inside a Module or Assembly
7.3 References to User-defined Types (<typeReference>)
8 Visibility, Accessibility and Hiding
8.1 Visibility of Top-Level Types and Accessibility of Nested Types
9.1.1 Visibility and Accessibility Attributes
9.1.3 Type Semantics Attributes
9.1.5 Interoperation Attributes
9.1.6 Special Handling Attributes
9.3 Introducing and Overriding Virtual Methods
9.3.1 Introducing a Virtual Method
9.3.3 Accessibility and Overriding
9.4 Method Implementation Requirements
9.7 Controlling Instance Layout
11.2 Implementing Virtual Methods on Interfaces
13.6.1 Synchronous Calls to Delegates
13.6.2 Asynchronous Calls to Delegates
14 Defining, Referencing, and Calling Methods
14.2 Static, Instance, and Virtual Methods
14.4.2 Predefined Attributes on Methods
14.4.3 Implementation Attributes of Methods
14.5.1 Method Transition Thunks
14.5.6 Managed Native Calling Conventions (x86)
15 Defining and Referencing Fields
15.1.1 Accessibility Information
15.1.2 Field Contract Attributes
15.1.3 Interoperation Attributes
15.3 Embedding Data in a PE File
15.3.2 Accessing Data from the PE File
15.3.3 Unmanaged Thread-local Storage
15.4 Initialization of Non-Literal Static Data
15.4.1 Data Known at Link Time
20.1 CLS Conventions: Custom Attribute Usage
20.2 Attributes Used by the CLI
20.2.1 Pseudo Custom Attributes
20.2.2 Custom Attributes Defined by the CLS
20.2.3 Custom Attributes for CIL-to-Native-Code Compiler and Debugger
20.2.4 Custom Attributes for Remoting
20.2.5 Custom Attributes for Security
20.2.6 Custom Attributes for TLS
20.2.7 Pseudo Custom Attributes for the Assembly Linker
20.2.8 Custom Attributes Provided for Interoperation with Unmanaged Code
20.2.9 Custom Attributes, Various
21 Metadata Logical Format: Tables
21.1 Metadata Validation Rules
21.7 AssemblyRefProcessor : 0x24
22 Metadata Logical Format: Other Structures
22.1.1 Values for AssemblyHashAlgorithm
22.1.2 Values for AssemblyFlags
22.1.4 Flags for Events [EventAttributes]
22.1.5 Flags for Fields [FieldAttributes]
22.1.6 Flags for Files [FileAttributes]
22.1.7 Flags for ImplMap [PInvokeAttributes]
22.1.8 Flags for ManifestResource [ManifestResourceAttributes]
22.1.9 Flags for Methods [MethodAttributes]
22.1.10 Flags for Methods [MethodImplAttributes]
22.1.11 Flags for MethodSemantics [MethodSemanticsAttributes]
22.1.12 Flags for Params [ParamAttributes]
22.1.13 Flags for Properties [PropertyAttributes]
22.1.14 Flags for Types [TypeAttributes]
22.1.15 Element Types used in Signatures
24 File Format Extensions to PE
24.1 Structure of the Runtime File Format
24.3.1 Import Table and Import Address Table (IAT)
24.4 Common Intermediate Language Physical Layout
24.4.1 Method Header Type Values
24.4.4 Flags for Method Headers
24.4.6 Exception Handling Clauses
Partition I_alink=Partition_I of the Common Language Infrastructure (CLI) describes the overall architecture of the CLI, and provides the normative description of the Common Type System (CTS), the Virtual Execution System (VES), and the Common Language Specification (CLS). It also provides a non-normative description of the metadata and a comprehensive set of abbreviations, acronyms (Partition I_alink=Partition_I) and definitions, included by reference (Partition I_alink=Partition_I) from all other Partitions.
Partition II (this specification) provides the normative description of the metadata: its physical layout (as a file format), its logical contents (as a set of tables and their relationships), and its semantics (as seen from a hypothetical assembler, ilasm).
This document focuses on the structure and semantics of metadata. The semantics of metadata, which dictate much of the operation of the VES, are described using the syntax of ilasm, an assembler language for CIL. The ilasm syntax itself is considered a normative part of this ECMA standard. This constitutes Chapters 5_5_General_Syntax through 20_20_Custom_Attributes. A complete syntax for ilasm is included in Partition V_alink=Partition_V. The structure (both logical and physical) is covered in Chapters 21_21_Metedata_Logical_Format_Tables through 24_24_File_Format_Extensions_to_PE.
Rationale: An assembly language is really just syntax for specifying the metadata in a file and the CIL instructions in that file. Specifying ilasm provides a means of interchanging programs written directly for the CLI without the use of a higher-level language and also provides a convenient way to express examples.
The semantics of the metadata also can be described independently of the actual format in which the metadata is stored. This point is important because the storage format as specified Chapters 21_21_Metedata_Logical_Format_Tables through 24_24_File_Format_Extensions_to_PE is engineered to be efficient for both storage space and access time but this comes at the cost of the simplicity desirable for describing its semantics.
Validation refers to a set of tests that can be performed on any file to check that the file format, metadata, and CIL are self-consistent. These tests are intended to ensure that the file conforms to the mandatory requirements of this specification. The behavior of conforming implementations of the CLI when presented with non-conforming files is unspecified.
Verification refers to a check of both CIL and its related metadata to ensure that the CIL code sequences do not permit any access to memory outside the program’s logical address space. In conjunction with the validation tests, verification ensures that the program cannot access memory or other resources to which it is not granted access.
Partition III_alink=Partition_III specifies the rules for both valid and verifiable use of CIL instructions. Partition III_alink=Partition_III also provides an informative description of rules for validating the internal consistency of metadata (the rules follow, albeit indirectly, from the specification in this Partition) as well as containing a normative description of the verification algorithm. A mathematical proof of soundness of the underlying type system is possible, and provides the basis for the verification requirements. Aside from these rules this standard does not specify:
· at what time (if ever) such an algorithm should be performed
· what a conforming implementation should do in case of failure of verification.
The following graph makes this relationship clearer (see next paragraph for a description):

Figure 1: Relationship between valid and verifiable CIL
In the above figure, the outer circle contains all code permitted by the ilasm syntax. The next circle represents all code that is valid CIL. The dotted inner circle represents all type safe code. Finally, the black innermost circle contains all code that is verifiable. (The difference between typesafe code and verifiable code is one of provability: code which passes the VES verification algorithm is, by-definition, verifiable; but that simple algorithm rejects certain code, even though a deeper analysis would reveal it as genuinely typesafe). Note that even if a program follows the syntax described in Partition V_alink=Partition_V, the code may still not be valid, because valid code shall adhere to restrictions presented in this document and in Partition III_alink=Partition_III.
Verification is a very stringent test. There are many programs that will pass validation but will fail verification. The VES cannot guarantee that these programs do not access memory or resources to which they are not granted access. Nonetheless, they may have been correctly constructed so that they do not access these resources. It is thus a matter of trust, rather than mathematical proof, whether it is safe to run these programs. A conforming implementation of the CLI may allow unverifiable code (valid code that does not pass verification) to be executed, although this may be subject to administrative trust controls that are not part of this standard. A conforming implementation of the CLI shall allow the execution of verifiable code, although this may be subject to additional implementation-specified trust controls.
This section and its subsections contain only informative text.
Before diving into the details, it is useful to see an introductory sample program to get a feeling for the ilasm assembly language. The next section shows the famous Hello World program, this time in the ilasm assembly language.
This section gives a simple example to illustrate the general feel of ilasm. Below is code that prints the well known “Hello world!” salutation. The salutation is written by calling WriteLine, a static method found in the class System.Console that is part of the assembly mscorlib (see Partition IV_alink=Partition_IV).
Example (informative):
.assembly extern mscorlib {}
.assembly hello {}
.method static public void main() cil managed
{ .entrypoint
.maxstack 1
ldstr "Hello world!"
call void [mscorlib]System.Console::WriteLine(class System.String)
ret
}
The .assembly extern declaration references an external assembly, mscorlib, which defines System.Console. The .assembly declaration in the second line declares the name of the assembly for this program. (Assemblies are the deployment unit for executable content for the CLI.) The .method declaration defines the global method main. The body of the method is enclosed in braces. The first line in the body indicates that this method is the entry point for the assembly (.entrypoint), and the second line in the body specifies that it requires at most one stack slot (.maxstack).
The method contains only three instructions. The ldstr instruction pushes the string constant "Hello world!" onto the stack and the call instruction invokes System.Console::WriteLine, passing the string as its only argument (note that string literals in CIL are instances of the standard class System.String). As shown, call instructions shall include the full signature of the called method. Finally, the last instruction returns (ret) from main.
This document contains integrated examples for most features of the CLI metadata. Many sections conclude with an example showing a typical use of the feature. All these examples are written using the ilasm assembly language. In addition, Partition V_alink=Partition_V contains a longer example of a program written in the ilasm assembly language. All examples are, of course, informative only.
End informative text
This section describes aspects of the ilasm syntax that are common to many parts of the grammar. The term “ASCII” refers to the American Standard Code for Information Interchange, a standard seven-bit code that was proposed by ANSI in 1963, and finalized in 1968. The ASCII repertoire of Unicode is the set of 128 Unicode characters from U+0000 to U+007F.
This document uses a modified form of the BNF syntax notation. The following is a brief summary of this notation.
Bold items are terminals. Items placed in angle brackets (e.g. <int64>) are names of syntax classes and shall be replaced by actual instances of the class. Items placed in square brackets (e.g. [<float>]) are optional, and any item followed by * can appear zero or more times. The character “|” means that the items on either side of it are acceptable. The options are sorted in alphabetical order (to be more specific: in ASCII order, ignoring “<” for syntax classes, and case-insensitive). If a rule starts with an optional term, the optional term is not considered for sorting purposes.
ilasm is a case-sensitive language. All terminals shall be used with the same case as specified in this reference.
Example (informative):
A grammar such as
<top> ::= <int32> | float <float> |
floats [<float> [, <float>]*] | else <QSTRING>
would consider the following all to be legal:
12
float 3
float –4.3e7
floats
floats 2.4
floats 2.4, 3.7
else "Something \t weird"
but all of the following to be illegal:
else 3
3, 4
float 4.3, 2.4
float else
stuff
The basic syntax classes used in the grammar are used to describe syntactic constraints on the input intended to convey logical restrictions on the information encoded in the metadata.
The syntactic constraints described in this clause are informative only. The semantic constraints (e.g. “shall be represented in 32 bits”) are normative.
<int32> is either a decimal number or “0x” followed by a hexadecimal number, and shall be represented in 32 bits.
<int64> is either a decimal number or “0x” followed by a hexadecimal number, and shall be represented in 64 bits.
<hexbyte> is a 2-digit hexadecimal number that fits into one byte.
<realnumber> is any syntactic representation for a floating point number that is distinct from that for all other terminal nodes. In this document, a period (.) is used to separate the integer and fractional parts, and “e” or “E” separates the mantissa from the exponent. Either (but not both) may be omitted.
Note: A complete assembler may also provide syntax for infinities and NaNs.
<QSTRING> is a string surrounded by double quote (″) marks. Within the quoted string the character “\” can be used as an escape character, with “\t” for a tab character, “\n” for a new line character, or followed by three octal digits in order to insert an arbitrary byte into the string. The “+” operator can be used to concatenate string literals. This way, a long string can be broken across multiple lines by using “+” and a new string on each line. An alternative is using “\” as the last character in a line, in which case the line break is not entered into the generated string. Any white characters (space, line feed, carriage return, and tab) between the “\” and the first character on the next line are ignored. See also examples below.
Note: A complete assembler will need to deal with the full set of issues required to support Unicode encodings, see Partition I_alink=Partition_I (especially CLS Rule 4).
<SQSTRING> is similar to <QSTRING> with the difference that it is surround by single quote (′) marks instead of double quote marks.
<ID> is a contiguous string of characters which starts with either an alphabetic character or one of “_”, “$”, “@” or “?” and is followed by any number of alphanumeric characters or any of “_”, “$”, “@”, or “?”. An <ID> is used in only two ways:
· As a label of a CIL instruction
· As an <id> which can either be an <ID> or an <SQSTRING>, so that special characters can be included.
Example (informative):
The following examples shows breaking of strings:
ldstr "Hello " + "World " +
"from CIL!"
and
ldstr "Hello World\
\040from CIL!"
become both "Hello World from CIL!".
Identifiers are used to name entities. Simple identifiers are just equivalent to an <ID>. However, the ilasm syntax allows the use of any identifier that can be formed using the Unicode character set (see Partition I_alink=Partition_I). To achieve this an identifier is placed within single quotation marks. This is summarized in the following grammar.
|
<id> ::= |
|
<ID> |
|
| <SQSTRING> |
Keywords may only be used as identifiers if they appear in single quotes (see Partition V_alink=Partition_V for a list of all keywords).
Several <id>’s may be combined to form a larger <id>. The <id>’s are separated by a dot (.). An <id> formed in this way is called a <dottedname>.
|
<dottedname> ::= <id> [. <id>]* |
Rationale: <dottedname> is provided for convenience, since “.” can be included in an <id> using the <SQSTRING> syntax. <dottedname> is used in the grammar where “.” is considered a common character (e.g. fully qualified type names)
Implementation Specific (Microsoft)
Names that end with $PST followed by a hexadecimal number have a special meaning. The assembler will automatically truncate the part starting with the $PST. This is in support of compiler-controlled accessibility, see Partition I_alink=Partition_V. Also, the first release of the CLI limits the length of identifiers; see Chapter 21_21_Metedata_Logical_Format_Tables for details.
Examples (informative):
The following shows some simple identifiers:
A
Test
$Test
@Foo?
?_X_
The following shows identifiers in single quotes:
′Weird Identifier′
′Odd\102Char′
′Embedded\nReturn′
The following shows dotted names:
System.Console
A.B.C
′My Project′.′My Component′.′My Name′
Labels are provided as a programming convenience; they represent a number that is encoded in the metadata. The value represented by a label is typically an offset in bytes from the beginning of the current method, although the precise encoding differs depending on where in the logical metadata structure or CIL stream the label occurs. For details of how labels are encoded in the metadata, see Chapters 21_21_Metedata_Logical_Format_Tables through 24_24_File_Format_Extensions_to_PE; for their encoding in CIL instructions see Partition III_alink=Partition_III.
A simple label is a special name that represents an address. Syntactically, a label is equivalent to an <id>. Thus, labels may be also single quoted and may contain Unicode characters.
A list of labels is comma separated, and can be any combination of these simple labels.
|
<labeloroffset> ::= <id> |
|
<labels> ::= <labeloroffset> [, <labeloroffset>]* |
Rationale: In a real assembler the syntax for <labeloroffset> might allow the direct specification of a number rather than requiring symbolic labels.
Implementation Specific (Microsoft)
The following syntax is also supported, for round-tripping purposes:
<labeloroffset> ::= <int32> | <label>
ilasm distinguishes between two kinds of labels: code labels and data labels. Code labels are followed by a colon (“:”) and represent the address of an instruction to be executed. Code labels appear before an instruction and they represent the address of the instruction that immediately follows the label. A particular code label name may not be declared more than once in a method.
In contrast to code labels, data labels specify the location of a piece of data and do not include the colon character. The data label may not be used as a code label, and a code label may not be used as a data label. A particular code label name may not be declared more than once in a module.
|
<codeLabel> ::= <id> : |
|
<dataLabel> ::= <id> |
Example (informative):
The following defines a code label, ldstr_label, that represents the address of the ldstr instruction:
ldstr_label: ldstr "A label"
A list of bytes consists simply of one or more hex bytes. Hex bytes are pairs of characters 0 – 9, a – f, and A – F.
|
<bytes> ::= <hexbyte> [<hexbyte>*] |
There are two different ways to specify a floating-point number:
1. Use the dot (“.”) for the decimal point and “e” or “E” in front of the exponent. Both the decimal point and the exponent are optional.
2. Indicate that the floating-point value is derived from an integer using the keyword float32 or float64 and indicating the integer in parentheses.
|
<float64> ::= |
|
float32 ( <int32> ) |
|
| float64 ( <int64> ) |
|
| <realnumber> |
Example (informative):
5.5
1.1e10
float64(128) // note: this converts the integer 128 to its fp value
The metadata does not encode information about the lexical scope of variables or the mapping from source line numbers to CIL instructions. Nonetheless, it is useful to specify an assembler syntax for providing this information for use in creating alternate encodings of the information.
Implementation Specific (Microsoft)
Source line information is stored in the PDB (Portable Debug) file associated with each module.
.line takes a line number, and optional column number (preceded by a colon) and single quoted string that specifies the name of the file the line number is referring to
|
<externSourceDecl> ::= .line <int32> [ : <int32> ] [<SQSTRING>] |
Implementation Specific (Microsoft)
For compatibility reasons, ilasm allows the following:
<externSourceDecl> ::= … | #line <int32> <QSTRING>
Notice that this requires the file name and that it shall be double quoted, not single quoted as with .line
Some grammar elements require that a file name be supplied. A file name is like any other name where “.” is considered a normal constituent character. The specific syntax for file names follows the specifications of the underlying operating system.
|
<filename> ::= |
Section |
|
<dottedname> |
5.3_5.3_Identifiers |
Attributes of types and their members attach descriptive information to their definition. The most common attributes are predefined and have a specific encoding in the metadata associated with them (see Chapter 22_22_Metadata_Logical_Format:_Other_Structures). In addition, the metadata provides a way of attaching user-defined attributes to metadata, using several different encodings.
From a syntactic point of view, there are several ways for specifying attributes in ilasm:
· Using special syntax built into ilasm. For example the keyword private in a <classAttr> specifies that the visibility attribute on a type should be set to allow access only within the defining assembly.
· Using a general-purpose syntax in ilasm. The non-terminal <customDecl> describes this grammar (see Chapter 20_20_Custom_Attributes). For some attributes, called pseudo-custom attributes, this grammar actually results in setting special encodings within the metadata (see clause 20.2.1_20.2.1_Pseudo_Custom_Attributes).
· Some attributes are required to be set based on the settings of other attributes or information within the metadata and are not visible from the syntax of ilasm at all. These attributes, called hidden attributes
· Security attributes are treated specially. There is special syntax in ilasm that allows the XML representing security attributes to be described directly (see Chapter 19_19_Declarative_Security). While all other attributes defined either in the standard library or by user-provided extension are encoded in the metadata using one common mechanism described in Section 21.10_21.9_CustomAttribute_:_0x0C, security attributes (distinguished by the fact that they inherit, directly or indirectly from System.Security.Permissions.SecurityAttribute, see Partition IV_alink=Partition_IV) shall be encoded as described in Section 21.11_21.10_DeclSecurity_:_0x0E.
An input to ilasm is a sequence of declarations, defined as follows:
|
<ILFile> ::= |
Reference |
|
<decl>* |
5.10_5.10_ilasm_source_files |
The complete grammar for a top level declaration is shown below. The following sections will concentrate on the various parts of this grammar.
|
<decl> ::= |
Reference |
|
.assembly <dottedname> { <asmDecl>* } |
6.1 |
|
| .assembly extern <dottedname> { <asmRefDecl>* } |
6.3 |
|
| .class <classHead> { <classMember>* } |
9 |
|
| .class extern <exportAttr> <dottedname> { <externClassDecl>* } |
6.7 |
|
| .corflags <int32> |
6.1 |
|
| .custom <customDecl> |
20 |
|
| .data <datadecl> |
15.3.1 |
|
| .field <fieldDecl> |
15 |
|
| .file [nometadata]
<filename> [.hash = ( <bytes> )] |
6.2.3 |
|
| .mresource [public |
private] <dottedname> |
6.2.2 |
|
| .method <methodHead> { <methodBodyItem>* } |
14 |
|
| .module [<filename>] |
6.4 |
|
| .module extern <filename> |
6.5 |
|
| .subsystem <int32> |
6.2 |
|
| .vtfixup <vtfixupDecl> |
14.5.1 |
|
| <externSourceDecl> |
5.7 |
|
| <securityDecl> |
18 |
Implementation Specific (Microsoft)
The grammar for declarations also includes the following. These are described in a separate product specification.
|
Implementation Specific (Microsoft) |
|
|
<decl> ::= |
Reference |
|
.file alignment <int32> |
|
|
| .imagebase <int64> |
|
|
| .language <languageDecl> |
|
|
| .namespace <id> |
|
|
| … |
|
Assemblies and modules are grouping constructs, each playing a different role in the CLI.
An assembly is a set of one or more files deployed as a unit. An assembly always contains a manifest that specifies (see Section 6.1):
· Version, name, culture, and security requirements for the assembly.
· Which other files, if any, belong to the assembly along with a cryptographic hash of each file. The manifest itself resides in the metadata part of a file and that file is always part of the assembly.
· Which of the types defined in other files of the assembly are to be exported from the assembly. Types defined in the same file as the manifest are exported based on attributes of the type itself.
· Optionally, a digital signature for the manifest itself and the public key used to compute it.
A module is a single file containing executable content in the format specified here. If the module contains a manifest then it also specifies the modules (including itself) that constitute the assembly. An assembly shall contain only one manifest amongst all its constituent files. For an assembly to be executed (rather than dynamically loaded) the manifest shall reside in the module that contains the entry point.
While some programming languages introduce the concept of a namespace, there is no support in the CLI for this concept. Type names are always specified by their full name relative to the assembly in which they are defined.
This section contains informative text only.
The following picture should clarify the various forms of references:

Figure 2: References
Eight files are shown in the picture. The name of each file is shown below the file. Files that declare a module have an additional border around them and have names beginning with M. The other two files have a name beginning with F. These files may be resource files, like bitmaps, or other files that do not contain CIL code.
Files M1 and M4 declare an assembly in addition to the module declaration, namely assemblies A and B, respectively. The assembly declaration in M1 and M4 references other modules, shown with straight lines. Assembly A references M2 and M3. Assembly B references M3 and M5. Thus, both assemblies reference M3.
Usually, a module belongs only to one assembly, but it is possible to share it across assemblies. When Assembly A is loaded at runtime, an instance of M3 will be loaded for it. When Assembly B is loaded into the same application domain, possibly simultaneously with Assembly A, M3 will be shared for both assemblies. Both assemblies also reference F2, for which similar rules apply.
The module M2 references F1, shown by dotted lines. As a consequence F1 will be loaded as part of Assembly A, when A is executed. Thus, the file reference shall also appear with the assembly declaration. Similarly, M5 references another module, M6, which becomes part of B when B is executed. It follows, that assembly B shall also have a module reference to M6.
End informative text
An assembly is specified as a module that contains a manifest in the metadata; see Section 21.2. The information for the manifest is created from the following portions of the grammar:
|
<decl> ::= |
Section |
|
.assembly <dottedname> { <asmDecl>* } |
6.2 |
|
| .assembly extern <dottedname> { <asmRefDecl>* } |
6.3 |
|
| .corflags <int32> |
6.2 |
|
| .file [nometadata]
<filename> .hash = ( <bytes> )
|
6.2.3 |
|
| .module extern <filename> |
6.5 |
|
| .mresource [public | private] <dottedname> [( <QSTRING> )] { <manResDecl>* } |
6.2.2 |
|
| .subsystem <int32> |
6.2 |
|
| … |
|
The .assembly directive declares the manifest and specifies to which assembly the current module belongs. A module shall contain at most one .assembly directive. The <dottedname> specifies the name of the assembly.
Note: Since some platforms treat names in a case insensitive manner, two assemblies that have names that differ only in case should not be declared.
The .corflags directive sets a field in the CLI header of the output PE file (see clause 24.3.3.1). A conforming implementation of the CLI shall expect it to be 1. For backwards compatibility, the three least significant bits are reserved. Future versions of this standard may provide definitions for values between 8 and 65,535. Experimental and non-standard uses should thus use values greater than 65,535.
The .subsystem directive is used only when the assembly is directly executed (as opposed to used as a library for another program). It specifies the kind of application environment required for the program, by storing the specified value in the PE file header (see clause 24.2.2). While a full 32 bit integer may be supplied, a conforming implementation of the CLI need only respect two possible values:
If the value is 2, the program should be run using whatever conventions are appropriate for an application that has a graphical user interface.
If the value is 3, the program should be run using whatever conventions are appropriate for an application that has a direct console attached.
Implementation Specific (Microsoft)
<decl> ::= … | .file alignment <int32> | .imagebase <int64>
The .file alignment directive sets the file alignment field in the PE header of the output file. Legal values are multiples of 512. (Different sections of the PE file are aligned, on disk, at the specified value (in bytes))
The .imagebase directive sets the imagebase field in the PE header of the output file. This value specifies the virtual address at which this PE file will be loaded into the process.
See clause 24.2.3.2
Example (informative):
.assembly CountDown
{ .hash algorithm 32772
.ver 1:0:0:0
}
.file Counter.dll .hash = (BA D9 7D 77 31 1C 85 4C 26 9C 49 E7 02 BE E7 52 3A CB 17 AF)
The following grammar shows the information that can be specified about an assembly.
|
<asmDecl> ::= |
Description |
Section |
|
.custom <customDecl> |
Custom attributes |
20 |
|
.hash algorithm <int32> |
Hash algorithm used in the .file directive |
6.2.1.1 |
|
| .culture <QSTRING> |
Culture for which this assembly is built |
6.2.1.2 |
|
| .publickey = ( <bytes> ) |
The originator's public key. |
6.2.1.3 |
|
| .ver <int32> : <int32> : <int32> : <int32> |
Major version, minor version, revision, and build |
6.2.1.4 |
|
| <securityDecl> |
Permissions needed, desired, or prohibited |
19 |
|
<asmDecl> ::= .hash algorithm <int32> | … |
When an assembly consists of more than one file (see clause 6.2.3), the manifest for the assembly specifies both the name of the file and the cryptographic hash of the contents of the file. The algorithm used to compute the hash can be specified, and shall be the same for all files included in the assembly. All values are reserved for future use, and conforming implementations of the CLI shall use the SHA1(see Partition I_alink=Partition_I) hash function and shall specify this algorithm by using a value of 32772 (0x8004).
Rationale: SHA1 was chosen as the best widely available technology at the time of standardization (see Partition I_alink=Partition_I). A single algorithm is chosen since all conforming implementations of the CLI would be required to implement all algorithms to ensure portability of executable images.
|
<asmDecl> ::= .culture <QSTRING> | … |
When present, this indicates that the assembly has been customized for a specific culture. The strings that shall be used here are those specified in Partition IV_alink=Partition_IV as acceptable with the class System.Globalization.CultureInfo. When used for comparison between an assembly reference and an assembly definition these strings shall be compared in a case insensitive manner.
Implementation Specific (Microsoft)
The product version of ilasm and ildasm use .locale rather than .culture.
Note: The culture names follow the IETF RFC1766 names. The format is “<language>-<country/region>”, where <language> is a lowercase two-letter code in ISO 639-1. <country/region> is an uppercase two-letter code in ISO 3166
|
<asmDecl> ::= .publickey = ( <bytes> ) | … |
The CLI metadata allows the producer of an assembly to compute a cryptographic hash of the assembly (using the SHA1 hash function) and then encrypt it using the RSA algorithm (see Partition I_alink=Partition_I) and a public/private key pair of the producer’s choosing. The results of this (an “SHA1/RSA digital signature”) can then be stored in the metadata along with the public part of the key pair required by the RSA algorithm. The .publickey directive is used to specify the public key that was used to compute the signature. To calculate the hash, the signature is zeroed, the hash calculated, then the result stored into the signature.
A reference to an assembly (see Section 6.3) captures some of this information at compile time. At runtime, the information contained in the assembly reference can be combined with the information from the manifest of the assembly located at runtime to ensure that the same private key was used to create both the assembly seen when the reference was created (compile time) and when it is resolved (runtime).
|
<asmDecl> ::= .ver <int32> : <int32> : <int32> : <int32> | … |
The version number of the assembly, specified as four 32-bit integers. This version number shall be captured at compile time and used as part of all references to the assembly within the compiled module. This standard places no other requirement on the use of the version numbers.
Note: A conforming implementation may ignore version numbers entirely, or it may require that they match precisely when binding a reference, or any other behavior deemed appropriate. By convention:
the first of these is considered the major version number and assemblies with the same name but different major versions are not interchangeable. This would be appropriate, for example, for a major rewrite of a product where backwards compatibility cannot be assumed.
the second of these is considered the minor version number and assemblies with the same name and major version but different minor versions indicate significant enhancements but with intention to be backward compatible. This would be appropriate, for example, on a “point release” of a product or a fully backward compatible new version of a product.
the third of these is considered the revision number and assemblies with the same name, major and minor version number but different revisions are intended to be fully interchangeable. This would be appropriate, for example, to fix a security hole in a previously released assembly.
the fourth of these is considered the build number and assemblies that differ only by build number are intended to represent a recompilation from the same source. This would be appropriate, for example,because of processor, platform, or compiler changes.
A manifest resource is simply a named item of data associated with an assembly. A manifest resource is introduced using the .mresource directive, which adds the manifest resource to the assembly manifest begun by a preceding .assembly declaration.
|
<decl> ::= |
Section |
|
.mresource [public | private] <dottedname> { <manResDecl>* } |
|
|
| … |
5.10 |
If the manifest resource is declared public it is exported from the assembly. If it is declared private it is not exported and hence only available from within the assembly. The <dottedname> is the name of the resource, and the optional quoted string is a description of the resource.
|
<manResDecl> ::= |
Description |
Section |
|
.assembly extern <dottedname> |
Manifest resource is in external assembly with name <dottedname>. |
6.3 |
|
| .custom <customDecl> |
Custom attribute. |
20 |
|
| .file <dottedname> at <int32> |
Manifest resource is in file <dottedname> at byte offset <int32>. |
|
For a resource stored in a file that is not a module (for example, an attached text file), the file shall be declared in the manifest using a separate (top-level) .file declaration (see clause 6.2.3) and the byte offset shall be zero Similarly, a resource that is defined in another assembly is referenced using .assembly extern which requires that the assembly has been defined in a separate (top-level) .assembly extern directive (see Section 6.3).
Assemblies may be associated with other files, e.g. documentation and other files that are used during execution. The declaration .file is used to add a reference to such a file to the manifest of the assembly: (See Section 21.19)
|
<decl> ::= |
Section |
|
.file [nometadata] <filename> .hash = ( <bytes> ) [.entrypoint] |
|
|
| … |
5.10 |
The attribute nometadata is specified if the file is not a module according to this specification. Files that are marked as nometadata may have any format; they are considered pure data files.
The <bytes> after the .hash specify a hash value computed for the file. The VES shall recompute this hash value prior to accessing this file and shall generate an exception if it does not match. The algorithm used to calculate this hash value is specified with .hash algorithm (see clause 6.2.1.1).
If specified, the .entrypoint directive indicates that the entrypoint of a multi-module assembly is contained in this file.
Implementation Specific (Microsoft)
If the hash value is not specified, it will be automatically computed by the assembly linker al when an assembly file is created using al. Even though the hash value is optional in the grammar for ilasm, it is required at runtime.
|
<asmRefDecl> ::= .assembly
extern <dottedname> [ as <dottedname> ] |
An assembly mediates all accesses from the files that it contains to other assemblies. This is done through the metadata by requiring that the manifest for the executing assembly contain a declaration for any assembly referenced by the executing code. The syntax .assembly extern as a top-level declaration is used for this purpose. The optional as clause provides an alias which allows ilasm to address external assemblies that have the same name, but differing in version, culture, etc.
The dotted name used in .assembly extern shall exactly match the name of the assembly as declared with .assembly directive in a case sensitive manner. (So, even though an assembly might be stored within a file, within a filesystem that is case-blind, the names stored internally within metadata are case-sensitive, and shall match exactly.)
Implementation Specific (Microsoft)
The assembly mscorlib contains many of the types and methods in the Base Class Library. For convenience, ilasm automatically inserts a .assembly extern mscorlib declaration if required
|
<asmRefDecl> ::= |
Description |
Section |
|
.hash = ( <bytes> ) |
Hash of referenced assembly |
6.2.3 |
|
| .custom <customDecl> |
Custom attributes |
20 |
|
| .culture <QSTRING> |
Culture of the referenced assembly |
6.2.1.2 |
|
| .publickeytoken = ( <bytes> ) |
The low 8 bytes of the SHA1 hash of the originator's public key. |
6.3 |
|
| .publickey = ( <bytes> ) |
The originator’s full public key |
6.2.1.3 |
|
| .ver <int32> : <int32> : <int32> : <int32> |
Major version, minor version, revision, and build |
6.2.1.4 |
These declarations are the same as those for .assembly declarations (clause 6.2.1), except for the addition of .publickeytoken. This declaration is used to store the low 8 bytes of the SHA1 hash of the originator’s public key in the assembly reference, rather than the full public key.
An assembly reference can store either a full public key or an 8 byte “publickeytoken.” Either can be used to validate that the same private key used to sign the assembly at compile time signed the assembly used at runtime. Neither is required to be present, and while both can be stored this is not useful.
A conforming implementation of the CLI need not perform this validation, but it is permitted to do so, and it may refuse to load an assembly for which the validation fails. A conforming implementation of the CLI may also refuse to permit access to an assembly unless the assembly reference contains either the public key or the public key token. A conforming implementation of the CLI shall make the same access decision independent of whether a public key or a token is used.
Rationale: The full public key is cryptographically safer, but requires more storage space in the assembly reference.
Example (informative):
.assembly extern MyComponents
{ .publickey = (BB AA BB EE 11 22 33 00)
.hash = (2A 71 E9 47 F5 15 E6 07 35 E4 CB E3 B4 A1 D3 7F 7F A0 9C 24)
.ver 2:10:2002:0
}
All CIL files are modules and are referenced by a logical name carried in the metadata rather than their file name. See Section 21.16.
|
<decl> ::= |
Section |
|
| .module <filename> |
|
|
| … |
5.10 |
Example (informative):
.module CountDown.exe
Implementation Specific (Microsoft)
If the .module directive is missing, ilasm will automatically add a .module directive and set the module name to be the file name, including its extension in capital letters. e.g., if the file is called foo and compiled into an exe, the module name will become “Foo.EXE”.
Note that ilasm also generates a required GUID to uniquely identify this instance of the module and emits that into the Mvid metadata field: see clause 21.27.
When an item is in the current assembly but part of a different module than the one containing the manifest, the defining module shall be declared in the manifest of the assembly using the .module extern directive. The name used in the .module extern directive of the referencing assembly shall exactly match the name used in the .module directive (see Section 6.4) of the defining module. See Section 21.28.
|
<decl> ::= |
Section |
|
| .module extern <filename> |
|
|
| … |
5.10 |
Example (informative):
.module extern Counter.dll
Declarations inside a module or assembly are specified by the following grammar. More information on each option can be found in the corresponding section.
|
<decl> ::= |
Section |
|
| .class <classHead> { <classMember>* } |
9 |
|
| .custom <customDecl> |
20 |
|
| .data <datadecl> |
15.3.1 |
|
| .field <fieldDecl> |
15 |
|
| .method <methodHead> { <methodBodyItem>* } |
14 |
|
| <externSourceDecl> |
5.7 |
|
| <securityDecl> |
18 |
|
| … |
|
The manifest module, of which there can only be one per assembly, includes the .assembly statement. To export a type defined in any other module of an assembly requires an entry in the assembly’s manifest. The following grammar is used to construct such an entry in the manifest:
|
<decl> ::= |
Section |
|
.class extern <exportAttr> <dottedname> { <externClassDecl>* } |
|
|
<externClassDecl> ::= |
Section |
|
.file <dottedname> | .class extern <dottedname> | .custom <customDecl> |
20 |
The <exportAttr> value shall be either public or nested public and shall match the visibility of the type.
For example, suppose an assembly consists of two modules A.EXE and B.DLL. A.EXE contains the manifest. A public class “Foo” is defined in B.DLL. In order to export it – that is, to make it visible by, and usable from, other assemblies –a .class extern statement shall be included in A.EXE.
Conversely, a public class “Bar” defined in A.EXE does not need any .class extern statement.
Rationale: Tools should be able to retrieve a single module, the manifest module, to determine the complete set types defined by the assembly. Therefore, information from other modules within the assembly is replicated in the manifest module. By convention, the manifest module is also known as the assembly.
The metadata provides mechanisms to both define types and reference types. Chapter 9 describes the metadata associated with a type definition, regardless of whether the type is an interface, class or a value type.
The mechanism used to reference types is divided into two parts. The first is the creation of a logical description of user-defined types that are referenced but (typically) not defined in the current module. These are stored in a logical table in the metadata (see Section 21.35).
The second is a signature that encodes one or more type references, along with a variety of modifiers. The grammar non-terminal <type> describes an individual entry in a signature. The encoding of a signature is specified in Section 22.1.15.n cn
The following grammar completely specifies all built-in types including pointer types of the CLI system. It also shows the syntax for user defined types that can be defined in the CLI system:
|
<type> ::= |
Description |
Section |
|
bool |
Boolean |
7.2 |
|
| boxed <typeReference> |
Boxed user-defined value type |
|
|
| char |
16-bit Unicode code point |
7.2 |
|
| class <typeReference> |
User defined reference type. |
7.3 |
|
| float32 |
32-bit floating point number |
7.2 |
|
| float64 |
64-bit floating point number |
7.2 |
|
| int8 |
Signed 8-bit integer |
7.2 |
|
| int16 |
Signed 16-bit integer |
7.2 |
|
| int32 |
Signed 32-bit integer |
7.2 |
|
| int64 |
Signed 64-bit integer |
7.2 |
|
| method <callConv> <type> * ( <parameters> ) |
Method pointer |
13.5 |
|
| native int |
Signed integer whose size varies depending on platform (32- or 64-bit) |
7.2 |
|
| native unsigned int |
Unsigned integer whose size varies depending on platform (32- or 64-bit) |
7.2 |
|
| object |
See System.Object in Partition IV_alink=Partition_IV |
|
|
| string |
See System.String in Partition IV_alink=Partition_IV |
|
|
| <type> & |
Managed pointer to <type>. <type> shall not be a managed pointer type or typedref |
13.4 |
|
| <type> * |
Unmanaged pointer to <type> |
13.4 |
|
| <type> [ [<bound> [,<bound>]*] ] |
Array of <type> with optional rank (number of dimensions) and bounds. |
13.1and 13.2 |
|
| <type> modopt ( <typeReference> ) |
Custom modifier that may be ignored by the caller. |
7.1.1 |
|
| <type> modreq ( <typeReference> ) |
Custom modifier that the caller shall understand. |
7.1.1 |
|
| <type> pinned |
For local variables only. The garbage collector shall not move the referenced value. |
7.1.2 |
|
| typedref |
Typed reference, created by mkrefany and used by refanytype or refanyval. |
7.2 |
|
| valuetype <typeReference> |
User defined value type (unboxed) |
12 |
|
| unsigned int8 |
Unsigned 8-bit integers |
7.2 |
|
| unsigned int16 |
Unsigned 16-bit integers |
7.2 |
|
| unsigned int32 |
Unsigned 32-bit integers |
7.2 |
|
| unsigned int64 |
Unsigned 64-bit integers |
7.2 |
|
| void |
No type. Only allowed as a return type or as part of void * |
7.2 |
In several situations the grammar permits the use of a slightly simpler mechanism for specifying types, by just allowing type names (e.g. “System.GC”) to be used instead of the full algebra (e.g. “class System.GC”). These are called type specifications:
|
<typeSpec> ::= |
Section |
|
[ [.module] <dottedname> ] |
7.3 |
|
| <typeReference> |
7.2 |
|
| <type> |
7.1 |
Custom modifiers, defined using modreq (“required modifier”) and modopt (“optional modifier”), are similar to custom attributes (see Chapter 20) except that modifiers are part of a signature rather than attached to a declaration. Each modifer associates a type reference with an item in the signature.
The CLI itself shall treat required and optional modifiers in the same manner. Two signatures that differ only by the addition of a custom modifier (required or optional) shall not be considered to match. Custom modifiers have no other effect on the operation of the VES.
Rationale: The distinction between required and optional modifiers is important to tools other than the CLI that deal with the metadata, typically compilers and program analysers. A required modifier indicates that there is a special semantics to the modified item that should not be ignored, while an optional modifier can simply be ignored.
For example, the concept of const in the C programming language can be modelled with an optional modifier since the caller of a method that has a constant parameter need not treat it in any special way. On the other hand, a parameter that shall be copy constructed in C++ shall be marked with a required custom attribute since it is the caller who makes the copy.
The signature encoding for pinned shall appear only in signatures that describe local variables (see clause 14.4.1.3). While a method with a pinned local variable is executing the VES shall not relocate the object to which the local refers. That is, if the implementation of the CLI uses a garbage collector that moves objects, the collector shall not move objects that are referenced by an active pinned local variable.
Rationale: If unmanaged pointers are used to dereference managed objects, these objects shall be pinned. This happens, for example, when a managed object is passed to a method designed to operate with unmanaged data.
The CLI built-in types have corresponding value types defined in the Base Class Library. They shall be referenced in signatures only using their special encodings (i.e. not using the general purpose valuetype <typeReference> syntax). Partition I_alink=Partition_I specifies the built-in types.
User-defined types are referenced either using their full name and a resolution scope or (if one is available in the same module) a type definition (see Chapter 9).
A <typeReference> is used to capture the full name and resolution scope.
|
<typeReference> ::= |
|
[<resolutionScope>] <dottedname> [/ <dottedname>]* |
|
<resolutionScope> ::= |
|
[ .module <filename> ] |
|
| [ <assemblyRefName> ] |
|
<assemblyRefName> ::= |
Section |
|
<dottedname> |
5.1 |
The following resolution scopes are specified for un-nested types:
· Current module (and, hence, assembly). This is the most common case and is the default if no resolution scope is specified. The type shall be resolved to a definition only if the definition occurs in the same module as the reference.
Note: A type reference that refers to a type in the same module and assembly is better represented using a type definition. Where this is not possible (for example, when referencing a nested type that has compilercontrolled accessibility) or convenient (for example, in some one-pass compilers) a type reference is equivalent and may be used.
· Different module, current assembly. The resolution scope shall be a module reference syntactically reprented using the notation [.module <filename>]. The type shall be resolved to a definition only if the referenced module (see Section 6.4) and type (see Section 6.7) have been declared by the current assembly and hence have entries in the assembly’s manifest. Note that in this case the manifest is not physically stored with the referencing module.
· Different assembly. The resolution scope shall be an assembly reference syntactically represented using the notation [<assemblyRefName>]. The referenced assembly shall be declared in the manifest for the current assembly (see Section 6.3), the type shall be declared in the referenced assembly’s manifest, and the type shall be marked as exported from that assembly (see section 6.7 and clause 9.1.1).
· For nested types, the resolution scope is always the enclosing type. (See Section 9.6). This is indicated syntactically by using a slash (“/”) to separate the enclosing type name from the nested type’s name
Example (informative):
The proper way to refer to a type defined in the base class library. The name of the type is System.Console and it is found in the assembly named mscorlib.
.assembly extern mscorlib { }
.class [mscorlib]System.Console
A reference to the type named C.D in the module named x in the current assembly.
.module extern x
.class [.module x]C.D
A reference to the type named C nested inside of the type named Foo.Bar in another assembly, named MyAssembly.
.assembly extern MyAssembly { }
.class [MyAssembly]Foo.Bar/C
Some implementations of the CLI will be hosted on top of existing operating systems or runtime platforms that specify data types required to perform certain functions. The metadata allows interaction with these native data types by specifying how the built-in and user-defined types of the CLI are to be marshalled to and from native data types. This marshalling information can be specified (using the keyword marshal) for
· the return type of a method, indicating that a native data type is actually returned and shall be marshalled back into the specified CLI data type
· a parameter to a method, indicating that the CLI data type provided by the caller shall be marshalled into the specified native data type (if the parameter is passed by reference the updated value shall be marshalled back from the native data type into the CLI data type when the call is completed)
· a field of a user-defined type, indicating that any attempt to pass the object in which it occurs to platform methods shall make a copy of the object, replacing the field by the specified native data type (if the object is passed by reference then the updated value shall be marshalled back when the call is completed)
The following table lists all native types supported by the CLI and provides a description for each of them. A more complete description can be found in Partition IV_alink=Partition_IV in the definition of the enum System.Runtime.Interopservices.UnmanagedType, which provides the actual values used to encode the types. All encoding values from 0 through 63 are reserved for backward compatibility with existing implementations of the CLI. Values 64 through 127 are reserved for future use in this and related Standards.
|
<nativeType> ::= |
Description |
Name in |
|
[ ] |
Native array. Type and size are determined at runtime from the actual marshaled array. |
LPArray |
|
| bool |
Boolean. 4-byte integer value where a non-zero value represents TRUE and 0 represents FALSE. |
Bool |
|
| float32 |
32-bit floating point number. |
FLOAT32 |
|
| float64 |
64-bit floating point number. |
FLOAT64 |
|
| [unsigned] int |
Signed or unsigned integer, sized to hold a pointer on the platform |
SysUInt or SysInt |
|
| [unsigned] int8 |
Signed or unsigned 8-bit integer |
unsigned int8 or int8 |
|
| [unsigned] int16 |
Signed or unsigned 16-bit integer |
unsigned int16 or int16 |
|
| [unsigned] int32 |
Signed or unsigned 32-bit integer |
unsigned int32 or int32 |
|
| [unsigned] int64 |
Signed or unsigned 64-bit integer |
unsigned int64 or int64 |
|
| lpstr |
A pointer to a null terminated array of ANSI characters. Code page is implementation specific. |
LPStr |
|
| lptstr |
A pointer to a null terminated array of platform characters (ANSI or Unicode). Code page and character encoding are implementation specific. |
LPTStr |
|
| lpvoid |
An untyped pointer, platform specifies size. |
LPVoid |
|
| lpwstr |
A pointer to a null terminated array of Unicode characters. Character encoding is implementation specific. |
LPWStr |
|
| method |
A function pointer. |
FunctionPtr |
|
| <nativeType> [ ] |
Array of <nativeType>. The length is determined at runtime by the size of the actual marshaled array. |
LPArray |
|
| <nativeType> [ <int32> ] |
Array of <nativeType> of length <int32>. |
LPArray |
|
| <nativeType> |
Array of <nativeType> with runtime supplied element size. The int32 specifies a parameter to the current method (counting from parameter number 0) that, at runtime, will contain the size of an element of the array in bytes. Can only be applied to methods, not fields. |
LPArray |
|
| <nativeType> |
Array of <nativeType> with runtime supplied element size. The first int32 specifies the number of elements in the array. The second int32 specifies which parameter to the current method (counting from parameter number 1) will specify the additional number of elements in the array. Can only be applied to methods, not fields |
LPArray |
Implementation Specific (Microsoft)
The Microsoft implementation supports a richer set of types to describe marshalling between Windows native types and COM. These additional options are listed in the following table:
|
Implementation Specific (Microsoft) |
||
|
<nativeType> ::= |
Description |
Name in |
|
| as any |
Determines the type of an object at runtime and marshals the Object as that type. |
AsAny |
|
| byvalstr |
A string in a fixed length buffer. |
VBByRefStr |
|
| custom ( <QSTRING>, |
Custom marshaler. The 1st string is the name of the marshalling class, using the string conventions of Reflection.Emit to specify the assembly and/or module. The 2nd is an arbitrary string passed to the marshaller at runtime to identify the form of marshalling required. |
CustomMarshaler |
|
| fixed array [ <int32> ] |
A fixed size array of length <int32> bytes |
ByValArray |
|
| fixed sysstring |
A fixed size system string of length <int32>. This can only be applied to fields, and a separate attribute specifies the encoding of the string. |
ByValTStr |
|
| lpstruct |
A pointer to a C-style structure. Used to marshal managed formatted types. |
LPStruct |
|
| struct |
A C-style structure, used to marshal managed formatted types. |
Struct |
Example (informative):
.method int32 M1( int32 marshal(int32), bool[] marshal(bool[5]) )
Method M1 takes two arguments: an int32, and an array of 5 bools
++++++++++
.method int32 M2( int32 marshal(int32), bool[] marshal(bool[+1]) )
Method M2 takes two arguments: an int32, and an array of bools: the number of elements in that array is given by the value of the first parameter
++++++++++
.method int32 M3( int32 marshal(int32), bool[] marshal(bool[7+1]) )
Method M3 takes two arguments: an int32, and an array of bools: the number of elements in that array is given as 7 plus the value of the first parameter
Partition I_alink=Partition_I specifies visibility and accessibility. In addition to these attributes, the metadata stores information about method name hiding. Hiding controls which method names inherited from a base type are available for compile-time name binding.
Visibility is attached only to top-level types, and there are only two possibilities: visible to types within the same assembly, or visible to types regardless of assembly. For nested types (i.e. types that are members of another type) the nested type has an accessibility that further refines the set of methods that can reference the type. A nested type may have any of the 7 accessibility modes (see Partition I_alink=Partition_I), but has no direct visibility attribute of its own, using the visibility of its enclosing type instead.
Because the visibility of a top-level type controls the visibility of the names of all of its members, a nested type cannot be more visible than the type in which it is nested. That is, if the enclosing type is visible only within an assembly then a nested type with public accessibility is still only available within the assembly. By contrast, a nested type that has assembly accessibility is restricted to use within the assembly even if the enclosing type is visible outside the assembly.
To make the encoding of all types consistent and compact, the visibility of a top-level type and the accessibility of a nested type are encoded using the same mechanism in the logical model of clause 22.1.14.
Accessibility is encoded directly in the metadata. See, for example, clause 21.24.
Hiding is a compile-time concept that applies to individual methods of a type. The CTS specifies two mechanisms for hiding, specified by a single bit:
· hide-by-name, meaning that the introduction of a name in a given type hides all inherited members of the same kind (method or field) with the same name.
· hide-by-name-and-sig, meaning that the introduction of a name in a given type hides any inherited member of the same kind but with precisely the same type (for fields) or signature (for methods, properties, and events).
There is no runtime support for hiding. A conforming implementation of the CLI treats all references as though the names were marked hide-by-name-and-sig. Compilers that desire the effect of hide-by-name can do so by marking method definitions with the newslot attribute (see clause 14.4.2.3) and correctly chosing the type used to resolve a method reference (see clause 14.1.3).
Types (i.e., classes, value types, and interfaces) may be defined at the top-level of a module:
|
<decl> ::= |
Section |
|
.class <classHead> { <classMember>* } |
9 |
|
| … |
|
The logical metadata table created by this declaration is specified in Section 21.34.
Rationale: For historical reasons, many of the syntactic classes used for defining types incorrectly use “class” instead of “type” in their name. All classes are types, but “types” is a broader term encompassing value types, and interfaces.
A type header consists of
· any number of type attributes
· a name (an <id>)
· a base type (or parent type), which defaults to [mscorlib]System.Object
· an optional list of interfaces whose contract this type and all its descendent types shall satisfy
|
<classHead> ::= |
|
<classAttr>* <id> [extends <typeReference>] [implements <typeReference> [, <typeReference>]*] |
The extends keyword defines the base type of a type. A type shall extend from exactly one other type. If no type is specified, ilasm will add an extend clause to make the type inherit from System.Object.
The implements keyword defines the interfaces of a type. By listing an interface here, a type declares that all of its concrete implementations will support the contract of that interface, including providing implementations of any virtual methods the interface declares. See also Chapter 10 and Chapter 11.
Example (informative):
.class private auto autochar CounterTextBox
extends [System.Windows.Forms]System.Windows.Forms.TextBox
implements [.module Counter]CountDisplay
{ // body of the class
}
This code declares the class CounterTextBox, which extends the class System.Windows.Forms.TextBox in the assembly System.Windows.Forms and implements the interface CountDisplay in the module Counter of the current assembly. The attributes private, auto and autochar are described in the following sections.
A type can have any number of custom attributes attached. Custom attributes are attached as described in Chapter 20. The other (predefined) attributes of a type may be grouped into attributes that specify visibility, type layout information, type semantics information, inheritance rules, interoperation information, and information on special handling. The following subsections provide additional information on each group of predefined attributes.
|
<classAttr> ::= |
Description |
Section |
|
abstract |
Type is abstract. |
9.1.4 |
|
| ansi |
Marshal strings to platform as ANSI. |
9.1.5 |
|
| auto |
Auto layout of type. |
9.1.2 |
|
| autochar |
Marshal strings to platform based on platform. |
9.1.5 |
|
| beforefieldinit |
Calling static methods does not initialize type. |
9.1.6 |
|
| explicit |
Layout of fields is provided explicitly. |
9.1.2 |
|
| interface |
Interface declaration. |
9.1.3 |
|
| nested assembly |
Assembly accessibility for nested type. |
9.1.1 |
|
| nested famandassem |
Family and Assembly accessibility for nested type. |
9.1.1 |
|
| nested family |
Family accessibility for nested type. |
9.1.1 |
|
| nested famorassem |
Family or Assembly accessibility for nested type. |
9.1.1 |
|
| nested private |
Private accessibility for nested type. |
9.1.1 |
|
| nested public |
Public accessibility for nested type. |
9.1.1 |
|
| private |
Private visibility of top-level type. |
9.1.1 |
|
| public |
Public visibility of top-level type. |
9.1.1 |
|
| rtspecialname |
Special treatment by runtime. |
9.1.6 |
|
| sealed |
The type cannot be subclassed. |
9.1.4 |
|
| sequential |
The type is laid out sequentially. |
9.1.2 |
|
| serializable |
Type may be serialized. |
9.1.6 |
|
| specialname |
Special treatment by tools. |
9.1.6 |
|
| unicode |
Marshal strings to platform as Unicode. |
9.1.5 |
Implementation Specific (Microsoft)
The above grammar also includes
<classAttr> ::= import
to indicate that the type is imported from a COM type library
|
<classAttr> ::= … |
|
| nested assembly |
|
| nested famandassem |
|
| nested family |
|
| nested famorassem |
|
| nested private |
|
| nested public |
|
| private |
|
| public |
See Partition I_alink=Partition_I. A type that is not nested inside another shall have exactly one visibility (private or public) and shall not have an accessiblity. Nested types shall have no visibility, but instead shall have exactly one of the accessibility attributes (nested assembly, nested famandassem, nested family, nested famorassem, nested private, or nested public). The default visibility for top-level types is private. The default accessibility for nested types is nested private.
|
<classAttr> ::= … |
|
| auto |
|
| explicit |
|
| sequential |
The type layout specifies how the fields of an instance of a type are arranged. A given type shall have only one layout attribute specified. By convention, ilasm supplies auto if no layout attribute is specified.
auto: the layout shall be done by the CLI, with no user-supplied constraints
explicit: the layout of the fields is explicitly provided (see Section 9.7).
sequential: the CLI shall lay out the fields in sequential order, based on the order of the fields in the logical metadata table (see Section 21.15).
Rationale: The default auto layout should provide the best layout for the platform on which the code is executing. sequential layout is intended to instruct the CLI to match layout rules commonly followed by languages like C and C++ on an individual platform, where this is possible while still guaranteeing verifiable layout. explicit layout allows the CIL generator to specify the precise layout semantics; specific rules govern which explicit layouts are verifiable.
|
<classAttr> ::= … |
|
| interface |
The type semantic attributes specify whether an interface, class, or value type shall be defined. The interface attribute specifies an interface. If this attribute is not present and the definition extends (directly or indirectly) System.ValueType a value type shall be defined (see Chapter 12). Otherwise, a class shall be defined (see Chapter 10).
Note that the runtime size of a value type shall not exceed 1 MByte (0x100000 bytes)
Implementation Specific (Microsoft)
The current implementation allows 0x3F0000 bytes, but may be reduced in future
|
<classAttr> ::= … |
|
| abstract |
|
| sealed |
Attributes that specify special semantics are abstract and sealed. These attributes may be used together.
abstract specifies that this type shall not be instantiated. If a type contains abstract methods, the type shall be declared as an abstract type.
sealed specifies that a type shall not have subclasses. All value types shall be sealed.
Rationale: Virtual methods of sealed types are effectively instance methods, since they cannot be overridden. Framework authors should use sealed classes sparingly since they do not provide a convenient building block for user extensibility. Sealed classes may be necessary when the implementation of a set of virtual methods for a single class (typically inherited from different interfaces) becomes interdependent or depends critically on implementation details not visible to potential subclasses.
A type that is both abstract and sealed should have only static members, and serves as what some languages call a namespace.
|
<classAttr> ::= … |
|
| ansi |
|
| autochar |
|
| unicode |
These attributes are for interoperation with unmanaged code. They specify the default behavior to be used when calling a method (static, instance, or virtual) on the class that has an argument or return type of System.String and does not itself specify marshalling behavior. Only one value shall be specified for any type, and the default value is ansi.
ansi specifies that marshalling shall be to and from ANSI strings
unicode specifies that marshalling shall be to and from Unicode strings
autochar specifies either ANSI or Unicode behavior, depending on the platform on which the CLI is running.
|
<classAttr> ::= … |
|
| beforefieldinit |
|
| serializable |
|
| specialname |
|
| rtspecialname |
These attributes may be combined in any way.
beforefieldinit instructs the CLI that it need not initialize the type before a static method is called. See clause 9.5.3.
Implementation Specific (Microsoft)
serializable indicates that the fields of the type may be serialized into a data stream by the CLI serializer. See Partition IV_alink=Partition_IV.
specialname indicates that the name of this item may have special significance to tools other than the CLI. See, for example, Partition I_alink=Partition_I .
rtspecialname indicates that the name of this item has special significance to the CLI. There are no currently defined special type names; this is for future use. Any item marked rtspecialname shall also be marked specialname
Rationale: If an item is treated specially by the CLI, then tools should also be made aware of that. The converse is not true.
A type may contain any number of further declarations. The directives .event, .field, .method, and .property are used to declare members of a type. The directive .class inside a type declaration is used to create a nested type, which is discussed in further detail in Section 9.6.
|
<classMember> ::= |
Description |
Section |
|
.class <classHead> { <classMember>* } |
Defines a nested type. |
9.6 |
|
| .custom <customDecl> |
Custom attribute. |
20 |
|
| .data <datadecl> |
Defines static data associated with the type. |
15.3 |
|
| .event <eventHead> { <eventMember>* } |
Declares an event. |
17 |
|
| .field <fieldDecl> |
Declares a field belonging to the type. |
15 |
|
| .method <methodHead> { <methodBodyItem>* } |
Declares a method of the type. |
14 |
|
| .override <typeSpec> :: <methodName> with <callConv> <type> <typeSpec> :: <methodName> ( <parameters> ) |
Specifies that the first method is overridden by the definition of the second method. |
9.3.2 |
|
| .pack <int32> |
Used for explicit layout of fields. |
9.7 |
|
| .property <propHead> { <propMember>* } |
Declares a property of the type. |
16 |
|
| .size <int32> |
Used for explicit layout of fields. |
9.7 |
|
| <externSourceDecl> |
.line |
5.7 |
|
| <securityDecl> |
.permission or .capability |
19 |
A virtual method of a base type is overridden by providing a direct implementation of the method (using a method definition, see Section 14.4) and not specifying it to be newslot (see clause 14.4.2.3). An existing method body may also be used to implement a given virtual declaration using the .override directive (see clause 9.3.2).
A virtual method is introduced in the inheritance hierarchy by defining a virtual method (see Section 14.4). The versioning semantics differ depending on whether or not the definition is marked as newslot (see clause 14.4.2.3):
If the definition is marked newslot then the definition always creates a new virtual method, even if a base class provides a matching virtual method. Any reference to the virtual method created before the new virtual function was defined will continue to refer to the original definition.
If the definition is not marked newslot then it creates a new virtual method only if there is no virtual method of the same name and signature inherited from a base class. If the inheritance hierarchy changes so that the definition matches an inherited virtual function the definition will be treated as a new implementation of the inherited function.
The .override directive specifies that a virtual method should be implemented (overridden), in this type, by a virtual method with a different name but with the same signature. It can be used to provide an implementation for a virtual method inherited from a base class or a virtual method specified in an interface implemented by this type. The .override directive specifies a Method Implementation (MethodImpl) in the metadata (see clause 14.1.4).
|
<classMember> ::= |
Section |
|
.override <typeSpec> :: <methodName> with <callConv> <type> <typeSpec> :: <methodName> ( <parameters> ) |
|
|
| … |
9.2 |
The first <typeSpec> :: <methodName> pair specifies the virtual method that is being overridden. It shall reference either an inherited virtual method or a virtual method on an interface that the current type implements. The remaining information specifies the virtual method that provides the implementation.
While the syntax specified here and the actual metadata format (see Section 21.25 )allows any virtual method to be used to provide an implementation, a conforming program shall provide a virtual method actually implemented directly on the type containing the .override directive.
Rationale: The metadata is designed to be more expressive than can be expected of all implementations of the VES.
Example (informative):
The following example shows a typical use of the .override directive. A method implementation is provided for a method declared in an interface (see Chapter 11).
.class interface I
{ .method public virtual abstract void m() cil managed {}
}
.class C implements I
{ .method virtual public void m2()
{ // body of m2
}
.override I::m with instance void C::m2()
}
The .override directive specifies that the C::m2 body shall provide the implementation of be used to implement I::m on objects of class C.
If a type overrides an inherited method, it may widen, but it shall not narrow, the accessibility of that method. As a principle, if a client of a type is allowed to access a method of that type, then it should also be able to access that method (identified by name and signature) in any derived type. Table 7.1 specifies narrow and widen in this context – a “Yes” denotes that the subclass can apply that accessibility, a “No” denotes it is illegal.
Table 7.1: Legal Widening of Access to a Virtual Method
|
Subclass |
Base type Accessibility |
|||||
|
|
private |
family |
assembly |
famandassem |
famorassem |
public |
|
private |
Yes |
No |
No |
No |
No |
No |
|
family |
Yes |
Yes |
No |
No |
If not in same assembly |
No |
|
assembly |
Yes |
No |
Same assembly |
No |
No |
No |
|
famandassem |
Yes |
No |
No |
Same assembly |
No |
No |
|
famorassem |
Yes |
Yes |
Same assembly |
Yes |
Same assembly |
No |
|
public |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Note: A method may be overridden even if it may not be accessed by the subclass.
If a method has assembly accessibility, then it shall have public accessibility if it is being overridden by a method in a different assembly. A similar rule applies to famandassem, where also famorassem is allowed outside the assembly. In both cases assembly or famandassem, respectively, may be used inside the same assembly.
A special rule applies to famorassem, as shown in the table. This is the only case where the accessibility is apparently narrowed by the subclass. A famorassem method may be overridden with family accessibility by a type in another assembly.
Rationale: Because there is no way to specify “family or specific other assembly” it is not possible to specify that the accessibility should be unchanged. To avoid narrowing access, it would be necessary to specify an accessibility of public, which would force widening of access even when it is not desired. As a compromise, the minor narrowing of “family” alone is permitted.
A type (concrete or abstract) may provide
· implementations for instance, static, and virtual methods that it introduces
· implementations for methods declared in interfaces that it has specified it will implement, or that its base type has specified it will implement
· alternative implementations for virtual methods inherited from its parent
· implementations for virtual methods inherited from an abstract base type that did not provide an implementation
A concrete (i.e. non-abstract) type shall provide either directly or by inheritance an implementation for
· all methods declared by the type itself
· all virtual methods of interfaces implemented by the type
· all virtual methods that the type inherits from its base type
There are three special members, all methods, that can be defined as part of a type: instance constructors, instance finalizers, and type initializers.
Instance constructors initialize an instance of a type. An instance constructor is called when an instance of a type is created by the newobj instruction (see Partition III_alink=Partition_III). Instance constructors shall be instance (not static or virtual) methods, they shall be named .ctor and marked both rtspecialname and specialname (see clause 14.4.2.6). Instance constructors may take parameters, but shall not return a value. Instance constructors may be overloaded (i.e. a type may have several instance constructors). Each instance constructor shall have a unique signature. Unlike other methods, instance constructors may write into fields of the type that are marked with the initonly attribute (see clause 15.1.2).
Example (informative):
The following shows the definition of an instance constructor that does not take any parameters:
.class X {
.method public rtspecialname specialname instance void .ctor() cil managed
{ .maxstack 1
// call super constructor
ldarg.0 // load this pointer
call instance void [mscorlib]System.Object::.ctor()
// do other initialization work
ret
}
}
The behavior of finalizers is specified in Partition I_alink=Partition_I. The finalize method for a particular type is specified by overriding the virtual method Finalize in System.Object.
Types may contain special methods called type initializers to initialize the type itself.
All types (classes, interfaces, and value types) may have a type initializer. This method shall be static, take no parameters, return no value, be marked with rtspecialname and specialname (see clause 14.4.2.6), and be named .cctor.
Like instance initializers, type initializers may write into static fields of their type that are marked with the initonly attribute (see clause 15.1.2).
Note: Type initializers are often simple methods that initialize the type’s static fields from stored constants or via simple computations. There are, however, no limitations on what code is permitted in a type initializer.
The CLI shall provide the following guarantees regarding type initialization (but see also clause 9.5.3.2 and clause 9.5.3.3):
1. When type initializers are executed is specified in Partition I_alink=Partition_I
2. A type initializer shall run exactly once for any given type, unless explicitly called by user code
3. No method other than those called directly or indirectly from the type initializer will be able to access members of a type before its initializer completes execution.
A type can be marked with the attribute beforefieldinit (see clause 9.1.6) to indicate that all the guarantees specified in clause 9.5.3.1 are not required. In particular, the final requirement of guarantee 1 need not be provided: the type initializer need not run before a static method is called or referenced.
Rationale: When code can be executed in multiple application domains it becomes particularly expensive to ensure this final guarantee. At the same time, examination of large bodies of managed code have shown that this final guarantee is rarely required, since type initializers are almost always simple methods for initializing static fields. Leaving it up to the CIL generator (and hence, possibly, to the programmer) to decide whether this guarantee is required therefore provides efficiency when it is desired at the cost of consistency guarantees.
In addition to the type initialization guarantees specified in clause 9.5.3.1 the CLI shall ensure two further guarantees for code that is called from a type initializer:
1. Static variables of a type are in a known state prior to any access whatsoever.
2. Type initialization alone shall not create a deadlock unless some code called from a type initializer (directly or indirectly) explicitly invokes blocking operations.
Rationale:
Consider the following two class definitions:
.class public A extends [mscorlib]System.Object
{ .field static public class A a
.field static public class B b
.method public static rtspecialname specialname void .cctor ()
{ ldnull // b=null
stsfld class B A::b
ldsfld class A B::a // a=B.a
stsfld class A A::a
ret
}
}
.class public B extends [mscorlib]System.Object
{ .field static public class A a
.field static public class B b
.method public static rtspecialname specialname void .cctor ()
{ ldnull // a=null
stsfld class A B::a
ldsfld class B A::b // b=A.b
stfld class B B::b
ret
}
}
After loading these two classes, an attempt to reference any of the static fields causes a problem, since the type initializer for each of A and B requires that the type initializer of the other be invoked first. Requiring that no access to a type be permitted until its initializer has completed would create a deadlock situation. Instead, the CLI provides a weaker guarantee: the initializer will have started to run, but it need not have completed. But this alone would allow the full uninitialized state of a type to be visible, which would make it difficult to guarantee repeatable results.
There are similar, but more complex, problems when type initialization takes place in a multi-threaded system. In these cases, for example, two separate threads might start attempting to access static variables of separate types (A and B) and then each would have to wait for the other to complete initialization.
A rough outline of the algorithm is as follows:
1. At class load time (hence prior to initialization time) store zero or null into all static fields of the type.
2. If the type is initialized you are done.
2.1. If the type is not yet initialized, try to take an initialization lock.
2.2. If successful, record this thread as responsible for initializing the type and proceed to step 2.3.
2.2.1. If not, see whether this thread or any thread waiting for this thread to complete already holds the lock.
2.2.2. If so, return since blocking would create a deadlock. This thread will now see an incompletely initialized state for the type, but no deadlock will arise.
2.2.3 If not, block until the type is initialized then return.
2.3 Initialize the parent type and then all interfaces implemented by this type.
2.4 Execute the type initialization code for this type.
2.5 Mark the type as initialized, release the initialization lock, awaken any threads waiting for this type to be initialized, and return.
Nested types are specified in Partition I_alink=Partition_I. Interfaces may be nested inside of classes and value types, but classes and value types shall not be nested inside of interfaces. For information about the logical tables associated with nested types, see Section 21.29.
Note: A nested type is not associated with an instance of its enclosing type. The nested type has its own base type and may be instantiated independent of the enclosing type. This means that the instance members of the enclosing type are not accessible using the this pointer of the nested type.
A nested type may access any members of its enclosing type, including private members, as long as the member is static or the nested type has a reference to an instance of the enclosing type. Thus, by using nested types a type may give access to its private members to another type.
On the other side, the enclosing type may not access any private or family members of the nested type. Only members with assembly, famorassem, or public accessibility can be accessed by the enclosing type.
Example (informative):
The following example shows a class declared inside another class. Both classes declare a field. The nested class may access both fields, while the enclosing class does not have access to the field b.
.class private auto autochar CounterTextBox
extends [System.Windows.Forms]System.Windows.Forms.TextBox
implements [.module Counter]IcountDisplay
{ .field static private int32 a
/* Nested class. Declares the NegativeNumberException */
.class nested assembly NonPositiveNumberException extends [mscorlib]System.Exception
{ .field static private int32 b
// body of nested class
} // end of nested class NegativeNumberException
}
The CLI supports both sequential and explicit layout control, see clause 9.1.2. For explicit layout it is also necessary to specify the precise layout of an instance, see also Section 21.18 and Section 21.16.
|
<fieldDecl> ::= |
|
[[ <int32> ]] <fieldAttr>* <type> <id> |
The optional int32 specified in brackets at the beginning of the declaration specifies the byte offset from the beginning of the instance of the type. This form of explicit layout control shall not be used with global fields specified using the at notation (see clause 15.3.2).
Offset values shall be 0 or greater; they cannot be negative. It is possible to overlap fields in this way, even though it is not recommended. The field may be accessed using pointer arithmetic and ldind to load the field indirectly or stind to store the field indirectly (see Partition III_alink=Partition_III). See Section 21.18 and Section 21.16 for encoding of this information. For explicit layout, every field shall be assigned an offset.
The .pack directive specifies that fields should be placed within the runtime object at addresses which are a multiple of the specified number, or at natural alignment for that field type, whichever is smaller. e.g., .pack 2 would allow 32-bit-wide fields to be started on even addresses – whereas without any .pack directive, they would be naturally aligned – that is to say, placed on addresses that are a multiple of 4. The integer following .pack shall be one of 0, 1, 2, 4, 8, 16, 32, 64 or 128. (A value of zero indicates that the pack size used should match the default for the current platform). The .pack directive shall not be supplied for any type with explicit layout control.
The directive .size specifies that a memory block of the specified amount of bytes shall be allocated for an instance of the type. e.g., .size 32 would create a block of 32 bytes for the instance. The value specified shall be greater than or equal to the calculated size of the class, based upon its field sizes and any .pack directive. Note that if this directive applies to a value type, then the size shall be less than 1 MByte.
Note: Metadata that controls instance layout is not a “hint,” it is an integral part of the VES that shall be supported by all conforming implementations of the CLI.
Example (informative):
The following class uses sequential layout of its fields:
.class sequential public SequentialClass
{ .field public int32 a // store at offset 0 bytes
.field public int32 b // store at offset 4 bytes
}
The following class uses explicit layout of its fields:
.class explicit public ExplicitClass
{ .field [0] public int32 a // store at offset 0 bytes
.field [6] public int32 b // store at offset 6 bytes
}
The following value type uses .pack to pack its fields together:
.class value sealed public MyClass extends [mscorlib]System.ValueType
{ .pack 2
.field public int8 a // store at offset 0 bytes
.field public int32 b // store at offset 2 bytes (not 4)
}
The following class specifies a contiguous block of 16 bytes:
.class public BlobClass
{ .size 16
}
In addition to types with static members, many languages have the notion of data and methods that are not part of a type at all. These are referred to as global fields and methods.
It is simplest to understand global fields and methods in the CLI by imagining that they are simply members of an invisible abstract public class. In fact, the CLI defines such a special class, named ′<Module>′, that does not have a base type and does not implement any interfaces. The only noticeable difference is in how definitions of this special class are treated when multiple modules are combined together, as is done by a class loader. This process is known as metadata merging.
For an ordinary type, if the metadata merges two definitions of the same type, it simply discards one definition on the assumption they are equivalent and that any anomaly will be discovered when the type is used. For the special class that holds global members, however, members are unioned across all modules at merge time. If the same name appears to be defined for cross-module use in multiple modules then there is an error. In detail:
· If no member of the same kind (field or method), name, and signature exists, then add this member to the output class.
· If there are duplicates and no more than one has an accessibility other than compilercontrolled, then add them all in the output class.
· If there are duplicates and two or more have an accessibility other than compilercontrolled an error has occurred.
Classes, as specified in Partition I_alink=Partition_I, define types in an inheritance hierarchy. A class (except for the built-in class System.Object) shall declare exactly one parent class. A class shall declare zero or more interfaces that it implements (see Chapter 11). A concrete class may be instantiated to create an object, but an abstract class (see clause 9.1.4) shall not be instantiated. A class may define fields (static or instance), methods (static, instance, or virtual), events, properties, and nested types (classes, value types, or interfaces).
Instances of a class (objects) are created only by explicitly using the newobj instruction (see Partition III_alink=Partition_III). When a variable or field that has a class as its type is created (for example, by calling a method that has a local variable of a class type) the value shall initially be null, a special value that is assignment compatible with all class types even though it is not an instance of any particular class.
Interfaces, as specified in Partition I_alink=Partition_I, define a contract that other types may implement. Interfaces may have static fields and methods, but they shall not have instance fields or methods. Interfaces may define virtual methods, but only if they are abstract (see Partition I_alink=Partition_I and clause 14.4.2.4).
Rationale: Interfaces cannot define instance fields for the same reason that the CLI does not support multiple inheritance of base types: in the presence of dynamic loading of data types there is no known implementation technique that is both efficient when used and has no cost when not used. By contrast, providing static fields and methods need not affect the layout of instances and therefore does not raise these issues.
Interfaces may be nested inside any type (interface, class, or value type). Classes and value types shall not be nested inside of interfaces.
Classes and value types shall implement zero or more interfaces. Implementing an interface implies that all concrete instances of the class or value type shall provide an implementation for each abstract virtual method declared in the interface. In order to implement an interface, a class or value type shall either explicitly declare that it does so (using the implements attribute in its type definition, see Section 9.1) or shall be derived from a base class that implements the interface.
Note: An abstract class (since it cannot be instantiated) need not provide implementations of the virtual methods of interfaces it implements, but any concrete class derived from it shall provide the implementation.
Merely providing implementations for all of the abstract methods of an interface is not sufficient to have a type implement that interface. Conceptually, this represents that fact that an interface represents a contract that may have more requirements than are captured in the set of abstract methods. From an implementation point of view, this allows the layout of types to be constrained only by those interfaces that are explicitly declared.
Interfaces shall declare that they require the implementation of zero or more other interfaces. If one interface, A, declares that it requires the implementation of another interface, B, then A implicitly declares that it requires the implementation of all interfaces required by B. If a class or value type declares that it implements A, then all concrete instances shall provide implementations of the virtual methods declared in A and all of the interfaces A requires.
Example (informative):
The following class implements the interface IStartStopEventSource defined in the module Counter.
.class private auto autochar StartStopButton
extends [System.Windows.Forms]System.Windows.Forms.Button
implements [.module Counter]IstartStopEventSource
{ // body of class
}
Classes that implement an interface (see Section 11.1) are required to provide implementations for the abstract virtual methods defined by the interface. There are three mechanisms for providing this implementation:
· directly specifying an implementation, using the same name and signature as appears in the interface
· inheritance of an existing implementation from the base type
· use of an explicit MethodImpl (see clause 14.1.4).
The Virtual Execution System shall determine the appropriate implementation of a virtual method to be used for an interface abstract method using the following algorithm.
· If the parent class implements the interface, start with the same virtual methods that it provides, otherwise create an interface that has empty slots for all virtual functions.
· If this class explicitly specifies that it implements the interface
o if the class defines any public virtual newslot functions whose name and signature match a virtual method on the interface, then use these new virtual methods to implement the corresponding interface method.
· If there are any virtual methods in the interface that still have empty slots, see if there are any public virtual methods available on this class (directly or inherited) and use these to implement the corresponding methods on the interface.
· Apply all MethodImpls that are specified for this class, thereby placing explicitly specified virtual methods into the interface in preference to those inherited or chosen by name matching.
· If the current class is not abstract and there are any interface methods that still have empty slots, then the program is not valid.
Rationale: Interfaces can be thought of as specifying, primarily, a set of virtual methods that shall be implemented by any class that implements the interface. The class specifies a mapping from its own virtual methods to those of the interface. Thus it is virtual methods, not specific implementations of those methods, that are associated with interfaces. Overriding a virtual method on a class with a specific implementation will thus affect not only the virtual method named in the class but also any interface virtual methods to which that same virtual method has been mapped.
In contrast to classes, value types (see Partition I_alink=Partition_I) are not accessed by using a reference but are stored directly in the location of that type.
Rationale: Value types are used to describe the type of small data items. They can be compared to struct (as opposed to pointers to struct) types in C++. Compared to reference types, value types are accessed faster since there is no additional indirection involved. As elements of arrays they do not require allocating memory for the pointers as well as for the data itself. Typical value types are complex numbers, geometric points, or dates.
Like other types, value types may have fields (static or instance), methods (static, instance, or virtual), properties, events, and nested types. A value type may be converted into a corresponding reference type (its boxed form, a class automatically created for this purpose by the VES when a value type is defined) by a process called boxing. A boxed value type may be converted back into its value type representation, the unboxed form, by a process called unboxing. Value types shall be sealed, and they shall have a base type of either System.ValueType or System.Enum (see Partition IV_alink=Partition_IV). Value types shall implement zero or more interfaces, but this has meaning only in their boxed form (see Section 12.3).
Unboxed value types are not considered subtypes of another type and it is not valid to use the isinst instruction (see Partition III_alink=Partition_III) on unboxed value types. The isinst instruction may be used for boxed value types. Unboxed value types shall not be assigned the value null and they shall not be compared to null.
Value types support layout control in the same way as reference types do (see Section 9.7). This is especially important when values are imported from native code.
The unboxed form of a value type shall be referred to by using the valuetype keyword followed by a type reference. The boxed form of a value type shall be referred to by using the boxed keyword followed by a type reference.
|
<valueTypeReference> ::= |
|
boxed <typeReference> | |
|
valuetype <typeReference> |
Implementation Specific (Microsoft)
For historical reasons “value class” may be used instead of “valuetype” although the latter is preferred. V1 of the CLI does not support direct references to boxed value types; they should be treated as object instead.
Like classes, value types may have both instance constructors (see clause 9.5.1) and type initializers (see clause 9.5.3). Unlike classes that are automatically initialized to null, however, the following rules constitute the only guarantee about the initilisation of (unboxed) value types:
· Static variables shall be initialized to zero when a type is loaded (see clause 9.5.3.3), hence statics whose type is a value type are zero-initialized when the type is loaded.
· Local variables shall be initialized to zero if the appropriate bit in the method header (see clause 24.4.4) is set.
· Arrays shall be zero initialized.
· Instances of classes (i.e. objects) shall be zero initialized prior to calling their instance constructor.
Rationale: Guaranteeing automatic initialization of unboxed value types is both difficult and expensive, especially on platforms that support thread-local storage and allow threads to be created outside of the CLI and then passed to the CLI for management.
Note: Boxed value types are classes and follow the rules for classes.
The instruction initobj (see Partition III_alink=Partition_III) performs zero-initialization under program control. If a value type has a constructor, an instance of its unboxed type can be created as is done with classes. The newobj instruction (see Partition III_alink=Partition_III) is used along with the initializer and its parameters to allocate and initialize the instance. The instance of the value type will be allocated on the stack. The Base Class Library provides the method System.Array.Initialize (see Partition IV_alink=Partition_IV) to zero all instances in an array of unboxed value types.
Example (informative):
The following code declares and initializes three value type variables. The first variable is zero-initialized, the second is initialized by calling an instance constructor, and the third by creating the object on the stack and storing it into the local.
.assembly Test { }
.assembly extern System.Drawing {
.ver 1:0:3102:0
.publickeytoken = (b03f5f7f11d50a3a)
}
.method public static void Start()
{ .maxstack 3
.entrypoint
.locals init (valuetype [System.Drawing]System.Drawing.Size Zero,
valuetype [System.Drawing]System.Drawing.Size Init,
valuetype [System.Drawing]System.Drawing.Size Store)
// Zero initialize the local named Zero
ldloca Zero // load address of local variable
initobj valuetype [System.Drawing]System.Drawing.Size
// Call the initializer on the local named Init
ldloca Init // load address of local variable
ldc.i4 425 // load argument 1 (width)
ldc.i4 300 // load argument 2 (height)
call instance void [System.Drawing]System.Drawing.Size::.ctor(int32, int32)
// Create a new instance on the stack and store into Store. Note that
// stobj is used here – but one could equally well use stloc, stfld, etc.
ldloca Store
ldc.i4 425 // load argument 1 (width)
ldc.i4 300 // load argument 2 (height)
newobj instance void [System.Drawing]System.Drawing.Size::.ctor(int32, int32)
stobj valuetype [System.Drawing]System.Drawing.Size
ret
}
Value types may have static, instance and virtual methods. static methods of value types are defined and called the same way as static methods of class types. As with classes, both instance and virtual methods of a boxed or unboxed value type may be called using the call instruction. The callvirt instruction shall not be used with unboxed value types, but it may be used on boxed value types.
Instance and virtual methods of classes shall be coded to expect a reference to an instance of the class as the this pointer. By contrast, instance and virtual methods of value types shall be coded to expect a managed pointer (see Partition I_alink=Partition_I) to an unboxed instance of the value type. The CLI shall convert a boxed value type into a managed pointer to the unboxed value type when a boxed value type is passed as the this pointer to a virtual method whose implementation is provided by the unboxed value type.
Note: This operation is the same as unboxing the instance, since the unbox instruction (see Partition III_alink=Partition_III) is defined to return a managed pointer to the value type that shares memory with the original boxed instance.
The following diagrams may help understand the relationship between the boxed and unboxed representations of a value type.


Rationale: An important use of instance methods on value types is to change internal state of the instance. This cannot be done if an instance of the unboxed value type is used for the this pointer, since it would be operating on a copy of the value, not the original value: unboxed value types are copied when they are passed as arguments.
Virtual methods are used to allow multiple types to share implementation code, and this requires that all classes that implement the virtual method share a common representation defined by the class that first introduces the method. Since value types can (and in the Base Class Library do) implement interfaces and virtual methods defined on System.Object, it is important that the virtual method be callable using a boxed value type so it can be manipulated as would any other type that implements the interface. This leads to the requirement that the EE automatically unbox value types on virtual calls.
Table 1: Type of this given CIL instruction and declaring type of instance method.
|
|
Value Type (Boxed or Unboxed) |
Interface |
Class Type |
|
call |
managed pointer to value type |
illegal |
object reference |
|
callvirt |
managed pointer to value type |
object reference |
object reference |
Example (informative):
The following converts an integer of the value type int32 into a string. Recall that int32 corresponds to the unboxed value type System.Int32 defined in the Base Class Library. Suppose the integer is declared as:
.locals init (int32 x)
Then the call is made as shown below:
ldloca x // load managed pointer to local variable
call instance
string
valuetype [mscorlib]System.Convert::ToString()
However, if System.Object (a class) is used as the type reference rather than System.Int32 (a value type), the value of x shall be boxed before the call is made and the code becomes:
ldloc x
box valuetype [mscorlib]System.Int32
callvirt instance string [mscorlib]System.Object::ToString()
Special Types are those that are referenced from CIL, but for which no definition is supplied: the VES supplies the definitions automatically based on information available from the reference.
|
<type> ::= … |
|
| <type> [ ] |
Vectors are single-dimension arrays with a zero lower bound. They have direct support in CIL instructions (newarr, ldelem, stelem, and ldelema, see Partition III_alink=Partition_III). The CIL Framework also provides methods that deal with multidimensional arrays, or single-dimension arrays with a non-zero lower bound (see Section 13.2). Two vectors are the same type if their element types are the same, regardless of their actual upper bounds.
Vectors have a fixed size and element type, determined when they are created. All CIL instructions shall respect these values. That is, they shall reliably detect attempts to index beyond the end of the vector, attempts to store the incorrect type of data into an element of a vector, and attempts to take addresses of elements of a vector with an incorrect data type. See Partition III_alink=Partition_III.
Example (informative):
Declaring a vector of Strings:
.field string[] errorStrings
Declaring a vector of function pointers:
.field method instance void*(int32) [] myVec
Create a vector of 4 strings, and store it into the field errorStrings. The four strings lie at errorStrings[0] through errorStrings[3]:
ldc.i4.4
newarr string
stfld string[] CountDownForm::errorStrings
Store the string "First" into errorStrings[0]:
ldfld string[] CountDownForm::errorStrings
ldc.i4.0
ldstr "First"
stelem
Vectors are subtypes of System.Array, an abstract class pre-defined by the CLI. It provides several methods that can be applied to all vectors. See Partition IV_alink=Partition_IV.
While vectors (see Section 13.1) have direct support through CIL instructions, all other arrays are supported by the VES by creating subtypes of the abstract class System.Arrray (see Partition IV_alink=Partition_IV)
|
<type> ::= … |
|
| <type> [ [<bound> [,<bound>]*] ] |
The rank of an array is the number of dimensions. The CLI does not support arrays with rank 0. The type of an array (other than a vector) shall be determined by the type of its elements and the number of dimensions.
|
<bound> ::= |
Description |
|
... |
lower and upper bounds unspecified. In the case of multi-dimensional arrays, the ellipsis may be omitted |
|
| <int32> |
zero lower bound, <int32> upper bound |
|
| <int32> ... |
lower bound only specified |
|
| <int32> ... <int32> |
both bounds specified |
The fundamental operations provided by the CIL instruction set for vectors are provided by methods on the class created by the VES.
The VES shall provide two constructors for arrays. One takes a sequence of numbers giving the number of elements in each dimension (a lower bound of zero is assumed). The second takes twice as many arguments: a sequence of lower bounds, one for each dimension; followed by a sequence of lengths, one for each dimension (where length is the number of elements required).
In addition to array constructors, the VES shall provide the instance methods Get, Set, and Address to access specific elements and compute their addresses. These methods take a number for each dimension, to specify the target element. In addition, Set takes an additional final argument specifying the value to store into the target element.
Example (informative):
Creates an array, MyArray, of strings with two dimensions, with indexes 5..10 and 3..7. Stores the string "One" into MyArray[5, 3], retrieves it and prints it out. Then computes the address of MyArray[5, 4], stores "Test" into it, retrieves it, and prints it out.
.assembly Test { }
.assembly extern mscorlib { }
.method public static void Start()
{ .maxstack 5
.entrypoint
.locals (class [mscorlib]System.String[,] myArray)
ldc.i4.5 // load lower bound for dim 1
ldc.i4.6 // load (upper bound - lower bound + 1) for dim 1
ldc.i4.3 // load lower bound for dim 2
ldc.i4.5 // load (upper bound - lower bound + 1) for dim 2
newobj instance void string[,]::.ctor(int32,
int32, int32, int32)
stloc myArray
ldloc myArray
ldc.i4.5
ldc.i4.3
ldstr "One"
call instance void string[,]::Set(int32, int32, string)
ldloc myArray
ldc.i4.5
ldc.i4.3
call instance string string[,]::Get(int32, int32)
call void [mscorlib]System.Console::WriteLine(string)
ldloc myArray
ldc.i4.5
ldc.i4.4
call instance string & string[,]::Address(int32, int32)
ldstr "Test"
stind.ref
ldloc myArray
ldc.i4.5
ldc.i4.4
call instance string string[,]::Get(int32, int32)
call void [mscorlib]System.Console::WriteLine(string)
ret
}
The following text is informative
Whilst the elements of multi-dimensional arrays can be thought of as laid out in contiguous memory, arrays of arrays are different – each dimension (except the last) holds an array reference. The following picture illustrates the difference:

On the left is a [6, 10] rectangular array. On the right is not one, but a total of five arrays. The vertical array is an array of arrays, and references the four horizontal arrays. Note how the first and second elements of the vertical array both reference the same horizontal array.
Note that all dimensions of a multi-dimensional array shall be of the same size. But in an array of arrays, it is possible to reference arrays of different sizes. For example, the figure on the right shows the vertical array referencing arrays of lengths 8, 8, 3, null, 6 and 1.
There is no special support for these so-called jagged arrays in either the CIL instruction set or the VES. They are simply vectors whose elements are themselves either the base elements or (recursively) jagged arrays.
End of informative text
An enum, short for enumeration, defines a set of symbols that all have the same type. A type shall be an enum if and only if it has an immediate base type of System.Enum. Since System.Enum itself has an immediate base type of System.ValueType (see Partition IV_alink=Partition_IV), enums are value types (see Chapter 12). The symbols of an enum are represented by an underlying type: one of { bool, char, int8, unsigned int8, int16, unsigned int16, int32, unsigned int32, int64, unsigned int64, float32, float64, native int, unsigned native int }
Note: The CLI does not provide a guarantee that values of the enum type are integers corresponding to one of the symbols (unlike Pascal). In fact, the CLS (see Partition I_alink=Partition_I, CLS) defines a convention for using enums to represent bit flags which can be combined to form integral value that are not named by the enum type itself.
Enums obey additional restrictions beyond those on other value types. Enums shall contain only fields as members (they shall not even define type initializers or instance constructors); they shall not implement any interfaces; they shall have auto field layout (see clause 9.1.2); they shall have exactly one instance field and it shall be of the underlying type of the enum; all other fields shall be static and literal (see Section 15.1); and they shall not be initialized with the initobj instruction.
Rationale: These restrictions allow a very efficient implementation of enums.
The single, required, instance field stores the value of an instance of the enum. The static literal fields of an enum declare the mapping of the symbols of the enum to the underlying values. All of these fields shall have the type of the enum and shall have field init metadata that assigns them a value (see Section 15.2).
For binding purposes (e.g. for locating a method definition from the method reference used to call it) enums shall be distinct from their underlying type. For all other purposes, including verification and execution of code, an unboxed enum freely interconverts with its underlying type. Enums can be boxed (see Chapter 12) to a corresponding boxed instance type, but this type is not the same as the boxed type of the underlying type, so boxing does not lose the original type of the enum.
Example (informative):
Declare an enum type, then create a local variable of that type. Store a constant of the underlying type into the enum (showing automatic coercsion from the underlying type to the enum type). Load the enum back and print it as the underlying type (showing automatic coersion back). Finally, load the address of the enum and extract the contents of the instance field and print that out as well.
.assembly Test { }
.assembly extern mscorlib { }
.class sealed public ErrorCodes extends [mscorlib]System.Enum
{ .field public unsigned int8 MyValue
.field public static literal valuetype ErrorCodes no_error = int8(0)
.field public static literal valuetype ErrorCodes format_error =
int8(1)
.field public static literal valuetype ErrorCodes overflow_error =
int8(2)
.field public static literal valuetype ErrorCodes nonpositive_error =
int8(3)
}
.method public static void Start()
{ .maxstack 5
.entrypoint
.locals init (valuetype ErrorCodes errorCode)
ldc.i4.1 // load 1 (= format_error)
stloc errorCode // store in local, note conversion to enum
ldloc errorCode
call void [mscorlib]System.Console::WriteLine(int32)
ldloca errorCode // address of enum
ldfld unsigned int8 valuetype ErrorCodes::MyValue
call void [mscorlib]System.Console::WriteLine(int32)
ret
}
|
<type> ::= … |
Section |
|
| <type> & |
13.4.2 |
|
| <type> * |
13.4.1 |
A pointer type shall be defined by specifying a signature that includes the type for the location it points at. A pointer may be managed (reported to the CLI garbage collector, denoted by &, see clause 13.4.2) or unmanaged (not reported, denoted by *, see clause 13.4.1)
Pointers may contain the address of a field (of an object or value type) or an element of an array. Pointers differ from object references in that they do not point to an entire type instance, but rather to the interior of an instance. The CLI provides two type-safe operations on pointer:
· loading the value from the location referenced by the pointer
· storing an assignment-compatible value into the location referenced by the pointer
For pointers into the same array or object (see Partition I_alink=Partition_I) the following arithmetic operations are supported:
· Adding an integer value to a pointer, where that value is interpreted as a number of bytes, results in a pointer of the same kind
· Subtracting an integer value (number of bytes) from a pointer results in a pointer of the same kind. Note that subtracting a pointer from an integer value is not permitted.
· Two pointers, regardless of kind, can be subtracted from one another, producing an integer value that specifies the number of bytes between the addresses they reference.
The following is informative text
Pointers are compatible with unsigned int32 on 32-bit architectures, and with unsigned int64 on 64-bit architectures. They are best considered as unsigned int, whose size varies depending upon the runtime machine architecture.
The CIL instruction set (see Partition III_alink=Partition_III) contains instructions to compute addresses of fields, local variables, arguments, and elements of vectors:
|
Instruction |
Description |
|
ldarga |
Load address of argument |
|
ldelema |
Load address of vector element |
|
ldflda |
Load address of field |
|
ldloca |
Load address of local variable |
|
ldsflda |
Load address of static field |
Once a pointer is loaded onto the stack, the ldind class of instructions may be used to load the data item to which it points. Similarly, the stind class of instructions can be used to store data into the location.
Note that the CLI will throw an InvalidOperationException for an ldflda instruction if the address is not within the current application domain. This situation arises typically only from the use of objects with a base type of System.MarshalByRefObject (see Partition IV_alink=Partition_IV).
Unmanaged pointers (*) are the traditional pointers used in languages like C and C++. There are no restrictions on their use, although for the most part they result in code that cannot be verified. While it is perfectly legal to mark locations that contain unmanaged pointers as though they were unsigned integers (and this is, in fact, how they are treated by the VES), it is often better to mark them as unmanaged pointers to a specific type of data. This is done by using * in a signature for a return value, local variable or an argument or by using a pointer type for a field or array element.
· Unmanaged pointers are not reported to the garbage collector and can be used in any way that an integer can be used.
· Verifiable code cannot dereference unmanaged pointers.
· Unverified code can pass an unmanaged pointer to a method that expects a managed pointer. This is safe only if one of the following is true:
a. The unmanaged pointer refers to memory that is not in memory used by the CLI for storing instances of objects (“garbage collected memory” or “managed memory”).
b. The unmanaged pointer contains the address of a field within an object.
c. The unmanaged pointer contains the address of an element within an array.
d. The unmanaged pointer contains the address where the element following the last element in an array would be located
Managed pointers (&) may point to an instance of a value type, a field of an object, a field of a value type, an element of an array, or the address where an element just past the end of an array would be stored (for pointer indexes into managed arrays). Managed pointers cannot be null, and they shall be reported to the garbage collector even if they do not point to managed memory.
Managed pointers are specified by using & in a signature for a return value, local variable or an argument or by using a by-ref type for a field or array element.
· Managed pointers can be passed as arguments, stored in local variables, and returned as values.
· If a parameter is passed by reference, the corresponding argument is a managed pointer.
· Managed pointers cannot be stored in static variables, array elements, or fields of objects or value types.
· Managed pointers are not interchangeable with object references.
· A managed pointer cannot point to another managed pointer, but it can point to an object reference or a value type.
· A managed pointer can point to a local variable, or a method argument
· Managed pointers that do not point to managed memory can be converted (using conv.u or conv.ovf.u) into unmanaged pointers, but this is not verifiable.
e. Unverified code that erroneously converts a managed pointer into an unmanaged pointer can seriously compromise the integrity of the CLI. See Partition III_alink=Partition_III (Managed Pointers) for more details.
|
<type> ::= … |
|
| method <callConv> <type> * ( <parameters> ) |
Variables of type method pointer shall store the address of the entry point to a method with compatible signature. A pointer to a static or instance method is obtained with the ldftn instruction, while a pointer to a virtual method is obtained with the ldvirtftn instruction. A method may be called by using a method pointer with the calli instruction. See Partition III_alink=Partition_III for the specification of these instructions.
Note: Like other pointers, method pointers are compatible with unsigned int64 on 64-bit architectures with unsigned int32 and on 32-bit architectures. The preferred usage, however, is unsigned native int, which works on both 32- and 64-bit architectures.
Call a method using a pointer. The method MakeDecision::Decide returns a method pointer to either AddOne or Negate, alternating on each call. The main program call MakeDecision::Decide three times and after each call uses a CALLI instruction to call the method specified. The output printed is "-1 2 –1" indicating successful alternating calls.
.assembly Test { }
.assembly extern mscorlib { }
.method public static int32 AddOne(int32 Input)
{ .maxstack 5
ldarg Input
ldc.i4.1
add
ret
}
.method public static int32 Negate(int32 Input)
{ .maxstack 5
ldarg Input
neg
ret
}
.class value sealed public MakeDecision extends
[mscorlib]System.ValueType
{ .field static bool Oscillate
.method public static method int32 *(int32) Decide()
{ ldsfld bool valuetype MakeDecision::Oscillate
dup
not
stsfld bool valuetype MakeDecision::Oscillate
brfalse NegateIt
ldftn int32 AddOne(int32)
ret
NegateIt:
ldftn int32 Negate(int32)
ret
}
}
.method public static void Start()
{ .maxstack 2
.entrypoint
ldc.i4.1
call method int32 *(int32) valuetype MakeDecision::Decide()
calli int32(int32)
call void [mscorlib]System.Console::WriteLine(int32)
ldc.i4.1
call method int32 *(int32) valuetype MakeDecision::Decide()
calli int32(int32)
call void [mscorlib]System.Console::WriteLine(int32)
ldc.i4.1
call method int32 *(int32) valuetype MakeDecision::Decide()
calli int32(int32)
call void [mscorlib]System.Console::WriteLine(int32)
ret
}
Delegates (see Partition I_alink=Partition_I) are the object-oriented equivalent of function pointers. Unlike function pointers, delegates are object-oriented, type-safe, and secure. Delegates are reference types, and are declared in the form of Classes. Delegates shall have an immediate base type of System.MulticastDelegate, which in turns has an immediate base type of System.Delegate (see Partition IV_alink=Partition_IV).
Delegates shall be declared sealed, and the only members a Delegate shall have are either two or four methods as specified here. These methods shall be declared runtime and managed (see clause 14.4.3). They shall not have a body, since it shall be automatically created by the VES. Other methods available on delegates are inherited from the classes System.Delegate and System.MulticastDelegate in the Base Class Library (see Partition IV_alink=Partition_IV).
Rationale: A better design would be to simply have delegate classes derive directly from System.Delegate. Unfortunately, backward compatibility with an existing CLI does not permit this design.
The instance constructor (named .ctor and marked specialname and rtspecialname, see clause 9.5.1) shall take exactly two parameters. The first parameter shall be of type System.Object and the second parameter shall be of type System.IntPtr. When actually called (via a newobj instruction, see Partition III_alink=Partition_III), the first argument shall be an instance of the class (or one of its subclasses) that defines the target method and the second argument shall be a method pointer to the method to be called.
The Invoke method shall be virtual and have the same signature (return type, parameter types, calling convention, and modifiers, see Section 7.1) as the target method. When actually called the arguments passed shall match the types specified in this signature.
The BeginInvoke method (see clause 13.6.2.1), if present, shall be virtual have a signature related to, but not the same as, that of the Invoke method. There are two differences in the signature. First, the return type shall be System.IAsyncResult (see Partition IV_alink=Partition_IV). Second, there shall be two additional parameters that follow those of Invoke: the first of type System.AsyncCallback and the second of type System.Object.
The EndInvoke method (see clause 13.6.2) shall be virtual have the same return type as the Invoke method. It shall take as parameters exactly those parameters of Invoke that are managed pointers, in the same order they occur in the signature for Invoke. In addition, there shall be an additional parameter of type System.IAsyncResult.
Example (informative):
The following example declares a Delegate used to call functions that take a single integer and return void. It provides all four methods so it can be called either synchronously or asynchronously. Because there are no parameters that are passed by reference (i.e. as managed pointers) there are no additional arguments to EndInvoke.
.assembly Test { }
.assembly extern mscorlib { }
.class private sealed StartStopEventHandler
extends [mscorlib]System.MulticastDelegate
{ .method public specialname rtspecialname instance
void .ctor(object Instance, native int Method)
runtime managed {}
.method public virtual void Invoke(int32 action) runtime managed {}
.method public virtual
class [mscorlib]System.IAsyncResult
BeginInvoke(int32 action,
class [mscorlib]System.AsyncCallback callback,
object Instance) runtime managed {}
.method public virtual
void EndInvoke(class [mscorlib]System.IAsyncResult result)
runtime managed {}
}
As with any class, an instance is created using the newobj instruction in conjunction with the instance constructor. The first argument to the constructor shall be the object on which the method is to be called, or it shall be null if the method is a static method. The second argument shall be a method pointer to a method on the corresponding class and with a signature that matches that of the delegate class being instantiated.
Implementation-Specific (Microsoft)
The Microsoft implementation of the CLI allows the programmer to add more methods to a delegate, on the condition that they provide an implementation for those methods (ie, they cannot be marked runtime). Note that such use makes the resulting assembly non-portable.
The synchronous mode of calling delegates corresponds to regular method calls and is performed by calling the virtual method named Invoke on the delegate. The delegate itself is the first argument to this call (it serves as the this pointer), followed by the other arguments as specified in the signature. When this call is made, the caller shall block until the called method returns. The called method shall be executed on the same thread as the caller.
Continuing the previous example, define a class Test that declares a method, onStartStop, appropriate for use as the target for the delegate.
.class public Test
{ .field public int32 MyData
.method public void onStartStop(int32 action)
{ ret // put your code here
}
.method public specialname rtspecialname
instance void .ctor(int32 Data)
{ ret // call parent constructor, store state, etc.
}
}
Then define a main program. This one constructs an instance of Test and then a delegate that targets the onStartStop method of that instance. Finally, call the delegate.
.method public static void Start()
{ .maxstack 3
.entrypoint
.locals (class StartStopEventHandler DelegateOne,
class Test InstanceOne)
// Create instance of Test class
ldc.i4.1
newobj instance void Test::.ctor(int32)
stloc InstanceOne
// Create delegate to onStartStop method of that class
ldloc InstanceOne
ldftn instance void Test::onStartStop(int32)
newobj void StartStopEventHandler::.ctor(object, native int)
stloc DelegateOne
// Invoke the delegate, passing 100 as an argument
ldloc DelegateOne
ldc.i4 100
callvirt instance void StartStopEventHandler::Invoke(int32)
ret
}
// Note that the example above creates a delegate to a non-virtual
// function. If onStartStop had instead been a virtual function, use
// the following code sequence instead :
ldloc InstanceOne
dup
ldvirtftn instance void Test::onStartStop(int32)
newobj void StartStopEventHandler::.ctor(object, native int)
stloc DelegateOne
// Invoke the delegate, passing 100 as an argument
ldloc DelegateOne
Note: The code sequence above shall use dup –not ldloc InstanceOne twice. The dup code sequence is easily recognized as typesafe, whereas alternatives would require more complex analysis. Verifiability of code is discussed in Partition III_alink=Partition_III
In the asynchronous mode, the call is dispatched, and the caller shall continue execution without waiting for the method to return. The called method shall be executed on a separate thread.
To call delegates asynchronously, the BeginInvoke and EndInvoke methods are used.
Note: if the caller thread terminates before the callee completes, the callee thread is unaffected. The callee thread continues execution and terminates silently
Note: the callee may throw exceptions. Any unhandled exception propagates to the caller via the EndInvoke method.
An asynchronous call to a delegate shall begin by making a virtual call to the BeginInvoke method. BeginInvoke is similar to the Invoke method (see clause 13.6.1), but has three differences:
· It has a two additional parameters, appended to the list, of type System.AsyncCallback, and System.Object
· The return type of the method is System.IAsyncResult
Although the BeginInvoke method therefore includes parameters that represent return values, these values are not updated by this method. The results instead are obtained from the EndInvoke method (see below).
Unlike a synchronous call, an asynchronous call shall provide a way for the caller to determine when the call has been completed. The CLI provides two such mechanisms. The first is through the result returned from the call. This object, an instance of the interface System.IAsyncResult, can be used to wait for the result to be computed, it can be queried for the current status of the method call, and it contains the System.Object value that was passed to the call to BeginInvoke. See Partition IV_alink=Partition_IV.
The second mechanism is through the System.AsyncCallback delegate passed to BeginInvoke. The VES shall call this delegate when the value is computed or an exception has been raised indicating that the result will not be available. The value passed to this callback is the same value passed to the call to BeginInvoke. A value of null may be passed for System.AsyncCallback to indicate that the VES need not provide the callback.
Rationale: This model supports both a polling approach (by checking the status of the returned System.IAsyncResult) and an event-driven approach (by supplying a System.AsyncCallback) to asynchronous calls.
A synchronous call returns information both through its return value and through output parameters. Output parameters are represented in the CLI as parameters with managed pointer type. Both the returned value and the values of the output parameters are not available until the VES signals that the asynchronous call has completed successfully. They are retrieved by calling the EndInvoke method on the delegate that began the asynchronous call.
The EndInvoke method can be called at any time after BeginInvoke. It shall suspend the thread that calls it until the asynchronous call completes. If the call completes successfully, EndInvoke will return the value that would have been returned had the call been made synchronously, and its managed pointer arguments will point to values that would have been returned to the out parameters of the synchronous call.
EndInvoke requires as parameters the value returned by the originating call to BeginInvoke (so that different calls to the same delegate can be distinguished, since they may execute concurrently) as well as any managed pointers that were passed as arguments (so their return values can be provided).
Methods may be defined at the global level (outside of any type):
|
<decl> ::= … |
|
| .method <methodHead> { <methodBodyItem>* } |
as well as inside a type:
|
<classMember> ::= … |
|
| .method <methodHead> { <methodBodyItem>* } |
There are four constructs in ilasm connected with methods. These correspond with different metadata constructs, as described in Chapter 21.
A MethodDecl, or method declaration, supplies the method name and signature (parameter and return types), but not its body. That is, a method declaration provides a <methodHead> but no <methodBodyItem>s. These are used at callsites to specify the call target (call or callvirt instructions, see Partition III_alink=Partition_III) or to declare an abstract method. A MethodDecl has no direct logical couterpart in the metadata; it can be either a Method or a MethodRef.
A Method, or method definition, supplies the method name, attributes, signature and body. That is, a method definition provides a <methodHead> as well as one or more <methodBodyItem>s. The body includes the method's CIL instructions, exception handlers, local variable information, and additional runtime or custom metadata about the method. See Chapter 11.
A MethodRef, or method reference, is a reference to a method. It is used when a method is called whose definition lies in another module or assembly. A MethodRef shall be resolved by the VES into a Method before the method is called at runtime. If a matching Method cannot be found, the VES shall throw a System.MissingMethodException. See Chapter 21.23.
A MethodImpl, or method implementation, supplies the executable body for an existing virtual method. It associates a Method (representing the body) with a MethodDecl or Method (representing the virtual method). A MethodImpl is used to provide an implementation for an inherited virtual method