Lexical structure
From Ruby Standard Wiki
When several prefixes of the input under parsing process have matching productions, the production that matches the longest prefix is selected.
Contents
|
8.1 Source text
Syntax
source-character ::
[ any character in ISO/IEC 646 ]
A program is represented as a sequence of characters. A conforming processor shall accept any conforming program which consists of characters in ISO/IEC 646, encoded with the octet values as specified in ISO/IEC 646. The support for any other character sets and encodings is implementation dependent.
Terminal symbols are sequences of those characters in ISO/IEC 646. Control characters in ISO/IEC 646 are represented by hexadecimal notation.
EXAMPLE "0x0a" represents a line feed character.
8.2 Line terminators
Syntax
line-terminator ::
0x0d? 0x0a
A line-terminator is ignored when it is used to separate tokens. For this reason, except in §8.4 and §8.5, line-terminators are omitted from productions. However, in some cases, the presence or absence of a line-terminator changes the meaning of a program.
A location of program text where a line-terminator shall occur is indicated by the notation "[ line-terminator here ]". A location of program text where a line-terminator shall not occur is indicated by the notation "[ no line-terminator here ]"; however, a conforming processor may ignore the notation where the ignorance does not introduce ambiguity.
EXAMPLE statements are separated by separators (see §10.2). The syntax of the separators is as follows:
separator :: ; | [ line-terminator here ]
The source
x = 1 + 2
puts x
is therefore separated to two statements x = 1 + 2 and puts x by a line-terminator.
The source
x =
1+2
is parsed as a single statement x = 1 + 2 because x = is not a valid statement. However, the source
x
= 1 + 2
is not a valid Ruby program because a line-terminator shall not occur before = in a single-variable-assignment-expression, and = 1 + 2 is not a valid statement. The fact that a line-terminator shall not occur before = is indicated in the syntax of the single-variable-assignment-expression as follows (see §11.3.1.1.1):
single-variable-assignment-expression ::
variable [ no line-terminator here ] = operator-expression
8.3 Whitespace
Syntax
whitespace ::
0x09 | 0x0b | 0x0c | 0x0d | 0x20 | \ 0x0d? 0x0a
whitespace is ignored when it is used to separate tokens. For this reason, except in §8.4 and §8.5, whitespace is omitted from productions. However, in some cases, the presence or absence of whitespace changes the meaning of a program.
A location of program text where whitespace shall occur is indicated by the notation "[ whitespace here ]". A location of program text where whitespace shall not occur is indicated by the notation "[ no whitespace here ]". A line-terminator shall not occur in the location where whitespace shall not occur. Therefore, this notation also indicates that a line-terminator shall not occur.
8.4 Comments
Syntax
comment ::
single-line-comment
| multi-line-comment
single-line-comment ::
# comment-content?
comment-content ::
line-content
line-content ::
source-character+
multi-line-comment ::
multi-line-comment-begin-line multi-line-comment-line?
multi-line-comment-end-line
multi-line-comment-begin-line ::
[ beginning of a line ] =begin rest-of-begin-end-line? line-terminator
multi-line-comment-end-line ::
[ beginning of a line ] =end rest-of-begin-end-line?
( line-terminator | [ end of a program ] )
rest-of-begin-end-line ::
whitespace+ comment-content
line ::
comment-content line-terminator
multi-line-comment-line ::
line but not multi-line-comment-end-line
The notation "[ beginning of a line ]" indicates the beginning of a program or the position immediately after a line-terminator.
Any characters that are considered as line-terminators are not allowed within a line-content.
A comment is either a single-line-comment or a multi-line-comment. A comment is considered to be whitespace.
A single-line-comment begins with "#" and continues to the end of the line. A line-terminator at the end of the line is not considered to be a part of the comment. A single-line-comment can contain any characters except line-terminators.
A multi-line-comment begins with a line beginning with =begin, and continues until and including a line that begins with =end. Unlike single-line-comments, a line-terminator on a multi-line-comment-end-line, if any, is considered to be part of the comment.
8.5 Tokens
Syntax
token ::
reserved-word
| identifier
| punctuator
| operator
| literal
8.5.1 Reserved words
Syntax
reserved-word ::
__LINE__ | __ENCODING__ | __FILE__ | BEGIN | END | alias | and | begin
| break | case | class | def | defined? | do | else | elsif | end
| ensure | for | false | if | in | module | next | nil | not | or | redo
| rescue | retry | return | self | super | then | true | undef | unless
| until | when | while | yield
Reserved words are case-sensitive.
8.5.2 Identifiers
Syntax
identifier ::
local-variable-identifier
| global-variable-identifier
| class-variable-identifier
| instance-variable-identifier
| constant-identifier
| method-identifier
local-variable-identifier ::
( lowercase-character | _ ) identifier-character*
global-variable-identifier ::
$ identifier-start-character identifier-character*
class-variable-identifier ::
@@ identifier-start-character identifier-character*
instance-variable-identifier ::
@ identifier-start-character identifier-character*
constant-identifier ::
uppercase-character identifier-character*
method-identifier ::
method-only-identifier
| assignment-like-method-identifier
| constant-identifier
| local-variable-identifier
method-only-identifier ::
( constant-identifier | local-variable-identifier ) ( ! | ? )
assignment-like-method-identifier ::
( constant-identifier | local-variable-identifier ) =
identifier-character ::
lowercase-character
| uppercase-character
| decimal-digit
| _
identifier-start-character ::
lowercase-character
| uppercase-character
| _
uppercase-character ::
A | B | C | D | E | F | G | H | I | J | K | L| M | N | O | P | Q | R
| S | T | U | V | W | X | Y | Z
lowercase-character ::
a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r
| s | t | u | v | w | x | y | z
decimal-digit ::
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
An identifier is a sequence of identifier-characters optionally prefixed by one of "$", "@@", or "@", and optionally postfixed by one of "?", "!", or "=".
A global-variable-identifier begins with "$". A class-variable-identifier starts with "@@". An instance-variable-identifier begins with "@". A constant-identifier begins with an uppercase-character.
A local-variable-identifier begins with a lowercase-character or "_". A method-identifier is a constant-identifier or a local-variable-identifier optionally followed by one of "?", "!", or "=".
8.5.3 Punctuators
Syntax
punctuator ::
[ | ] | ( | ) | { | } | :: | , | ; | .. | ... | ? | : | =>
8.5.4 Operators
Syntax
operator ::
operator-method-name
| assignment-operator
operator-method-name ::
^ | & | | | <=> | == | === | !~ | =~ | > | >= | < | <= | << | >> | +
| - | * | / | % | ** | ~ | +@ | -@ | [] | []= | `
assignment-operator ::
assignment-operator-name =
assignment-operator-name ::
+ | - | * | ** | / | ^ | % | << | >> | & | && | || | |
8.5.5 Literals
literal ::
numeric-literal
| string-literal
| array-literal
| regular-expression-literal
| symbol
8.5.5.1 Numeric literals
Syntax
numeric-literal ::
signed-number
| unsigned-number
unsigned-number ::
integer-literal
| float-literal
integer-literal ::
decimal-integer-literal
| binary-integer-literal
| octal-integer-literal
| hexadecimal-integer-literal
decimal-integer-literal ::
digit-decimal-integer-literal
| prefixed-decimal-integer-literal
digit-decimal-integer-literal ::
0
| decimal-digit-without-zero ( _? decimal-digit )*
prefixed-decimal-integer-literal ::
0 ( d | D ) digit-decimal-part
digit-decimal-part ::
decimal-digit ( _? decimal-digit )*
binary-integer-literal ::
0 ( b | B ) binary-digit ( _? binary-digit )*
octal-integer-literal ::
0 ( _ | o | O )? octal-digit ( _? octal-digit )*
hexadecimal-integer-literal ::
0 ( x | X ) hexadecimal-digit ( _? hexadecimal-digit )*
float-literal ::
decimal-float-literal
| exponent-float-literal
decimal-float-literal ::
digit-decimal-integer-literal . digit-decimal-part
exponent-float-literal ::
base-part exponent-part
base-part ::
decimal-float-literal
| digit-decimal-integer-literal
exponent-part ::
( e | E ) ( + | - )? digit-decimal-part
signed-number ::
( + | - ) unsigned-number
decimal-digit-without-zero ::
1 | 2 | 3 | 4 | 5 | 6 |7 | 8 | 9
octal-digit ::
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
binary-digit ::
0 | 1
hexadecimal-digit ::
decimal-digit | a | b | c | d | e | f | A | B | C | D | E | F
Semantics
A numeric-literal evaluates to either an instance of the class Integer or a direct instance of the class Float.
An unsigned-number of the form integer-literal evaluates to an instance of the class Integer whose value is the value of one of the alternatives on the right-hand side.
An unsigned-number of the form float-literal evaluates to a direct instance of the class Float whose value is the value of one of the alternatives on the right-hand side.
A signed-number which begins with "+" evaluates to an instance represented by the unsigned-number. A signed-number which begins with "-" evaluates to an instance of the class Integer or a direct instance of the class Float whose value is the negated value of the instance represented by the unsigned-number.
The value of an integer-literal, a decimal-integer-literal, a float-literal, or a base-part is the value of one of the alternatives on the right-hand side.
The value of a digit-decimal-integer-literal is either 0 or the value of a sequence of characters, which consist of a decimal-digit-without-zero followed by sequence of decimal-digits, ignoring interleaving "_"s, computed using base 10.
The value of a prefixed-decimal-integer-literal is the value of the digit-decimal-part.
The value of a digit-decimal-part is the value of the sequence of decimal-digits, ignoring interleaving "_"s, computed using base 10.
The value of a binary-integer-literal is the value of the sequence of binary-digits, ignoring interleaving "_"s, computed using base 2.
The value of an octal-integer-literal is the value of the sequence of octal-digits, ignoring interleaving "_"s, computed using base 8.
The value of a hexadecimal-integer-literal is the value of the sequence of hexadecimal-digits, ignoring interleaving "_"s, computed using base 16.
The value of a decimal-float-literal is the value of the digit-decimal-integer-literal plus the value of the digit-decimal-part times 10-n where n is the number of decimal-digits of the digit-decimal-part.
The value of an exponent-float-literal is the value of the base-part times 10n where n is the value of the exponent-part.
The value of an exponent-part is the negative value of the digit-decimal-part if "-" occurs, otherwise, it is the value of the digit-decimal-part.
There is no limitation on the maximum magnitude for the value of an integer-literal. The precision of the value of a float-literal is implementation defined; however, if the underlying platform of a conforming processor supports IEC 60559:1989, the representation of an instance of the class Float should be the 64-bit double format as specified in §S3.2.2 of IEC 60559:1989. The value of a float-literal is rounded to fit in the representation of an instance of the class Float in an implementation defined way.
8.5.5.2 String literals
Syntax
string-literal ::
single-quoted-string
| double-quoted-string
| quoted-non-expanded-literal-string
| quoted-expanded-literal-string
| here-document
| external-command-execution
Semantics
A string-literal evaluates to a direct instance of the class String.
8.5.5.2.1 Single quoted strings
Syntax
single-quoted-string ::
' single-quoted-string-character* '
single-quoted-string-character ::
non-escaped-single-quoted-string-character
| single-quoted-escape-sequence
single-quoted-escape-sequence ::
single-escape-character-sequence
| non-escaped-single-quoted-string-character-sequence
single-escape-character-sequence ::
\ single-escaped-character
non-escaped-single-quoted-string-character-sequence ::
\ non-escaped-single-quoted-string-character
single-escaped-character ::
' | \
non-escaped-single-quoted-string-character ::
source-character but not single-escaped-character
Semantics
A single-quoted-string consists of zero or more characters enclosed by single quotes. The sequence of single-quoted-string-characters within the pair of single quotes represents the content of a string as it occurs in program text literally, except for single-escape-character-sequences. The sequence "\\" represents "\". The sequence "\'" represents "'".
8.5.5.2.2 Double quoted strings
Syntax
double-quoted-string ::
" double-quoted-string-character* "
double-quoted-string-character ::
source-character but not ( " | \ )
| double-escape-sequence
| interpolated-character-sequence
double-escape-sequence ::
simple-escape-sequence
| non-escaped-sequence
| line-terminator-escape-sequence
| octal-escape-sequence
| hex-escape-sequence
| control-escape-sequence
simple-escape-sequence ::
\ double-escaped-character
non-escaped-sequence ::
\ non-escaped-double-quoted-string-character
line-terminator-escape-sequence ::
\ line-terminator
non-escaped-double-quoted-string-character ::
source-character but not ( double-escaped-character | line-terminator )
double-escaped-character ::
\ | n | t | r | f | v | a | e | b | s
octal-escape-sequence ::
\ octal-digit ( octal-digit octal-digit? )?
hex-escape-sequence ::
\ x hexadecimal-digit hexadecimal-digit?
control-escape-sequence ::
\ ( C - | c ) control-escaped-character
control-escaped-character ::
double-escape-sequence
| ?
| source-character but not ( \ | ? )
interpolated-character-sequence ::
# global-variable-identifier
| # class-variable-identifier
| # instance-variable-identifier
| # { compound-statement }
Semantics
A double-quoted-string consists of zero or more characters enclosed by double quotes. The sequence of double-quoted-string-characters within the pair of double quotes represents the content of a string.
Except for a double-escape-sequence and an interpolated-character-sequence, a double-quoted-string-character represents a character as it occurs in program text.
A simple-escape-sequence represents a character as shown in Table 1.
An octal-escape-sequence represents a character the code of which is the value of the sequence of octal-digits computed using base 8.
A hex-escape-sequence represents a character the code of which is the value of the sequence of hexadecimal-digits computed using base 16.
A non-escaped-sequence represents a non-escaped-double-quoted-string-character.
A line-terminator-escape-sequence is used to break the content of a string into separate lines in program text without inserting a line-terminator into the string. A line-terminator-escape-sequence does not count as a character of the string.
A control-escape-sequence represents a character the code of which is computed by performing a bitwise AND operation between 0x9f and the code of the character represented by the control-escaped-character, except when the control-escaped-character is ?, in which case, the control-escape-sequence represents a character the code of which is 127.
An interpolated-character-sequence is a part of a string-literal which is dynamically evaluated when the string-literal in which it is embedded is evaluated. The interpolated-character-sequences within a string-literal are evaluated in the order in which they occur in program text.
The value of a string-literal which contains interpolated-character-sequences is a direct instance of the class String the content of which is made from the string-literal where each occurrence of interpolated-character-sequence is replaced by the content of an instance of the class String which is the dynamically evaluated value of the interpolated-character-sequence.
An interpolated-character-sequence is evaluated as follows:
- If it is of the form
#global-variable-identifier, evaluate the global-variable-identifier (see §11.4.3.3). Let V be the resulting value. - If it is of the form
#class-variable-identifier, evaluate the class-variable-identifier (see §11.4.3.4). Let V be the resulting value. - If it is of the form
#instance-variable-identifier, evaluate the instance-variable-identifier (see §11.4.3.5). Let V be the resulting value. - If it is of the form
#{compound-statement}, evaluate the compound-statement (see §10.2). Let V be the resulting value. - If V is an instance of the class
String, V is the value of interpolated-character-sequence. - Otherwise, invoke the method
to_son V with an empty list of arguments. Let S be the resulting value. - If S is an instance of the class
String, S is the value of interpolated-character-sequence. - Otherwise, the value of interpolated-character-sequence is an instance of the class
String, the content of which is implementation defined.
| Escape sequence | Character code |
|---|---|
\\ | 0x5c |
\n | 0x0a |
\t | 0x09 |
\r | 0x0d |
\f | 0x0c |
\v | 0x0b |
\a | 0x07 |
\e | 0x1b |
\b | 0x08 |
\s | 0x20 |
8.5.5.2.3 Quoted non-expanded literal strings
Syntax
quoted-non-expanded-literal-string ::
%q literal-beginning-delimiter non-expanded-literal-string* literal-ending-delimiter
non-expanded-literal-string ::
non-expanded-literal-character
| non-expanded-delimited-string
non-expanded-delimited-string ::
literal-beginning-delimiter non-expanded-literal-string* literal-ending-delimiter
non-expanded-literal-character ::
non-escaped-literal-character
| non-expanded-literal-escape-sequence
non-escaped-literal-character ::
source-character but not quoted-literal-escape-character
non-expanded-literal-escape-sequence ::
non-expanded-literal-escape-character-sequence
| non-escaped-non-expanded-literal-character-sequence
non-expanded-literal-escape-character-sequence ::
\ non-expanded-literal-escaped-character
non-expanded-literal-escaped-character ::
literal-beginning-delimiter
| literal-ending-delimiter
| \
quoted-literal-escape-character ::
non-expanded-literal-escaped-character
non-escaped-non-expanded-literal-character-sequence ::
\ non-escaped-non-expanded-literal-character
non-escaped-non-expanded-literal-character ::
source-character but not non-expanded-literal-escaped-character
The literal-beginning-delimiter of a non-expanded-delimited-string shall be the same character as the literal-beginning-delimiter of the quoted-non-expanded-literal-string.
A literal-ending-delimiter shall be the same character as the corresponding literal-beginning-delimiter, except when the literal-beginning-delimiter is one of the characters on the left in Table 2. In that case, the literal-ending-delimiter is the corresponding character on the right in Table 2.
| literal-beginning-delimiter | literal-ending-delimiter |
|---|---|
{ | }
|
( | )
|
[ | ]
|
< | >
|
The production non-expanded-delimited-string applies only when the literal-beginning-delimiter is one of the characters of matching-literal-beginning-delimiter.
Semantics
A non-expanded-literal-string represents the content of a string as it occurs in program text literally, except for non-expanded-literal-escape-character-sequences.
A non-expanded-literal-escape-character-sequence represents a character as follows. The sequence "\\" represents "\"; the sequence \literal-beginning-delimiter, a literal-beginning-delimiter; the sequence \literal-ending-delimiter, a literal-ending-delimiter.
8.5.5.2.4. Quoted expanded literal strings
Syntax
quoted-expanded-literal-string ::
% Q? literal-beginning-delimiter expanded-literal-string* literal-ending-delimiter
expanded-literal-string ::
expanded-literal-character
| expanded-delimited-string
expanded-literal-character ::
non-escaped-literal-character
| double-escape-sequence
| interpolated-character-sequence
expanded-delimited-string ::
literal-beginning-delimiter expanded-literal-string* literal-ending-delimiter
literal-beginning-delimiter ::
source-character but not alpha-numeric-character-or-separator
alpha-numeric-character-or-separator ::
whitespace
| line-terminator
| uppercase-character
| lowercase-character
| decimal-digit
literal-ending-delimiter ::
[ depending on the literal-beginning-delimiter ]
matching-literal-beginning-delimiter ::
( | { | < | [
The literal-beginning-delimiter of an expanded-delimited-string shall be the same character as the literal-beginning-delimiter of the quoted-expanded-literal-string.
The literal-ending-delimiter shall match the literal-beginning-delimiter as described in §8.5.5.2.3.
The production expanded-delimited-string applies only when the literal-beginning-delimiter is one of the characters of matching-literal-beginning-delimiter.
Semantics
A expanded-literal-string represents the content of a string.
A character in an expanded-literal-string other than a double-escape-sequence or an interpolated-character-sequence represents a character as it occurs in program text. A double-escape-sequence and an interpolated-character-sequence represent characters as described in §8.5.5.2.2.
8.5.5.2.5 Here documents
Syntax
here-document ::
heredoc-start-line heredoc-body heredoc-end-line
heredoc-start-line ::
heredoc-signifier rest-of-line
heredoc-signifier ::
<< heredoc-delimiter-specifier
rest-of-line ::
line-content? line-terminator
heredoc-body ::
heredoc-body-line*
heredoc-body-line ::
line but not heredoc-end-line
Semantics
A here-document is represented by several lines of program text, and evaluates to a direct instance of the class String or the value of the invocation of the method `.
The heredoc-signifier, the heredoc-body, and the heredoc-end-line in a here-document are treated as a unit and considered to be a single token occurring at the place where the heredoc-signifier occurs. The first character of the rest-of-line becomes the head of the input after the here-document has been processed.
The object to which here-document evaluates is either a direct instance S of the class String whose content is represented by the heredoc-body or the value of the invocation of the method ` with S as the only argument.
The form of the heredoc-delimiter-specifier determines both the form of the heredoc-end-line and the way in which the here-document is processed, as described below.
Syntax
heredoc-delimiter-specifier ::
-? heredoc-delimiter
heredoc-delimiter ::
non-quoted-delimiter
| single-quoted-delimiter
| double-quoted-delimiter
| command-quoted-delimiter
non-quoted-delimiter ::
non-quoted-delimiter-identifier
non-quoted-delimiter-identifier ::
identifier-character*
single-quoted-delimiter ::
' single-quoted-delimiter-identifier* '
single-quoted-delimiter-identifier ::
source-character but not '
double-quoted-delimiter ::
" double-quoted-delimiter-identifier* "
double-quoted-delimiter-identifier ::
source-character but not "
command-quoted-delimiter ::
` command-quoted-delimiter-identifier* `
command-quoted-delimiter-identifier ::
source-character but not `
heredoc-end-line ::
indented-heredoc-end-line
| non-indented-heredoc-end-line
indented-heredoc-end-line ::
[ beginning of a line ] whitespace* heredoc-delimiter-identifier line-terminator
non-indented-heredoc-end-line ::
[ beginning of a line ] heredoc-delimiter-identifier line-terminator
heredoc-delimiter-identifier ::
non-quoted-delimiter-identifier
| single-quoted-delimiter-identifier
| double-quoted-delimiter-identifier
| command-quoted-delimiter-identifier
Semantics
The form of a heredoc-end-line depends on the presence or absence of the beginning "-" of the heredoc-delimiter-specifier.
If the heredoc-delimiter-specifier begins with "-", a line of the form indented-heredoc-end-line is treated as the heredoc-end-line, otherwise, a line of the form non-indented-heredoc-end-line is treated as the heredoc-end-line. In both forms, the heredoc-delimiter-identifier shall be the same sequence of characters as it occurs in the corresponding part of heredoc-delimiter.
If the heredoc-delimiter is of the form non-quoted-delimiter, the heredoc-delimiter-identifier shall be the same sequence of characters as the non-quoted-delimiter-identifier; if it is of the form single-quoted-delimiter, the single-quoted-delimiter-identifier; if it is of the form of double-quoted-delimiter, the double-quoted-delimiter-identifier ; if it is of the form of command-quoted-delimiter, the command-quoted-delimiter-identifier.
The object to which a here-document evaluates is created as follows:
-
Create a direct instance of the class
Stringfrom the heredoc-body, the treatment of which depends on the form of the heredoc-delimiter as follows:- If heredoc-delimiter is of the form single-quoted-delimiter, the heredoc-body is treated as a sequence of source-characters as it occurs in program text literally.
- If heredoc-delimiter is in any of the forms non-quoted-delimiter, double-quoted-delimiter, or command-quoted-delimiter, the heredoc-body is treated as a sequence of double-quoted-string-characters as described in §8.5.5.2.2.
Let S be that instance of the class
String. - If the heredoc-delimiter is not of the form command-quoted-delimiter, let V be S.
- Otherwise, invoke the method
`on the current self with the list of arguments whose only element is S. Let V be the resulting value of the method invocation. - V is the object to which the here-document evaluates.
8.5.5.2.6 External command execution
Syntax
external-command-execution ::
backquoted-external-command-execution
| quoted-external-command-execution
backquoted-external-command-execution ::
` double-quoted-string-character* `
quoted-external-command-execution ::
%x literal-beginning-delimiter expanded-literal-string* literal-ending-delimiter
The literal-ending-delimiter shall match the literal-beginning-delimiter as described in §8.5.5.2.
Semantics
An external-command-execution is a form to invoke the method "`".
An external-command-execution is evaluated as follows:
- If the external-command-execution is of the form backquoted-external-command-execution, construct a direct instance of the class
StringS by replacing the two "`" with """ and evaluating the resulting double-quoted-string as described in §8.5.5.2.2. - If the external-command-execution is of the form quoted-external-command-execution, construct a direct instance of the class
StringS by replacing "%x" with "%Q" and evaluating the resulting quoted-expanded-literal-string as described in §8.5.5.2.4. - Invoke the method "
`" on the current self with a list of arguments whose only element is S. - The resulting value is the value of the external-command-execution.
8.5.5.3 Array literals
Syntax
array-literal ::
quoted-non-expanded-array-constructor
| quoted-expanded-array-constructor
quoted-non-expanded-array-constructor ::
%w literal-beginning-delimiter non-expanded-array-content literal-ending-delimiter
non-expanded-array-content ::
quoted-array-item-separator-list? non-expanded-array-item-list? quoted-array-item-separator-list?
non-expanded-array-item-list ::
non-expanded-array-item ( quoted-array-item-separator-list non-expanded-array-item )*
quoted-array-item-separator-list ::
quoted-array-item-separator+
quoted-array-item-separator ::
whitespace
| line-terminator
non-expanded-array-item ::
non-expanded-array-item-character+
non-expanded-array-item-character ::
non-escaped-array-item-character
| non-expanded-array-escape-sequence
non-escaped-array-item-character ::
non-escaped-array-character
| matching-literal-delimiter
non-escaped-array-character ::
non-escaped-literal-character but not quoted-array-item-separator
matching-literal-delimiter ::
( | { | < | [ | ) | } | > | ]
non-expanded-array-escape-sequence ::
non-expanded-literal-escape-sequence but not escaped-quoted-array-item-separator
| escaped-quoted-array-item-separator
escaped-quoted-array-item-separator ::
\ quoted-array-item-separator
quoted-expanded-array-constructor ::
%W literal-beginning-delimiter expanded-array-content literal-ending-delimiter
expanded-array-content ::
quoted-array-item-separator-list? expanded-array-item-list? quoted-array-item-separator-list?
expanded-array-item-list ::
expanded-array-item ( quoted-array-item-separator-list expanded-array-item )*
expanded-array-item ::
expanded-array-item-character+
expanded-array-item-character ::
non-escaped-array-item-character
| expanded-array-escape-sequence
| interpolated-character-sequence
expanded-array-escape-sequence ::
double-escape-sequence but not escaped-quoted-array-item-separator
| escaped-quoted-array-item-separator
The literal-ending-delimiter shall match the literal-beginning-delimiter as described in §8.5.5.2.
When the literal-beginning-delimiter is one of the matching-literal-beginning-delimiter, the quoted-non-expanded-array-constructor and the quoted-expanded-array-constructor is determined as follows.
Let N be 0. For each character C which appears after "%w" or "%W", take the following steps.
- If C is a literal-beginning-delimiter which is not prefixed by a "
\", increment N by 1. - If C is a literal-ending-delimiter which is not prefixed by a "
\", decrement N by 1. - If N is 0 and C is the literal-ending-delimiter, terminate these steps.
The literal-ending-delimiter in Step c is the literal-ending-delimiter of the quoted-non-expanded-array-constructor or the quoted-expanded-array-constructor.
Semantics
An array-literal evaluates to a direct instance of the class Array.
A quoted-non-expanded-array-constructor is evaluated as follows:
- Create an empty direct instance of the class
Array. Let A be the instance. -
If non-expanded-array-item-list occurs, for each non-expanded-array-item of the non-expanded-array-item-list, take the following steps:
-
Create a direct instance of the class
StringS, the content of which is represented by the sequence of non-expanded-array-item-characters.A non-expanded-array-item-character represents itself, except in the case of a non-expanded-array-escape-sequence. A non-expanded-array-escape-sequence represents a character as described in §8.5.5.2.3, except in the case of an escaped-quoted-array-item- separator. An escaped-quoted-array-item-separator represents a quoted-array-item-separator.
- Append S to A.
-
- The value of the quoted-non-expanded-array-constructor is A.
A quoted-expanded-array-constructor is evaluated as follows:
- Create an empty direct instance of the class
Array. Let A be the instance. -
If expanded-array-item-list occurs, process each expanded-array-item of the expanded-array-item-list as follows:
-
Create a direct instance of the class
StringS, the content of which is represented by the sequence of expanded-array-item-characters.An expanded-array-item-character represents itself, except in the case of an expanded-array-escape-sequence and an interpolated-character-sequence. An expanded-array-escape-sequence represents a character as described in §8.5.5.2.2, except in the case of an escaped-quoted-array-item-separator. An escaped-quoted-array-item-separator represents a quoted-array-item-separator. An interpolated-character-sequence represents a sequence of characters as described in §8.5.5.2.2.
- Append S to A.
-
- The value of the quoted-expanded-array-constructor is A.
8.5.5.4 Regular expression literals
Syntax
regular-expression-literal ::
/ regular-expression-body / regular-expression-option*
| %r literal-beginning-delimiter expanded-literal-string* literal-ending-delimiter regular-expression-option*
regular-expression-body ::
regular-expression-character*
regular-expression-character ::
source-character but not ( / | \ )
| \\
| line-terminator-escape-sequence
| interpolated-character-sequence
regular-expression-option ::
i | m
Within an expanded-literal-string, a literal-beginning-delimiter shall be the same character as the literal-beginning-delimiter of a regular-expression-literal.
The literal-ending-delimiter shall match the literal-beginning-delimiter as described in §8.5.5.2.3.
If a regular-expression-literal of the form / regular-expression-body / regular-expression-option*
is the first argument (see §11.2.1), the first character of the regular-expression-body shall not be whitespace.
Semantics
A regular-expression-literal evaluates to a direct instance of the class Regexp.
The pattern of an instance of the class Regexp resulting from a regular-expression-literal is the string which regular-expression-characters or expanded-literal-strings represent. If the string cannot be derived from the pattern (see §15.2.15.3), the evaluation of the program shall be terminated and a syntax error shall be reported.
A regular-expression-character other than the sequence \\, a line-terminator-escape-sequence, or interpolated-character-sequence represents themselves. A expanded-literal-string other than a line-terminator-escape-sequence or interpolated-character-sequence represents themselves.
The sequence \\ of regular-expression-character represents a single character \.
A line-terminator-escape-sequence in a regular-expression-character and an expanded-literal-string is ignored in the resulting pattern of an instance of the class Regexp.
An interpolated-character-sequence in a regular-expression-literal and an expanded-literal-string is evaluated as described in §8.5.5.2.2, and represents a string which is the content of the resulting an instance of the class String.
A regular-expression-option specifies the ignorecase and the multiline properties of an instance of the class Regexp resulting from a regular-expression-literal. If i occurs in a regular-expression-option, the ignorecase property of the resulting instance of the class Regexp is set to true. If m occurs in a regular-expression-option, the multiline property of the resulting instance of the class Regexp is set to true.
The grammar for a pattern of an instance of the class Regexp created from a regular-expression-literal is described in §15.2.15.
8.5.5.5 Symbol literals
Syntax
symbol ::
symbol-literal
| dynamic-symbol
symbol-literal ::
: symbol-name
dynamic-symbol ::
: single-quoted-string
| : double-quoted-string
| %s literal-beginning-delimiter non-expanded-literal-string* literal-ending-delimiter
symbol-name ::
method-identifier
| operator-method-name
| reserved-word
| instance-variable-identifier
| global-variable-identifier
| class-variable-identifier
single-quoted-strings, double-quoted-strings, and non-expanded-literal-strings shall not contain any sequences which represent the character 0x00.
Within a non-expanded-literal-string, literal-beginning-delimiter shall be the same character as the literal-beginning-delimiter of the dynamic-symbol.
The literal-ending-delimiter shall match the literal-beginning-delimiter as described in §8.5.5.2.3.
Semantics
A symbol evaluates to a direct instance of the class Symbol. A symbol-literal evaluates to a direct instance of the class Symbol whose name is the symbol-name. A dynamic-symbol evaluates to a direct instance of the class Symbol whose name is the content of an instance of the class String which is the value of the single-quoted-string (see §8.5.5.2.1), double-quoted-string (see §8.5.5.2.2), or non-expanded-literal-string (see §8.5.5.2.3).
Previous: 7. Execution context Next: 9. Scope of variables