Hide minor edits - Show changes to markup
Notes on http://www.perl.com/lpt/a/2002/06/04/apo5.html Apocalypse 5
(...) capturing brackets [...] grouping brackets (non capturing) {...} closure (return value ignored unless assigned) <...> Assertion (does this thing match?) :... introduces a meta-syntactic token (a modifier)
$0 is the match object. Assigning to $0 affects the return value of the RE
Modifiers now come at the beginning of the RE statement and as such, the leading m or s is required. For example, s:i/foo/bar/ replaces "foo", "Foo", "FOO", etc with bar. The modifer :i means "ignore case".
| Long Modifier | Short Modifier | Meaning |
| :cont | :c | Continue from where the previous match left off |
| :words | :w | Match a sequence of "words". Causes whitespace (which is normally ignored) to be replaced by \s+ between identifier and \s* anywhere else |
| :ignorecase | :i | Match all alphabetic characters in a case insensitive manner |
| :any | :a | returns a list of anywhere the pattern matches within the string regardless of overlap |
| :each | :e | apply the pattern iteratively ("each" time we can) |
| :once | :o | Match succeeds exactly once. To allow the RE to match again, execute the .reset() method on the RE object. |
| :perl5 | :p5 | Perl 5 matching. Cause the RE to be interpreted using the Perl 5 rules |
| ? | :u0 | Level 0 Unicode support. A . (dot) matches bytes |
| ? | :u1 | Level 1 Unicode support. A . (dot) matches code points |
| ? | :u2 | Level 2 Unicode support. A . (dot) matches a grapheme |
| ? | :u3 | Level 3 Unicode support. What . (dot) matches is language dependent |
| :nth(1) | :1st | only match the first occurance of the pattern |
| :nth(2) | :2nd | only match the second occurance of the pattern |
| :nth(3) | :3rd | only match the third occurance of the pattern |
| :nth(4) | :4th | only match the fourth occurance of the pattern |
| :nth(5) | :5th | only match the fifth occurance of the pattern |
| ... | ... | ... |
| :x(1) | :1x | Match one time |
| :x(2) | :2x | Match two times |
| :x(3) | :3x | Match three times |
| ... | ... | ... |
| assertion | meaning |
| <[...]> | matches ... as a character class |
| <'...'> | matches ... as a literal string |
| <alpha> | matches any alphabetic character |
| <digit> | matches any numeric character |
| <sp> | matches a space character |
| <ws> | matches any sequences of whitespace (same as \s+) |
| <dot> | matches a literal . character (same as <'.'>) |
| <lt> | matches a litereal < character (same as <'<'>) |
| <gt> | matches a litereal > character (same as <'>'>) |
| <prior> | match whatever the most recently successful match did |
| <after pattern> | matches only after pattern (zero-width) |
| <before pattern> | matches only before pattern (zero-width) |
| <commit> | fails the entire match if backtracked to |
| <cut> | fails the entire match if backtracked to and removes the portion of the string that matched to that point |
| <fail> | causes the match to fail if reached |
| <null> | match nothing |
| <ident> | match an "identifier". (same as [ [<alpha>|_] \w* ]) |
| <self> | matches the same pattern as the current rule (useful for recursion) |
Describe Perl 6 Apocalypse 5 here.