| 1 | ---
|
| 2 | title: YSH Expression Language (Oils Reference)
|
| 3 | all_docs_url: ..
|
| 4 | body_css_class: width40
|
| 5 | default_highlighter: oils-sh
|
| 6 | preserve_anchor_case: yes
|
| 7 | ---
|
| 8 |
|
| 9 | <div class="doc-ref-header">
|
| 10 |
|
| 11 | [Oils Reference](index.html) —
|
| 12 | Chapter **YSH Expression Language**
|
| 13 |
|
| 14 | </div>
|
| 15 |
|
| 16 | This chapter describes the YSH expression language, which includes [Egg
|
| 17 | Expressions]($xref:eggex).
|
| 18 |
|
| 19 | <div id="dense-toc">
|
| 20 | </div>
|
| 21 |
|
| 22 | ## Assignment
|
| 23 |
|
| 24 | ### assign
|
| 25 |
|
| 26 | The `=` operator is used with declaration and assignment keywords:
|
| 27 |
|
| 28 | var x = 42
|
| 29 | setvar x = 43
|
| 30 |
|
| 31 | const name = 'bob'
|
| 32 |
|
| 33 | setglobal a = [3, 4, 5]
|
| 34 |
|
| 35 | ### aug-assign
|
| 36 |
|
| 37 | The augmented assignment operators are:
|
| 38 |
|
| 39 | ```raw
|
| 40 | += -= *= /= **= //= %=
|
| 41 | &= |= ^= <<= >>=
|
| 42 | ```
|
| 43 |
|
| 44 | They are used with `setvar` and `setglobal`. For example, these two statements
|
| 45 | are the same:
|
| 46 |
|
| 47 | setvar x += 2
|
| 48 | setvar x = x + 2
|
| 49 |
|
| 50 | Likewise, these two are also the same:
|
| 51 |
|
| 52 | setglobal a[2] -= 5
|
| 53 | setglobal a[2] = a[2] - 5
|
| 54 |
|
| 55 | ## Literals
|
| 56 |
|
| 57 | ### atom-literal
|
| 58 |
|
| 59 | YSH uses JavaScript-like spellings for these three "atoms":
|
| 60 |
|
| 61 | ```raw
|
| 62 | null # type Null
|
| 63 | true false # type Bool
|
| 64 | ```
|
| 65 |
|
| 66 | Note: to signify "no value", you may sometimes use an empty string `''`,
|
| 67 | instead of `null`.
|
| 68 |
|
| 69 | - Related: [Null][] type, [Bool][] type
|
| 70 |
|
| 71 | [Null]: chap-type-method.html#Null
|
| 72 | [Bool]: chap-type-method.html#Bool
|
| 73 |
|
| 74 | ### int-literal
|
| 75 |
|
| 76 | There are several ways to write integers. Examples:
|
| 77 |
|
| 78 | var decimal = 42
|
| 79 | var big = 42_000
|
| 80 |
|
| 81 | var hex = 0x0010_ffff
|
| 82 |
|
| 83 | var octal = 0o755
|
| 84 |
|
| 85 | var binary = 0b0001_0000
|
| 86 |
|
| 87 | - Related: [Int][] type
|
| 88 |
|
| 89 | [Int]: chap-type-method.html#Int
|
| 90 |
|
| 91 | ### float-literal
|
| 92 |
|
| 93 | Floating point numbers looke like C, Python, or JavaScript:
|
| 94 |
|
| 95 | var myfloat = 3.14
|
| 96 |
|
| 97 | var f2 = -1.5e-100
|
| 98 |
|
| 99 | - Related: [Float][] type
|
| 100 |
|
| 101 | [Float]: chap-type-method.html#Float
|
| 102 |
|
| 103 | ### char-literal
|
| 104 |
|
| 105 | The expression language has 3 kinds of backslash escapes, denoting bytes or
|
| 106 | UTF-8:
|
| 107 |
|
| 108 | var backslash = \\
|
| 109 | var quotes = \' ++ \" # same as u'\'' ++ '"'
|
| 110 |
|
| 111 | var mu = \u{3bc} # same as u'\u{3bc}'
|
| 112 |
|
| 113 | var nul = \y00 # same as b'\y00'
|
| 114 |
|
| 115 | Notice that this is the same syntax that's available within quoted J8 strings.
|
| 116 | That is, the expression `\\` denotes the same thing as `u'\\'`.
|
| 117 |
|
| 118 | - Related: [Str][] type
|
| 119 |
|
| 120 | [Str]: chap-type-method.html#Str
|
| 121 |
|
| 122 | ### ysh-string
|
| 123 |
|
| 124 | YSH has single and double-quoted strings borrowed from Bourne shell, and
|
| 125 | C-style strings borrowed from J8 Notation.
|
| 126 |
|
| 127 | Double quoted strings respect `$` interpolation:
|
| 128 |
|
| 129 | var dq = "hello $name and $(hostname)"
|
| 130 |
|
| 131 | You can add a `$` before the left quote to be explicit: `$"x is $x"` rather
|
| 132 | than `"x is $x"`.
|
| 133 |
|
| 134 | Single quoted strings may be raw:
|
| 135 |
|
| 136 | var s = r'line\n' # raw string means \n is literal, NOT a newline
|
| 137 |
|
| 138 | Or *J8 strings* with backslash escapes:
|
| 139 |
|
| 140 | var s = u'line\n \u{3bc}' # unicode string means \n is a newline
|
| 141 | var s = b'line\n \u{3bc} \yff' # same thing, but also allows bytes
|
| 142 |
|
| 143 | Both `u''` and `b''` strings evaluate to the single `Str` type. The difference
|
| 144 | is that `b''` strings allow the `\yff` byte escape.
|
| 145 |
|
| 146 | #### Notes
|
| 147 |
|
| 148 | There's no way to express a single quote in raw strings. Use one of the other
|
| 149 | forms instead:
|
| 150 |
|
| 151 | var sq = "single quote: ' "
|
| 152 | var sq = u'single quote: \' '
|
| 153 |
|
| 154 | Sometimes you can omit the `r`, e.g. where there are no backslashes and thus no
|
| 155 | ambiguity:
|
| 156 |
|
| 157 | echo 'foo'
|
| 158 | echo r'foo' # same thing
|
| 159 |
|
| 160 | The `u''` and `b''` strings are called *J8 strings* because the syntax in YSH
|
| 161 | **code** matches JSON-like **data**.
|
| 162 |
|
| 163 | var strU = u'mu = \u{3bc}' # J8 string with escapes
|
| 164 | var strB = b'bytes \yff' # J8 string that can express byte strings
|
| 165 |
|
| 166 | More examples:
|
| 167 |
|
| 168 | var myRaw = r'[a-z]\n' # raw strings can be used for regexes (not
|
| 169 | # eggexes)
|
| 170 |
|
| 171 | ### triple-quoted
|
| 172 |
|
| 173 | Triple-quoted string literals have leading whitespace stripped on each line.
|
| 174 | They come in the same variants:
|
| 175 |
|
| 176 | var dq = """
|
| 177 | hello $name and $(hostname)
|
| 178 | no leading whitespace
|
| 179 | """
|
| 180 |
|
| 181 | var myRaw = r'''
|
| 182 | raw string
|
| 183 | no leading whitespace
|
| 184 | '''
|
| 185 |
|
| 186 | var strU = u'''
|
| 187 | string that happens to be unicode \u{3bc}
|
| 188 | no leading whitespace
|
| 189 | '''
|
| 190 |
|
| 191 | var strB = b'''
|
| 192 | string that happens to be bytes \u{3bc} \yff
|
| 193 | no leading whitespace
|
| 194 | '''
|
| 195 |
|
| 196 | Again, you can omit the `r` prefix if there's no backslash, because it's not
|
| 197 | ambiguous:
|
| 198 |
|
| 199 | var myRaw = '''
|
| 200 | raw string
|
| 201 | no leading whitespace
|
| 202 | '''
|
| 203 |
|
| 204 | [Expr]: chap-type-method.html#Expr
|
| 205 |
|
| 206 | ### list-literal
|
| 207 |
|
| 208 | Lists have a Python-like syntax:
|
| 209 |
|
| 210 | var mylist = ['one', 'two', [42, 43]]
|
| 211 |
|
| 212 | And a shell-like syntax:
|
| 213 |
|
| 214 | var list2 = :| one two |
|
| 215 |
|
| 216 | The shell-like syntax accepts the same syntax as a simple command:
|
| 217 |
|
| 218 | echo $name @ARGV *.py {foo,bar}@example.com
|
| 219 |
|
| 220 | # Rather than executing echo, evaluate words into a List
|
| 221 | var cmd = :| echo $name @ARGV *.py {foo,bar}@example.com |
|
| 222 |
|
| 223 | - Related: [List][] type
|
| 224 |
|
| 225 | [List]: chap-type-method.html#List
|
| 226 |
|
| 227 | ### dict-literal
|
| 228 |
|
| 229 | Dicts look like JavaScript.
|
| 230 |
|
| 231 | var d = {
|
| 232 | key1: 'value', # key can be unquoted if it looks like a var name
|
| 233 | 'key2': 42, # or quote it
|
| 234 |
|
| 235 | ['key2' ++ name]: 43, # bracketed expression
|
| 236 | }
|
| 237 |
|
| 238 | Omitting a value means that the corresponding key takes the value of a var of
|
| 239 | the same name:
|
| 240 |
|
| 241 | var x = 42
|
| 242 | var y = 43
|
| 243 |
|
| 244 | var d = {x, y} # values omitted
|
| 245 | = d # => (Dict) {x: 42, y: 43}
|
| 246 |
|
| 247 | - Related: [Dict][] type
|
| 248 |
|
| 249 | [Dict]: chap-type-method.html#Dict
|
| 250 |
|
| 251 | ### range
|
| 252 |
|
| 253 | A Range is a sequence of numbers that can be iterated over. The `..<` operator
|
| 254 | constructs half-open ranges.
|
| 255 |
|
| 256 | for i in (0 ..< 3) {
|
| 257 | echo $i
|
| 258 | }
|
| 259 | # => 0
|
| 260 | # => 1
|
| 261 | # => 2
|
| 262 |
|
| 263 | The `..=` operator constructs closed ranges:
|
| 264 |
|
| 265 | for i in (0 ..= 3) {
|
| 266 | echo $i
|
| 267 | }
|
| 268 | # => 0
|
| 269 | # => 1
|
| 270 | # => 2
|
| 271 | # => 3
|
| 272 |
|
| 273 | - Related: [Range][] type
|
| 274 |
|
| 275 | [Range]: chap-type-method.html#Range
|
| 276 |
|
| 277 | ### block-expr
|
| 278 |
|
| 279 | In YSH expressions, we use `^()` to create a [Command][] object:
|
| 280 |
|
| 281 | var myblock = ^(echo $PWD; ls *.txt)
|
| 282 |
|
| 283 | It's more common for [Command][] objects to be created with block arguments,
|
| 284 | which are not expressions:
|
| 285 |
|
| 286 | cd /tmp {
|
| 287 | echo $PWD
|
| 288 | ls *.txt
|
| 289 | }
|
| 290 |
|
| 291 | [Command]: chap-type-method.html#Command
|
| 292 |
|
| 293 | ### expr-literal
|
| 294 |
|
| 295 | An expression literal is an object that holds an unevaluated expression:
|
| 296 |
|
| 297 | var myexpr = ^[1 + 2*3]
|
| 298 |
|
| 299 | - Related: [Expr][] type
|
| 300 |
|
| 301 | [Expr]: chap-type-method.html#Expr
|
| 302 |
|
| 303 | ### str-template
|
| 304 |
|
| 305 | String templates use the same syntax as double-quoted strings:
|
| 306 |
|
| 307 | var mytemplate = ^"name = $name, age = $age"
|
| 308 |
|
| 309 | Related topics:
|
| 310 |
|
| 311 | - The type of a template is [Expr][].
|
| 312 | - [Str.replace](chap-type-method.html#replace)
|
| 313 | - [ysh-string](#ysh-string)
|
| 314 |
|
| 315 | ### expr-sub
|
| 316 |
|
| 317 | Turn an expression into a string.
|
| 318 |
|
| 319 | var x = $[3 * 2]
|
| 320 | = x # => (Str) '6'
|
| 321 |
|
| 322 | This is the same as [Word Language > expr-sub](chap-word-lang.html#expr-sub).
|
| 323 |
|
| 324 | ### expr-splice
|
| 325 |
|
| 326 | Turns each element of a List into a string.
|
| 327 |
|
| 328 | var mylist = [42, 43]
|
| 329 | var x = @[mylist]
|
| 330 | = x # => (List) ['42', '43']
|
| 331 |
|
| 332 | This is the same as [Word Language > expr-splice](chap-word-lang.html#expr-splice).
|
| 333 |
|
| 334 | ## Operators
|
| 335 |
|
| 336 | ### op-precedence
|
| 337 |
|
| 338 | YSH operator precedence is identical to Python's operator precedence.
|
| 339 |
|
| 340 | New operators:
|
| 341 |
|
| 342 | - `++` has the same precedence as `+`
|
| 343 | - `->` and `=>` have the same precedence as `.`
|
| 344 |
|
| 345 | <!-- TODO: show grammar -->
|
| 346 |
|
| 347 |
|
| 348 | <h3 id="concat">concat <code>++</code></h3>
|
| 349 |
|
| 350 | The concatenation operator works on `Str` objects:
|
| 351 |
|
| 352 | var s = 'hello'
|
| 353 | var t = s ++ ' world'
|
| 354 |
|
| 355 | = t # => (Str) "hello world"
|
| 356 |
|
| 357 | and `List` objects:
|
| 358 |
|
| 359 | var L = ['one', 'two']
|
| 360 | var M = L ++ ['three', '4']
|
| 361 |
|
| 362 | = M # => (List) ["one", "two", "three", "4"]
|
| 363 |
|
| 364 | and `Dict` objects:
|
| 365 |
|
| 366 | var d = {a: 2, b: 3}
|
| 367 | var other = {a: 42}
|
| 368 | = d ++ other # => (Dict) {a: 42, b:3}
|
| 369 | = d # => (Dict) {a: 2, b:3}
|
| 370 |
|
| 371 | String interpolation can be nicer than `++`:
|
| 372 |
|
| 373 | var t2 = "${s} world" # same as t
|
| 374 |
|
| 375 | Likewise, splicing lists can be nicer:
|
| 376 |
|
| 377 | var M2 = :| @L three 4 | # same as M
|
| 378 |
|
| 379 | ### ysh-equals
|
| 380 |
|
| 381 | YSH has strict equality:
|
| 382 |
|
| 383 | ```raw
|
| 384 | a === b # Python-like, without type conversion
|
| 385 | a !== b # negated
|
| 386 | ```
|
| 387 |
|
| 388 | And type converting equality:
|
| 389 |
|
| 390 | ```raw
|
| 391 | '3' ~== 3 # True, type conversion
|
| 392 | ```
|
| 393 |
|
| 394 | The `~==` operator expects a string as the left operand.
|
| 395 |
|
| 396 | ---
|
| 397 |
|
| 398 | Note that:
|
| 399 |
|
| 400 | - `3 === 3.0` is false because integers and floats are different types, and
|
| 401 | there is no type conversion.
|
| 402 | - `3 ~== 3.0` is an error, because the left operand isn't a string.
|
| 403 |
|
| 404 | You may want to use explicit `int()` and `float()` to convert numbers, and then
|
| 405 | compare them.
|
| 406 |
|
| 407 | ---
|
| 408 |
|
| 409 | Compare objects for identity with `is`:
|
| 410 |
|
| 411 | var d = {}
|
| 412 | var e = d
|
| 413 |
|
| 414 | = d is d # => (Bool) true
|
| 415 |
|
| 416 | = d is {other: 'dict'} # => (Bool) false
|
| 417 |
|
| 418 | To negate `is`, use `is not` (like Python):
|
| 419 |
|
| 420 | = d is not {other: 'dict'} # => (Bool) true
|
| 421 |
|
| 422 | ### ysh-in
|
| 423 |
|
| 424 | The `in` operator tests if a key is in a dictionary:
|
| 425 |
|
| 426 | var d = {k: 42}
|
| 427 | if ('k' in d) {
|
| 428 | echo yes
|
| 429 | } # => yes
|
| 430 |
|
| 431 | Unlike Python, `in` doesn't work on `Str` and `List` instances. This because
|
| 432 | those operations take linear time rather than constant time (O(n) rather than
|
| 433 | O(1)).
|
| 434 |
|
| 435 | TODO: Use `includes() / contains()` methods instead.
|
| 436 |
|
| 437 | ### ysh-compare
|
| 438 |
|
| 439 | The comparison operators apply to integers or floats:
|
| 440 |
|
| 441 | = 4 < 4 # => false
|
| 442 | = 4 <= 4 # => true
|
| 443 |
|
| 444 | = 5.0 > 5.0 # => false
|
| 445 | = 5.0 >= 5.0 # => true
|
| 446 |
|
| 447 | Example in context:
|
| 448 |
|
| 449 | var x = -42
|
| 450 | if (x < 0) {
|
| 451 | echo 'x is negative'
|
| 452 | }
|
| 453 |
|
| 454 | ### ysh-logical
|
| 455 |
|
| 456 | The logical operators take boolean operands, and are spelled like Python:
|
| 457 |
|
| 458 | ```raw
|
| 459 | not
|
| 460 | and or
|
| 461 | ```
|
| 462 |
|
| 463 | Note that they are distinct from `! && ||`, which are part of the [command
|
| 464 | language](chap-cmd-lang.html).
|
| 465 |
|
| 466 | ### ysh-arith
|
| 467 |
|
| 468 | YSH supports most of the arithmetic operators from Python. Notably, `/` and `%`
|
| 469 | differ from Python as [they round toward zero, not negative
|
| 470 | infinity](https://www.oilshell.org/blog/2024/03/release-0.21.0.html#integers-dont-do-whatever-python-or-c-does).
|
| 471 |
|
| 472 | Use `+ - *` for `Int` or `Float` addition, subtraction and multiplication. If
|
| 473 | any of the operands are `Float`s, then the output will also be a `Float`.
|
| 474 |
|
| 475 | Use `/` and `//` for `Float` division and `Int` division, respectively. `/`
|
| 476 | will _always_ result in a `Float`, meanwhile `//` will _always_ result in an
|
| 477 | `Int`.
|
| 478 |
|
| 479 | = 1 / 2 # => (Float) 0.5
|
| 480 | = 1 // 2 # => (Int) 0
|
| 481 |
|
| 482 | Use `%` to compute the _remainder_ of integer division. The left operand must
|
| 483 | be an `Int` and the right a _positive_ `Int`.
|
| 484 |
|
| 485 | = 1 % 2 # -> (Int) 1
|
| 486 | = -4 % 2 # -> (Int) 0
|
| 487 |
|
| 488 | Use `**` for exponentiation. The left operand must be an `Int` and the right a
|
| 489 | _positive_ `Int`.
|
| 490 |
|
| 491 | All arithmetic operators may coerce either of their operands from strings to a
|
| 492 | number, provided those strings are formatted as numbers.
|
| 493 |
|
| 494 | = 10 + '1' # => (Int) 11
|
| 495 |
|
| 496 | Operators like `+ - * /` will coerce strings to _either_ an `Int` or `Float`.
|
| 497 |
|
| 498 | = '1.14' + '2' # => (Float) 3.14
|
| 499 |
|
| 500 | However, operators like `// ** %` and bit shifts will coerce strings _only_ to
|
| 501 | an `Int`.
|
| 502 |
|
| 503 | ```raw
|
| 504 | = '1.14' % '2' # Type Error: Left operand is a Str
|
| 505 | ```
|
| 506 |
|
| 507 | ### ysh-unary
|
| 508 |
|
| 509 | YSH has unary `+` and `-` operators:
|
| 510 |
|
| 511 | var x = '3.14'
|
| 512 | = +x # => (Float) 3.14
|
| 513 | = -x # => (Float) -3.14
|
| 514 |
|
| 515 | Like binary `+` and `-`, these operators coerce `Str` values with decimal
|
| 516 | digits to either an `Int` or `Float`.
|
| 517 |
|
| 518 | ### ysh-bitwise
|
| 519 |
|
| 520 | Bitwise operators are like Python and C:
|
| 521 |
|
| 522 | ```raw
|
| 523 | ~ # unary complement
|
| 524 |
|
| 525 | & | ^ # binary and, or, xor
|
| 526 |
|
| 527 | >> << # bit shift
|
| 528 | ```
|
| 529 |
|
| 530 | ### ysh-ternary
|
| 531 |
|
| 532 | The ternary operator is borrowed from Python:
|
| 533 |
|
| 534 | var display = 'yes' if len(s) else 'empty'
|
| 535 |
|
| 536 | ### ysh-index
|
| 537 |
|
| 538 | `Str` objects can be indexed by byte:
|
| 539 |
|
| 540 | var mystr = 'cat'
|
| 541 | = mystr[1] # => (Str) 'a'
|
| 542 |
|
| 543 | # index from the end
|
| 544 | = mystr[-1] # => (Str) 't'
|
| 545 |
|
| 546 | `List` objects:
|
| 547 |
|
| 548 | var mylist = [1, 2, 3]
|
| 549 | = mylist[2] # => (Int) 3
|
| 550 |
|
| 551 | `Dict` objects are indexed by string key:
|
| 552 |
|
| 553 | var mydict = {'key': 42}
|
| 554 | = mydict['key'] # => (Int) 42
|
| 555 |
|
| 556 | ### ysh-attr
|
| 557 |
|
| 558 | The `.` operator looks up values on either `Dict` or `Obj` instances.
|
| 559 |
|
| 560 | On dicts, it looks for the value associated with a key. That is, the
|
| 561 | expression `mydict.key` is short for `mydict['key']` (like JavaScript, but
|
| 562 | unlike Python.)
|
| 563 |
|
| 564 | ---
|
| 565 |
|
| 566 | On objects, the expression `obj.x` looks for attributes, with a special rule
|
| 567 | for bound methods. The rules are:
|
| 568 |
|
| 569 | 1. Search the properties of `obj` for a field named `x`.
|
| 570 | - If it exists, return the value literally. (It can be of any type: `Func`, `Int`,
|
| 571 | `Str`, ...)
|
| 572 | 2. Search up the prototype chain for a field named `x`.
|
| 573 | - If it exists, and is **not** a `Func`, return the value literally.
|
| 574 | - If it **is** a `Func`, return **bound method**, which is an (object,
|
| 575 | function) pair.
|
| 576 |
|
| 577 | Later, when the bound method is called, the object is passed as the first
|
| 578 | argument to the function (`self`), making it a method call. This is how a
|
| 579 | method has access to the object's properties.
|
| 580 |
|
| 581 | Example of first rule:
|
| 582 |
|
| 583 | func Free(i) {
|
| 584 | return (i + 1)
|
| 585 | }
|
| 586 | var module = Object(null, {Free})
|
| 587 | echo $[module.Free(42)] # => 43
|
| 588 |
|
| 589 | Example of second rule:
|
| 590 |
|
| 591 | func method(self, i) {
|
| 592 | return (self.n + i)
|
| 593 | }
|
| 594 | var methods = Object(null, {method})
|
| 595 | var obj = Object(methods, {n: 10})
|
| 596 | echo $[obj.method(42)] # => 52
|
| 597 |
|
| 598 | ### ysh-slice
|
| 599 |
|
| 600 | Slicing gives you a subsequence of a `Str` or `List`, as in Python.
|
| 601 |
|
| 602 | Negative indices are relative to the end.
|
| 603 |
|
| 604 | String example:
|
| 605 |
|
| 606 | var s = 'spam eggs'
|
| 607 | = s[1:-1] # => (Str) "pam egg"
|
| 608 |
|
| 609 | echo "x $[s[2:]]" # => x am eggs
|
| 610 |
|
| 611 | List example:
|
| 612 |
|
| 613 | var foods = ['ale', 'bean', 'corn']
|
| 614 | = foods[-2:] # => (List) ["bean","corn"]
|
| 615 |
|
| 616 | write -- @[foods[:2]]
|
| 617 | # => ale
|
| 618 | # => bean
|
| 619 |
|
| 620 | ### ysh-func-call
|
| 621 |
|
| 622 | Function calls are expressions:
|
| 623 |
|
| 624 | ```raw
|
| 625 | = f(x)
|
| 626 | echo $[f(x)] # expression sub
|
| 627 | ```
|
| 628 |
|
| 629 | Let's use this definition for examples:
|
| 630 |
|
| 631 | func myFunc(...args; ...kwargs) {
|
| 632 | return (args ++ keys(kwargs))
|
| 633 | }
|
| 634 |
|
| 635 | A semicolon `;` can be used after positional args and before named args:
|
| 636 |
|
| 637 | = myFunc('s', 't'; named=42)
|
| 638 |
|
| 639 | It isn't always required, because this works too:
|
| 640 |
|
| 641 | = myFunc('s', 't', named=42)
|
| 642 |
|
| 643 | In these cases, the `;` is necessary:
|
| 644 |
|
| 645 | var args = [3, 4]
|
| 646 | var kwargs = {name: 42}
|
| 647 |
|
| 648 | = myFunc(...args; ...kwargs)
|
| 649 |
|
| 650 | = myFunc(42, 43; ...kwargs)
|
| 651 |
|
| 652 | ### thin-arrow
|
| 653 |
|
| 654 | The thin arrow is for mutating methods:
|
| 655 |
|
| 656 | var mylist = ['bar']
|
| 657 | call mylist->pop()
|
| 658 |
|
| 659 | var mydict = {name: 'foo'}
|
| 660 | call mydict->erase('name')
|
| 661 |
|
| 662 | On `Obj` instances, `obj->mymethod` looks up the prototype chain for a function
|
| 663 | named `M/mymethod`. The `M/` prefix signals mutation.
|
| 664 |
|
| 665 | Example:
|
| 666 |
|
| 667 | func inc(self, n) {
|
| 668 | setvar self.i += n
|
| 669 | }
|
| 670 | var Counter_methods = Object(null, {'M/inc': inc})
|
| 671 | var c = Object(Counter_methods, {i: 0})
|
| 672 |
|
| 673 | call c->inc(5)
|
| 674 | echo $[c.i] # => 5
|
| 675 |
|
| 676 | It does **not** look in the properties of an object.
|
| 677 |
|
| 678 | ### fat-arrow
|
| 679 |
|
| 680 | The fat arrow is for function chaining:
|
| 681 |
|
| 682 | var x = myFunc() => list() => join()
|
| 683 |
|
| 684 | (Note: it also does method lookup like `s => startswith('y')`, but the `.`
|
| 685 | operator is usually preferred.)
|
| 686 |
|
| 687 | ### match-ops
|
| 688 |
|
| 689 | YSH has four pattern matching operators: `~ !~ ~~ !~~`.
|
| 690 |
|
| 691 | Does string match an **eggex**?
|
| 692 |
|
| 693 | var filename = 'x42.py'
|
| 694 | if (filename ~ / d+ /) {
|
| 695 | echo 'yes'
|
| 696 | } # => yes
|
| 697 |
|
| 698 | This performs a **search**. To change that, add `%start` or `%end` anchors to
|
| 699 | the pattern:
|
| 700 |
|
| 701 | if (filename ~ / %start d+ %end /) {
|
| 702 | echo 'yes'
|
| 703 | } # nothing printed
|
| 704 |
|
| 705 | ---
|
| 706 |
|
| 707 | Does a string match a POSIX regular expression (ERE syntax)?
|
| 708 |
|
| 709 | if (filename ~ '[[:digit:]]+') {
|
| 710 | echo 'number'
|
| 711 | }
|
| 712 |
|
| 713 | This is also a search, which can be anchored with `^` and `$`.
|
| 714 |
|
| 715 | ---
|
| 716 |
|
| 717 | Negate the result with the `!~` operator:
|
| 718 |
|
| 719 | if (filename !~ /space/ ) {
|
| 720 | echo 'no space'
|
| 721 | }
|
| 722 |
|
| 723 | if (filename !~ '[[:space:]]' ) {
|
| 724 | echo 'no space'
|
| 725 | }
|
| 726 |
|
| 727 | ---
|
| 728 |
|
| 729 | Does a string match a **glob**?
|
| 730 |
|
| 731 | if (filename ~~ '*.py') {
|
| 732 | echo 'Python'
|
| 733 | } # => Python
|
| 734 |
|
| 735 | if (filename !~~ '*.py') { # negation
|
| 736 | echo 'not Python'
|
| 737 | } # nothing printed
|
| 738 |
|
| 739 | Take care not to confuse glob patterns and regular expressions.
|
| 740 |
|
| 741 | For example, globs don't have `%start %end` or `^ $`. They are always
|
| 742 | "anchored".
|
| 743 |
|
| 744 | - Related doc: [YSH Regex API](../ysh-regex-api.html)
|
| 745 |
|
| 746 | ## Eggex
|
| 747 |
|
| 748 | ### re-literal
|
| 749 |
|
| 750 | An eggex literal looks like this:
|
| 751 |
|
| 752 | ```raw
|
| 753 | / expression ; flags ; translation preference /
|
| 754 | ```
|
| 755 |
|
| 756 | The flags and translation preference are both optional.
|
| 757 |
|
| 758 | Examples:
|
| 759 |
|
| 760 | var pat = / d+ / # => [[:digit:]]+
|
| 761 |
|
| 762 | You can specify flags passed to libc `regcomp()`:
|
| 763 |
|
| 764 | var pat = / d+ ; reg_icase reg_newline /
|
| 765 |
|
| 766 | You can specify a translation preference after a second semi-colon:
|
| 767 |
|
| 768 | var pat = / d+ ; ; ERE /
|
| 769 |
|
| 770 | Right now the translation preference does nothing. It could be used to
|
| 771 | translate eggex to PCRE or Python syntax.
|
| 772 |
|
| 773 | - Related doc: [Egg Expressions](../eggex.html)
|
| 774 |
|
| 775 | ### re-primitive
|
| 776 |
|
| 777 | There are two kinds of eggex primitives.
|
| 778 |
|
| 779 | "Zero-width assertions" match a position rather than a character:
|
| 780 |
|
| 781 | ```raw
|
| 782 | %start # translates to ^
|
| 783 | %end # translates to $
|
| 784 | ```
|
| 785 |
|
| 786 | Literal characters appear within **single** quotes:
|
| 787 |
|
| 788 | ```raw
|
| 789 | 'oh *really*' # translates to regex-escaped string
|
| 790 | ```
|
| 791 |
|
| 792 | Double-quoted strings are **not** eggex primitives. Instead, you can use
|
| 793 | splicing of strings:
|
| 794 |
|
| 795 | var dq = "hi $name"
|
| 796 | var eggex = / @dq /
|
| 797 |
|
| 798 | ### class-literal
|
| 799 |
|
| 800 | An eggex *character class literal* specifies a **set** of code points. It's
|
| 801 | enclosed in brackets:
|
| 802 |
|
| 803 | var vowels = / [a e i o u] / # A set of 5 vowels
|
| 804 |
|
| 805 | A class literal can have individual code points:
|
| 806 |
|
| 807 | ```raw
|
| 808 | [ a e i o u '?' '*' '+' ]
|
| 809 | ```
|
| 810 |
|
| 811 | It can also have ranges of code points, denoted with a hyphen:
|
| 812 |
|
| 813 | ```raw
|
| 814 | [ a-f A-F 0-9 ]
|
| 815 | ```
|
| 816 |
|
| 817 | To reduce the number of quotes, you can write a set of characters as a string:
|
| 818 |
|
| 819 | ```raw
|
| 820 | [ 'xyz' ] # any of 3 chars, NOT a sequence of 3 chars
|
| 821 | ```
|
| 822 |
|
| 823 | You can also use backslash escapes:
|
| 824 |
|
| 825 | ```raw
|
| 826 | [ \\ \' \" \0 ]
|
| 827 | [ \y7F \u{3bc} ] # a byte and a code point
|
| 828 |
|
| 829 | [ \y01 - \y7F ] # range of bytes
|
| 830 | [ \u{1} - \u{7F} ] # range of code points
|
| 831 | ```
|
| 832 |
|
| 833 | The `@` operator lets you refer to string variables:
|
| 834 |
|
| 835 | ```raw
|
| 836 | var str_var = 'xyz'
|
| 837 | [ @str_var ]
|
| 838 | ```
|
| 839 |
|
| 840 | Negation always uses `!`
|
| 841 |
|
| 842 | ```raw
|
| 843 | ![ a-f A-F 'xyz' @str_var ]
|
| 844 | ```
|
| 845 |
|
| 846 | ### re-chars
|
| 847 |
|
| 848 | Oils usually invokes `libc` in UTF-8 mode. In this mode, the regex engine
|
| 849 | can't match bytes like `0xFF`; it can only match code points.
|
| 850 |
|
| 851 | var x = / [ \y7F \u{3bc} ] / # a byte and a code point
|
| 852 |
|
| 853 | Oils translates Eggex to POSIX extended regex (ERE) syntax. Here are some
|
| 854 | restrictions when translating bytes and code points to ERE:
|
| 855 |
|
| 856 | - The `NUL` byte `\y00` isn't allowed.
|
| 857 | - Its synonym, code point zero `\u{0}`, also isn't allowed.
|
| 858 | - Bytes `\y80` to `\yFF` aren't allowed, because they're outside the ASCII
|
| 859 | range.
|
| 860 |
|
| 861 | Reminders:
|
| 862 |
|
| 863 | - In the ASCII range, bytes and code points are the same
|
| 864 | - That is, `\y01` to `\y7F` are synonyms for `\u{1}` to `\u{7F}`.
|
| 865 | - Outside of the ASCII range, they are different, so Eggex disallows them.
|
| 866 | - For example, `\u{FF}` is a code point, and `\yFF` is a byte, but they are
|
| 867 | not the same.
|
| 868 |
|
| 869 | ### named-class
|
| 870 |
|
| 871 | Perl-like shortcuts for sets of characters:
|
| 872 |
|
| 873 | ```raw
|
| 874 | [ dot ] # => .
|
| 875 | [ digit ] # => [[:digit:]]
|
| 876 | [ space ] # => [[:space:]]
|
| 877 | [ word ] # => [[:alpha:]][[:digit:]]_
|
| 878 | ```
|
| 879 |
|
| 880 | Abbreviations:
|
| 881 |
|
| 882 | ```raw
|
| 883 | [ d s w ] # Same as [ digit space word ]
|
| 884 | ```
|
| 885 |
|
| 886 | Valid POSIX classes:
|
| 887 |
|
| 888 | ```raw
|
| 889 | alnum cntrl lower space
|
| 890 | alpha digit print upper
|
| 891 | blank graph punct xdigit
|
| 892 | ```
|
| 893 |
|
| 894 | Negated:
|
| 895 |
|
| 896 | ```raw
|
| 897 | !digit !space !word
|
| 898 | !d !s !w
|
| 899 | !alnum # etc.
|
| 900 | ```
|
| 901 |
|
| 902 | ### re-repeat
|
| 903 |
|
| 904 | Eggex repetition looks like POSIX syntax:
|
| 905 |
|
| 906 | ```raw
|
| 907 | / 'a'? / # zero or one
|
| 908 | / 'a'* / # zero or more
|
| 909 | / 'a'+ / # one or more
|
| 910 | ```
|
| 911 |
|
| 912 | Counted repetitions:
|
| 913 |
|
| 914 | ```raw
|
| 915 | / 'a'{3} / # exactly 3 repetitions
|
| 916 | / 'a'{2,4} / # between 2 to 4 repetitions
|
| 917 | ```
|
| 918 |
|
| 919 | ### re-compound
|
| 920 |
|
| 921 | Sequence expressions with a space:
|
| 922 |
|
| 923 | ```raw
|
| 924 | / word digit digit / # Matches 3 characters in sequence
|
| 925 | # Examples: a42, b51
|
| 926 | ```
|
| 927 |
|
| 928 | (Compare `/ [ word digit ] /`, which is a set matching 1 character.)
|
| 929 |
|
| 930 | Alternation with `|`:
|
| 931 |
|
| 932 | ```raw
|
| 933 | / word | digit / # Matches 'a' OR '9', for example
|
| 934 | ```
|
| 935 |
|
| 936 | Grouping with parentheses:
|
| 937 |
|
| 938 | ```raw
|
| 939 | / (word digit) | \\ / # Matches a9 or \
|
| 940 | ```
|
| 941 |
|
| 942 | ### re-capture
|
| 943 |
|
| 944 | To retrieve a substring of a string that matches an Eggex, use a "capture
|
| 945 | group" like `<capture ...>`.
|
| 946 |
|
| 947 | Here's an eggex with a **positional** capture:
|
| 948 |
|
| 949 | var pat = / 'hi ' <capture d+> / # access with _group(1)
|
| 950 | # or Match.group(1)
|
| 951 |
|
| 952 | Captures can be **named**:
|
| 953 |
|
| 954 | ```raw
|
| 955 | <capture d+ as month> # access with _group('month')
|
| 956 | # or Match.group('month')
|
| 957 | ```
|
| 958 |
|
| 959 | Captures can also have a type **conversion func**:
|
| 960 |
|
| 961 | ```raw
|
| 962 | <capture d+ : int> # _group(1) returns Int
|
| 963 |
|
| 964 | <capture d+ as month: int> # _group('month') returns Int
|
| 965 | ```
|
| 966 |
|
| 967 | Related docs and help topics:
|
| 968 |
|
| 969 | - [YSH Regex API](../ysh-regex-api.html)
|
| 970 | - [`_group()`](chap-builtin-func.html#_group)
|
| 971 | - [`Match.group()`](chap-type-method.html#group)
|
| 972 |
|
| 973 | ### re-splice
|
| 974 |
|
| 975 | To build an eggex out of smaller expressions, you can **splice** eggexes
|
| 976 | together:
|
| 977 |
|
| 978 | var D = / [0-9][0-9] /
|
| 979 | var time = / @D ':' @D / # [0-9][0-9]:[0-9][0-9]
|
| 980 |
|
| 981 | If the variable begins with a capital letter, you can omit `@`:
|
| 982 |
|
| 983 | var ip = / D ':' D /
|
| 984 |
|
| 985 | You can also splice a string:
|
| 986 |
|
| 987 | var greeting = 'hi'
|
| 988 | var pat = / @greeting ' world' / # hi world
|
| 989 |
|
| 990 | Splicing is **not** string concatenation; it works on eggex subtrees.
|
| 991 |
|
| 992 | ### re-flags
|
| 993 |
|
| 994 | Valid ERE flags, which are passed to libc's `regcomp()`:
|
| 995 |
|
| 996 | - `reg_icase` aka `i` - ignore case
|
| 997 | - `reg_newline` - 4 matching changes related to newlines
|
| 998 |
|
| 999 | See `man regcomp`.
|
| 1000 |
|
| 1001 | ### re-multiline
|
| 1002 |
|
| 1003 | Multi-line eggexes aren't yet implemented. Splicing makes it less necessary:
|
| 1004 |
|
| 1005 | var Name = / <capture [a-z]+ as name> /
|
| 1006 | var Num = / <capture d+ as num> /
|
| 1007 | var Space = / <capture s+ as space> /
|
| 1008 |
|
| 1009 | # For variables named like CapWords, splicing @Name doesn't require @
|
| 1010 | var lexer = / Name | Num | Space /
|