OILS / doc / ref / chap-expr-lang.md View on Github | oils.pub

956 lines, 588 significant
1---
2title: YSH Expression Language (Oils Reference)
3all_docs_url: ..
4body_css_class: width40
5default_highlighter: oils-sh
6preserve_anchor_case: yes
7---
8
9<div class="doc-ref-header">
10
11[Oils Reference](index.html) &mdash;
12Chapter **YSH Expression Language**
13
14</div>
15
16This chapter describes the YSH expression language, which includes [Egg
17Expressions]($xref:eggex).
18
19<div id="dense-toc">
20</div>
21
22## Assignment
23
24### assign
25
26The `=` operator is used with assignment keywords:
27
28 var x = 42
29 setvar x = 43
30
31 const y = 'k'
32
33 setglobal z = 'g'
34
35### aug-assign
36
37The augmented assignment operators are:
38
39 += -= *= /= **= //= %=
40 &= |= ^= <<= >>=
41
42They are used with `setvar` and `setglobal`. For example:
43
44 setvar x += 2
45
46is the same as:
47
48 setvar x = x + 2
49
50Likewise, these are the same:
51
52 setglobal a[i] -= 1
53
54 setglobal a[i] = a[i] - 1
55
56## Literals
57
58### atom-literal
59
60YSH uses JavaScript-like spellings for these three "atoms":
61
62 null # type Null
63 true false # type Bool
64
65Note: to signify "no value", you may sometimes use an empty string `''`,
66instead of `null`.
67
68- Related: [Null][] type, [Bool][] type
69
70[Null]: chap-type-method.html#Null
71[Bool]: chap-type-method.html#Bool
72
73### int-literal
74
75There are several ways to write integers. Examples:
76
77 var decimal = 42
78 var big = 42_000
79
80 var hex = 0x0010_ffff
81
82 var octal = 0o755
83
84 var binary = 0b0001_0000
85
86- Related: [Int][] type
87
88[Int]: chap-type-method.html#Int
89
90### float-literal
91
92Floating point numbers looke like C, Python, or JavaScript:
93
94 var myfloat = 3.14
95
96 var f2 = -1.5e-100
97
98- Related: [Float][] type
99
100[Float]: chap-type-method.html#Float
101
102### char-literal
103
104The expression language has 3 kinds of backslash escapes, denoting bytes or
105UTF-8:
106
107 var backslash = \\
108 var quotes = \' ++ \" # same as u'\'' ++ '"'
109
110 var mu = \u{3bc} # same as u'\u{3bc}'
111
112 var nul = \y00 # same as b'\y00'
113
114Notice that this is the same syntax that's available within quoted J8 strings.
115That is, the expression `\\` denotes the same thing as `u'\\'`.
116
117- Related: [Str][] type
118
119[Str]: chap-type-method.html#Str
120
121### ysh-string
122
123YSH has single and double-quoted strings borrowed from Bourne shell, and
124C-style strings borrowed from J8 Notation.
125
126Double quoted strings respect `$` interpolation:
127
128 var dq = "hello $world and $(hostname)"
129
130You can add a `$` before the left quote to be explicit: `$"x is $x"` rather
131than `"x is $x"`.
132
133Single quoted strings may be raw:
134
135 var s = r'line\n' # raw string means \n is literal, NOT a newline
136
137Or *J8 strings* with backslash escapes:
138
139 var s = u'line\n \u{3bc}' # unicode string means \n is a newline
140 var s = b'line\n \u{3bc} \yff' # same thing, but also allows bytes
141
142Both `u''` and `b''` strings evaluate to the single `Str` type. The difference
143is that `b''` strings allow the `\yff` byte escape.
144
145#### Notes
146
147There's no way to express a single quote in raw strings. Use one of the other
148forms instead:
149
150 var sq = "single quote: ' "
151 var sq = u'single quote: \' '
152
153Sometimes you can omit the `r`, e.g. where there are no backslashes and thus no
154ambiguity:
155
156 echo 'foo'
157 echo r'foo' # same thing
158
159The `u''` and `b''` strings are called *J8 strings* because the syntax in YSH
160**code** matches JSON-like **data**.
161
162 var strU = u'mu = \u{3bc}' # J8 string with escapes
163 var strB = b'bytes \yff' # J8 string that can express byte strings
164
165More examples:
166
167 var myRaw = r'[a-z]\n' # raw strings can be used for regexes (not
168 # eggexes)
169
170### triple-quoted
171
172Triple-quoted string literals have leading whitespace stripped on each line.
173They come in the same variants:
174
175 var dq = """
176 hello $world and $(hostname)
177 no leading whitespace
178 """
179
180 var myRaw = r'''
181 raw string
182 no leading whitespace
183 '''
184
185 var strU = u'''
186 string that happens to be unicode \u{3bc}
187 no leading whitespace
188 '''
189
190 var strB = b'''
191 string that happens to be bytes \u{3bc} \yff
192 no leading whitespace
193 '''
194
195Again, you can omit the `r` prefix if there's no backslash, because it's not
196ambiguous:
197
198 var myRaw = '''
199 raw string
200 no leading whitespace
201 '''
202
203[Expr]: chap-type-method.html#Expr
204
205### list-literal
206
207Lists have a Python-like syntax:
208
209 var mylist = ['one', 'two', [42, 43]]
210
211And a shell-like syntax:
212
213 var list2 = :| one two |
214
215The shell-like syntax accepts the same syntax as a simple command:
216
217 ls $mystr @ARGV *.py {foo,bar}@example.com
218
219 # Rather than executing ls, evaluate words into a List
220 var cmd = :| ls $mystr @ARGV *.py {foo,bar}@example.com |
221
222- Related: [List][] type
223
224[List]: chap-type-method.html#List
225
226### dict-literal
227
228Dicts look like JavaScript.
229
230 var d = {
231 key1: 'value', # key can be unquoted if it looks like a var name
232 'key2': 42, # or quote it
233
234 ['key2' ++ suffix]: 43, # bracketed expression
235 }
236
237Omitting a value means that the corresponding key takes the value of a var of
238the same name:
239
240 ysh$ var x = 42
241 ysh$ var y = 43
242
243 ysh$ var d = {x, y} # values omitted
244 ysh$ = d
245 (Dict) {x: 42, y: 43}
246
247- Related: [Dict][] type
248
249[Dict]: chap-type-method.html#Dict
250
251### range
252
253A Range is a sequence of numbers that can be iterated over. The `..<` operator
254constructs half-open ranges.
255
256 for i in (0 ..< 3) {
257 echo $i
258 }
259 => 0
260 => 1
261 => 2
262
263The `..=` operator constructs closed ranges:
264
265 for i in (0 ..= 3) {
266 echo $i
267 }
268 => 0
269 => 1
270 => 2
271 => 3
272
273- Related: [Range][] type
274
275[Range]: chap-type-method.html#Range
276
277### block-expr
278
279In YSH expressions, we use `^()` to create a [Command][] object:
280
281 var myblock = ^(echo $PWD; ls *.txt)
282
283It's more common for [Command][] objects to be created with block arguments,
284which are not expressions:
285
286 cd /tmp {
287 echo $PWD
288 ls *.txt
289 }
290
291[Command]: chap-type-method.html#Command
292
293### expr-literal
294
295An expression literal is an object that holds an unevaluated expression:
296
297 var myexpr = ^[1 + 2*3]
298
299- Related: [Expr][] type
300
301[Expr]: chap-type-method.html#Expr
302
303### str-template
304
305String templates use the same syntax as double-quoted strings:
306
307 var mytemplate = ^"name = $name, age = $age"
308
309Related topics:
310
311- The type of a template is [Expr][].
312- [Str.replace](chap-type-method.html#replace)
313- [ysh-string](#ysh-string)
314
315### expr-sub
316
317Turn an expression into a string.
318
319 $ var x = $[3 * 2]
320 $ = x
321 (Str) '6'
322
323This is the same as [Word Language > expr-sub](chap-word-lang.html#expr-sub).
324
325### expr-splice
326
327Turns each element of a List into a string.
328
329 $ var mylist = [42, 43]
330 $ var x = @[mylist]
331 $ = x
332 (List) ['42', '43']
333
334This is the same as [Word Language > expr-splice](chap-word-lang.html#expr-splice).
335
336## Operators
337
338### op-precedence
339
340YSH operator precedence is identical to Python's operator precedence.
341
342New operators:
343
344- `++` has the same precedence as `+`
345- `->` and `=>` have the same precedence as `.`
346
347<!-- TODO: show grammar -->
348
349
350<h3 id="concat">concat <code>++</code></h3>
351
352The concatenation operator works on `Str` objects:
353
354 ysh$ var s = 'hello'
355 ysh$ var t = s ++ ' world'
356
357 ysh$ = t
358 (Str) "hello world"
359
360and `List` objects:
361
362 ysh$ var L = ['one', 'two']
363 ysh$ var M = L ++ ['three', '4']
364
365 ysh$ = M
366 (List) ["one", "two", "three", "4"]
367
368and `Dict` objects:
369
370 var d = {a: 2, b: 3}
371 var other = {a: 42}
372 = d ++ other # => (Dict) {a: 42, b:3}
373 = d # => (Dict) {a: 2, b:3}
374
375String interpolation can be nicer than `++`:
376
377 var t2 = "${s} world" # same as t
378
379Likewise, splicing lists can be nicer:
380
381 var M2 = :| @L three 4 | # same as M
382
383### ysh-equals
384
385YSH has strict equality:
386
387 a === b # Python-like, without type conversion
388 a !== b # negated
389
390And type converting equality:
391
392 '3' ~== 3 # True, type conversion
393
394The `~==` operator expects a string as the left operand.
395
396---
397
398Note that:
399
400- `3 === 3.0` is false because integers and floats are different types, and
401 there is no type conversion.
402- `3 ~== 3.0` is an error, because the left operand isn't a string.
403
404You may want to use explicit `int()` and `float()` to convert numbers, and then
405compare them.
406
407---
408
409Compare objects for identity with `is`:
410
411 ysh$ var d = {}
412 ysh$ var e = d
413
414 ysh$ = d is d
415 (Bool) true
416
417 ysh$ = d is {other: 'dict'}
418 (Bool) false
419
420To negate `is`, use `is not` (like Python:
421
422 ysh$ d is not {other: 'dict'}
423 (Bool) true
424
425### ysh-in
426
427The `in` operator tests if a key is in a dictionary:
428
429 var d = {k: 42}
430 if ('k' in d) {
431 echo yes
432 } # => yes
433
434Unlike Python, `in` doesn't work on `Str` and `List` instances. This because
435those operations take linear time rather than constant time (O(n) rather than
436O(1)).
437
438TODO: Use `includes() / contains()` methods instead.
439
440### ysh-compare
441
442The comparison operators apply to integers or floats:
443
444 4 < 4 # => false
445 4 <= 4 # => true
446
447 5.0 > 5.0 # => false
448 5.0 >= 5.0 # => true
449
450Example in context:
451
452 if (x < 0) {
453 echo 'x is negative'
454 }
455
456### ysh-logical
457
458The logical operators take boolean operands, and are spelled like Python:
459
460 not
461 and or
462
463Note that they are distinct from `! && ||`, which are part of the [command
464language](chap-cmd-lang.html).
465
466### ysh-arith
467
468YSH supports most of the arithmetic operators from Python. Notably, `/` and `%`
469differ from Python as [they round toward zero, not negative
470infinity](https://www.oilshell.org/blog/2024/03/release-0.21.0.html#integers-dont-do-whatever-python-or-c-does).
471
472Use `+ - *` for `Int` or `Float` addition, subtraction and multiplication. If
473any of the operands are `Float`s, then the output will also be a `Float`.
474
475Use `/` and `//` for `Float` division and `Int` division, respectively. `/`
476will _always_ result in a `Float`, meanwhile `//` will _always_ result in an
477`Int`.
478
479 = 1 / 2 # => (Float) 0.5
480 = 1 // 2 # => (Int) 0
481
482Use `%` to compute the _remainder_ of integer division. The left operand must
483be an `Int` and the right a _positive_ `Int`.
484
485 = 1 % 2 # -> (Int) 1
486 = -4 % 2 # -> (Int) 0
487
488Use `**` for exponentiation. The left operand must be an `Int` and the right a
489_positive_ `Int`.
490
491All arithmetic operators may coerce either of their operands from strings to a
492number, provided those strings are formatted as numbers.
493
494 = 10 + '1' # => (Int) 11
495
496Operators like `+ - * /` will coerce strings to _either_ an `Int` or `Float`.
497However, operators like `// ** %` and bit shifts will coerce strings _only_ to
498an `Int`.
499
500 = '1.14' + '2' # => (Float) 3.14
501 = '1.14' % '2' # Type Error: Left operand is a Str
502
503### ysh-unary
504
505YSH has unary `+` and `-` operators:
506
507 var x = '3.14'
508 = +x # => (Float) 3.14
509 = -x # => (Float) -3.14
510
511Like binary `+` and `-`, these operators coerce `Str` values with decimal
512digits to either an `Int` or `Float`.
513
514### ysh-bitwise
515
516Bitwise operators are like Python and C:
517
518 ~ # unary complement
519
520 & | ^ # binary and, or, xor
521
522 >> << # bit shift
523
524### ysh-ternary
525
526The ternary operator is borrowed from Python:
527
528 display = 'yes' if len(s) else 'empty'
529
530### ysh-index
531
532`Str` objects can be indexed by byte:
533
534 ysh$ var s = 'cat'
535 ysh$ = mystr[1]
536 (Str) 'a'
537
538 ysh$ = mystr[-1] # index from the end
539 (Str) 't'
540
541`List` objects:
542
543 ysh$ var mylist = [1, 2, 3]
544 ysh$ = mylist[2]
545 (Int) 3
546
547`Dict` objects are indexed by string key:
548
549 ysh$ var mydict = {'key': 42}
550 ysh$ = mydict['key']
551 (Int) 42
552
553### ysh-attr
554
555The `.` operator looks up values on either `Dict` or `Obj` instances.
556
557On dicts, it looks for the value associated with a key. That is, the
558expression `mydict.key` is short for `mydict['key']` (like JavaScript, but
559unlike Python.)
560
561---
562
563On objects, the expression `obj.x` looks for attributes, with a special rule
564for bound methods. The rules are:
565
5661. Search the properties of `obj` for a field named `x`.
567 - If it exists, return the value literally. (It can be of any type: `Func`, `Int`,
568 `Str`, ...)
5692. Search up the prototype chain for a field named `x`.
570 - If it exists, and is **not** a `Func`, return the value literally.
571 - If it **is** a `Func`, return **bound method**, which is an (object,
572 function) pair.
573
574Later, when the bound method is called, the object is passed as the first
575argument to the function (`self`), making it a method call. This is how a
576method has access to the object's properties.
577
578Example of first rule:
579
580 func Free(i) {
581 return (i + 1)
582 }
583 var module = Object(null, {Free})
584 echo $[module.Free(42)] # => 43
585
586Example of second rule:
587
588 func method(self, i) {
589 return (self.n + i)
590 }
591 var methods = Object(null, {method})
592 var obj = Object(methods, {n: 10})
593 echo $[obj.method(42)] # => 52
594
595### ysh-slice
596
597Slicing gives you a subsequence of a `Str` or `List`, as in Python.
598
599Negative indices are relative to the end.
600
601String example:
602
603 $ var s = 'spam eggs'
604 $ pp (s[1:-1])
605 (Str) "pam egg"
606
607 $ echo "x $[s[2:]]"
608 x am eggs
609
610List example:
611
612 $ var foods = ['ale', 'bean', 'corn']
613 $ pp (foods[-2:])
614 (List) ["bean","corn"]
615
616 $ write -- @[foods[:2]]
617 ale
618 bean
619
620### ysh-func-call
621
622A function call expression looks like Python:
623
624 ysh$ = f('s', 't', named=42)
625
626A semicolon `;` can be used after positional args and before named args, but
627isn't always required:
628
629 ysh$ = f('s', 't'; named=42)
630
631In these cases, the `;` is necessary:
632
633 ysh$ = f(...args; ...kwargs)
634
635 ysh$ = f(42, 43; ...kwargs)
636
637### thin-arrow
638
639The thin arrow is for mutating methods:
640
641 var mylist = ['bar']
642 call mylist->pop()
643
644 var mydict = {name: 'foo'}
645 call mydict->erase('name')
646
647On `Obj` instances, `obj->mymethod` looks up the prototype chain for a function
648named `M/mymethod`. The `M/` prefix signals mutation.
649
650Example:
651
652 func inc(self, n) {
653 setvar self.i += n
654 }
655 var Counter_methods = Object(null, {'M/inc': inc})
656 var c = Object(Counter_methods, {i: 0})
657
658 call c->inc(5)
659 echo $[c.i] # => 5
660
661It does **not** look in the properties of an object.
662
663### fat-arrow
664
665The fat arrow is for function chaining:
666
667 var x = myFunc() => list() => join()
668
669(Note: it also does method lookup like `s => startswith('y')`, but the `.`
670operator is usually preferred.)
671
672### match-ops
673
674YSH has four pattern matching operators: `~ !~ ~~ !~~`.
675
676Does string match an **eggex**?
677
678 var filename = 'x42.py'
679 if (filename ~ / d+ /) {
680 echo 'yes'
681 } # => yes
682
683This performs a **search**. To change that, add `%start` or `%end` anchors to
684the pattern:
685
686 if (filename ~ / %start d+ %end /) {
687 echo 'yes'
688 } # nothing printed
689
690---
691
692Does a string match a POSIX regular expression (ERE syntax)?
693
694 if (filename ~ '[[:digit:]]+') {
695 echo 'number'
696 }
697
698This is also a search, which can be anchored with `^` and `$`.
699
700---
701
702Negate the result with the `!~` operator:
703
704 if (filename !~ /space/ ) {
705 echo 'no space'
706 }
707
708 if (filename !~ '[[:space:]]' ) {
709 echo 'no space'
710 }
711
712---
713
714Does a string match a **glob**?
715
716 if (filename ~~ '*.py') {
717 echo 'Python'
718 } # => Python
719
720 if (filename !~~ '*.py') { # negation
721 echo 'not Python'
722 } # nothing printed
723
724Take care not to confuse glob patterns and regular expressions.
725
726For example, globs don't have `%start %end` or `^ $`. They are always
727"anchored".
728
729- Related doc: [YSH Regex API](../ysh-regex-api.html)
730
731## Eggex
732
733### re-literal
734
735An eggex literal looks like this:
736
737 / expression ; flags ; translation preference /
738
739The flags and translation preference are both optional.
740
741Examples:
742
743 var pat = / d+ / # => [[:digit:]]+
744
745You can specify flags passed to libc `regcomp()`:
746
747 var pat = / d+ ; reg_icase reg_newline /
748
749You can specify a translation preference after a second semi-colon:
750
751 var pat = / d+ ; ; ERE /
752
753Right now the translation preference does nothing. It could be used to
754translate eggex to PCRE or Python syntax.
755
756- Related doc: [Egg Expressions](../eggex.html)
757
758### re-primitive
759
760There are two kinds of eggex primitives.
761
762"Zero-width assertions" match a position rather than a character:
763
764 %start # translates to ^
765 %end # translates to $
766
767Literal characters appear within **single** quotes:
768
769 'oh *really*' # translates to regex-escaped string
770
771Double-quoted strings are **not** eggex primitives. Instead, you can use
772splicing of strings:
773
774 var dq = "hi $name"
775 var eggex = / @dq /
776
777
778### class-literal
779
780An eggex *character class literal* specifies a **set** of code points. It's
781enclosed in brackets:
782
783 var vowels = / [a e i o u] / # A set of 5 vowels
784
785A class literal can have individual code points:
786
787 [ a e i o u '?' '*' '+' ]
788
789It can also have ranges of code points, denoted with a hyphen:
790
791 [ a-f A-F 0-9 ]
792
793To reduce the number of quotes, you can write a set of characters as a string:
794
795 [ 'xyz' ] # any of 3 chars, NOT a sequence of 3 chars
796
797You can also use backslash escapes:
798
799 [ \\ \' \" \0 ]
800 [ \y7F \u{3bc} ] # a byte and a code point
801
802 [ \y01 - \y7F ] # range of bytes
803 [ \u{1} - \u{7F} ] # range of code points
804
805The `@` operator lets you refer to string variables:
806
807 var str_var = 'xyz'
808 [ @str_var ]
809
810Negation always uses `!`
811
812 ![ a-f A-F 'xyz' @str_var ]
813
814### re-chars
815
816Oils usually invokes `libc` in UTF-8 mode. In this mode, the regex engine
817can't match bytes like `0xFF`; it can only match code points.
818
819 var x = / [ \y7F \u{3bc} ] / # a byte and a code point
820
821Oils translates Eggex to POSIX extended regex (ERE) syntax. Here are some
822restrictions when translating bytes and code points to ERE:
823
824- The `NUL` byte `\y00` isn't allowed.
825 - Its synonym, code point zero `\u{0}`, also isn't allowed.
826- Bytes `\y80` to `\yFF` aren't allowed, because they're outside the ASCII
827 range.
828
829Reminders:
830
831- In the ASCII range, bytes and code points are the same
832 - That is, `\y01` to `\y7F` are synonyms for `\u{1}` to `\u{7F}`.
833- Outside of the ASCII range, they are different, so Eggex disallows them.
834 - For example, `\u{FF}` is a code point, and `\yFF` is a byte, but they are
835 not the same.
836
837### named-class
838
839Perl-like shortcuts for sets of characters:
840
841 [ dot ] # => .
842 [ digit ] # => [[:digit:]]
843 [ space ] # => [[:space:]]
844 [ word ] # => [[:alpha:]][[:digit:]]_
845
846Abbreviations:
847
848 [ d s w ] # Same as [ digit space word ]
849
850Valid POSIX classes:
851
852 alnum cntrl lower space
853 alpha digit print upper
854 blank graph punct xdigit
855
856Negated:
857
858 !digit !space !word
859 !d !s !w
860 !alnum # etc.
861
862### re-repeat
863
864Eggex repetition looks like POSIX syntax:
865
866 / 'a'? / # zero or one
867 / 'a'* / # zero or more
868 / 'a'+ / # one or more
869
870Counted repetitions:
871
872 / 'a'{3} / # exactly 3 repetitions
873 / 'a'{2,4} / # between 2 to 4 repetitions
874
875### re-compound
876
877Sequence expressions with a space:
878
879 / word digit digit / # Matches 3 characters in sequence
880 # Examples: a42, b51
881
882(Compare `/ [ word digit ] /`, which is a set matching 1 character.)
883
884Alternation with `|`:
885
886 / word | digit / # Matches 'a' OR '9', for example
887
888Grouping with parentheses:
889
890 / (word digit) | \\ / # Matches a9 or \
891
892### re-capture
893
894To retrieve a substring of a string that matches an Eggex, use a "capture
895group" like `<capture ...>`.
896
897Here's an eggex with a **positional** capture:
898
899 var pat = / 'hi ' <capture d+> / # access with _group(1)
900 # or Match.group(1)
901
902Captures can be **named**:
903
904 <capture d+ as month> # access with _group('month')
905 # or Match.group('month')
906
907Captures can also have a type **conversion func**:
908
909 <capture d+ : int> # _group(1) returns Int
910
911 <capture d+ as month: int> # _group('month') returns Int
912
913Related docs and help topics:
914
915- [YSH Regex API](../ysh-regex-api.html)
916- [`_group()`](chap-builtin-func.html#_group)
917- [`Match.group()`](chap-type-method.html#group)
918
919### re-splice
920
921To build an eggex out of smaller expressions, you can **splice** eggexes
922together:
923
924 var D = / [0-9][0-9] /
925 var time = / @D ':' @D / # [0-9][0-9]:[0-9][0-9]
926
927If the variable begins with a capital letter, you can omit `@`:
928
929 var ip = / D ':' D /
930
931You can also splice a string:
932
933 var greeting = 'hi'
934 var pat = / @greeting ' world' / # hi world
935
936Splicing is **not** string concatenation; it works on eggex subtrees.
937
938### re-flags
939
940Valid ERE flags, which are passed to libc's `regcomp()`:
941
942- `reg_icase` aka `i` - ignore case
943- `reg_newline` - 4 matching changes related to newlines
944
945See `man regcomp`.
946
947### re-multiline
948
949Multi-line eggexes aren't yet implemented. Splicing makes it less necessary:
950
951 var Name = / <capture [a-z]+ as name> /
952 var Num = / <capture d+ as num> /
953 var Space = / <capture s+ as space> /
954
955 # For variables named like CapWords, splicing @Name doesn't require @
956 var lexer = / Name | Num | Space /