OILS / doc / ref / chap-expr-lang.md View on Github | oilshell.org

873 lines, 535 significant
1---
2title: YSH Expression Language (Oils Reference)
3all_docs_url: ..
4body_css_class: width40
5default_highlighter: oils-sh
6preserve_anchor_case: yes
7---
8
9<div class="doc-ref-header">
10
11[Oils Reference](index.html) &mdash;
12Chapter **YSH Expression Language**
13
14</div>
15
16This chapter describes the YSH expression language, which includes [Egg
17Expressions]($xref:eggex).
18
19<div id="dense-toc">
20</div>
21
22## Assignment
23
24### assign
25
26The `=` operator is used with assignment keywords:
27
28 var x = 42
29 setvar x = 43
30
31 const y = 'k'
32
33 setglobal z = 'g'
34
35### aug-assign
36
37The augmented assignment operators are:
38
39 += -= *= /= **= //= %=
40 &= |= ^= <<= >>=
41
42They are used with `setvar` and `setglobal`. For example:
43
44 setvar x += 2
45
46is the same as:
47
48 setvar x = x + 2
49
50Likewise, these are the same:
51
52 setglobal a[i] -= 1
53
54 setglobal a[i] = a[i] - 1
55
56## Literals
57
58### atom-literal
59
60YSH uses JavaScript-like spellings for these three "atoms":
61
62 null # type Null
63 true false # type Bool
64
65Note: to signify "no value", you may sometimes use an empty string `''`,
66instead of `null`.
67
68- Related: [Null][] type, [Bool][] type
69
70[Null]: chap-type-method.html#Null
71[Bool]: chap-type-method.html#Bool
72
73### int-literal
74
75There are several ways to write integers. Examples:
76
77 var decimal = 42
78 var big = 42_000
79
80 var hex = 0x0010_ffff
81
82 var octal = 0o755
83
84 var binary = 0b0001_0000
85
86- Related: [Int][] type
87
88[Int]: chap-type-method.html#Int
89
90### float-literal
91
92Floating point numbers looke like C, Python, or JavaScript:
93
94 var myfloat = 3.14
95
96 var f2 = -1.5e-100
97
98- Related: [Float][] type
99
100[Float]: chap-type-method.html#Float
101
102### char-literal
103
104The expression language has 3 kinds of backslash escapes, denoting bytes or
105UTF-8:
106
107 var backslash = \\
108 var quotes = \' ++ \" # same as u'\'' ++ '"'
109
110 var mu = \u{3bc} # same as u'\u{3bc}'
111
112 var nul = \y00 # same as b'\y00'
113
114Notice that this is the same syntax that's available within quoted J8 strings.
115That is, the expression `\\` denotes the same thing as `u'\\'`.
116
117- Related: [Str][] type
118
119[Str]: chap-type-method.html#Str
120
121### ysh-string
122
123YSH has single and double-quoted strings borrowed from Bourne shell, and
124C-style strings borrowed from J8 Notation.
125
126Double quoted strings respect `$` interpolation:
127
128 var dq = "hello $world and $(hostname)"
129
130You can add a `$` before the left quote to be explicit: `$"x is $x"` rather
131than `"x is $x"`.
132
133Single quoted strings may be raw:
134
135 var s = r'line\n' # raw string means \n is literal, NOT a newline
136
137Or *J8 strings* with backslash escapes:
138
139 var s = u'line\n \u{3bc}' # unicode string means \n is a newline
140 var s = b'line\n \u{3bc} \yff' # same thing, but also allows bytes
141
142Both `u''` and `b''` strings evaluate to the single `Str` type. The difference
143is that `b''` strings allow the `\yff` byte escape.
144
145#### Notes
146
147There's no way to express a single quote in raw strings. Use one of the other
148forms instead:
149
150 var sq = "single quote: ' "
151 var sq = u'single quote: \' '
152
153Sometimes you can omit the `r`, e.g. where there are no backslashes and thus no
154ambiguity:
155
156 echo 'foo'
157 echo r'foo' # same thing
158
159The `u''` and `b''` strings are called *J8 strings* because the syntax in YSH
160**code** matches JSON-like **data**.
161
162 var strU = u'mu = \u{3bc}' # J8 string with escapes
163 var strB = b'bytes \yff' # J8 string that can express byte strings
164
165More examples:
166
167 var myRaw = r'[a-z]\n' # raw strings can be used for regexes (not
168 # eggexes)
169
170### triple-quoted
171
172Triple-quoted string literals have leading whitespace stripped on each line.
173They come in the same variants:
174
175 var dq = """
176 hello $world and $(hostname)
177 no leading whitespace
178 """
179
180 var myRaw = r'''
181 raw string
182 no leading whitespace
183 '''
184
185 var strU = u'''
186 string that happens to be unicode \u{3bc}
187 no leading whitespace
188 '''
189
190 var strB = b'''
191 string that happens to be bytes \u{3bc} \yff
192 no leading whitespace
193 '''
194
195Again, you can omit the `r` prefix if there's no backslash, because it's not
196ambiguous:
197
198 var myRaw = '''
199 raw string
200 no leading whitespace
201 '''
202
203[Expr]: chap-type-method.html#Expr
204
205### list-literal
206
207Lists have a Python-like syntax:
208
209 var mylist = ['one', 'two', [42, 43]]
210
211And a shell-like syntax:
212
213 var list2 = :| one two |
214
215The shell-like syntax accepts the same syntax as a simple command:
216
217 ls $mystr @ARGV *.py {foo,bar}@example.com
218
219 # Rather than executing ls, evaluate words into a List
220 var cmd = :| ls $mystr @ARGV *.py {foo,bar}@example.com |
221
222- Related: [List][] type
223
224[List]: chap-type-method.html#List
225
226### dict-literal
227
228Dicts look like JavaScript.
229
230 var d = {
231 key1: 'value', # key can be unquoted if it looks like a var name
232 'key2': 42, # or quote it
233
234 ['key2' ++ suffix]: 43, # bracketed expression
235 }
236
237Omitting a value means that the corresponding key takes the value of a var of
238the same name:
239
240 ysh$ var x = 42
241 ysh$ var y = 43
242
243 ysh$ var d = {x, y} # values omitted
244 ysh$ = d
245 (Dict) {x: 42, y: 43}
246
247- Related: [Dict][] type
248
249[Dict]: chap-type-method.html#Dict
250
251### range
252
253A Range is a sequence of numbers that can be iterated over. The `..<` operator
254constructs half-open ranges.
255
256 for i in (0 ..< 3) {
257 echo $i
258 }
259 => 0
260 => 1
261 => 2
262
263The `..=` operator constructs closed ranges:
264
265 for i in (0 ..= 3) {
266 echo $i
267 }
268 => 0
269 => 1
270 => 2
271 => 3
272
273- Related: [Range][] type
274
275[Range]: chap-type-method.html#Range
276
277### block-expr
278
279In YSH expressions, we use `^()` to create a [Command][] object:
280
281 var myblock = ^(echo $PWD; ls *.txt)
282
283It's more common for [Command][] objects to be created with block arguments,
284which are not expressions:
285
286 cd /tmp {
287 echo $PWD
288 ls *.txt
289 }
290
291[Command]: chap-type-method.html#Command
292
293### expr-literal
294
295An expression literal is an object that holds an unevaluated expression:
296
297 var myexpr = ^[1 + 2*3]
298
299- Related: [Expr][] type
300
301[Expr]: chap-type-method.html#Expr
302
303### str-template
304
305String templates use the same syntax as double-quoted strings:
306
307 var mytemplate = ^"name = $name, age = $age"
308
309Related topics:
310
311- The type of a template is [Expr][].
312- [Str.replace](chap-type-method.html#replace)
313- [ysh-string](#ysh-string)
314
315## Operators
316
317### op-precedence
318
319YSH operator precedence is identical to Python's operator precedence.
320
321New operators:
322
323- `++` has the same precedence as `+`
324- `->` and `=>` have the same precedence as `.`
325
326<!-- TODO: show grammar -->
327
328
329<h3 id="concat">concat <code>++</code></h3>
330
331The concatenation operator works on `Str` objects:
332
333 ysh$ var s = 'hello'
334 ysh$ var t = s ++ ' world'
335
336 ysh$ = t
337 (Str) "hello world"
338
339and `List` objects:
340
341 ysh$ var L = ['one', 'two']
342 ysh$ var M = L ++ ['three', '4']
343
344 ysh$ = M
345 (List) ["one", "two", "three", "4"]
346
347String interpolation can be nicer than `++`:
348
349 var t2 = "${s} world" # same as t
350
351Likewise, splicing lists can be nicer:
352
353 var M2 = :| @L three 4 | # same as M
354
355### ysh-equals
356
357YSH has strict equality:
358
359 a === b # Python-like, without type conversion
360 a !== b # negated
361
362And type converting equality:
363
364 '3' ~== 3 # True, type conversion
365
366The `~==` operator expects a string as the left operand.
367
368---
369
370Note that:
371
372- `3 === 3.0` is false because integers and floats are different types, and
373 there is no type conversion.
374- `3 ~== 3.0` is an error, because the left operand isn't a string.
375
376You may want to use explicit `int()` and `float()` to convert numbers, and then
377compare them.
378
379---
380
381Compare objects for identity with `is`:
382
383 ysh$ var d = {}
384 ysh$ var e = d
385
386 ysh$ = d is d
387 (Bool) true
388
389 ysh$ = d is {other: 'dict'}
390 (Bool) false
391
392To negate `is`, use `is not` (like Python:
393
394 ysh$ d is not {other: 'dict'}
395 (Bool) true
396
397### ysh-in
398
399The `in` operator tests if a key is in a dictionary:
400
401 var d = {k: 42}
402 if ('k' in d) {
403 echo yes
404 } # => yes
405
406Unlike Python, `in` doesn't work on `Str` and `List` instances. This because
407those operations take linear time rather than constant time (O(n) rather than
408O(1)).
409
410TODO: Use `includes() / contains()` methods instead.
411
412### ysh-compare
413
414The comparison operators apply to integers or floats:
415
416 4 < 4 # => false
417 4 <= 4 # => true
418
419 5.0 > 5.0 # => false
420 5.0 >= 5.0 # => true
421
422Example in context:
423
424 if (x < 0) {
425 echo 'x is negative'
426 }
427
428### ysh-logical
429
430The logical operators take boolean operands, and are spelled like Python:
431
432 not
433 and or
434
435Note that they are distinct from `! && ||`, which are part of the [command
436language](chap-cmd-lang.html).
437
438### ysh-arith
439
440YSH supports most of the arithmetic operators from Python. Notably, `/` and `%`
441differ from Python as [they round toward zero, not negative
442infinity](https://www.oilshell.org/blog/2024/03/release-0.21.0.html#integers-dont-do-whatever-python-or-c-does).
443
444Use `+ - *` for `Int` or `Float` addition, subtraction and multiplication. If
445any of the operands are `Float`s, then the output will also be a `Float`.
446
447Use `/` and `//` for `Float` division and `Int` division, respectively. `/`
448will _always_ result in a `Float`, meanwhile `//` will _always_ result in an
449`Int`.
450
451 = 1 / 2 # => (Float) 0.5
452 = 1 // 2 # => (Int) 0
453
454Use `%` to compute the _remainder_ of integer division. The left operand must
455be an `Int` and the right a _positive_ `Int`.
456
457 = 1 % 2 # -> (Int) 1
458 = -4 % 2 # -> (Int) 0
459
460Use `**` for exponentiation. The left operand must be an `Int` and the right a
461_positive_ `Int`.
462
463All arithmetic operators may coerce either of their operands from strings to a
464number, provided those strings are formatted as numbers.
465
466 = 10 + '1' # => (Int) 11
467
468Operators like `+ - * /` will coerce strings to _either_ an `Int` or `Float`.
469However, operators like `// ** %` and bit shifts will coerce strings _only_ to
470an `Int`.
471
472 = '1.14' + '2' # => (Float) 3.14
473 = '1.14' % '2' # Type Error: Left operand is a Str
474
475### ysh-bitwise
476
477Bitwise operators are like Python and C:
478
479 ~ # unary complement
480
481 & | ^ # binary and, or, xor
482
483 >> << # bit shift
484
485### ysh-ternary
486
487The ternary operator is borrowed from Python:
488
489 display = 'yes' if len(s) else 'empty'
490
491### ysh-index
492
493`Str` objects can be indexed by byte:
494
495 ysh$ var s = 'cat'
496 ysh$ = mystr[1]
497 (Str) 'a'
498
499 ysh$ = mystr[-1] # index from the end
500 (Str) 't'
501
502`List` objects:
503
504 ysh$ var mylist = [1, 2, 3]
505 ysh$ = mylist[2]
506 (Int) 3
507
508`Dict` objects are indexed by string key:
509
510 ysh$ var mydict = {'key': 42}
511 ysh$ = mydict['key']
512 (Int) 42
513
514### ysh-attr
515
516The `.` operator looks up values on either `Dict` or `Obj` instances.
517
518On dicts, it looks for the value associated with a key. That is, the
519expression `mydict.key` is short for `mydict['key']` (like JavaScript, but
520unlike Python.)
521
522---
523
524On objects, the expression `obj.x` looks for attributes, with a special rule
525for bound methods. The rules are:
526
5271. Search the properties of `obj` for a field named `x`.
528 - If it exists, return the value literally. (It can be of any type: `Func`, `Int`,
529 `Str`, ...)
5302. Search up the prototype chain for a field named `x`.
531 - If it exists, and is **not** a `Func`, return the value literally.
532 - If it **is** a `Func`, return **bound method**, which is an (object,
533 function) pair.
534
535Later, when the bound method is called, the object is passed as the first
536argument to the function (`self`), making it a method call. This is how a
537method has access to the object's properties.
538
539Example of first rule:
540
541 func Free(i) {
542 return (i + 1)
543 }
544 var module = Object(null, {Free})
545 echo $[module.Free(42)] # => 43
546
547Example of second rule:
548
549 func method(self, i) {
550 return (self.n + i)
551 }
552 var methods = Object(null, {method})
553 var obj = Object(methods, {n: 10})
554 echo $[obj.method(42)] # => 52
555
556### ysh-slice
557
558Slicing gives you a subsequence of a `Str` or `List`, as in Python.
559
560Negative indices are relative to the end.
561
562String example:
563
564 $ var s = 'spam eggs'
565 $ pp (s[1:-1])
566 (Str) "pam egg"
567
568 $ echo "x $[s[2:]]"
569 x am eggs
570
571List example:
572
573 $ var foods = ['ale', 'bean', 'corn']
574 $ pp (foods[-2:])
575 (List) ["bean","corn"]
576
577 $ write -- @[foods[:2]]
578 ale
579 bean
580
581### ysh-func-call
582
583A function call expression looks like Python:
584
585 ysh$ = f('s', 't', named=42)
586
587A semicolon `;` can be used after positional args and before named args, but
588isn't always required:
589
590 ysh$ = f('s', 't'; named=42)
591
592In these cases, the `;` is necessary:
593
594 ysh$ = f(...args; ...kwargs)
595
596 ysh$ = f(42, 43; ...kwargs)
597
598### thin-arrow
599
600The thin arrow is for mutating methods:
601
602 var mylist = ['bar']
603 call mylist->pop()
604
605 var mydict = {name: 'foo'}
606 call mydict->erase('name')
607
608On `Obj` instances, `obj->mymethod` looks up the prototype chain for a function
609named `M/mymethod`. The `M/` prefix signals mutation.
610
611Example:
612
613 func inc(self, n) {
614 setvar self.i += n
615 }
616 var Counter_methods = Object(null, {'M/inc': inc})
617 var c = Object(Counter_methods, {i: 0})
618
619 call c->inc(5)
620 echo $[c.i] # => 5
621
622It does **not** look in the properties of an object.
623
624### fat-arrow
625
626The fat arrow is for transforming methods:
627
628 if (s => startsWith('prefix')) {
629 echo 'yes'
630 }
631
632If the method lookup on `s` fails, it looks for free functions. This means it
633can be used for "chaining" transformations:
634
635 var x = myFunc() => list() => join()
636
637### match-ops
638
639YSH has four pattern matching operators: `~ !~ ~~ !~~`.
640
641Does string match an **eggex**?
642
643 var filename = 'x42.py'
644 if (filename ~ / d+ /) {
645 echo 'number'
646 }
647
648Does a string match a POSIX regular expression (ERE syntax)?
649
650 if (filename ~ '[[:digit:]]+') {
651 echo 'number'
652 }
653
654Negate the result with the `!~` operator:
655
656 if (filename !~ /space/ ) {
657 echo 'no space'
658 }
659
660 if (filename !~ '[[:space:]]' ) {
661 echo 'no space'
662 }
663
664Does a string match a **glob**?
665
666 if (filename ~~ '*.py') {
667 echo 'Python'
668 }
669
670 if (filename !~~ '*.py') {
671 echo 'not Python'
672 }
673
674Take care not to confuse glob patterns and regular expressions.
675
676- Related doc: [YSH Regex API](../ysh-regex-api.html)
677
678## Eggex
679
680### re-literal
681
682An eggex literal looks like this:
683
684 / expression ; flags ; translation preference /
685
686The flags and translation preference are both optional.
687
688Examples:
689
690 var pat = / d+ / # => [[:digit:]]+
691
692You can specify flags passed to libc `regcomp()`:
693
694 var pat = / d+ ; reg_icase reg_newline /
695
696You can specify a translation preference after a second semi-colon:
697
698 var pat = / d+ ; ; ERE /
699
700Right now the translation preference does nothing. It could be used to
701translate eggex to PCRE or Python syntax.
702
703- Related doc: [Egg Expressions](../eggex.html)
704
705### re-primitive
706
707There are two kinds of eggex primitives.
708
709"Zero-width assertions" match a position rather than a character:
710
711 %start # translates to ^
712 %end # translates to $
713
714Literal characters appear within **single** quotes:
715
716 'oh *really*' # translates to regex-escaped string
717
718Double-quoted strings are **not** eggex primitives. Instead, you can use
719splicing of strings:
720
721 var dq = "hi $name"
722 var eggex = / @dq /
723
724### class-literal
725
726An eggex character class literal specifies a set. It can have individual
727characters and ranges:
728
729 [ 'x' 'y' 'z' a-f A-F 0-9 ] # 3 chars, 3 ranges
730
731Omit quotes on ASCII characters:
732
733 [ x y z ] # avoid typing 'x' 'y' 'z'
734
735Sets of characters can be written as strings
736
737 [ 'xyz' ] # any of 3 chars, not a sequence of 3 chars
738
739Backslash escapes are respected:
740
741 [ \\ \' \" \0 ]
742 [ \xFF \u{3bc} ]
743
744(Note that we don't use `\yFF`, as in J8 strings.)
745
746Splicing:
747
748 [ @str_var ]
749
750Negation always uses `!`
751
752 ![ a-f A-F 'xyz' @str_var ]
753
754### named-class
755
756Perl-like shortcuts for sets of characters:
757
758 [ dot ] # => .
759 [ digit ] # => [[:digit:]]
760 [ space ] # => [[:space:]]
761 [ word ] # => [[:alpha:]][[:digit:]]_
762
763Abbreviations:
764
765 [ d s w ] # Same as [ digit space word ]
766
767Valid POSIX classes:
768
769 alnum cntrl lower space
770 alpha digit print upper
771 blank graph punct xdigit
772
773Negated:
774
775 !digit !space !word
776 !d !s !w
777 !alnum # etc.
778
779### re-repeat
780
781Eggex repetition looks like POSIX syntax:
782
783 / 'a'? / # zero or one
784 / 'a'* / # zero or more
785 / 'a'+ / # one or more
786
787Counted repetitions:
788
789 / 'a'{3} / # exactly 3 repetitions
790 / 'a'{2,4} / # between 2 to 4 repetitions
791
792### re-compound
793
794Sequence expressions with a space:
795
796 / word digit digit / # Matches 3 characters in sequence
797 # Examples: a42, b51
798
799(Compare `/ [ word digit ] /`, which is a set matching 1 character.)
800
801Alternation with `|`:
802
803 / word | digit / # Matches 'a' OR '9', for example
804
805Grouping with parentheses:
806
807 / (word digit) | \\ / # Matches a9 or \
808
809### re-capture
810
811To retrieve a substring of a string that matches an Eggex, use a "capture
812group" like `<capture ...>`.
813
814Here's an eggex with a **positional** capture:
815
816 var pat = / 'hi ' <capture d+> / # access with _group(1)
817 # or Match => _group(1)
818
819Captures can be **named**:
820
821 <capture d+ as month> # access with _group('month')
822 # or Match => group('month')
823
824Captures can also have a type **conversion func**:
825
826 <capture d+ : int> # _group(1) returns Int
827
828 <capture d+ as month: int> # _group('month') returns Int
829
830Related docs and help topics:
831
832- [YSH Regex API](../ysh-regex-api.html)
833- [`_group()`](chap-builtin-func.html#_group)
834- [`Match => group()`](chap-type-method.html#group)
835
836### re-splice
837
838To build an eggex out of smaller expressions, you can **splice** eggexes
839together:
840
841 var D = / [0-9][0-9] /
842 var time = / @D ':' @D / # [0-9][0-9]:[0-9][0-9]
843
844If the variable begins with a capital letter, you can omit `@`:
845
846 var ip = / D ':' D /
847
848You can also splice a string:
849
850 var greeting = 'hi'
851 var pat = / @greeting ' world' / # hi world
852
853Splicing is **not** string concatenation; it works on eggex subtrees.
854
855### re-flags
856
857Valid ERE flags, which are passed to libc's `regcomp()`:
858
859- `reg_icase` aka `i` - ignore case
860- `reg_newline` - 4 matching changes related to newlines
861
862See `man regcomp`.
863
864### re-multiline
865
866Multi-line eggexes aren't yet implemented. Splicing makes it less necessary:
867
868 var Name = / <capture [a-z]+ as name> /
869 var Num = / <capture d+ as num> /
870 var Space = / <capture s+ as space> /
871
872 # For variables named like CapWords, splicing @Name doesn't require @
873 var lexer = / Name | Num | Space /