OILS / doc / ysh-tour.md View on Github | oilshell.org

1541 lines, 1049 significant
1---
2default_highlighter: oils-sh
3---
4
5A Tour of YSH
6=============
7
8<!-- author's note about example names
9
10- people: alice, bob
11- nouns: ale, bean
12 - peanut, coconut
13- 42 for integers
14-->
15
16This doc describes the [YSH]($xref) language from **clean slate**
17perspective. We don't assume you know Unix shell, or the compatible
18[OSH]($xref). But shell users will see the similarity, with simplifications
19and upgrades.
20
21Remember, YSH is for Python and JavaScript users who avoid shell! See the
22[project FAQ][FAQ] for more color on that.
23
24[FAQ]: https://www.oilshell.org/blog/2021/01/why-a-new-shell.html
25
26This document is **long** because it demonstrates nearly every feature of the
27language. You may want to read it in multiple sittings, or read [The Simplest
28Explanation of
29Oil](https://www.oilshell.org/blog/2020/01/simplest-explanation.html) first.
30(Until 2023, YSH was called the "Oil language".)
31
32
33Here's a summary of what follows:
34
351. YSH has interleaved *word*, *command*, and *expression* languages.
36 - The command language has Ruby-like *blocks*, and the expression language
37 has Python-like *data types*.
382. YSH has both builtin *commands* like `cd /tmp`, and builtin *functions* like
39 `join()`.
403. Languages for *data*, like [JSON][], are complementary to YSH code.
414. OSH and YSH share both an *interpreter data model* and a *process model*
42 (provided by the Unix kernel). Understanding these common models will make
43 you both a better shell user and YSH user.
44
45Keep these points in mind as you read the details below.
46
47[JSON]: https://json.org
48
49<div id="toc">
50</div>
51
52## Preliminaries
53
54Start YSH just like you start bash or Python:
55
56<!-- oils-sh below skips code block extraction, since it doesn't run -->
57
58```sh-prompt
59bash$ ysh # assuming it's installed
60
61ysh$ echo 'hello world' # command typed into YSH
62hello world
63```
64
65In the sections below, we'll save space by showing output **in comments**, with
66`=>`:
67
68 echo 'hello world' # => hello world
69
70Multi-line output is shown like this:
71
72 echo one
73 echo two
74 # =>
75 # one
76 # two
77
78## Examples
79
80### Hello World Script
81
82You can also type commands into a file like `hello.ysh`. This is a complete
83YSH program, which is identical to a shell program:
84
85 echo 'hello world' # => hello world
86
87### A Taste of YSH
88
89Unlike shell, YSH has `var` and `const` keywords:
90
91 const name = 'world' # const is rarer, used the top-level
92 echo "hello $name" # => hello world
93
94They take rich Python-like expressions on the right:
95
96 var x = 42 # an integer, not a string
97 setvar x = x * 2 + 1 # mutate with the 'setvar' keyword
98
99 setvar x += 5 # Increment by 5
100 echo $x # => 6
101
102 var mylist = [x, 7] # two integers [6, 7]
103
104Expressions are often surrounded by `()`:
105
106 if (x > 0) {
107 echo 'positive'
108 } # => positive
109
110 for i, item in (mylist) { # 'mylist' is a variable, not a string
111 echo "[$i] item $item"
112 }
113 # =>
114 # [0] item 6
115 # [1] item 7
116
117YSH has Ruby-like blocks:
118
119 cd /tmp {
120 echo hi > greeting.txt # file created inside /tmp
121 echo $PWD # => /tmp
122 }
123 echo $PWD # prints the original directory
124
125And utilities to read and write JSON:
126
127 var person = {name: 'bob', age: 42}
128 json write (person)
129 # =>
130 # {
131 # "name": "bob",
132 # "age": 42,
133 # }
134
135 echo '["str", 42]' | json read # sets '_reply' variable by default
136
137The `=` keyword evaluates and prints an expression:
138
139 = _reply
140 # => (List) ["str", 42]
141
142(Think of it like `var x = _reply`, without the `var`.)
143
144## Word Language: Expressions for Strings (and Arrays)
145
146Let's describe the word language first, and then talk about commands and
147expressions. Words are a rich language because **strings** are a central
148concept in shell.
149
150### Unquoted Words
151
152Words denote strings, but you often don't need to quote them:
153
154 echo hi # => hi
155
156Quotes are useful when a string has spaces, or punctuation characters like `( )
157;`.
158
159### Three Kinds of String Literals
160
161You can choose the style that's most convenient to write a given string.
162
163#### Double-Quoted, Single-Quoted, and J8 strings (like JSON)
164
165Double-quoted strings allow **interpolation**, with `$`:
166
167 var person = 'alice'
168 echo "hi $person, $(echo bye)" # => hi alice, bye
169
170Write operators by escaping them with `\`:
171
172 echo "\$ \" \\ " # => $ " \
173
174In single-quoted strings, all characters are **literal** (except `'`, which
175can't be expressed):
176
177 echo 'c:\Program Files\' # => c:\Program Files\
178
179If you want C-style backslash **character escapes**, use a J8 string, which is
180like JSON, but with single quotes:
181
182 echo u' A is \u{41} \n line two, with backslash \\'
183 # =>
184 # A is A
185 # line two, with backslash \
186
187The `u''` strings are guaranteed to be valid Unicode (unlike JSON). You can
188also use `b''` strings:
189
190 echo b'byte \yff' # Byte that's not valid unicode, like \xff in C.
191 # Don't confuse it with \u{ff}.
192
193#### Multi-line Strings
194
195Multi-line strings are surrounded with triple quotes. They come in the same
196three varieties, and leading whitespace is stripped in a convenient way.
197
198 sort <<< """
199 var sub: $x
200 command sub: $(echo hi)
201 expression sub: $[x + 3]
202 """
203 # =>
204 # command sub: hi
205 # expression sub: 9
206 # var sub: 6
207
208 sort <<< '''
209 $2.00 # literal $, no interpolation
210 $1.99
211 '''
212 # =>
213 # $1.99
214 # $2.00
215
216 sort <<< u'''
217 C\tD
218 A\tB
219 ''' # b''' strings also supported
220 # =>
221 # A B
222 # C D
223
224(Use multiline strings instead of shell's [here docs]($xref:here-doc).)
225
226### Three Kinds of Substitution
227
228YSH has syntax for 3 types of substitution, all of which start with `$`. That
229is, you can convert any of these things to a **string**:
230
2311. Variables
2322. The output of commands
2333. The value of expressions
234
235#### Variable Sub
236
237The syntax `$a` or `${a}` converts a variable to a string:
238
239 var a = 'ale'
240 echo $a # => ale
241 echo _${a}_ # => _ale_
242 echo "_ $a _" # => _ ale _
243
244The shell operator `:-` is occasionally useful in YSH:
245
246 echo ${not_defined:-'default'} # => default
247
248#### Command Sub
249
250The `$(echo hi)` syntax runs a command and captures its `stdout`:
251
252 echo $(hostname) # => example.com
253 echo "_ $(hostname) _" # => _ example.com _
254
255#### Expression Sub
256
257The `$[myexpr]` syntax evaluates an expression and converts it to a string:
258
259 echo $[a] # => ale
260 echo $[1 + 2 * 3] # => 7
261 echo "_ $[1 + 2 * 3] _" # => _ 7 _
262
263<!-- TODO: safe substitution with "$[a]"html -->
264
265### Arrays of Strings: Globs, Brace Expansion, Splicing, and Splitting
266
267There are four constructs that evaluate to a **list of strings**, rather than a
268single string.
269
270#### Globs
271
272Globs like `*.py` evaluate to a list of files.
273
274 touch foo.py bar.py # create the files
275 write *.py
276 # =>
277 # foo.py
278 # bar.py
279
280If no files match, it evaluates to an empty list (`[]`).
281
282#### Brace Expansion
283
284The brace expansion mini-language lets you write strings without duplication:
285
286 write {alice,bob}@example.com
287 # =>
288 # alice@example.com
289 # bob@example.com
290
291#### Splicing
292
293The `@` operator splices an array into a command:
294
295 var myarray = :| ale bean |
296 write S @myarray E
297 # =>
298 # S
299 # ale
300 # bean
301 # E
302
303You also have `@[]` to splice an expression that evaluates to a list:
304
305 write -- @[split('ale bean')]
306 # =>
307 # ale
308 # bean
309
310Each item will be converted to a string.
311
312#### Split Command Sub / Split Builtin Sub
313
314There's also a variant of *command sub* that decodes J8 lines into a sequence
315of strings:
316
317 write @(seq 3) # write is passed 3 args
318 # =>
319 # 1
320 # 2
321 # 3
322
323## Command Language: I/O, Control Flow, Abstraction
324
325### Simple Commands
326
327A simple command is a space-separated list of words. YSH looks up the first
328word to determine if it's a builtin command, or a user-defined `proc`.
329
330 echo 'hello world' # The shell builtin 'echo'
331
332 proc greet (name) { # A proc is like a procedure or process
333 echo "hello $name"
334 }
335
336 # The first word now resolves to the proc you defined
337 greet alice # => hello alice
338
339If it's neither, then it's assumed to be an external command:
340
341 ls -l /tmp # The external 'ls' command
342
343Commands accept traditional string arguments, as well as typed arguments in
344parentheses:
345
346 # 'write' is a string arg; 'x' is a typed expression arg
347 json write (x)
348
349<!--
350Block args are a special kind of typed arg:
351
352 cd /tmp {
353 echo $PWD
354 }
355-->
356
357### Redirects
358
359You can **redirect** `stdin` and `stdout` of simple commands:
360
361 echo hi > tmp.txt # write to a file
362 sort < tmp.txt
363
364Here are the most common idioms for using `stderr` (identical to shell):
365
366 ls /tmp 2>errors.txt
367 echo 'fatal error' >&2
368
369### ARGV and ENV
370
371The `ARGV` list holds the arguments pased to the shell:
372
373 var num_args = len(ARGV)
374 ls /tmp @ARGV # pass shell's arguments through
375
376---
377
378You can add to the environment of a new process with a *prefix binding*:
379
380 PYTHONPATH=vendor ./demo.py
381
382The `ENV` object reflects the current environment:
383
384 echo $[ENV.PYTHONPATH] # => vendor
385
386### Pipelines
387
388Pipelines are a powerful method manipulating data streams:
389
390 ls | wc -l # count files in this directory
391 find /bin -type f | xargs wc -l # count files in a subtree
392
393The stream may contain (lines of) text, binary data, JSON, TSV, and more.
394Details below.
395
396### Multi-line Commands
397
398The `...` prefix lets you write long commands, pipelines, and `&&` chains
399without `\` line continuations.
400
401 ... find /bin # traverse this directory and
402 -type f -a -executable # print executable files
403 | sort -r # reverse sort
404 | head -n 30 # limit to 30 files
405 ;
406
407When this mode is active:
408
409- A single newline behaves like a space
410- A blank line (two newlines in a row) is illegal, but a line that has only a
411 comment is allowed. This prevents confusion if you forget the `;`
412 terminator.
413
414### `var`, `setvar`, `const` to Declare and Mutate
415
416Constants can't be modified:
417
418 const myconst = 'mystr'
419 # setvar myconst = 'foo' would be an error
420
421Modify variables with the `setvar` keyword:
422
423 var num_beans = 12
424 setvar num_beans = 13
425
426A more complex example:
427
428 var d = {name: 'bob', age: 42} # dict literal
429 setvar d.name = 'alice' # d.name is a synonym for d['name']
430 echo $[d.name] # => alice
431
432That's most of what you need to know about assignments. Advanced users may
433want to use `setglobal` or `call myplace->setValue(42)` in certain situations.
434
435<!--
436 var g = 1
437 var h = 2
438 proc demo(:out) {
439 setglobal g = 42
440 setref out = 43
441 }
442 demo :h # pass a reference to h
443 echo "$g $h" # => 42 43
444-->
445
446More info: [Variable Declaration and Mutation](variables.html).
447
448### `for` Loop
449
450#### Words
451
452Shell-style for loops iterate over **words**:
453
454 for word in 'oils' $num_beans {pea,coco}nut {
455 echo $word
456 }
457 # =>
458 # oils
459 # 13
460 # peanut
461 # coconut
462
463You can also request the loop index:
464
465 for i, word in README.md *.py {
466 echo "$i - $word"
467 }
468 # =>
469 # 0 - README.md
470 # 1 - __init__.py
471
472#### Typed Data
473
474To iterate over a typed data, use parentheses around an **expression**. The
475expression should evaluate to an integer `Range`, `List`, `Dict`, or `Stdin`.
476
477Range:
478
479 for i in (3 ..< 5) { # range operator ..<
480 echo "i = $i"
481 }
482 # =>
483 # i = 3
484 # i = 4
485
486List:
487
488 var foods = ['ale', 'bean']
489 for item in (foods) {
490 echo $item
491 }
492 # =>
493 # ale
494 # bean
495
496Again, you can request the index with `for i, item in ...`.
497
498---
499
500Here's the most general form of the loop over `Dict`:
501
502 var mydict = {pea: 42, nut: 10}
503 for i, k, v in (mydict) {
504 echo "$i - $k - $v"
505 }
506 # =>
507 # 0 - pea - 42
508 # 1 - nut - 10
509
510There are two simpler forms:
511
512- One variable gives you the key: `for k in (mydict)`
513- Two variables gives you the key and value: `for k, v in (mydict)`
514
515(One way to think of it: `for` loops in YSH have the functionality Python's
516`enumerate()`, `items()`, `keys()`, and `values()`.)
517
518---
519
520The `io.stdin` object iterates over lines:
521
522 for line in (io.stdin) {
523 echo $line
524 }
525 # lines are buffered, so it's much faster than `while read --rawline`
526
527<!--
528TODO: Str loop should give you the (UTF-8 offset, rune)
529Or maybe just UTF-8 offset? Decoding errors could be exceptions, or Unicode
530replacement.
531-->
532
533### `while` Loop
534
535While loops can use a **command** as the termination condition:
536
537 while test --file lock {
538 sleep 1
539 }
540
541Or an **expression**, which is surrounded in `()`:
542
543 var i = 3
544 while (i < 6) {
545 echo "i = $i"
546 setvar i += 1
547 }
548 # =>
549 # i = 3
550 # i = 4
551 # i = 5
552
553### `if elif` Conditional
554
555If statements test the exit code of a command, and have optional `elif` and
556`else` clauses:
557
558 if test --file foo {
559 echo 'foo is a file'
560 rm --verbose foo # delete it
561 } elif test --dir foo {
562 echo 'foo is a directory'
563 } else {
564 echo 'neither'
565 }
566
567Invert the exit code with `!`:
568
569 if ! grep alice /etc/passwd {
570 echo 'alice is not a user'
571 }
572
573As with `while` loops, the condition can also be an **expression** wrapped in
574`()`:
575
576 if (num_beans > 0) {
577 echo 'so many beans'
578 }
579
580 var done = false
581 if (not done) { # negate with 'not' operator (contrast with !)
582 echo "we aren't done"
583 }
584
585### `case` Conditional
586
587The case statement is a series of conditionals and executable blocks. The
588condition can be either an unquoted glob pattern like `*.py`, an eggex pattern
589like `/d+/`, or a typed expression like `(42)`:
590
591 var s = 'README.md'
592 case (s) {
593 *.py { echo 'Python' }
594 *.cc | *.h { echo 'C++' }
595 * { echo 'Other' }
596 }
597 # => Other
598
599 case (s) {
600 / dot* '.md' / { echo 'Markdown' }
601 (30 + 12) { echo 'the integer 42' }
602 (else) { echo 'neither' }
603 }
604 # => Markdown
605
606
607<!--
608(Shell style like `if foo; then ... fi` and `case $x in ... esac` is also
609legal, but discouraged in YSH code.)
610-->
611
612### Error Handling
613
614If statements are also used for **error handling**. Builtins and external
615commands use this style:
616
617 if ! test -d /bin {
618 echo 'not a directory'
619 }
620
621 if ! cp foo /tmp {
622 echo 'error copying' # any non-zero status
623 }
624
625Procs use this style (because of shell's *disabled `errexit` quirk*):
626
627 try {
628 myproc
629 }
630 if failed {
631 echo 'failed'
632 }
633
634For a complete list of examples, see [YSH Error
635Handling](ysh-error.html). For design goals and a reference, see [YSH
636Fixes Shell's Error Handling](error-handling.html).
637
638#### exit, break, continue, return
639
640The `exit` **keyword** exits a process. (It's not a shell builtin.)
641
642The other 3 control flow keywords behave like they do in Python and JavaScript.
643
644### Ruby-like Block Arguments
645
646Here's a builtin command that takes a literal block argument:
647
648 shopt --unset errexit { # ignore errors
649 cp ale /tmp
650 cp bean /bin
651 }
652
653A block is a value of type `Command`.
654
655### Shell-like `proc`
656
657You can define units of code with the `proc` keyword.
658
659 proc mycopy (src, dest) {
660 ### Copy verbosely
661
662 mkdir -p $dest
663 cp --verbose $src $dest
664 }
665
666The `###` line is a "doc comment". Simple procs like this are invoked like a
667shell command:
668
669 touch log.txt
670 mycopy log.txt /tmp # first word 'mycopy' is a proc
671
672Procs have many features, including **four** kinds of arguments:
673
6741. Word args (which are always strings)
6751. Typed, positional args (aka positional args)
6761. Typed, named args (aka named args)
6771. A final block argument, which may be written with `{ }`.
678
679At the call site, they can look like any of these forms:
680
681 ls /tmp # word arg
682
683 json write (d) # word arg, then positional arg
684
685 try {
686 error 'failed' (status=9) # word arg, then named arg
687 }
688
689 cd /tmp { echo $PWD } # word arg, then block arg
690
691 pp value ([1, 2]) # positional, typed arg
692
693<!-- TODO: lazy arg list: ls8 | where [age > 10] -->
694
695At the definition site, the kinds of parameters are separated with `;`, similar
696to the Julia language:
697
698 proc p2 (word1, word2; pos1, pos2, ...rest_pos) {
699 echo "$word1 $word2 $[pos1 + pos2]"
700 json write (rest_pos)
701 }
702
703 proc p3 (w ; ; named1, named2, ...rest_named; block) {
704 echo "$w $[named1 + named2]"
705 eval (block)
706 json write (rest_named)
707 }
708
709 proc p4 (; ; ; block) {
710 eval (block)
711 }
712
713YSH also has Python-like functions defined with `func`. These are part of the
714expression language, which we'll see later.
715
716For more info, see the [Guide to Procs and Funcs](proc-func.html).
717
718#### Builtin Commands
719
720**Shell builtins** like `cd` and `read` are the "standard library" of the
721command language. Each one takes various flags:
722
723 cd -L . # follow symlinks
724
725 echo foo | read --all # read all of stdin
726
727Here are some categories of builtin:
728
729- I/O: `echo write read`
730- File system: `cd test`
731- Processes: `fork wait forkwait exec`
732- Interpreter settings: `shopt shvar`
733- Meta: `command builtin runproc type eval`
734
735<!-- TODO: Link to a comprehensive list of builtins -->
736
737## Expression Language: Python-like Types
738
739YSH expressions look and behave more like Python or JavaScript than shell. For
740example, we write `if (x < y)` instead of `if [ $x -lt $y ]`. Expressions are
741usually surrounded by `( )`.
742
743At runtime, variables like `x` and `y` are bounded to **typed data**, like
744integers, floats, strings, lists, and dicts.
745
746<!--
747[Command vs. Expression Mode](command-vs-expression-mode.html) may help you
748understand how YSH is parsed.
749-->
750
751### Python-like `func`
752
753At the end of the *Command Language*, we saw that procs are shell-like units of
754code. YSH also has Python-like **functions**, which are different than
755`procs`:
756
757- They're defined with the `func` keyword.
758- They're called in expressions, not in commands.
759- They're **pure**, and live in the **interior** of a process.
760 - In contrast, procs usually perform I/O, and have **exterior** boundaries.
761
762The simplest function is:
763
764 func identity(x) {
765 return (x) # parens required for typed return
766 }
767
768A more complex pure function:
769
770 func myRepeat(s, n; special=false) { # positional; named params
771 var parts = []
772 for i in (0 ..< n) {
773 append $s (parts)
774 }
775 var result = join(parts)
776
777 if (special) {
778 return ("$result !!")
779 } else {
780 return (result)
781 }
782 }
783
784 echo $[myRepeat('z', 3)] # => zzz
785
786 echo $[myRepeat('z', 3, special=true)] # => zzz !!
787
788A function that mutates its argument:
789
790 func popTwice(mylist) {
791 call mylist->pop()
792 call mylist->pop()
793 }
794
795 var mylist = [3, 4]
796
797 # The call keyword is an "adapter" between commands and expressions,
798 # like the = keyword.
799 call popTwice(mylist)
800
801
802Funcs are named using `camelCase`, while procs use `kebab-case`. See the
803[Style Guide](style-guide.html) for more conventions.
804
805#### Builtin Functions
806
807In addition, to builtin commands, YSH has Python-like builtin **functions**.
808These are like the "standard library" for the expression language. Examples:
809
810- Functions that take multiple types: `len() type()`
811- Conversions: `bool() int() float() str() list() ...`
812- Explicit word evaluation: `split() join() glob() maybe()`
813
814<!-- TODO: Make a comprehensive list of func builtins. -->
815
816
817### Data Types: `Int`, `Str`, `List`, `Dict`, `Obj`, ...
818
819YSH has data types, each with an expression syntax and associated methods.
820
821### Methods
822
823YSH adds mutable data structures to shell, so we have a special syntax for
824mutating methods. They are looked up with a thin arrow `->`:
825
826 var foods = ['ale', 'bean']
827 var last = foods->pop() # bean
828 write @foods # => ale
829
830You can ignore the return value with the `call` keyword:
831
832 call foods->pop()
833
834Regular methods are looked up with the `.` operator:
835
836 var line = ' ale bean '
837 var caps = last.trim().upper() # 'ALE BEAN'
838
839---
840
841You can also chain functions with a fat arrow `=>`:
842
843 var trimmed = line.trim() => upper() # 'ALE BEAN'
844
845The `=>` operator allows functions to appear in a natural left-to-right order,
846like methods.
847
848 # list() is a free function taking one arg
849 # join() is a free function taking two args
850 var x = {k1: 42, k2: 43} => list() => join('/') # 'K1/K2'
851
852---
853
854Now let's go through the data types in YSH. We'll show the syntax for
855literals, and what **methods** they have.
856
857#### Null and Bool
858
859YSH uses JavaScript-like spellings these three "atoms":
860
861 var x = null
862
863 var b1, b2 = true, false
864
865 if (b1) {
866 echo 'yes'
867 } # => yes
868
869
870#### Int
871
872There are many ways to write integers:
873
874 var small, big = 42, 65_536
875 echo "$small $big" # => 42 65536
876
877 var hex, octal, binary = 0x0001_0000, 0o755, 0b0001_0101
878 echo "$hex $octal $binary" # => 65536 493 21
879
880<!--
881"Runes" are integers that represent Unicode code points. They're not common in
882YSH code, but can make certain string algorithms more readable.
883
884 # Pound rune literals are similar to ord('A')
885 const a = #'A'
886
887 # Backslash rune literals can appear outside of quotes
888 const newline = \n # Remember this is an integer
889 const backslash = \\ # ditto
890
891 # Unicode rune literal is syntactic sugar for 0x3bc
892 const mu = \u{3bc}
893
894 echo "chars $a $newline $backslash $mu" # => chars 65 10 92 956
895-->
896
897#### Float
898
899Floats are written with a decimal point:
900
901 var big = 3.14
902
903You can use scientific notation, as in Python:
904
905 var small = 1.5e-10
906
907#### Str
908
909See the section above on *Three Kinds of String Literals*. It described
910`'single quoted'`, `"double ${quoted}"`, and `u'J8-style\n'` strings; as well
911as their multiline variants.
912
913Strings are UTF-8 encoded in memory, like strings in the [Go
914language](https://golang.org). There isn't a separate string and unicode type,
915as in Python.
916
917Strings are **immutable**, as in Python and JavaScript. This means they only
918have **transforming** methods:
919
920 var x = s.trim()
921
922Other methods:
923
924- `trimLeft() trimRight()`
925- `trimPrefix() trimSuffix()`
926- `upper() lower()`
927- `search() leftMatch()` - pattern matching
928- `replace() split()`
929
930#### List (and Arrays)
931
932All lists can be expressed with Python-like literals:
933
934 var foods = ['ale', 'bean', 'corn']
935 var recursive = [1, [2, 3]]
936
937As a special case, list of strings are called **arrays**. It's often more
938convenient to write them with shell-like literals:
939
940 # No quotes or commas
941 var foods = :| ale bean corn |
942
943 # You can use the word language here
944 var other = :| foo $s *.py {alice,bob}@example.com |
945
946Lists are **mutable**, as in Python and JavaScript. So they mainly have
947mutating methods:
948
949 call foods->reverse()
950 write -- @foods
951 # =>
952 # corn
953 # bean
954 # ale
955
956#### Dict
957
958Dicts use syntax that's like JavaScript. Here's a dict literal:
959
960 var d = {
961 name: 'bob', # unquoted keys are allowed
962 age: 42,
963 'key with spaces': 'val'
964 }
965
966You can use either `[]` or `.` to retrieve a value, given a key:
967
968 var v1 = d['name']
969 var v2 = d.name # shorthand for the above
970 var v3 = d['key with spaces'] # no shorthand for this
971
972(If the key doesn't exist, an error is raised.)
973
974You can change Dict values with the same 2 syntaxes:
975
976 set d['name'] = 'other'
977 set d.name = 'fun'
978
979---
980
981If you want to compute a key name, use an expression inside `[]`:
982
983 var key = 'alice'
984 var d2 = {[key ++ '_z']: 'ZZZ'} # Computed key name
985 echo $[d2.alice_z] # => ZZZ
986
987If you omit the value, its taken from a variable of the same name:
988
989 var d3 = {key} # like {key: key}
990 echo "name is $[d3.key]" # => name is alice
991
992More examples:
993
994 var empty = {}
995 echo $[len(empty)] # => 0
996
997The `keys()` and `values()` methods return new `List` objects:
998
999 var keys = keys(d2) # => alice_z
1000 var vals = values(d3) # => alice
1001
1002### `Place` type / "out params"
1003
1004The `read` builtin can either set an implicit variable `_reply`:
1005
1006 whoami | read --all # sets _reply
1007
1008Or you can pass a `value.Place`, created with `&`
1009
1010 var x # implicitly initialized to null
1011 whoami | read --all (&x) # mutate this "place"
1012 echo who=$x # => who=andy
1013
1014<!--
1015#### Quotation Types: value.Command (Block) and value.Expr
1016
1017These types are for reflection on YSH code. Most YSH programs won't use them
1018directly.
1019
1020- `Command`: an unevaluated code block.
1021 - rarely-used literal: `^(ls | wc -l)`
1022- `Expr`: an unevaluated expression.
1023 - rarely-used literal: `^[42 + a[i]]`
1024-->
1025
1026### Operators
1027
1028YSH operators are generally the same as in Python:
1029
1030 if (10 <= num_beans and num_beans < 20) {
1031 echo 'enough'
1032 } # => enough
1033
1034YSH has a few operators that aren't in Python. Equality can be approximate or
1035exact:
1036
1037 var n = ' 42 '
1038 if (n ~== 42) {
1039 echo 'equal after stripping whitespace and type conversion'
1040 } # => equal after stripping whitespace type conversion
1041
1042 if (n === 42) {
1043 echo "not reached because strings and ints aren't equal"
1044 }
1045
1046<!-- TODO: is n === 42 a type error? -->
1047
1048Pattern matching can be done with globs (`~~` and `!~~`)
1049
1050 const filename = 'foo.py'
1051 if (filename ~~ '*.py') {
1052 echo 'Python'
1053 } # => Python
1054
1055 if (filename !~~ '*.sh') {
1056 echo 'not shell'
1057 } # => not shell
1058
1059or regular expressions (`~` and `!~`). See the Eggex section below for an
1060example of the latter.
1061
1062Concatenation is `++` rather than `+` because it avoids confusion in the
1063presence of type conversion:
1064
1065 var n = 42 + 1 # string plus int does implicit conversion
1066 echo $n # => 43
1067
1068 var y = 'ale ' ++ "bean $n" # concatenation
1069 echo $y # => ale bean 43
1070
1071<!--
1072TODO: change example above
1073 var n = '42' + 1 # string plus int does implicit conversion
1074-->
1075
1076<!--
1077
1078#### Summary of Operators
1079
1080- Arithmetic: `+ - * / // %` and `**` for exponentatiation
1081 - `/` always yields a float, and `//` is integer division
1082- Bitwise: `& | ^ ~`
1083- Logical: `and or not`
1084- Comparison: `== < > <= >= in 'not in'`
1085 - Approximate equality: `~==`
1086 - Eggex and glob match: `~ !~ ~~ !~~`
1087- Ternary: `1 if x else 0`
1088- Index and slice: `mylist[3]` and `mylist[1:3]`
1089 - `mydict->key` is a shortcut for `mydict['key']`
1090- Function calls
1091 - free: `f(x, y)`
1092 - transformations and chaining: `s => startWith('prefix')`
1093 - mutating methods: `mylist->pop()`
1094- String and List: `++` for concatenation
1095 - This is a separate operator because the addition operator `+` does
1096 string-to-int conversion
1097
1098TODO: What about list comprehensions?
1099-->
1100
1101### Egg Expressions (YSH Regexes)
1102
1103An *Eggex* is a YSH expression that denotes a regular expression. Eggexes
1104translate to POSIX ERE syntax, for use with tools like `egrep`, `awk`, and `sed
1105--regexp-extended` (GNU only).
1106
1107They're designed to be readable and composable. Example:
1108
1109 var D = / digit{1,3} /
1110 var ip_pattern = / D '.' D '.' D '.' D'.' /
1111
1112 var z = '192.168.0.1'
1113 if (z ~ ip_pattern) { # Use the ~ operator to match
1114 echo "$z looks like an IP address"
1115 } # => 192.168.0.1 looks like an IP address
1116
1117 if (z !~ / '.255' %end /) {
1118 echo "doesn't end with .255"
1119 } # => doesn't end with .255"
1120
1121See the [Egg Expressions doc](eggex.html) for details.
1122
1123## Interlude
1124
1125Let's review what we've seen before moving onto other YSH features.
1126
1127### Three Interleaved Languages
1128
1129Here are the languages we saw in the last 3 sections:
1130
11311. **Words** evaluate to a string, or list of strings. This includes:
1132 - literals like `'mystr'`
1133 - substitutions like `${x}` and `$(hostname)`
1134 - globs like `*.sh`
11352. **Commands** are used for
1136 - I/O: pipelines, builtins like `read`
1137 - control flow: `if`, `for`
1138 - abstraction: `proc`
11393. **Expressions** on typed data are borrowed from Python, with influence from
1140 JavaScript:
1141 - Lists: `['ale', 'bean']` or `:| ale bean |`
1142 - Dicts: `{name: 'bob', age: 42}`
1143 - Functions: `split('ale bean')` and `join(['pea', 'nut'])`
1144
1145### How Do They Work Together?
1146
1147Here are two examples:
1148
1149(1) In this this *command*, there are **four** *words*. The fourth word is an
1150*expression sub* `$[]`.
1151
1152 write hello $name $[d['age'] + 1]
1153 # =>
1154 # hello
1155 # world
1156 # 43
1157
1158(2) In this assignment, the *expression* on the right hand side of `=`
1159concatenates two strings. The first string is a literal, and the second is a
1160*command sub*.
1161
1162 var food = 'ale ' ++ $(echo bean | tr a-z A-Z)
1163 write $food # => ale BEAN
1164
1165So words, commands, and expressions are **mutually recursive**. If you're a
1166conceptual person, skimming [Syntactic Concepts](syntactic-concepts.html) may
1167help you understand this on a deeper level.
1168
1169<!--
1170One way to think about these sublanguages is to note that the `|` character
1171means something different in each context:
1172
1173- In the command language, it's the pipeline operator, as in `ls | wc -l`
1174- In the word language, it's only valid in a literal string like `'|'`, `"|"`,
1175 or `\|`. (It's also used in `${x|html}`, which formats a string.)
1176- In the expression language, it's the bitwise OR operator, as in Python and
1177 JavaScript.
1178-->
1179
1180## Advanced YSH Features
1181
1182Unlike shell, YSH is powerful enough to write reusable **libraries**. It also
1183has reflective features, to allow creating reusable **languages**!
1184
1185The following sections give you a taste of some advanced features.
1186
1187### Closures
1188
1189Block arguments capture the frame they're defined in, which means they have
1190*lexical scope*.
1191
1192For example, this proc accepts a block, and runs it:
1193
1194 proc do-it (; ; ; block) {
1195 call io->eval(block)
1196 }
1197
1198When you pass a block to it, the enclosing stack frame is captured:
1199
1200 var x = 42
1201 do-it {
1202 echo "x = $x" # outer x is visible LATER, when the block is run
1203 }
1204
1205- [Feature Index: Closures](ref/feature-index.html#Closures)
1206
1207### Objects
1208
1209YSH has an `Obj` type that bundles **code** and **data**. (In contrast, JSON
1210messages are pure data, not objects.)
1211
1212The main purpose of objects is **polymorphism**:
1213
1214 var obj = makeMyObject(42) # I don't know what it looks like inside
1215
1216 echo $[obj.myMethod()] # But I can perform abstract operations
1217
1218 call obj->mutatingMethod() # Mutation is considered special, with ->
1219
1220YSH objects are similar to Lua and JavaScript objects: they have a `Dict` of
1221properties, and a recursive "prototype chain" that is also an `Obj`.
1222
1223- [Feature Index: Objects](ref/feature-index.html#Objects)
1224
1225### Modules
1226
1227A module is a **file** of source code, like `lib/myargs.ysh`.
1228
1229The `use` builtin turns it into an `Obj` that can be invoked and inspected:
1230
1231 use myargs.ysh
1232 myargs proc1 --flag val # module name becomes a prefix, via __invoke__
1233 var alias = myargs.proc1 # module has attributes
1234
1235You can import specific names with the `--pick` flag:
1236
1237 use myargs.ysh --pick p2 p3
1238 p2
1239 p3
1240
1241<!--
1242TODO: not mentioning __provide__, since it should be optional in the most basic usage?
1243-->
1244
1245- [Feature Index: Modules](ref/feature-index.html#Modules)
1246
1247### Reflecting on the Interpreter
1248
1249YSH is a language for creating other languages. You can reflect on the
1250interpreter with APIs like `io->eval()` and `vm.getFrame()`.
1251
1252- [Feature Index: Reflection](ref/feature-index.html#Reflection)
1253
1254(Ruby, Tcl, and Racket also have this flavor.)
1255
1256---
1257
1258These advanced features all live **inside** the Oils interpreter. But a shell
1259naturally deals with textual data from the **outside**, so let's switch gears.
1260
1261## Data Notation / Interchange Formats
1262
1263YSH reads and writes **data notation**, like [JSON]($xref).
1264
1265I think of them as languages for data, rather than code. Instead of being
1266executed, they're parsed as data structures.
1267
1268<!-- TODO: Link to slogans, fallacies, and concepts -->
1269
1270### UTF-8
1271
1272UTF-8 is the foundation of our textual data languages.
1273
1274It's the most common Unicode encoding, and represents all code points
1275consistently and efficiently.
1276
1277<!-- TODO: there's a runes() iterator which gives integer offsets, usable for
1278slicing -->
1279
1280<!-- TODO: write about J8 notation -->
1281
1282### Lines of Text (traditional), and JSON/J8 Strings
1283
1284Traditional Unix tools like `grep` and `awk` operate on streams of lines. YSH
1285supports this style, like any other shell.
1286
1287But YSH also has [J8 Notation][], a data format based on [JSON][]. It's a 100%
1288compatible upgrade that fixes some warts in JSON, and makes Unix text and JSON
1289work together more smoothly.
1290
1291---
1292
1293[J8 Notation]: j8-notation.html
1294
1295Let's talk about simple strings and lines first. Here is YSH code for making a
1296string with 2 lines:
1297
1298 var mystr = u'pea\n' ++ u'42\n'
1299
1300Now we can **encode** it into a message, which will fit on a single line.
1301
1302 json write (mystr) > message.txt
1303
1304Now we can compress `message.txt`, encrypt it, and send it to another computer.
1305
1306And then we can **decode** it, i.e. read it back into a variable:
1307
1308 json read (&x) < message.txt
1309 = x # => "pea\n42\n"
1310
1311<!--
1312This can also be done with functions like `toJson()` and `fromJson()`
1313
1314 write $[toJson(mystr)] # => "pea\n42\n"
1315
1316 # JSON8 is the same, but it's not lossy for binary data
1317 write $[toJson8(mystr)] # => "pea\t42\n"
1318
1319-->
1320
1321### Structured: JSON8, TSV8
1322
1323In addition to strings and lines, you can write and read **tree-shaped** data
1324as [JSON][]:
1325
1326 var d = {key: 'value'}
1327 json write (d) # dump variable d as JSON
1328 # =>
1329 # {
1330 # "key": "value"
1331 # }
1332
1333 echo '["ale", 42]' > example.json
1334
1335 json read (&d2) < example.json # parse JSON into var d2
1336 pp (d2) # pretty print it
1337 # => (List) ['ale', 42]
1338
1339[JSON][] will lose information when strings have binary data, but the slight
1340[JSON8]($xref) upgrade won't:
1341
1342 var b = {binary: $'\xff'}
1343 json8 write (b)
1344 # =>
1345 # {
1346 # "binary": b'\yff'
1347 # }
1348
1349[JSON]: $xref
1350
1351**Table-shaped** data can be read and written as [TSV8]($xref). (TODO: not yet
1352implemented.)
1353
1354<!-- Figure out the API. Does it work like JSON?
1355
1356Or I think we just implement
1357- rows: 'where' or 'filter' (dplyr)
1358- cols: 'select' conflicts with shell builtin; call it 'cols'?
1359- sort: 'sort-by' or 'arrange' (dplyr)
1360- TSV8 <=> sqlite conversion. Are these drivers or what?
1361 - and then let you pipe output?
1362
1363Do we also need TSV8 space2tab or something? For writing TSV8 inline.
1364
1365More later:
1366- MessagePack (e.g. for shared library extension modules)
1367 - msgpack read, write? I think user-defined function could be like this?
1368- SASH: Simple and Strict HTML? For easy processing
1369-->
1370
1371## The Runtime Shared by OSH and YSH
1372
1373Although we describe OSH and YSH as different languages, they use the **same**
1374interpreter under the hood. This interpreter has various `shopt` flags that
1375are flipped for different behavior, e.g. with `shopt --set ysh:all`.
1376
1377Understanding this interpreter and its interface to the Unix kernel will help
1378you understand **both** languages!
1379
1380### Interpreter Data Model
1381
1382The [Interpreter State](interpreter-state.html) doc is **under construction**.
1383It will cover:
1384
1385- Two separate namespaces (like Lisp 1 vs. 2):
1386 - **proc** namespace for procs as the first word
1387 - **variable** namespace
1388- The variable namespace has a **call stack**, for the local variables of a
1389 proc.
1390 - Each **stack frame** is a `{name -> cell}` mapping.
1391 - A **cell** has one of the above data types: `Bool`, `Int`, `Str`, etc.
1392 - A cell has `readonly`, `export`, and `nameref` **flags**.
1393- Boolean shell options with `shopt`: `parse_paren`, `simple_word_eval`, etc.
1394- String shell options with `shvar`: `IFS`, `PATH`
1395- **Registers** that are silently modified by the interpreter
1396 - `$?` and `_error`
1397 - `$!` for the last PID
1398 - `_this_dir`
1399 - `_reply`
1400
1401### Process Model (the kernel)
1402
1403The [Process Model](process-model.html) doc is **under construction**. It will cover:
1404
1405- Simple Commands, `exec`
1406- Pipelines. #[shell-the-good-parts](#blog-tag)
1407- `fork`, `forkwait`
1408- Command and process substitution.
1409- Related links:
1410 - [Tracing execution in Oils](xtrace.html) (xtrace), which divides
1411 process-based concurrency into **synchronous** and **async** constructs.
1412 - [Three Comics For Understanding Unix
1413 Shell](http://www.oilshell.org/blog/2020/04/comics.html) (blog)
1414
1415
1416<!--
1417Process model additions: Capers, Headless shell
1418
1419some optimizations: See YSH starts fewer processes than other shells.
1420-->
1421
1422## Summary
1423
1424YSH is a large language that evolved from Unix shell. It has shell-like
1425commands, Python-like expressions on typed data, and Ruby-like command blocks.
1426
1427Even though it's large, you can "forget" the bad parts of shell like `[ $x -lt
1428$y ]`.
1429
1430These concepts are central to YSH:
1431
14321. Interleaved *word*, *command*, and *expression* languages.
14332. A standard library of *shell builtins*, as well as *builtin functions*
14343. Languages for *data*: J8 Notation, including JSON8 and TSV8
14354. A *runtime* shared by OSH and YSH
1436
1437## Related Docs
1438
1439- [YSH vs. Shell Idioms](idioms.html) - YSH side-by-side with shell.
1440- [YSH Language Influences](language-influences.html) - In addition to shell,
1441 Python, and JavaScript, YSH is influenced by Ruby, Perl, Awk, PHP, and more.
1442- [A Feel For YSH Syntax](syntax-feelings.html) - Some thoughts that may help
1443 you remember the syntax.
1444- [YSH Language Warts](warts.html) documents syntax that may be surprising.
1445
1446## Appendix: Features Not Shown
1447
1448### Advanced
1449
1450These shell features are part of YSH, but aren't shown for brevity.
1451
1452- The `fork` and `forkwait` builtins, for concurrent execution and subshells.
1453- Process Substitution: `diff <(sort left.txt) <(sort right.txt)`
1454
1455### Deprecated Shell Constructs
1456
1457The shared interpreter supports many shell constructs that are deprecated:
1458
1459- YSH code uses shell's `||` and `&&` in limited circumstances, since `errexit`
1460 is on by default.
1461- Assignment builtins like `local` and `declare`. Use YSH keywords.
1462- Boolean expressions like `[[ x =~ $pat ]]`. Use YSH expressions.
1463- Shell arithmetic like `$(( x + 1 ))` and `(( y = x ))`. Use YSH expressions.
1464- The `until` loop can always be replaced with a `while` loop
1465- Most of what's in `${}` can be written in other ways. For example
1466 `${s#/tmp}` could be `s => removePrefix('/tmp')` (TODO).
1467
1468### Not Yet Implemented
1469
1470This document mentions a few constructs that aren't yet implemented. Here's a
1471summary:
1472
1473```none
1474# Unimplemented syntax:
1475
1476echo ${x|html} # formatters
1477
1478echo ${x %.2f} # statically-parsed printf
1479
1480var x = "<p>$x</p>"html
1481echo "<p>$x</p>"html # tagged string
1482
1483var x = 15 Mi # units suffix
1484```
1485
1486<!--
1487- To implement: Capers: stateless coprocesses
1488-->
1489
1490## Appendix: Example of an YSH Module
1491
1492YSH can be used to write simple "shell scripts" or longer programs. It has
1493*procs* and *modules* to help with the latter.
1494
1495A module is just a file, like this:
1496
1497```
1498#!/usr/bin/env ysh
1499### Deploy script
1500
1501use $_this_dir/lib/util.ysh --pick log
1502
1503const DEST = '/tmp/ysh-tour'
1504
1505proc my-sync(...files) {
1506 ### Sync files and show which ones
1507
1508 cp --verbose @files $DEST
1509}
1510
1511proc main {
1512 mkdir -p $DEST
1513
1514 touch {foo,bar}.py {build,test}.sh
1515
1516 log "Copying source files"
1517 my-sync *.py *.sh
1518
1519 if test --dir /tmp/logs {
1520 cd /tmp/logs
1521
1522 log "Copying logs"
1523 my-sync *.log
1524 }
1525}
1526
1527if is-main { # The only top-level statement
1528 main @ARGV
1529}
1530```
1531
1532<!--
1533TODO:
1534- Also show flags parsing?
1535- Show longer examples where it isn't boilerplate
1536-->
1537
1538You wouldn't bother with the boilerplate for something this small. But this
1539example illustrates the basic idea: the top level often contains these words:
1540`use`, `const`, `proc`, and `func`.
1541