OILS / doc / ysh-tour.md View on Github | oilshell.org

1402 lines, 970 significant
1---
2default_highlighter: oils-sh
3---
4
5A Tour of YSH
6=============
7
8<!-- author's note about example names
9
10- people: alice, bob
11- nouns: ale, bean
12 - peanut, coconut
13- 42 for integers
14-->
15
16This doc describes the [YSH]($xref) language from **clean slate**
17perspective. We don't assume you know Unix shell, or the compatible
18[OSH]($xref). But shell users will see the similarity, with simplifications
19and upgrades.
20
21Remember, YSH is for Python and JavaScript users who avoid shell! See the
22[project FAQ][FAQ] for more color on that.
23
24[FAQ]: https://www.oilshell.org/blog/2021/01/why-a-new-shell.html
25[path dependence]: https://en.wikipedia.org/wiki/Path_dependence
26
27This document is **long** because it demonstrates nearly every feature of the
28language. You may want to read it in multiple sittings, or read [The Simplest
29Explanation of
30Oil](https://www.oilshell.org/blog/2020/01/simplest-explanation.html) first.
31(Until 2023, YSH was called the "Oil language".)
32
33
34Here's a summary of what follows:
35
361. YSH has interleaved *word*, *command*, and *expression* languages.
37 - The command language has Ruby-like *blocks*, and the expression language
38 has Python-like *data types*.
392. YSH has both builtin *commands* like `cd /tmp`, and builtin *functions* like
40 `join()`.
413. Languages for *data*, like [JSON][], are complementary to YSH code.
424. OSH and YSH share both an *interpreter data model* and a *process model*
43 (provided by the Unix kernel). Understanding these common models will make
44 you both a better shell user and YSH user.
45
46Keep these points in mind as you read the details below.
47
48[JSON]: https://json.org
49
50<div id="toc">
51</div>
52
53## Preliminaries
54
55Start YSH just like you start bash or Python:
56
57<!-- oils-sh below skips code block extraction, since it doesn't run -->
58
59```sh-prompt
60bash$ ysh # assuming it's installed
61
62ysh$ echo 'hello world' # command typed into YSH
63hello world
64```
65
66In the sections below, we'll save space by showing output **in comments**, with
67`=>`:
68
69 echo 'hello world' # => hello world
70
71Multi-line output is shown like this:
72
73 echo one
74 echo two
75 # =>
76 # one
77 # two
78
79## Examples
80
81### Hello World Script
82
83You can also type commands into a file like `hello.ysh`. This is a complete
84YSH program, which is identical to a shell program:
85
86 echo 'hello world' # => hello world
87
88### A Taste of YSH
89
90Unlike shell, YSH has `var` and `const` keywords:
91
92 const name = 'world' # const is rarer, used the top-level
93 echo "hello $name" # => hello world
94
95They take rich Python-like expressions on the right:
96
97 var x = 42 # an integer, not a string
98 setvar x = x * 2 + 1 # mutate with the 'setvar' keyword
99
100 setvar x += 5 # Increment by 5
101 echo $x # => 6
102
103 var mylist = [x, 7] # two integers [6, 7]
104
105Expressions are often surrounded by `()`:
106
107 if (x > 0) {
108 echo 'positive'
109 } # => positive
110
111 for i, item in (mylist) { # 'mylist' is a variable, not a string
112 echo "[$i] item $item"
113 }
114 # =>
115 # [0] item 6
116 # [1] item 7
117
118YSH has Ruby-like blocks:
119
120 cd /tmp {
121 echo hi > greeting.txt # file created inside /tmp
122 echo $PWD # => /tmp
123 }
124 echo $PWD # prints the original directory
125
126And utilities to read and write JSON:
127
128 var person = {name: 'bob', age: 42}
129 json write (person)
130 # =>
131 # {
132 # "name": "bob",
133 # "age": 42,
134 # }
135
136 echo '["str", 42]' | json read # sets '_reply' variable by default
137
138The `=` keyword evaluates and prints an expression:
139
140 = _reply
141 # => (List) ["str", 42]
142
143(Think of it like `var x = _reply`, without the `var`.)
144
145## Word Language: Expressions for Strings (and Arrays)
146
147Let's describe the word language first, and then talk about commands and
148expressions. Words are a rich language because **strings** are a central
149concept in shell.
150
151### Three Kinds of String Literals
152
153You can choose the quoting style that's most convenient to write a given
154string.
155
156#### Double-Quoted, Single-Quoted, and J8 strings (like JSON)
157
158Double-quoted strings allow **interpolation with `$`**:
159
160 var person = 'alice'
161 echo "hi $person, $(echo bye)" # => hi alice, bye
162
163Write operators by escaping them with `\`:
164
165 echo "\$ \" \\ " # => $ " \
166
167In single-quoted strings, all characters are **literal** (except `'`, which
168can't be expressed):
169
170 echo 'c:\Program Files\' # => c:\Program Files\
171
172If you want C-style backslash **character escapes**, use a J8 string, which is
173like JSON, but with single quotes:
174
175 echo u' A is \u{41} \n line two, with backslash \\'
176 # =>
177 # A is A
178 # line two, with backslash \
179
180The `u''` strings are guaranteed to be valid Unicode (unlike JSON), but you can
181also use `b''` strings:
182
183 echo b'byte \yff' # byte that's not valid unicode, like \xff in other languages
184 # do not confuse with \u{ff}
185
186#### Multi-line Strings
187
188Multi-line strings are surrounded with triple quotes. They come in the same
189three varieties, and leading whitespace is stripped in a convenient way.
190
191 sort <<< """
192 var sub: $x
193 command sub: $(echo hi)
194 expression sub: $[x + 3]
195 """
196 # =>
197 # command sub: hi
198 # expression sub: 9
199 # var sub: 6
200
201 sort <<< '''
202 $2.00 # literal $, no interpolation
203 $1.99
204 '''
205 # =>
206 # $1.99
207 # $2.00
208
209 sort <<< u'''
210 C\tD
211 A\tB
212 ''' # b''' strings also supported
213 # =>
214 # A B
215 # C D
216
217(Use multiline strings instead of shell's [here docs]($xref:here-doc).)
218
219### Three Kinds of Substitution
220
221YSH has syntax for 3 types of substitution, all of which start with `$`. These
222things can all be converted to a **string**:
223
2241. Variables
2252. The output of commands
2263. The value of expressions
227
228#### Variable Sub
229
230The syntax `$a` or `${a}` converts a variable to a string:
231
232 var a = 'ale'
233 echo $a # => ale
234 echo _${a}_ # => _ale_
235 echo "_ $a _" # => _ ale _
236
237The shell operator `:-` is occasionally useful in YSH:
238
239 echo ${not_defined:-'default'} # => default
240
241#### Command Sub
242
243The `$(echo hi)` syntax runs a command and captures its `stdout`:
244
245 echo $(hostname) # => example.com
246 echo "_ $(hostname) _" # => _ example.com _
247
248#### Expression Sub
249
250The `$[myexpr]` syntax evaluates an expression and converts it to a string:
251
252 echo $[a] # => ale
253 echo $[1 + 2 * 3] # => 7
254 echo "_ $[1 + 2 * 3] _" # => _ 7 _
255
256<!-- TODO: safe substitution with "$[a]"html -->
257
258### Arrays of Strings: Globs, Brace Expansion, Splicing, and Splitting
259
260There are four constructs that evaluate to an **list of strings**, rather than
261a single string.
262
263#### Globs
264
265Globs like `*.py` evaluate to a list of files.
266
267 touch foo.py bar.py # create the files
268 write *.py
269 # =>
270 # foo.py
271 # bar.py
272
273If no files match, it evaluates to an empty list (`[]`).
274
275#### Brace Expansion
276
277The brace expansion mini-language lets you write strings without duplication:
278
279 write {alice,bob}@example.com
280 # =>
281 # alice@example.com
282 # bob@example.com
283
284#### Splicing
285
286The `@` operator splices an array into a command:
287
288 var myarray = :| ale bean |
289 write S @myarray E
290 # =>
291 # S
292 # ale
293 # bean
294 # E
295
296You also have `@[]` to splice an expression that evaluates to a list:
297
298 write -- @[split('ale bean')]
299 # =>
300 # ale
301 # bean
302
303Each item will be converted to a string.
304
305#### Split Command Sub / Split Builtin Sub
306
307There's also a variant of *command sub* that splits first:
308
309 write @(seq 3) # write gets 3 arguments
310 # =>
311 # 1
312 # 2
313 # 3
314
315<!-- TODO: This should decode J8 notation, which includes "" j"" and b"" -->
316
317## Command Language: I/O, Control Flow, Abstraction
318
319### Simple Commands and Redirects
320
321A simple command is a space-separated list of words, which are often unquoted.
322YSH looks up the first word to determine if it's a `proc` or shell builtin.
323
324 echo 'hello world' # The shell builtin 'echo'
325
326 proc greet (name) { # A proc is like a procedure or process
327 echo "hello $name"
328 }
329
330 # Now the first word will resolve to the proc
331 greet alice # => hello alice
332
333If it's neither, then it's assumed to be an external command:
334
335 ls -l /tmp # The external 'ls' command
336
337Commands accept traditional string arguments, as well as typed arguments in
338parentheses:
339
340 # 'write' is a string arg; 'x' is a typed expression arg
341 json write (x)
342
343You can **redirect** `stdin` and `stdout` of simple commands:
344
345 echo hi > tmp.txt # write to a file
346 sort < tmp.txt
347
348Idioms for using stderr (identical to shell):
349
350 ls /tmp 2>errors.txt
351 echo 'fatal error' 1>&2
352
353"Simple" commands in YSH can also have typed `()` and block `{}` args, which
354we'll see in the section on "procs".
355
356### Pipelines
357
358Pipelines are a powerful method manipulating data streams:
359
360 ls | wc -l # count files in this directory
361 find /bin -type f | xargs wc -l # count files in a subtree
362
363The stream may contain (lines of) text, binary data, JSON, TSV, and more.
364Details below.
365
366### Multi-line Commands
367
368The YSH `...` prefix lets you write long commands, pipelines, and `&&` chains
369without `\` line continuations.
370
371 ... find /bin # traverse this directory and
372 -type f -a -executable # print executable files
373 | sort -r # reverse sort
374 | head -n 30 # limit to 30 files
375 ;
376
377When this mode is active:
378
379- A single newline behaves like a space
380- A blank line (two newlines in a row) is illegal, but a line that has only a
381 comment is allowed. This prevents confusion if you forget the `;`
382 terminator.
383
384### `var`, `setvar`, `const` to Declare and Mutate
385
386Constants can't be modified:
387
388 const myconst = 'mystr'
389 # setvar myconst = 'foo' would be an error
390
391Modify variables with the `setvar` keyword:
392
393 var num_beans = 12
394 setvar num_beans = 13
395
396A more complex example:
397
398 var d = {name: 'bob', age: 42} # dict literal
399 setvar d.name = 'alice' # d.name is a synonym for d['name']
400 echo $[d.name] # => alice
401
402That's most of what you need to know about assignments. Advanced users may
403want to use `setglobal` or `call myplace->setValue(42)` in certain situations.
404
405<!--
406 var g = 1
407 var h = 2
408 proc demo(:out) {
409 setglobal g = 42
410 setref out = 43
411 }
412 demo :h # pass a reference to h
413 echo "$g $h" # => 42 43
414-->
415
416More details: [Variable Declaration and Mutation](variables.html).
417
418### `for` Loop
419
420Shell-style for loops iterate over **words**:
421
422 for word in 'oils' $num_beans {pea,coco}nut {
423 echo $word
424 }
425 # =>
426 # oils
427 # 13
428 # peanut
429 # coconut
430
431You can also request the loop index:
432
433 for i, word in README.md *.py {
434 echo "$i - $word"
435 }
436 # =>
437 # 0 - README.md
438 # 1 - __init__.py
439
440To iterate over lines of `stdin`, use:
441
442 for line in (io.stdin) {
443 echo $line
444 }
445 # lines are buffered, so it's much faster than `while read --rawline`
446
447Ask for the loop index:
448
449 for i, line in (io.stdin) {
450 echo "$i $line"
451 }
452
453To iterate over a typed data, use parentheses around an **expression**. The
454expression should evaluate to an integer `Range`, `List`, or `Dict`:
455
456 for i in (3 .. 5) { # range operator ..
457 echo "i = $i"
458 }
459 # =>
460 # i = 3
461 # i = 4
462
463List:
464
465 var foods = ['ale', 'bean']
466 for item in (foods) {
467 echo $item
468 }
469 # =>
470 # ale
471 # bean
472
473Again, you can request the index with `for i, item in ...`.
474
475Here's the most general form of the loop over `Dict`:
476
477 var mydict = {pea: 42, nut: 10}
478 for i, k, v in (mydict) {
479 echo "$i - $k - $v"
480 }
481 # =>
482 # 0 - pea - 42
483 # 1 - nut - 10
484
485There are two simpler forms:
486
487- One variable gives you the key: `for k in (mydict)`
488- Two variables gives you the key and value: `for k, v in (mydict)`
489
490(One way to think of it: `for` loops in YSH have the functionality Python's
491`enumerate()`, `items()`, `keys()`, and `values()`.)
492
493<!--
494TODO: Str loop should give you the (UTF-8 offset, rune)
495Or maybe just UTF-8 offset? Decoding errors could be exceptions, or Unicode
496replacement.
497-->
498
499### `while` Loop
500
501While loops can use a **command** as the termination condition:
502
503 while test --file lock {
504 sleep 1
505 }
506
507Or an **expression**, which is surrounded in `()`:
508
509 var i = 3
510 while (i < 6) {
511 echo "i = $i"
512 setvar i += 1
513 }
514 # =>
515 # i = 3
516 # i = 4
517 # i = 5
518
519### `if elif` Conditional
520
521If statements test the exit code of a command, and have optional `elif` and
522`else` clauses:
523
524 if test --file foo {
525 echo 'foo is a file'
526 rm --verbose foo # delete it
527 } elif test --dir foo {
528 echo 'foo is a directory'
529 } else {
530 echo 'neither'
531 }
532
533Invert the exit code with `!`:
534
535 if ! grep alice /etc/passwd {
536 echo 'alice is not a user'
537 }
538
539As with `while` loops, the condition can also be an **expression** wrapped in
540`()`:
541
542 if (num_beans > 0) {
543 echo 'so many beans'
544 }
545
546 var done = false
547 if (not done) { # negate with 'not' operator (contrast with !)
548 echo "we aren't done"
549 }
550
551### `case` Conditional
552
553The case statement is a series of conditionals and executable blocks. The
554condition can be either an unquoted glob pattern like `*.py`, an eggex pattern
555like `/d+/`, or a typed expression like `(42)`:
556
557 var s = 'README.md'
558 case (s) {
559 *.py { echo 'Python' }
560 *.cc | *.h { echo 'C++' }
561 * { echo 'Other' }
562 }
563 # => Other
564
565 case (s) {
566 / dot* '.md' / { echo 'Markdown' }
567 (30 + 12) { echo 'the integer 42' }
568 (else) { echo 'neither' }
569 }
570 # => Markdown
571
572<!-- TODO: document case on typed data -->
573
574(Shell style like `if foo; then ... fi` and `case $x in ... esac` is also legal,
575but discouraged in YSH code.)
576
577### Error Handling
578
579If statements are also used for **error handling**. Builtins and external
580commands use this style:
581
582 if ! test -d /bin {
583 echo 'not a directory'
584 }
585
586 if ! cp foo /tmp {
587 echo 'error copying' # any non-zero status
588 }
589
590Procs use this style (because of shell's *disabled `errexit` quirk*):
591
592 try {
593 myproc
594 }
595 if failed {
596 echo 'failed'
597 }
598
599For a complete list of examples, see [YSH Error
600Handling](ysh-error.html). For design goals and a reference, see [YSH
601Fixes Shell's Error Handling](error-handling.html).
602
603#### `break`, `continue`, `return`, `exit`
604
605The `exit` **keyword** exits a process (it's not a shell builtin.) The other 3
606control flow keywords behave like they do in Python and JavaScript.
607
608### Ruby-like Blocks
609
610Here's a builtin command that takes a literal block argument:
611
612 shopt --unset errexit { # ignore errors
613 cp ale /tmp
614 cp bean /bin
615 }
616
617Blocks are a special kind of typed argument passed to commands like `shopt`.
618Their type is `value.Command`.
619
620### Shell-like `proc`
621
622You can define units of code with the `proc` keyword.
623
624 proc mycopy (src, dest) {
625 ### Copy verbosely
626
627 mkdir -p $dest
628 cp --verbose $src $dest
629 }
630
631The `###` line is a "doc comment", and can be retrieved with `pp proc`. Simple
632procs like this are invoked like a shell command:
633
634 touch log.txt
635 mycopy log.txt /tmp # first word 'mycopy' is a proc
636
637Procs have more features, including **four** kinds of arguments:
638
6391. Word args (which are always strings)
6401. Typed, positional args (aka positional args)
6411. Typed, named args (aka named args)
6421. A final block argument, which may be written with `{ }`.
643
644At the call site, they can look like any of these forms:
645
646 cd /tmp # word arg
647
648 json write (d) # word arg, then positional arg
649
650 # error 'failed' (status=9) # word arg, then named arg
651
652 cd /tmp { echo $PWD } # word arg, then block arg
653
654 var mycmd = ^(echo hi) # expression for a value.Command
655 eval (mycmd) # positional arg
656
657<!-- TODO: lazy arg list: ls8 | where [age > 10] -->
658
659At the definition site, the kinds of parameters are separated with `;`, similar
660to the Julia language:
661
662 proc p2 (word1, word2; pos1, pos2, ...rest_pos) {
663 echo "$word1 $word2 $[pos1 + pos2]"
664 json write (rest_pos)
665 }
666
667 proc p3 (w ; ; named1, named2, ...rest_named; block) {
668 echo "$w $[named1 + named2]"
669 eval (block)
670 json write (rest_named)
671 }
672
673 proc p4 (; ; ; block) {
674 eval (block)
675 }
676
677YSH also has Python-like functions defined with `func`. These are part of the
678expression language, which we'll see later.
679
680For more info, see the [Informal Guide to Procs and Funcs](proc-func.html)
681(under construction).
682
683#### Builtin Commands
684
685**Shell builtins** like `cd` and `read` are the "standard library" of the
686command language. Each one takes various flags:
687
688 cd -L . # follow symlinks
689
690 echo foo | read --all # read all of stdin
691
692Here are some categories of builtin:
693
694- I/O: `echo write read`
695- File system: `cd test`
696- Processes: `fork wait forkwait exec`
697- Interpreter settings: `shopt shvar`
698- Meta: `command builtin runproc type eval`
699
700<!-- TODO: Link to a comprehensive list of builtins -->
701
702## Expression Language: Python-like Types
703
704YSH expressions look and behave more like Python or JavaScript than shell. For
705example, we write `if (x < y)` instead of `if [ $x -lt $y ]`. Expressions are
706usually surrounded by `( )`.
707
708At runtime, variables like `x` and `y` are bounded to **typed data**, like
709integers, floats, strings, lists, and dicts.
710
711<!--
712[Command vs. Expression Mode](command-vs-expression-mode.html) may help you
713understand how YSH is parsed.
714-->
715
716### Python-like `func`
717
718At the end of the *Command Language*, we saw that procs are shell-like units of
719code. Now let's talk about Python-like **functions** in YSH, which are
720different than `procs`:
721
722- They're defined with the `func` keyword.
723- They're called in expressions, not in commands.
724- They're **pure**, and live in the **interior** of a process.
725 - In contrast, procs usually perform I/O, and have **exterior** boundaries.
726
727Here's a function that mutates its argument:
728
729 func popTwice(mylist) {
730 call mylist->pop()
731 call mylist->pop()
732 }
733
734 var mylist = [3, 4]
735
736 # The call keyword is an "adapter" between commands and expressions,
737 # like the = keyword.
738 call popTwice(mylist)
739
740Here's a pure function:
741
742 func myRepeat(s, n; special=false) { # positional; named params
743 var parts = []
744 for i in (0 .. n) {
745 append $s (parts)
746 }
747 var result = join(parts)
748
749 if (special) {
750 return ("$result !!") # parens required for typed return
751 } else {
752 return (result)
753 }
754 }
755
756 echo $[myRepeat('z', 3)] # => zzz
757
758 echo $[myRepeat('z', 3, special=true)] # => zzz !!
759
760Funcs are named using `camelCase`, while procs use `kebab-case`. See the
761[Style Guide](style-guide.html) for more conventions.
762
763#### Builtin Functions
764
765In addition, to builtin commands, YSH has Python-like builtin **functions**.
766These are like the "standard library" for the expression language. Examples:
767
768- Functions that take multiple types: `len() type()`
769- Conversions: `bool() int() float() str() list() ...`
770- Explicit word evaluation: `split() join() glob() maybe()`
771
772<!-- TODO: Make a comprehensive list of func builtins. -->
773
774
775### Data Types: `Int`, `Str`, `List`, `Dict`, `Obj`, ...
776
777YSH has data types, each with an expression syntax and associated methods.
778
779### Methods
780
781YSH adds mutable data structures to shell, so we have a special syntax for
782mutating methods. They are looked up with a thin arrow `->`:
783
784 var foods = ['ale', 'bean']
785 var last = foods->pop() # bean
786 write @foods # => ale
787
788You can ignore the return value with the `call` keyword:
789
790 call foods->pop()
791
792Regular methods are looked up with the `.` operator:
793
794 var line = ' ale bean '
795 var caps = last.trim().upper() # 'ALE BEAN'
796
797You can also use the "chaining" style, with a fat arrow `=>`:
798
799 var trimmed = line => trim() => upper() # 'ALE BEAN'
800
801The `=>` operator lets you mix methods and free functions. If it doesn't find
802a method with the given name, it looks for a `Func`:
803
804 # list() is a free function taking one arg
805 # join() is a free function taking two args
806 var x = {k1: 42, k2: 43} => list() => join('/') # 'K1/K2'
807
808This allows a left-to-right "method chaining" style.
809
810---
811
812Now let's go through the data types in YSH. We'll show the syntax for
813literals, and what **methods** they have.
814
815#### Null and Bool
816
817YSH uses JavaScript-like spellings these three "atoms":
818
819 var x = null
820
821 var b1, b2 = true, false
822
823 if (b1) {
824 echo 'yes'
825 } # => yes
826
827
828#### Int
829
830There are many ways to write integers:
831
832 var small, big = 42, 65_536
833 echo "$small $big" # => 42 65536
834
835 var hex, octal, binary = 0x0001_0000, 0o755, 0b0001_0101
836 echo "$hex $octal $binary" # => 65536 493 21
837
838<!--
839"Runes" are integers that represent Unicode code points. They're not common in
840YSH code, but can make certain string algorithms more readable.
841
842 # Pound rune literals are similar to ord('A')
843 const a = #'A'
844
845 # Backslash rune literals can appear outside of quotes
846 const newline = \n # Remember this is an integer
847 const backslash = \\ # ditto
848
849 # Unicode rune literal is syntactic sugar for 0x3bc
850 const mu = \u{3bc}
851
852 echo "chars $a $newline $backslash $mu" # => chars 65 10 92 956
853-->
854
855#### Float
856
857Floats are written like you'd expect:
858
859 var small = 1.5e-10
860 var big = 3.14
861
862#### Str
863
864See the section above called *Three Kinds of String Literals*. It described
865`'single quoted'`, `"double ${quoted}"`, and `u'J8-style\n'` strings; as well
866as their multiline variants.
867
868Strings are UTF-8 encoded in memory, like strings in the [Go
869language](https://golang.org). There isn't a separate string and unicode type,
870as in Python.
871
872Strings are **immutable**, as in Python and JavaScript. This means they only
873have **transforming** methods:
874
875 var x = s => trim()
876
877Other methods:
878
879- `trimLeft() trimRight()`
880- `trimPrefix() trimSuffix()`
881- `upper() lower()` (not implemented)
882
883<!--
884The syntax `:symbol` could be an interned string.
885-->
886
887#### List (and Arrays)
888
889All lists can be expressed with Python-like literals:
890
891 var foods = ['ale', 'bean', 'corn']
892 var recursive = [1, [2, 3]]
893
894As a special case, list of strings are called **arrays**. It's often more
895convenient to write them with shell-like literals:
896
897 # No quotes or commas
898 var foods = :| ale bean corn |
899
900 # You can use the word language here
901 var other = :| foo $s *.py {alice,bob}@example.com |
902
903Lists are **mutable**, as in Python and JavaScript. So they mainly have
904mutating methods:
905
906 call foods->reverse()
907 write -- @foods
908 # =>
909 # corn
910 # bean
911 # ale
912
913#### Dict
914
915Dicts use syntax that's more like JavaScript than Python. Here's a dict
916literal:
917
918 var d = {
919 name: 'bob', # unquoted keys are allowed
920 age: 42,
921 'key with spaces': 'val'
922 }
923
924There are two syntaxes for key lookup. If the key doesn't exist, it's a fatal
925error.
926
927 var v1 = d['name']
928 var v2 = d.name # shorthand for the above
929 var v3 = d['key with spaces'] # no shorthand for this
930
931Keys names can be computed with expressions in `[]`:
932
933 var key = 'alice'
934 var d2 = {[key ++ '_z']: 'ZZZ'} # Computed key name
935 echo $[d2.alice_z] # => ZZZ # Reminder: expression sub
936
937Omitting the value causes it to be taken from a variable of the same name:
938
939 var d3 = {key} # value is taken from the environment
940 echo "name is $[d3.key]" # => name is alice
941
942More:
943
944 var empty = {}
945 echo $[len(empty)] # => 0
946
947Dicts are **mutable**, as in Python and JavaScript. But the `keys()` and `values()`
948methods return new `List` objects:
949
950 var keys = d2 => keys() # => alice_z
951 # var vals = d3 => values() # => alice
952
953### `Place` type / "out params"
954
955The `read` builtin can either set an implicit variable `_reply`:
956
957 whoami | read --all # sets _reply
958
959Or you can pass a `value.Place`, created with `&`
960
961 var x # implicitly initialized to null
962 whoami | read --all (&x) # mutate this "place"
963 echo who=$x # => who=andy
964
965#### Quotation Types: value.Command (Block) and value.Expr
966
967These types are for reflection on YSH code. Most YSH programs won't use them
968directly.
969
970- `Command`: an unevaluated code block.
971 - rarely-used literal: `^(ls | wc -l)`
972- `Expr`: an unevaluated expression.
973 - rarely-used literal: `^[42 + a[i]]`
974
975<!-- TODO: implement Block, Expr, ArgList types (variants of value) -->
976
977### Operators
978
979Operators are generally the same as in Python:
980
981 if (10 <= num_beans and num_beans < 20) {
982 echo 'enough'
983 } # => enough
984
985YSH has a few operators that aren't in Python. Equality can be approximate or
986exact:
987
988 var n = ' 42 '
989 if (n ~== 42) {
990 echo 'equal after stripping whitespace and type conversion'
991 } # => equal after stripping whitespace type conversion
992
993 if (n === 42) {
994 echo "not reached because strings and ints aren't equal"
995 }
996
997<!-- TODO: is n === 42 a type error? -->
998
999Pattern matching can be done with globs (`~~` and `!~~`)
1000
1001 const filename = 'foo.py'
1002 if (filename ~~ '*.py') {
1003 echo 'Python'
1004 } # => Python
1005
1006 if (filename !~~ '*.sh') {
1007 echo 'not shell'
1008 } # => not shell
1009
1010or regular expressions (`~` and `!~`). See the Eggex section below for an
1011example of the latter.
1012
1013Concatenation is `++` rather than `+` because it avoids confusion in the
1014presence of type conversion:
1015
1016 var n = 42 + 1 # string plus int does implicit conversion
1017 echo $n # => 43
1018
1019 var y = 'ale ' ++ "bean $n" # concatenation
1020 echo $y # => ale bean 43
1021
1022<!--
1023TODO: change example above
1024 var n = '42' + 1 # string plus int does implicit conversion
1025-->
1026
1027<!--
1028
1029#### Summary of Operators
1030
1031- Arithmetic: `+ - * / // %` and `**` for exponentatiation
1032 - `/` always yields a float, and `//` is integer division
1033- Bitwise: `& | ^ ~`
1034- Logical: `and or not`
1035- Comparison: `== < > <= >= in 'not in'`
1036 - Approximate equality: `~==`
1037 - Eggex and glob match: `~ !~ ~~ !~~`
1038- Ternary: `1 if x else 0`
1039- Index and slice: `mylist[3]` and `mylist[1:3]`
1040 - `mydict->key` is a shortcut for `mydict['key']`
1041- Function calls
1042 - free: `f(x, y)`
1043 - transformations and chaining: `s => startWith('prefix')`
1044 - mutating methods: `mylist->pop()`
1045- String and List: `++` for concatenation
1046 - This is a separate operator because the addition operator `+` does
1047 string-to-int conversion
1048
1049TODO: What about list comprehensions?
1050-->
1051
1052### Egg Expressions (YSH Regexes)
1053
1054An *Eggex* is a type of YSH expression that denote regular expressions. They
1055translate to POSIX ERE syntax, for use with tools like `egrep`, `awk`, and `sed
1056--regexp-extended` (GNU only).
1057
1058They're designed to be readable and composable. Example:
1059
1060 var D = / digit{1,3} /
1061 var ip_pattern = / D '.' D '.' D '.' D'.' /
1062
1063 var z = '192.168.0.1'
1064 if (z ~ ip_pattern) { # Use the ~ operator to match
1065 echo "$z looks like an IP address"
1066 } # => 192.168.0.1 looks like an IP address
1067
1068 if (z !~ / '.255' %end /) {
1069 echo "doesn't end with .255"
1070 } # => doesn't end with .255"
1071
1072See the [Egg Expressions doc](eggex.html) for details.
1073
1074## Interlude
1075
1076Let's review what we've seen before moving onto other YSH features.
1077
1078### Three Interleaved Languages
1079
1080Here are the languages we saw in the last 3 sections:
1081
10821. **Words** evaluate to a string, or list of strings. This includes:
1083 - literals like `'mystr'`
1084 - substitutions like `${x}` and `$(hostname)`
1085 - globs like `*.sh`
10862. **Commands** are used for
1087 - I/O: pipelines, builtins like `read`
1088 - control flow: `if`, `for`
1089 - abstraction: `proc`
10903. **Expressions** on typed data are borrowed from Python, with some JavaScript
1091 influence.
1092 - Lists: `['ale', 'bean']` or `:| ale bean |`
1093 - Dicts: `{name: 'bob', age: 42}`
1094 - Functions: `split('ale bean')` and `join(['pea', 'nut'])`
1095
1096### How Do They Work Together?
1097
1098Here are two examples:
1099
1100(1) In this this *command*, there are **four** *words*. The fourth word is an
1101*expression sub* `$[]`.
1102
1103 write hello $name $[d['age'] + 1]
1104 # =>
1105 # hello
1106 # world
1107 # 43
1108
1109(2) In this assignment, the *expression* on the right hand side of `=`
1110concatenates two strings. The first string is a literal, and the second is a
1111*command sub*.
1112
1113 var food = 'ale ' ++ $(echo bean | tr a-z A-Z)
1114 write $food # => ale BEAN
1115
1116So words, commands, and expressions are **mutually recursive**. If you're a
1117conceptual person, skimming [Syntactic Concepts](syntactic-concepts.html) may
1118help you understand this on a deeper level.
1119
1120<!--
1121One way to think about these sublanguages is to note that the `|` character
1122means something different in each context:
1123
1124- In the command language, it's the pipeline operator, as in `ls | wc -l`
1125- In the word language, it's only valid in a literal string like `'|'`, `"|"`,
1126 or `\|`. (It's also used in `${x|html}`, which formats a string.)
1127- In the expression language, it's the bitwise OR operator, as in Python and
1128 JavaScript.
1129-->
1130
1131## Languages for Data (Interchange Formats)
1132
1133In addition to languages for **code**, YSH also deals with languages for
1134**data**. [JSON]($xref) is a prominent example of the latter.
1135
1136<!-- TODO: Link to slogans, fallacies, and concepts -->
1137
1138### UTF-8
1139
1140UTF-8 is the foundation of our textual data languages.
1141
1142<!-- TODO: there's a runes() iterator which gives integer offsets, usable for
1143slicing -->
1144
1145<!-- TODO: write about J8 notation -->
1146
1147### Lines of Text (traditional), and JSON/J8 Strings
1148
1149Traditional Unix tools like `grep` and `awk` operate on streams of lines. YSH
1150supports this style, just like any other shell.
1151
1152But YSH also has [J8 Notation][], a data format based on [JSON][].
1153
1154[J8 Notation]: j8-notation.html
1155
1156It lets you encode arbitrary byte strings into a single (readable) line,
1157including those with newlines and terminal escape sequences.
1158
1159Example:
1160
1161 # A line with a tab char in the middle
1162 var mystr = u'pea\t' ++ u'42\n'
1163
1164 # Print it as JSON
1165 write $[toJson(mystr)] # => "pea\t42\n"
1166
1167 # JSON8 is the same, but it's not lossy for binary data
1168 write $[toJson8(mystr)] # => "pea\t42\n"
1169
1170### Structured: JSON8, TSV8
1171
1172You can write and read **tree-shaped** data as [JSON][]:
1173
1174 var d = {key: 'value'}
1175 json write (d) # dump variable d as JSON
1176 # =>
1177 # {
1178 # "key": "value"
1179 # }
1180
1181 echo '["ale", 42]' > example.json
1182
1183 json read (&d2) < example.json # parse JSON into var d2
1184 pp (d2) # pretty print it
1185 # => (List) ['ale', 42]
1186
1187[JSON][] will lose information when strings have binary data, but the slight
1188[JSON8]($xref) upgrade won't:
1189
1190 var b = {binary: $'\xff'}
1191 json8 write (b)
1192 # =>
1193 # {
1194 # "binary": b'\yff'
1195 # }
1196
1197[JSON]: $xref
1198
1199**Table-shaped** data can be read and written as [TSV8]($xref). (TODO: not yet
1200implemented.)
1201
1202<!-- Figure out the API. Does it work like JSON?
1203
1204Or I think we just implement
1205- rows: 'where' or 'filter' (dplyr)
1206- cols: 'select' conflicts with shell builtin; call it 'cols'?
1207- sort: 'sort-by' or 'arrange' (dplyr)
1208- TSV8 <=> sqlite conversion. Are these drivers or what?
1209 - and then let you pipe output?
1210
1211Do we also need TSV8 space2tab or something? For writing TSV8 inline.
1212
1213More later:
1214- MessagePack (e.g. for shared library extension modules)
1215 - msgpack read, write? I think user-defined function could be like this?
1216- SASH: Simple and Strict HTML? For easy processing
1217-->
1218
1219## The Runtime Shared by OSH and YSH
1220
1221Although we describe OSH and YSH as different languages, they use the **same**
1222interpreter under the hood. This interpreter has various `shopt` flags that
1223are flipped for different behavior, e.g. with `shopt --set ysh:all`.
1224
1225Understanding this interpreter and its interface to the Unix kernel will help
1226you understand **both** languages!
1227
1228### Interpreter Data Model
1229
1230The [Interpreter State](interpreter-state.html) doc is **under construction**.
1231It will cover:
1232
1233- Two separate namespaces (like Lisp 1 vs. 2):
1234 - **proc** namespace for procs as the first word
1235 - **variable** namespace
1236- The variable namespace has a **call stack**, for the local variables of a
1237 proc.
1238 - Each **stack frame** is a `{name -> cell}` mapping.
1239 - A **cell** has one of the above data types: `Bool`, `Int`, `Str`, etc.
1240 - A cell has `readonly`, `export`, and `nameref` **flags**.
1241- Boolean shell options with `shopt`: `parse_paren`, `simple_word_eval`, etc.
1242- String shell options with `shvar`: `IFS`, `PATH`
1243- **Registers** that are silently modified by the interpreter
1244 - `$?` and `_error`
1245 - `$!` for the last PID
1246 - `_this_dir`
1247 - `_reply`
1248
1249### Process Model (the kernel)
1250
1251The [Process Model](process-model.html) doc is **under construction**. It will cover:
1252
1253- Simple Commands, `exec`
1254- Pipelines. #[shell-the-good-parts](#blog-tag)
1255- `fork`, `forkwait`
1256- Command and process substitution.
1257- Related links:
1258 - [Tracing execution in Oils](xtrace.html) (xtrace), which divides
1259 process-based concurrency into **synchronous** and **async** constructs.
1260 - [Three Comics For Understanding Unix
1261 Shell](http://www.oilshell.org/blog/2020/04/comics.html) (blog)
1262
1263
1264<!--
1265Process model additions: Capers, Headless shell
1266
1267some optimizations: See YSH starts fewer processes than other shells.
1268-->
1269
1270## Summary
1271
1272YSH is a large language that evolved from Unix shell. It has shell-like
1273commands, Python-like expressions on typed data, and Ruby-like command blocks.
1274
1275Even though it's large, you can "forget" the bad parts of shell like `[ $x -lt
1276$y ]`.
1277
1278These concepts are central to YSH:
1279
12801. Interleaved *word*, *command*, and *expression* languages.
12812. A standard library of *shell builtins*, as well as *builtin functions*
12823. Languages for *data*: J8 Notation, including JSON8 and TSV8
12834. A *runtime* shared by OSH and YSH
1284
1285## Related Docs
1286
1287- [YSH vs. Shell Idioms](idioms.html) - YSH side-by-side with shell.
1288- [YSH Language Influences](language-influences.html) - In addition to shell,
1289 Python, and JavaScript, YSH is influenced by Ruby, Perl, Awk, PHP, and more.
1290- [A Feel For YSH Syntax](syntax-feelings.html) - Some thoughts that may help
1291 you remember the syntax.
1292- [YSH Language Warts](warts.html) documents syntax that may be surprising.
1293
1294## Appendix: Features Not Shown
1295
1296### Advanced
1297
1298These shell features are part of YSH, but aren't shown for brevity.
1299
1300- The `fork` and `forkwait` builtins, for concurrent execution and subshells.
1301- Process Substitution: `diff <(sort left.txt) <(sort right.txt)`
1302
1303### Deprecated Shell Constructs
1304
1305The shared interpreter supports many shell constructs that are deprecated:
1306
1307- YSH code uses shell's `||` and `&&` in limited circumstances, since `errexit`
1308 is on by default.
1309- Assignment builtins like `local` and `declare`. Use YSH keywords.
1310- Boolean expressions like `[[ x =~ $pat ]]`. Use YSH expressions.
1311- Shell arithmetic like `$(( x + 1 ))` and `(( y = x ))`. Use YSH expressions.
1312- The `until` loop can always be replaced with a `while` loop
1313- Most of what's in `${}` can be written in other ways. For example
1314 `${s#/tmp}` could be `s => removePrefix('/tmp')` (TODO).
1315
1316### Not Yet Implemented
1317
1318This document mentions a few constructs that aren't yet implemented. Here's a
1319summary:
1320
1321```none
1322# Unimplemented syntax:
1323
1324echo ${x|html} # formatters
1325
1326echo ${x %.2f} # statically-parsed printf
1327
1328var x = j"line\n"
1329echo j"line\n" # JSON-style string literal
1330
1331var x = "<p>$x</p>"html
1332echo "<p>$x</p>"html # tagged string
1333
1334var x = 15 Mi # units suffix
1335```
1336
1337Important builtins that aren't implemented:
1338
1339- `describe` for testing
1340- `parseArgs()` to parse flags
1341- Builtins for [TSV8]($xref) - selection, projection, sorting
1342
1343<!--
1344
1345- To document: Method calls
1346- To implement: Capers: stateless coprocesses
1347-->
1348
1349## Appendix: Example of an YSH Module
1350
1351YSH can be used to write simple "shell scripts" or longer programs. It has
1352*procs* and *modules* to help with the latter.
1353
1354A module is just a file, like this:
1355
1356```
1357#!/usr/bin/env ysh
1358### Deploy script
1359
1360source-guard main || return 0 # declaration, "include guard"
1361
1362source $_this_dir/lib/util.ysh # defines 'log' helper
1363
1364const DEST = '/tmp/ysh-tour'
1365
1366proc my-sync(...files) {
1367 ### Sync files and show which ones
1368
1369 cp --verbose @files $DEST
1370}
1371
1372proc main {
1373 mkdir -p $DEST
1374
1375 touch {foo,bar}.py {build,test}.sh
1376
1377 log "Copying source files"
1378 my-sync *.py *.sh
1379
1380 if test --dir /tmp/logs {
1381 cd /tmp/logs
1382
1383 log "Copying logs"
1384 my-sync *.log
1385 }
1386}
1387
1388if is-main { # The only top-level statement
1389 main @ARGV
1390}
1391```
1392
1393<!--
1394TODO:
1395- Also show flags parsing?
1396- Show longer examples where it isn't boilerplate
1397-->
1398
1399You wouldn't bother with the boilerplate for something this small. But this
1400example illustrates the idea, which is that the top level often contains these
1401words: `proc`, `const`, `module`, `source`, and `use`.
1402