OILS / doc / ysh-tour.md View on Github | oils.pub

1636 lines, 1119 significant
1---
2default_highlighter: oils-sh
3---
4
5A Tour of YSH
6=============
7
8<!-- author's note about example names
9
10- people: alice, bob
11- nouns: ale, bean
12 - peanut, coconut
13- 42 for integers
14-->
15
16This doc describes the [YSH]($xref) language from **clean slate**
17perspective. We don't assume you know Unix shell, or the compatible
18[OSH]($xref). But shell users will see the similarity, with simplifications
19and upgrades.
20
21Remember, YSH is for Python and JavaScript users who avoid shell! See the
22[project FAQ][FAQ] for more color on that.
23
24[FAQ]: https://www.oilshell.org/blog/2021/01/why-a-new-shell.html
25
26This document is **long** because it demonstrates nearly every feature of the
27language. You may want to read it in multiple sittings, or read [The Simplest
28Explanation of
29Oil](https://www.oilshell.org/blog/2020/01/simplest-explanation.html) first.
30(Until 2023, YSH was called the "Oil language".)
31
32
33Here's a summary of what follows:
34
351. YSH has interleaved *word*, *command*, and *expression* languages.
36 - The command language has Ruby-like *blocks*, and the expression language
37 has Python-like *data types*.
382. YSH has both builtin *commands* like `cd /tmp`, and builtin *functions* like
39 `join()`.
403. Languages for *data*, like [JSON][], are complementary to YSH code.
414. OSH and YSH share both an *interpreter data model* and a *process model*
42 (provided by the Unix kernel). Understanding these common models will make
43 you both a better shell user and YSH user.
44
45Keep these points in mind as you read the details below.
46
47[JSON]: https://json.org
48
49<div id="toc">
50</div>
51
52## Preliminaries
53
54Start YSH just like you start bash or Python:
55
56<!-- oils-sh below skips code block extraction, since it doesn't run -->
57
58```sh-prompt
59bash$ ysh # assuming it's installed
60
61ysh$ echo 'hello world' # command typed into YSH
62hello world
63```
64
65In the sections below, we'll save space by showing output **in comments**, with
66`=>`:
67
68 echo 'hello world' # => hello world
69
70Multi-line output is shown like this:
71
72 echo one
73 echo two
74 # =>
75 # one
76 # two
77
78## Examples
79
80### Hello World Script
81
82You can also type commands into a file like `hello.ysh`. This is a complete
83YSH program, which is identical to a shell program:
84
85 echo 'hello world' # => hello world
86
87### A Taste of YSH
88
89Unlike shell, YSH has `var` and `const` keywords:
90
91 const name = 'world' # const is rarer, used the top-level
92 echo "hello $name" # => hello world
93
94They take rich Python-like expressions on the right:
95
96 var x = 42 # an integer, not a string
97 setvar x = x * 2 + 1 # mutate with the 'setvar' keyword
98
99 setvar x += 5 # Increment by 5
100 echo $x # => 6
101
102 var mylist = [x, 7] # two integers [6, 7]
103
104Expressions are often surrounded by `()`:
105
106 if (x > 0) {
107 echo 'positive'
108 } # => positive
109
110 for i, item in (mylist) { # 'mylist' is a variable, not a string
111 echo "[$i] item $item"
112 }
113 # =>
114 # [0] item 6
115 # [1] item 7
116
117YSH has Ruby-like blocks:
118
119 cd /tmp {
120 echo hi > greeting.txt # file created inside /tmp
121 echo $PWD # => /tmp
122 }
123 echo $PWD # prints the original directory
124
125And utilities to read and write JSON:
126
127 var person = {name: 'bob', age: 42}
128 json write (person)
129 # =>
130 # {
131 # "name": "bob",
132 # "age": 42,
133 # }
134
135 echo '["str", 42]' | json read # sets '_reply' variable by default
136
137### Tip: Use the `=` operator interactively
138
139The `=` keyword evaluates and prints an expression:
140
141 = _reply
142 # => (List) ["str", 42]
143
144(Think of it like `var x = _reply`, without the `var`.)
145
146The **best way** to learn YSH is to type these examples and see what happens!
147
148## Word Language: Expressions for Strings (and Arrays)
149
150Let's describe the word language first, and then talk about commands and
151expressions. Words are a rich language because **strings** are a central
152concept in shell.
153
154### Unquoted Words
155
156Words denote strings, but you often don't need to quote them:
157
158 echo hi # => hi
159
160Quotes are useful when a string has spaces, or punctuation characters like `( )
161;`.
162
163### Three Kinds of String Literals
164
165You can choose the style that's most convenient to write a given string.
166
167#### Double-Quoted, Single-Quoted, and J8 strings (like JSON)
168
169Double-quoted strings allow **interpolation**, with `$`:
170
171 var person = 'alice'
172 echo "hi $person, $(echo bye)" # => hi alice, bye
173
174Write operators by escaping them with `\`:
175
176 echo "\$ \" \\ " # => $ " \
177
178In single-quoted strings, all characters are **literal** (except `'`, which
179can't be expressed):
180
181 echo 'c:\Program Files\' # => c:\Program Files\
182
183If you want C-style backslash **character escapes**, use a J8 string, which is
184like JSON, but with single quotes:
185
186 echo u' A is \u{41} \n line two, with backslash \\'
187 # =>
188 # A is A
189 # line two, with backslash \
190
191The `u''` strings are guaranteed to be valid Unicode (unlike JSON). You can
192also use `b''` strings:
193
194 echo b'byte \yff' # Byte that's not valid unicode, like \xff in C.
195 # Don't confuse it with \u{ff}.
196
197#### Multi-line Strings
198
199Multi-line strings are surrounded with triple quotes. They come in the same
200three varieties, and leading whitespace is stripped in a convenient way.
201
202 sort <<< """
203 var sub: $x
204 command sub: $(echo hi)
205 expression sub: $[x + 3]
206 """
207 # =>
208 # command sub: hi
209 # expression sub: 9
210 # var sub: 6
211
212 sort <<< '''
213 $2.00 # literal $, no interpolation
214 $1.99
215 '''
216 # =>
217 # $1.99
218 # $2.00
219
220 sort <<< u'''
221 C\tD
222 A\tB
223 ''' # b''' strings also supported
224 # =>
225 # A B
226 # C D
227
228(Use multiline strings instead of shell's [here docs]($xref:here-doc).)
229
230### Three Kinds of Substitution
231
232YSH has syntax for 3 types of substitution, all of which start with `$`. That
233is, you can convert any of these things to a **string**:
234
2351. Variables
2362. The output of commands
2373. The value of expressions
238
239#### Variable Sub
240
241The syntax `$a` or `${a}` converts a variable to a string:
242
243 var a = 'ale'
244 echo $a # => ale
245 echo _${a}_ # => _ale_
246 echo "_ $a _" # => _ ale _
247
248The shell operator `:-` is occasionally useful in YSH:
249
250 echo ${not_defined:-'default'} # => default
251
252#### Command Sub
253
254The `$(echo hi)` syntax runs a command and captures its `stdout`:
255
256 echo $(hostname) # => example.com
257 echo "_ $(hostname) _" # => _ example.com _
258
259#### Expression Sub
260
261The `$[myexpr]` syntax evaluates an expression and converts it to a string:
262
263 echo $[a] # => ale
264 echo $[1 + 2 * 3] # => 7
265 echo "_ $[1 + 2 * 3] _" # => _ 7 _
266
267<!-- TODO: safe substitution with "$[a]"html -->
268
269### Arrays of Strings: Globs, Brace Expansion, Splicing, and Splitting
270
271There are four constructs that evaluate to a **list of strings**, rather than a
272single string.
273
274#### Globs
275
276Globs like `*.py` evaluate to a list of files.
277
278 touch foo.py bar.py # create the files
279 write *.py
280 # =>
281 # foo.py
282 # bar.py
283
284If no files match, it evaluates to an empty list (`[]`).
285
286#### Brace Expansion
287
288The brace expansion mini-language lets you write strings without duplication:
289
290 write {alice,bob}@example.com
291 # =>
292 # alice@example.com
293 # bob@example.com
294
295#### Splicing
296
297The `@` operator splices an array into a command:
298
299 var myarray = :| ale bean |
300 write S @myarray E
301 # =>
302 # S
303 # ale
304 # bean
305 # E
306
307You also have `@[]` to splice an expression that evaluates to a list:
308
309 write -- @[split('ale bean')]
310 # =>
311 # ale
312 # bean
313
314Each item will be converted to a string.
315
316#### Split Command Sub / Split Builtin Sub
317
318There's also a variant of *command sub* that decodes J8 lines into a sequence
319of strings:
320
321 write @(seq 3) # write is passed 3 args
322 # =>
323 # 1
324 # 2
325 # 3
326
327## Command Language: I/O, Control Flow, Abstraction
328
329### Simple Commands
330
331A simple command is a space-separated list of words. YSH looks up the first
332word to determine if it's a builtin command, or a user-defined `proc`.
333
334 echo 'hello world' # The shell builtin 'echo'
335
336 proc greet (name) { # Define a unit of code
337 echo "hello $name"
338 }
339
340 # The first word now resolves to the proc you defined
341 greet alice # => hello alice
342
343If it's neither, then it's assumed to be an external command:
344
345 ls -l /tmp # The external 'ls' command
346
347Commands accept traditional string arguments, as well as typed arguments in
348parentheses:
349
350 # 'write' is a string arg; 'x' is a typed expression arg
351 json write (x)
352
353<!--
354Block args are a special kind of typed arg:
355
356 cd /tmp {
357 echo $PWD
358 }
359-->
360
361### Redirects
362
363You can **redirect** `stdin` and `stdout` of simple commands:
364
365 echo hi > tmp.txt # write to a file
366 sort < tmp.txt
367
368Here are the most common idioms for using `stderr` (identical to shell):
369
370 ls /tmp 2>errors.txt
371 echo 'fatal error' >&2
372
373### ARGV and ENV
374
375At the top level, the `ARGV` list holds the arguments passed to the shell:
376
377 var num_args = len(ARGV)
378 ls /tmp @ARGV # pass shell's arguments through
379
380Inside a `proc` without declared parameters, `ARGV` holds the arguments passed
381to the `proc`. (Procs are explained below.)
382
383---
384
385You can add to the environment of a new process with a *prefix binding*:
386
387 PYTHONPATH=vendor ./demo.py # os.environ will have {'PYTHONPATH': 'vendor'}
388
389Under the hood, the prefix binding temporarily augments the `ENV` object, which
390is the current environment.
391
392You can also mutate the `ENV` object:
393
394 setglobal ENV.PYTHONPATH = '.'
395 ./demo.py # all future invocations have a different PYTHONPATH
396 ./demo.py
397
398And get its attributes:
399
400 echo $[ENV.PYTHONPATH] # => .
401
402### Pipelines
403
404Pipelines are a powerful method manipulating data streams:
405
406 ls | wc -l # count files in this directory
407 find /bin -type f | xargs wc -l # count files in a subtree
408
409The stream may contain (lines of) text, binary data, JSON, TSV, and more.
410Details below.
411
412### Multi-line Commands
413
414The `...` prefix lets you write long commands, pipelines, and `&&` chains
415without `\` line continuations.
416
417 ... find /bin # traverse this directory and
418 -type f -a -executable # print executable files
419 | sort -r # reverse sort
420 | head -n 30 # limit to 30 files
421 ;
422
423When this mode is active:
424
425- A single newline behaves like a space
426- A blank line (two newlines in a row) is illegal, but a line that has only a
427 comment is allowed. This prevents confusion if you forget the `;`
428 terminator.
429
430### `var`, `setvar`, `const` to Declare and Mutate
431
432Constants can't be modified:
433
434 const myconst = 'mystr'
435 # setvar myconst = 'foo' would be an error
436
437Modify variables with the `setvar` keyword:
438
439 var num_beans = 12
440 setvar num_beans = 13
441
442A more complex example:
443
444 var d = {name: 'bob', age: 42} # dict literal
445 setvar d.name = 'alice' # d.name is a synonym for d['name']
446 echo $[d.name] # => alice
447
448That's most of what you need to know about assignments. Advanced users may
449want to use `setglobal` or `call myplace->setValue(42)` in certain situations.
450
451<!--
452 var g = 1
453 var h = 2
454 proc demo(:out) {
455 setglobal g = 42
456 setref out = 43
457 }
458 demo :h # pass a reference to h
459 echo "$g $h" # => 42 43
460-->
461
462More info: [Variable Declaration and Mutation](variables.html).
463
464### `for` Loop
465
466#### Words
467
468Shell-style for loops iterate over **words**:
469
470 for word in 'oils' $num_beans {pea,coco}nut {
471 echo $word
472 }
473 # =>
474 # oils
475 # 13
476 # peanut
477 # coconut
478
479You can ask for the loop index with `i,`:
480
481 for i, word in README.md *.py {
482 echo "$i - $word"
483 }
484 # =>
485 # 0 - README.md
486 # 1 - __init__.py
487
488#### Typed Data
489
490To iterate over a typed data, use parentheses around an **expression**. The
491expression should evaluate to an integer `Range`, `List`, `Dict`, or `io.stdin`.
492
493Range:
494
495 for i in (3 ..< 5) { # range operator ..<
496 echo "i = $i"
497 }
498 # =>
499 # i = 3
500 # i = 4
501
502List:
503
504 var foods = ['ale', 'bean']
505 for item in (foods) {
506 echo $item
507 }
508 # =>
509 # ale
510 # bean
511
512Again, you can request the index with `for i, item in ...`.
513
514---
515
516There are **three** ways of iterating over a `Dict`:
517
518 var mydict = {pea: 42, nut: 10}
519 for key in (mydict) {
520 echo $key
521 }
522 # =>
523 # pea
524 # nut
525
526 for key, value in (mydict) {
527 echo "$key $value"
528 }
529 # =>
530 # pea - 42
531 # nut - 10
532
533 for i, key, value in (mydict) {
534 echo "$i $key $value"
535 }
536 # =>
537 # 0 - pea - 42
538 # 1 - nut - 10
539
540That is, if you ask for two things, you'll get the key and value. If you ask
541for three, you'll also get the index.
542
543(One way to think of it: `for` loops in YSH have the functionality Python's
544`enumerate()`, `items()`, `keys()`, and `values()`.)
545
546---
547
548The `io.stdin` object iterates over lines:
549
550 for line in (io.stdin) {
551 echo $line
552 }
553 # lines are buffered, so it's much faster than `while read --raw-line`
554
555<!--
556TODO: Str loop should give you the (UTF-8 offset, rune)
557Or maybe just UTF-8 offset? Decoding errors could be exceptions, or Unicode
558replacement.
559-->
560
561### `while` Loop
562
563While loops can use a **command** as the termination condition:
564
565 while test --file lock {
566 sleep 1
567 }
568
569Or an **expression**, which is surrounded in `()`:
570
571 var i = 3
572 while (i < 6) {
573 echo "i = $i"
574 setvar i += 1
575 }
576 # =>
577 # i = 3
578 # i = 4
579 # i = 5
580
581### Conditionals
582
583#### `if elif`
584
585If statements test the exit code of a command, and have optional `elif` and
586`else` clauses:
587
588 if test --file foo {
589 echo 'foo is a file'
590 rm --verbose foo # delete it
591 } elif test --dir foo {
592 echo 'foo is a directory'
593 } else {
594 echo 'neither'
595 }
596
597Invert the exit code with `!`:
598
599 if ! grep alice /etc/passwd {
600 echo 'alice is not a user'
601 }
602
603As with `while` loops, the condition can also be an **expression** wrapped in
604`()`:
605
606 if (num_beans > 0) {
607 echo 'so many beans'
608 }
609
610 var done = false
611 if (not done) { # negate with 'not' operator (contrast with !)
612 echo "we aren't done"
613 }
614
615#### `case`
616
617The case statement is a series of conditionals and executable blocks. The
618condition can be either an unquoted glob pattern like `*.py`, an eggex pattern
619like `/d+/`, or a typed expression like `(42)`:
620
621 var s = 'README.md'
622 case (s) {
623 *.py { echo 'Python' }
624 *.cc | *.h { echo 'C++' }
625 * { echo 'Other' }
626 }
627 # => Other
628
629 case (s) {
630 / dot* '.md' / { echo 'Markdown' }
631 (30 + 12) { echo 'the integer 42' }
632 (else) { echo 'neither' }
633 }
634 # => Markdown
635
636
637<!--
638(Shell style like `if foo; then ... fi` and `case $x in ... esac` is also
639legal, but discouraged in YSH code.)
640-->
641
642### Error Handling
643
644If statements are also used for **error handling**. Builtins and external
645commands use this style:
646
647 if ! test -d /bin {
648 echo 'not a directory'
649 }
650
651 if ! cp foo /tmp {
652 echo 'error copying' # any non-zero status
653 }
654
655Procs use this style (because of shell's *disabled `errexit` quirk*):
656
657 try {
658 myproc
659 }
660 if failed {
661 echo 'failed'
662 }
663
664For a complete list of examples, see [YSH Error
665Handling](ysh-error.html). For design goals and a reference, see [YSH
666Fixes Shell's Error Handling](error-handling.html).
667
668#### exit, break, continue, return
669
670The `exit` **keyword** exits a process. (It's not a shell builtin.)
671
672The other 3 control flow keywords behave like they do in Python and JavaScript.
673
674### Shell-like `proc`
675
676You can define units of code with the `proc` keyword. A `proc` is like a
677*procedure* or *process*.
678
679 proc my-ls {
680 ls -a -l @ARGV # pass args through
681 }
682
683Simple procs like this are invoked like a shell command:
684
685 my-ls /dev/null /etc/passwd
686
687You can name the parameters, and add a doc comment with `###`:
688
689 proc mycopy (src, dest) {
690 ### Copy verbosely
691
692 mkdir -p $dest
693 cp --verbose $src $dest
694 }
695 touch log.txt
696 mycopy log.txt /tmp # first word 'mycopy' is a proc
697
698Procs have many features, including **four** kinds of arguments:
699
7001. Word args (which are always strings)
7011. Typed, positional args
7021. Typed, named args
7031. A final block argument, which may be written with `{ }`.
704
705At the call site, they can look like any of these forms:
706
707 ls /tmp # word arg
708
709 json write (d) # word arg, then positional arg
710
711 try {
712 error 'failed' (status=9) # word arg, then named arg
713 }
714
715 cd /tmp { echo $PWD } # word arg, then block arg
716
717 pp value ([1, 2]) # positional, typed arg
718
719<!-- TODO: lazy arg list: ls8 | where [age > 10] -->
720
721At the definition site, the kinds of parameters are separated with `;`, similar
722to the Julia language:
723
724 proc p2 (word1, word2; pos1, pos2, ...rest_pos) {
725 echo "$word1 $word2 $[pos1 + pos2]"
726 json write (rest_pos)
727 }
728
729 proc p3 (w ; ; named1, named2, ...rest_named; block) {
730 echo "$w $[named1 + named2]"
731 call io->eval(block)
732 json write (rest_named)
733 }
734
735 proc p4 (; ; ; block) {
736 call io->eval(block)
737 }
738
739YSH also has Python-like functions defined with `func`. These are part of the
740expression language, which we'll see later.
741
742For more info, see the [Guide to Procs and Funcs](proc-func.html).
743
744### Ruby-like Block Arguments
745
746A block is a value of type `Command`. For example, `shopt` is a builtin
747command that takes a block argument:
748
749 shopt --unset errexit { # ignore errors
750 cp ale /tmp
751 cp bean /bin
752 }
753
754In this case, the block doesn't form a new scope.
755
756#### Block Scope / Closures
757
758However, by default, block arguments capture the frame they're defined in.
759This means they obey *lexical scope*.
760
761Consider this proc, which accepts a block, and runs it:
762
763 proc do-it (; ; ; block) {
764 call io->eval(block)
765 }
766
767When the block arg is passed, the enclosing stack frame is captured. This
768means that code inside the block can use variables in the captured frame:
769
770 var x = 42
771 do-it {
772 echo "x = $x" # outer x is visible LATER, when the block is run
773 }
774
775- [Feature Index: Closures](ref/feature-index.html#Closures)
776
777### Builtin Commands
778
779**Shell builtins** like `cd` and `read` are the "standard library" of the
780command language. Each one takes various flags:
781
782 cd -L . # follow symlinks
783
784 echo foo | read --all # read all of stdin
785
786Here are some categories of builtin:
787
788- I/O: `echo write read`
789- File system: `cd test`
790- Processes: `fork wait forkwait exec`
791- Interpreter settings: `shopt shvar`
792- Meta: `command builtin runproc type eval`
793
794<!-- TODO: Link to a comprehensive list of builtins -->
795
796## Expression Language: Python-like Types
797
798YSH expressions look and behave more like Python or JavaScript than shell. For
799example, we write `if (x < y)` instead of `if [ $x -lt $y ]`. Expressions are
800usually surrounded by `( )`.
801
802At runtime, variables like `x` and `y` are bounded to **typed data**, like
803integers, floats, strings, lists, and dicts.
804
805<!--
806[Command vs. Expression Mode](command-vs-expression-mode.html) may help you
807understand how YSH is parsed.
808-->
809
810### Python-like `func`
811
812At the end of the *Command Language*, we saw that procs are shell-like units of
813code. YSH also has Python-like **functions**, which are different than
814`procs`:
815
816- They're defined with the `func` keyword.
817- They're called in expressions, not in commands.
818- They're **pure**, and live in the **interior** of a process.
819 - In contrast, procs usually perform I/O, and have **exterior** boundaries.
820
821The simplest function is:
822
823 func identity(x) {
824 return (x) # parens required for typed return
825 }
826
827A more complex pure function:
828
829 func myRepeat(s, n; special=false) { # positional; named params
830 var parts = []
831 for i in (0 ..< n) {
832 append $s (parts)
833 }
834 var result = join(parts)
835
836 if (special) {
837 return ("$result !!")
838 } else {
839 return (result)
840 }
841 }
842
843 echo $[myRepeat('z', 3)] # => zzz
844
845 echo $[myRepeat('z', 3, special=true)] # => zzz !!
846
847A function that mutates its argument:
848
849 func popTwice(mylist) {
850 call mylist->pop()
851 call mylist->pop()
852 }
853
854 var mylist = [3, 4]
855
856 # The call keyword is an "adapter" between commands and expressions,
857 # like the = keyword.
858 call popTwice(mylist)
859
860
861Funcs are named using `camelCase`, while procs use `kebab-case`. See the
862[Style Guide](style-guide.html) for more conventions.
863
864#### Builtin Functions
865
866In addition, to builtin commands, YSH has Python-like builtin **functions**.
867These are like the "standard library" for the expression language. Examples:
868
869- Functions that take multiple types: `len() type()`
870- Conversions: `bool() int() float() str() list() ...`
871- Explicit word evaluation: `split() join() glob() maybe()`
872
873<!-- TODO: Make a comprehensive list of func builtins. -->
874
875
876### Data Types: `Int`, `Str`, `List`, `Dict`, `Obj`, ...
877
878YSH has data types, each with an expression syntax and associated methods.
879
880### Methods
881
882Non-mutating methods are looked up with the `.` operator:
883
884 var line = ' ale bean '
885 var caps = line.trim().upper() # 'ALE BEAN'
886
887Mutating methods are looked up with a thin arrow `->`:
888
889 var foods = ['ale', 'bean']
890 var last = foods->pop() # bean
891 write @foods # => ale
892
893You can ignore the return value with the `call` keyword:
894
895 call foods->pop()
896
897That is, YSH adds mutable data structures to shell, so we have a special syntax
898for mutation.
899
900---
901
902You can also chain functions with a fat arrow `=>`:
903
904 var trimmed = line.trim() => upper() # 'ALE BEAN'
905
906The `=>` operator allows functions to appear in a natural left-to-right order,
907like methods.
908
909 # list() is a free function taking one arg
910 # join() is a free function taking two args
911 var x = {k1: 42, k2: 43} => list() => join('/') # 'K1/K2'
912
913---
914
915Now let's go through the data types in YSH. We'll show the syntax for
916literals, and what **methods** they have.
917
918#### Null and Bool
919
920YSH uses JavaScript-like spellings these three "atoms":
921
922 var x = null
923
924 var b1, b2 = true, false
925
926 if (b1) {
927 echo 'yes'
928 } # => yes
929
930
931#### Int
932
933There are many ways to write integers:
934
935 var small, big = 42, 65_536
936 echo "$small $big" # => 42 65536
937
938 var hex, octal, binary = 0x0001_0000, 0o755, 0b0001_0101
939 echo "$hex $octal $binary" # => 65536 493 21
940
941<!--
942"Runes" are integers that represent Unicode code points. They're not common in
943YSH code, but can make certain string algorithms more readable.
944
945 # Pound rune literals are similar to ord('A')
946 const a = #'A'
947
948 # Backslash rune literals can appear outside of quotes
949 const newline = \n # Remember this is an integer
950 const backslash = \\ # ditto
951
952 # Unicode rune literal is syntactic sugar for 0x3bc
953 const mu = \u{3bc}
954
955 echo "chars $a $newline $backslash $mu" # => chars 65 10 92 956
956-->
957
958#### Float
959
960Floats are written with a decimal point:
961
962 var big = 3.14
963
964You can use scientific notation, as in Python:
965
966 var small = 1.5e-10
967
968#### Str
969
970See the section above on *Three Kinds of String Literals*. It described
971`'single quoted'`, `"double ${quoted}"`, and `u'J8-style\n'` strings; as well
972as their multiline variants.
973
974Strings are UTF-8 encoded in memory, like strings in the [Go
975language](https://golang.org). There isn't a separate string and unicode type,
976as in Python.
977
978Strings are **immutable**, as in Python and JavaScript. This means they only
979have **transforming** methods:
980
981 var x = s.trim()
982
983Other methods:
984
985- `trimLeft() trimRight()`
986- `trimPrefix() trimSuffix()`
987- `upper() lower()`
988- `search() leftMatch()` - pattern matching
989- `replace() split()`
990
991#### List (and Arrays)
992
993All lists can be expressed with Python-like literals:
994
995 var foods = ['ale', 'bean', 'corn']
996 var recursive = [1, [2, 3]]
997
998As a special case, list of strings are called **arrays**. It's often more
999convenient to write them with shell-like literals:
1000
1001 # No quotes or commas
1002 var foods = :| ale bean corn |
1003
1004 # You can use the word language here
1005 var other = :| foo $s *.py {alice,bob}@example.com |
1006
1007Lists are **mutable**, as in Python and JavaScript. So they mainly have
1008mutating methods:
1009
1010 call foods->reverse()
1011 write -- @foods
1012 # =>
1013 # corn
1014 # bean
1015 # ale
1016
1017#### Dict
1018
1019Dicts use syntax that's like JavaScript. Here's a dict literal:
1020
1021 var d = {
1022 name: 'bob', # unquoted keys are allowed
1023 age: 42,
1024 'key with spaces': 'val'
1025 }
1026
1027You can use either `[]` or `.` to retrieve a value, given a key:
1028
1029 var v1 = d['name']
1030 var v2 = d.name # shorthand for the above
1031 var v3 = d['key with spaces'] # no shorthand for this
1032
1033(If the key doesn't exist, an error is raised.)
1034
1035You can change Dict values with the same 2 syntaxes:
1036
1037 set d['name'] = 'other'
1038 set d.name = 'fun'
1039
1040---
1041
1042If you want to compute a key name, use an expression inside `[]`:
1043
1044 var key = 'alice'
1045 var d2 = {[key ++ '_z']: 'ZZZ'} # Computed key name
1046 echo $[d2.alice_z] # => ZZZ
1047
1048If you omit the value, its taken from a variable of the same name:
1049
1050 var d3 = {key} # like {key: key}
1051 echo "name is $[d3.key]" # => name is alice
1052
1053More examples:
1054
1055 var empty = {}
1056 echo $[len(empty)] # => 0
1057
1058The `keys()` and `values()` methods return new `List` objects:
1059
1060 var keys = keys(d2) # => alice_z
1061 var vals = values(d3) # => alice
1062
1063#### Obj
1064
1065YSH has an `Obj` type that bundles **code** and **data**. (In contrast, JSON
1066messages are pure data, not objects.)
1067
1068The main purpose of objects is **polymorphism**:
1069
1070 var obj = makeMyObject(42) # I don't know what it looks like inside
1071
1072 echo $[obj.myMethod()] # But I can perform abstract operations
1073
1074 call obj->mutatingMethod() # Mutation is considered special, with ->
1075
1076YSH objects are similar to Lua and JavaScript objects. They can be thought of
1077as a linked list of `Dict` instances.
1078
1079Or you can say they have a `Dict` of properties, and a recursive "prototype
1080chain" that is also an `Obj`.
1081
1082- [Feature Index: Objects](ref/feature-index.html#Objects)
1083
1084### `Place` type / "out params"
1085
1086The `read` builtin can set an implicit variable `_reply`:
1087
1088 whoami | read --all # sets _reply
1089
1090Or you can pass a `value.Place`, created with `&`
1091
1092 var x # implicitly initialized to null
1093 whoami | read --all (&x) # mutate this "place"
1094 echo who=$x # => who=andy
1095
1096<!--
1097#### Quotation Types: value.Command (Block) and value.Expr
1098
1099These types are for reflection on YSH code. Most YSH programs won't use them
1100directly.
1101
1102- `Command`: an unevaluated code block.
1103 - rarely-used literal: `^(ls | wc -l)`
1104- `Expr`: an unevaluated expression.
1105 - rarely-used literal: `^[42 + a[i]]`
1106-->
1107
1108### Operators
1109
1110YSH operators are generally the same as in Python:
1111
1112 if (10 <= num_beans and num_beans < 20) {
1113 echo 'enough'
1114 } # => enough
1115
1116YSH has a few operators that aren't in Python. Equality can be approximate or
1117exact:
1118
1119 var n = ' 42 '
1120 if (n ~== 42) {
1121 echo 'equal after stripping whitespace and type conversion'
1122 } # => equal after stripping whitespace type conversion
1123
1124 if (n === 42) {
1125 echo "not reached because strings and ints aren't equal"
1126 }
1127
1128<!-- TODO: is n === 42 a type error? -->
1129
1130Pattern matching can be done with globs (`~~` and `!~~`)
1131
1132 const filename = 'foo.py'
1133 if (filename ~~ '*.py') {
1134 echo 'Python'
1135 } # => Python
1136
1137 if (filename !~~ '*.sh') {
1138 echo 'not shell'
1139 } # => not shell
1140
1141or regular expressions (`~` and `!~`). See the Eggex section below for an
1142example of the latter.
1143
1144Concatenation is `++` rather than `+` because it avoids confusion in the
1145presence of type conversion:
1146
1147 var n = 42 + 1 # string plus int does implicit conversion
1148 echo $n # => 43
1149
1150 var y = 'ale ' ++ "bean $n" # concatenation
1151 echo $y # => ale bean 43
1152
1153<!--
1154TODO: change example above
1155 var n = '42' + 1 # string plus int does implicit conversion
1156-->
1157
1158<!--
1159
1160#### Summary of Operators
1161
1162- Arithmetic: `+ - * / // %` and `**` for exponentatiation
1163 - `/` always yields a float, and `//` is integer division
1164- Bitwise: `& | ^ ~`
1165- Logical: `and or not`
1166- Comparison: `== < > <= >= in 'not in'`
1167 - Approximate equality: `~==`
1168 - Eggex and glob match: `~ !~ ~~ !~~`
1169- Ternary: `1 if x else 0`
1170- Index and slice: `mylist[3]` and `mylist[1:3]`
1171 - `mydict->key` is a shortcut for `mydict['key']`
1172- Function calls
1173 - free: `f(x, y)`
1174 - transformations and chaining: `s => startWith('prefix')`
1175 - mutating methods: `mylist->pop()`
1176- String and List: `++` for concatenation
1177 - This is a separate operator because the addition operator `+` does
1178 string-to-int conversion
1179
1180TODO: What about list comprehensions?
1181-->
1182
1183### Egg Expressions (YSH Regexes)
1184
1185An *Eggex* is a YSH expression that denotes a regular expression. Eggexes
1186translate to POSIX ERE syntax, for use with tools like `egrep`, `awk`, and `sed
1187--regexp-extended` (GNU only).
1188
1189They're designed to be readable and composable. Example:
1190
1191 var D = / digit{1,3} /
1192 var ip_pattern = / D '.' D '.' D '.' D'.' /
1193
1194 var z = '192.168.0.1'
1195 if (z ~ ip_pattern) { # Use the ~ operator to match
1196 echo "$z looks like an IP address"
1197 } # => 192.168.0.1 looks like an IP address
1198
1199 if (z !~ / '.255' %end /) {
1200 echo "doesn't end with .255"
1201 } # => doesn't end with .255"
1202
1203See the [Egg Expressions doc](eggex.html) for details.
1204
1205## Interlude
1206
1207Before moving onto other YSH features, let's review what we've seen.
1208
1209### Three Interleaved Languages
1210
1211Here are the languages we saw in the last 3 sections:
1212
12131. **Words** evaluate to a string, or list of strings. This includes:
1214 - literals like `'mystr'`
1215 - substitutions like `${x}` and `$(hostname)`
1216 - globs like `*.sh`
12172. **Commands** are used for
1218 - I/O: pipelines, builtins like `read`
1219 - control flow: `if`, `for`
1220 - abstraction: `proc`
12213. **Expressions** on typed data are borrowed from Python, with influence from
1222 JavaScript:
1223 - Lists: `['ale', 'bean']` or `:| ale bean |`
1224 - Dicts: `{name: 'bob', age: 42}`
1225 - Functions: `split('ale bean')` and `join(['pea', 'nut'])`
1226
1227### How Do They Work Together?
1228
1229Here are two examples:
1230
1231(1) In this this *command*, there are **four** *words*. The fourth word is an
1232*expression sub* `$[]`.
1233
1234 write hello $name $[d['age'] + 1]
1235 # =>
1236 # hello
1237 # world
1238 # 43
1239
1240(2) In this assignment, the *expression* on the right hand side of `=`
1241concatenates two strings. The first string is a literal, and the second is a
1242*command sub*.
1243
1244 var food = 'ale ' ++ $(echo bean | tr a-z A-Z)
1245 write $food # => ale BEAN
1246
1247So words, commands, and expressions are **mutually recursive**. If you're a
1248conceptual person, skimming [Syntactic Concepts](syntactic-concepts.html) may
1249help you understand this on a deeper level.
1250
1251<!--
1252One way to think about these sublanguages is to note that the `|` character
1253means something different in each context:
1254
1255- In the command language, it's the pipeline operator, as in `ls | wc -l`
1256- In the word language, it's only valid in a literal string like `'|'`, `"|"`,
1257 or `\|`. (It's also used in `${x|html}`, which formats a string.)
1258- In the expression language, it's the bitwise OR operator, as in Python and
1259 JavaScript.
1260-->
1261
1262---
1263
1264Let's move on from talking about **code**, and talk about **data**.
1265
1266## Data Notation / Interchange Formats
1267
1268In YSH, you can read and write data languages based on [JSON]($xref). This is
1269a primary way to exchange messages between Unix processes.
1270
1271Instead of being **executed**, like our command/word/expression languages,
1272these languages **parsed** as data structures.
1273
1274<!-- TODO: Link to slogans, fallacies, and concepts -->
1275
1276### UTF-8
1277
1278UTF-8 is the foundation of our data notation. It's the most common Unicode
1279encoding, and the most consistent:
1280
1281 var x = u'hello \u{1f642}' # store a UTF-8 string in memory
1282 echo $x # send UTF-8 to stdout
1283
1284hello &#x1f642;
1285
1286<!-- TODO: there's a runes() iterator which gives integer offsets, usable for
1287slicing -->
1288
1289### JSON
1290
1291JSON messages are UTF-8 text. You can encode and decode JSON with functions
1292(`func` style):
1293
1294 var message = toJson({x: 42}) # => (Str) '{"x": 42}'
1295 var mydict = fromJson('{"x": 42}') # => (Dict) {x: 42}
1296
1297Or with commands (`proc` style):
1298
1299 json write ({x: 42}) > foo.json # writes '{"x": 42}'
1300
1301 json read (&mydict) < foo.json # create var
1302 = mydict # => (Dict) {x: 42}
1303
1304### J8 Notation
1305
1306But JSON isn't quite enough for a principled shell.
1307
1308- Traditional Unix tools like `grep` and `awk` operate on streams of **lines**.
1309 In YSH, to avoid data-dependent bugs, we want a reliable way of **quoting**
1310 lines.
1311- In YSH, we also want to represent **binary** data, not just text. When you
1312 read a Unix file, it may or may not be text.
1313
1314So we borrow JSON-style strings, and create [J8 Notation][]. Slogans:
1315
1316- *Deconstructing and Augmenting JSON*
1317- *Fixing the JSON-Unix Mismatch*
1318
1319[J8 Notation]: $xref:j8-notation
1320
1321#### J8 Lines
1322
1323*J8 Lines* are a building block of J8 Notation. If you have a file
1324`lines.txt`:
1325
1326<pre>
1327 doc/hello.md
1328 "doc/with spaces.md"
1329b'doc/with byte \yff.md'
1330</pre>
1331
1332Then you can decode it with *split command sub* (mentioned above):
1333
1334 var decoded = @(cat lines.txt)
1335
1336This file has:
1337
13381. An unquoted string
13391. A JSON string with `"double quotes"`
13401. A J8-style string: `u'unicode'` or `b'bytes'`
1341
1342<!--
1343TODO: fromJ8Line() toJ8Line()
1344-->
1345
1346#### JSON8 is Tree-Shaped
1347
1348JSON8 is just like JSON, but it allows J8-style strings:
1349
1350<pre>
1351{ "foo": "hi \uD83D\uDE42"} # valid JSON, and valid JSON8
1352{u'foo': u'hi \u{1F642}' } # valid JSON8, with J8-style strings
1353</pre>
1354
1355<!--
1356In addition to strings and lines, you can write and read **tree-shaped** data
1357as [JSON][]:
1358
1359 var d = {key: 'value'}
1360 json write (d) # dump variable d as JSON
1361 # =>
1362 # {
1363 # "key": "value"
1364 # }
1365
1366 echo '["ale", 42]' > example.json
1367
1368 json read (&d2) < example.json # parse JSON into var d2
1369 pp (d2) # pretty print it
1370 # => (List) ['ale', 42]
1371
1372[JSON][] will lose information when strings have binary data, but the slight
1373[JSON8]($xref) upgrade won't:
1374
1375 var b = {binary: $'\xff'}
1376 json8 write (b)
1377 # =>
1378 # {
1379 # "binary": b'\yff'
1380 # }
1381-->
1382
1383[JSON]: $xref
1384
1385#### TSV8 is Table-Shaped
1386
1387(TODO: not yet implemented.)
1388
1389YSH supports data notation for tables:
1390
13911. Plain TSV files, which are untyped. Every column has string data.
1392 - Cells with tabs, newlines, and binary data are a problem.
13932. Our extension [TSV8]($xref), which supports typed data.
1394 - It uses JSON notation for booleans, integers, and floats.
1395 - It uses J8 strings, which can represent any string.
1396
1397<!-- Figure out the API. Does it work like JSON?
1398
1399Or I think we just implement
1400- rows: 'where' or 'filter' (dplyr)
1401- cols: 'select' conflicts with shell builtin; call it 'cols'?
1402- sort: 'sort-by' or 'arrange' (dplyr)
1403- TSV8 <=> sqlite conversion. Are these drivers or what?
1404 - and then let you pipe output?
1405
1406Do we also need TSV8 space2tab or something? For writing TSV8 inline.
1407
1408More later:
1409- MessagePack (e.g. for shared library extension modules)
1410 - msgpack read, write? I think user-defined function could be like this?
1411- SASH: Simple and Strict HTML? For easy processing
1412-->
1413
1414## YSH Modules are Files
1415
1416A module is a **file** of source code, like `lib/myargs.ysh`. The `use`
1417builtin turns it into an `Obj` that can be invoked and inspected:
1418
1419 use myargs.ysh
1420
1421 myargs proc1 --flag val # module name becomes a prefix, via __invoke__
1422 var alias = myargs.proc1 # module has attributes
1423
1424You can import specific names with the `--pick` flag:
1425
1426 use myargs.ysh --pick p2 p3
1427
1428 p2
1429 p3
1430
1431- [Feature Index: Modules](ref/feature-index.html#Modules)
1432
1433## The Runtime Shared by OSH and YSH
1434
1435Although we describe OSH and YSH as different languages, they use the **same**
1436interpreter under the hood.
1437
1438This interpreter has many `shopt` booleans to control behavior, like `shopt
1439--set parse_paren`. The group `shopt --set ysh:all` flips all booleans to make
1440`bin/osh` behave like `bin/ysh`.
1441
1442Understanding this common runtime, and its interface to the Unix kernel, will
1443help you understand **both** languages!
1444
1445### Interpreter Data Model
1446
1447The [Interpreter State](interpreter-state.html) doc is under construction. It
1448will cover:
1449
1450- The **call stack** for OSH and YSH
1451 - Each *stack frame* is a `{name -> cell}` mapping.
1452- Each cell has a **value**, with boolean flags
1453 - OSH has types `Str BashArray BashAssoc`, and flags `readonly export
1454 nameref`.
1455 - YSH has types `Bool Int Float Str List Dict Obj ...`, and the `readonly`
1456 flag.
1457- YSH **namespaces**
1458 - Modules with `use`
1459 - Builtin functions and commands
1460 - ENV
1461- Shell **options**
1462 - Boolean options with `shopt`: `parse_paren`, `simple_word_eval`, etc.
1463 - String options with `shvar`: `IFS`, `PATH`
1464- **Registers** that store interpreter state
1465 - `$?` and `_error`
1466 - `$!` for the last PID
1467 - `_this_dir`
1468 - `_reply`
1469
1470### Process Model (the kernel)
1471
1472The [Process Model](process-model.html) doc is **under construction**. It will cover:
1473
1474- Simple Commands, `exec`
1475- Pipelines. #[shell-the-good-parts](#blog-tag)
1476- `fork`, `forkwait`
1477- Command and process substitution
1478- Related:
1479 - [Tracing execution in Oils](xtrace.html) (xtrace), which divides
1480 process-based concurrency into **synchronous** and **async** constructs.
1481 - [Three Comics For Understanding Unix
1482 Shell](http://www.oilshell.org/blog/2020/04/comics.html) (blog)
1483
1484<!--
1485Process model additions: Capers, Headless shell
1486
1487some optimizations: See YSH starts fewer processes than other shells.
1488-->
1489
1490### Advanced: Reflecting on the Interpreter
1491
1492You can reflect on the interpreter with APIs like `io->eval()` and
1493`vm.getFrame()`.
1494
1495- [Feature Index: Reflection](ref/feature-index.html#Reflection)
1496
1497This allows YSH to be a language for creating other languages. (Ruby, Tcl, and
1498Racket also have this flavor.)
1499
1500<!--
1501
1502TODO: Hay and Awk examples
1503-->
1504
1505## Summary
1506
1507What have we described in this tour?
1508
1509YSH is a programming language that evolved from Unix shell. But you can
1510"forget" the bad parts of shell like `[ $x -lt $y ]`.
1511
1512<!--
1513Instead, we've shown you shell-like commands, Python-like expressions on typed
1514data, and Ruby-like command blocks.
1515-->
1516
1517Instead, focus on these central concepts:
1518
15191. Interleaved *word*, *command*, and *expression* languages.
15202. A standard library of *builtin commands*, as well as *builtin functions*
15213. Languages for *data*: J8 Notation, including JSON8 and TSV8
15224. A *runtime* shared by OSH and YSH
1523
1524## Appendix
1525
1526### Related Docs
1527
1528- [YSH vs. Shell Idioms](idioms.html) - YSH side-by-side with shell.
1529- [YSH Language Influences](language-influences.html) - In addition to shell,
1530 Python, and JavaScript, YSH is influenced by Ruby, Perl, Awk, PHP, and more.
1531- [A Feel For YSH Syntax](syntax-feelings.html) - Some thoughts that may help
1532 you remember the syntax.
1533- [YSH Language Warts](warts.html) documents syntax that may be surprising.
1534
1535
1536### YSH Script Template
1537
1538YSH can be used to write simple "shell scripts" or longer programs. It has
1539*procs* and *modules* to help with the latter.
1540
1541A module is just a file, like this:
1542
1543```
1544#!/usr/bin/env ysh
1545### Deploy script
1546
1547use $_this_dir/lib/util.ysh --pick log
1548
1549const DEST = '/tmp/ysh-tour'
1550
1551proc my-sync(...files) {
1552 ### Sync files and show which ones
1553
1554 cp --verbose @files $DEST
1555}
1556
1557proc main {
1558 mkdir -p $DEST
1559
1560 touch {foo,bar}.py {build,test}.sh
1561
1562 log "Copying source files"
1563 my-sync *.py *.sh
1564
1565 if test --dir /tmp/logs {
1566 cd /tmp/logs
1567
1568 log "Copying logs"
1569 my-sync *.log
1570 }
1571}
1572
1573if is-main { # The only top-level statement
1574 main @ARGV
1575}
1576```
1577
1578<!--
1579TODO:
1580- Also show flags parsing?
1581- Show longer examples where it isn't boilerplate
1582-->
1583
1584You wouldn't bother with the boilerplate for something this small. But this
1585example illustrates the basic idea: the top level often contains these words:
1586`use`, `const`, `proc`, and `func`.
1587
1588
1589<!--
1590TODO: not mentioning __provide__, since it should be optional in the most basic usage?
1591-->
1592
1593### YSH Features Not Shown
1594
1595#### Advanced
1596
1597These shell features are part of YSH, but aren't shown above:
1598
1599- The `fork` and `forkwait` builtins, for concurrent execution and subshells.
1600- Process Substitution: `diff <(sort left.txt) <(sort right.txt)`
1601
1602#### Deprecated Shell Constructs
1603
1604The shared interpreter supports many shell constructs that are deprecated:
1605
1606- YSH code uses shell's `||` and `&&` in limited circumstances, since `errexit`
1607 is on by default.
1608- Assignment builtins like `local` and `declare`. Use YSH keywords.
1609- Boolean expressions like `[[ x =~ $pat ]]`. Use YSH expressions.
1610- Shell arithmetic like `$(( x + 1 ))` and `(( y = x ))`. Use YSH expressions.
1611- The `until` loop can always be replaced with a `while` loop
1612- Most of what's in `${}` can be written in other ways. For example
1613 `${s#/tmp}` could be `s => removePrefix('/tmp')` (TODO).
1614
1615#### Not Yet Implemented
1616
1617This document mentions a few constructs that aren't yet implemented. Here's a
1618summary:
1619
1620```none
1621# Unimplemented syntax:
1622
1623echo ${x|html} # formatters
1624
1625echo ${x %.2f} # statically-parsed printf
1626
1627var x = "<p>$x</p>"html
1628echo "<p>$x</p>"html # tagged string
1629
1630var x = 15 Mi # units suffix
1631```
1632
1633<!--
1634- To implement: Capers: stateless coprocesses
1635-->
1636