OILS / doc / ysh-tour.md View on Github | oils.pub

1614 lines, 1105 significant
1---
2default_highlighter: oils-sh
3---
4
5A Tour of YSH
6=============
7
8<!-- author's note about example names
9
10- people: alice, bob
11- nouns: ale, bean
12 - peanut, coconut
13- 42 for integers
14-->
15
16This doc describes the [YSH]($xref) language from **clean slate**
17perspective. We don't assume you know Unix shell, or the compatible
18[OSH]($xref). But shell users will see the similarity, with simplifications
19and upgrades.
20
21Remember, YSH is for Python and JavaScript users who avoid shell! See the
22[project FAQ][FAQ] for more color on that.
23
24[FAQ]: https://www.oilshell.org/blog/2021/01/why-a-new-shell.html
25
26This document is **long** because it demonstrates nearly every feature of the
27language. You may want to read it in multiple sittings, or read [The Simplest
28Explanation of
29Oil](https://www.oilshell.org/blog/2020/01/simplest-explanation.html) first.
30(Until 2023, YSH was called the "Oil language".)
31
32
33Here's a summary of what follows:
34
351. YSH has interleaved *word*, *command*, and *expression* languages.
36 - The command language has Ruby-like *blocks*, and the expression language
37 has Python-like *data types*.
382. YSH has both builtin *commands* like `cd /tmp`, and builtin *functions* like
39 `join()`.
403. Languages for *data*, like [JSON][], are complementary to YSH code.
414. OSH and YSH share both an *interpreter data model* and a *process model*
42 (provided by the Unix kernel). Understanding these common models will make
43 you both a better shell user and YSH user.
44
45Keep these points in mind as you read the details below.
46
47[JSON]: https://json.org
48
49<div id="toc">
50</div>
51
52## Preliminaries
53
54Start YSH just like you start bash or Python:
55
56<!-- oils-sh below skips code block extraction, since it doesn't run -->
57
58```sh-prompt
59bash$ ysh # assuming it's installed
60
61ysh$ echo 'hello world' # command typed into YSH
62hello world
63```
64
65In the sections below, we'll save space by showing output **in comments**, with
66`=>`:
67
68 echo 'hello world' # => hello world
69
70Multi-line output is shown like this:
71
72 echo one
73 echo two
74 # =>
75 # one
76 # two
77
78## Examples
79
80### Hello World Script
81
82You can also type commands into a file like `hello.ysh`. This is a complete
83YSH program, which is identical to a shell program:
84
85 echo 'hello world' # => hello world
86
87### A Taste of YSH
88
89Unlike shell, YSH has `var` and `const` keywords:
90
91 const name = 'world' # const is rarer, used the top-level
92 echo "hello $name" # => hello world
93
94They take rich Python-like expressions on the right:
95
96 var x = 42 # an integer, not a string
97 setvar x = x * 2 + 1 # mutate with the 'setvar' keyword
98
99 setvar x += 5 # Increment by 5
100 echo $x # => 6
101
102 var mylist = [x, 7] # two integers [6, 7]
103
104Expressions are often surrounded by `()`:
105
106 if (x > 0) {
107 echo 'positive'
108 } # => positive
109
110 for i, item in (mylist) { # 'mylist' is a variable, not a string
111 echo "[$i] item $item"
112 }
113 # =>
114 # [0] item 6
115 # [1] item 7
116
117YSH has Ruby-like blocks:
118
119 cd /tmp {
120 echo hi > greeting.txt # file created inside /tmp
121 echo $PWD # => /tmp
122 }
123 echo $PWD # prints the original directory
124
125And utilities to read and write JSON:
126
127 var person = {name: 'bob', age: 42}
128 json write (person)
129 # =>
130 # {
131 # "name": "bob",
132 # "age": 42,
133 # }
134
135 echo '["str", 42]' | json read # sets '_reply' variable by default
136
137The `=` keyword evaluates and prints an expression:
138
139 = _reply
140 # => (List) ["str", 42]
141
142(Think of it like `var x = _reply`, without the `var`.)
143
144## Word Language: Expressions for Strings (and Arrays)
145
146Let's describe the word language first, and then talk about commands and
147expressions. Words are a rich language because **strings** are a central
148concept in shell.
149
150### Unquoted Words
151
152Words denote strings, but you often don't need to quote them:
153
154 echo hi # => hi
155
156Quotes are useful when a string has spaces, or punctuation characters like `( )
157;`.
158
159### Three Kinds of String Literals
160
161You can choose the style that's most convenient to write a given string.
162
163#### Double-Quoted, Single-Quoted, and J8 strings (like JSON)
164
165Double-quoted strings allow **interpolation**, with `$`:
166
167 var person = 'alice'
168 echo "hi $person, $(echo bye)" # => hi alice, bye
169
170Write operators by escaping them with `\`:
171
172 echo "\$ \" \\ " # => $ " \
173
174In single-quoted strings, all characters are **literal** (except `'`, which
175can't be expressed):
176
177 echo 'c:\Program Files\' # => c:\Program Files\
178
179If you want C-style backslash **character escapes**, use a J8 string, which is
180like JSON, but with single quotes:
181
182 echo u' A is \u{41} \n line two, with backslash \\'
183 # =>
184 # A is A
185 # line two, with backslash \
186
187The `u''` strings are guaranteed to be valid Unicode (unlike JSON). You can
188also use `b''` strings:
189
190 echo b'byte \yff' # Byte that's not valid unicode, like \xff in C.
191 # Don't confuse it with \u{ff}.
192
193#### Multi-line Strings
194
195Multi-line strings are surrounded with triple quotes. They come in the same
196three varieties, and leading whitespace is stripped in a convenient way.
197
198 sort <<< """
199 var sub: $x
200 command sub: $(echo hi)
201 expression sub: $[x + 3]
202 """
203 # =>
204 # command sub: hi
205 # expression sub: 9
206 # var sub: 6
207
208 sort <<< '''
209 $2.00 # literal $, no interpolation
210 $1.99
211 '''
212 # =>
213 # $1.99
214 # $2.00
215
216 sort <<< u'''
217 C\tD
218 A\tB
219 ''' # b''' strings also supported
220 # =>
221 # A B
222 # C D
223
224(Use multiline strings instead of shell's [here docs]($xref:here-doc).)
225
226### Three Kinds of Substitution
227
228YSH has syntax for 3 types of substitution, all of which start with `$`. That
229is, you can convert any of these things to a **string**:
230
2311. Variables
2322. The output of commands
2333. The value of expressions
234
235#### Variable Sub
236
237The syntax `$a` or `${a}` converts a variable to a string:
238
239 var a = 'ale'
240 echo $a # => ale
241 echo _${a}_ # => _ale_
242 echo "_ $a _" # => _ ale _
243
244The shell operator `:-` is occasionally useful in YSH:
245
246 echo ${not_defined:-'default'} # => default
247
248#### Command Sub
249
250The `$(echo hi)` syntax runs a command and captures its `stdout`:
251
252 echo $(hostname) # => example.com
253 echo "_ $(hostname) _" # => _ example.com _
254
255#### Expression Sub
256
257The `$[myexpr]` syntax evaluates an expression and converts it to a string:
258
259 echo $[a] # => ale
260 echo $[1 + 2 * 3] # => 7
261 echo "_ $[1 + 2 * 3] _" # => _ 7 _
262
263<!-- TODO: safe substitution with "$[a]"html -->
264
265### Arrays of Strings: Globs, Brace Expansion, Splicing, and Splitting
266
267There are four constructs that evaluate to a **list of strings**, rather than a
268single string.
269
270#### Globs
271
272Globs like `*.py` evaluate to a list of files.
273
274 touch foo.py bar.py # create the files
275 write *.py
276 # =>
277 # foo.py
278 # bar.py
279
280If no files match, it evaluates to an empty list (`[]`).
281
282#### Brace Expansion
283
284The brace expansion mini-language lets you write strings without duplication:
285
286 write {alice,bob}@example.com
287 # =>
288 # alice@example.com
289 # bob@example.com
290
291#### Splicing
292
293The `@` operator splices an array into a command:
294
295 var myarray = :| ale bean |
296 write S @myarray E
297 # =>
298 # S
299 # ale
300 # bean
301 # E
302
303You also have `@[]` to splice an expression that evaluates to a list:
304
305 write -- @[split('ale bean')]
306 # =>
307 # ale
308 # bean
309
310Each item will be converted to a string.
311
312#### Split Command Sub / Split Builtin Sub
313
314There's also a variant of *command sub* that decodes J8 lines into a sequence
315of strings:
316
317 write @(seq 3) # write is passed 3 args
318 # =>
319 # 1
320 # 2
321 # 3
322
323## Command Language: I/O, Control Flow, Abstraction
324
325### Simple Commands
326
327A simple command is a space-separated list of words. YSH looks up the first
328word to determine if it's a builtin command, or a user-defined `proc`.
329
330 echo 'hello world' # The shell builtin 'echo'
331
332 proc greet (name) { # Define a unit of code
333 echo "hello $name"
334 }
335
336 # The first word now resolves to the proc you defined
337 greet alice # => hello alice
338
339If it's neither, then it's assumed to be an external command:
340
341 ls -l /tmp # The external 'ls' command
342
343Commands accept traditional string arguments, as well as typed arguments in
344parentheses:
345
346 # 'write' is a string arg; 'x' is a typed expression arg
347 json write (x)
348
349<!--
350Block args are a special kind of typed arg:
351
352 cd /tmp {
353 echo $PWD
354 }
355-->
356
357### Redirects
358
359You can **redirect** `stdin` and `stdout` of simple commands:
360
361 echo hi > tmp.txt # write to a file
362 sort < tmp.txt
363
364Here are the most common idioms for using `stderr` (identical to shell):
365
366 ls /tmp 2>errors.txt
367 echo 'fatal error' >&2
368
369### ARGV and ENV
370
371The `ARGV` list holds the arguments passed to the shell:
372
373 var num_args = len(ARGV)
374 ls /tmp @ARGV # pass shell's arguments through
375
376---
377
378You can add to the environment of a new process with a *prefix binding*:
379
380 PYTHONPATH=vendor ./demo.py
381
382The `ENV` object reflects the current environment:
383
384 echo $[ENV.PYTHONPATH] # => vendor
385
386### Pipelines
387
388Pipelines are a powerful method manipulating data streams:
389
390 ls | wc -l # count files in this directory
391 find /bin -type f | xargs wc -l # count files in a subtree
392
393The stream may contain (lines of) text, binary data, JSON, TSV, and more.
394Details below.
395
396### Multi-line Commands
397
398The `...` prefix lets you write long commands, pipelines, and `&&` chains
399without `\` line continuations.
400
401 ... find /bin # traverse this directory and
402 -type f -a -executable # print executable files
403 | sort -r # reverse sort
404 | head -n 30 # limit to 30 files
405 ;
406
407When this mode is active:
408
409- A single newline behaves like a space
410- A blank line (two newlines in a row) is illegal, but a line that has only a
411 comment is allowed. This prevents confusion if you forget the `;`
412 terminator.
413
414### `var`, `setvar`, `const` to Declare and Mutate
415
416Constants can't be modified:
417
418 const myconst = 'mystr'
419 # setvar myconst = 'foo' would be an error
420
421Modify variables with the `setvar` keyword:
422
423 var num_beans = 12
424 setvar num_beans = 13
425
426A more complex example:
427
428 var d = {name: 'bob', age: 42} # dict literal
429 setvar d.name = 'alice' # d.name is a synonym for d['name']
430 echo $[d.name] # => alice
431
432That's most of what you need to know about assignments. Advanced users may
433want to use `setglobal` or `call myplace->setValue(42)` in certain situations.
434
435<!--
436 var g = 1
437 var h = 2
438 proc demo(:out) {
439 setglobal g = 42
440 setref out = 43
441 }
442 demo :h # pass a reference to h
443 echo "$g $h" # => 42 43
444-->
445
446More info: [Variable Declaration and Mutation](variables.html).
447
448### `for` Loop
449
450#### Words
451
452Shell-style for loops iterate over **words**:
453
454 for word in 'oils' $num_beans {pea,coco}nut {
455 echo $word
456 }
457 # =>
458 # oils
459 # 13
460 # peanut
461 # coconut
462
463You can ask for the loop index with `i,`:
464
465 for i, word in README.md *.py {
466 echo "$i - $word"
467 }
468 # =>
469 # 0 - README.md
470 # 1 - __init__.py
471
472#### Typed Data
473
474To iterate over a typed data, use parentheses around an **expression**. The
475expression should evaluate to an integer `Range`, `List`, `Dict`, or `io.stdin`.
476
477Range:
478
479 for i in (3 ..< 5) { # range operator ..<
480 echo "i = $i"
481 }
482 # =>
483 # i = 3
484 # i = 4
485
486List:
487
488 var foods = ['ale', 'bean']
489 for item in (foods) {
490 echo $item
491 }
492 # =>
493 # ale
494 # bean
495
496Again, you can request the index with `for i, item in ...`.
497
498---
499
500There are **three** ways of iterating over a `Dict`:
501
502 var mydict = {pea: 42, nut: 10}
503 for key in (mydict) {
504 echo $key
505 }
506 # =>
507 # pea
508 # nut
509
510 for key, value in (mydict) {
511 echo "$key $value"
512 }
513 # =>
514 # pea - 42
515 # nut - 10
516
517 for i, key, value in (mydict) {
518 echo "$i $key $value"
519 }
520 # =>
521 # 0 - pea - 42
522 # 1 - nut - 10
523
524That is, if you ask for two things, you'll get the key and value. If you ask
525for three, you'll also get the index.
526
527(One way to think of it: `for` loops in YSH have the functionality Python's
528`enumerate()`, `items()`, `keys()`, and `values()`.)
529
530---
531
532The `io.stdin` object iterates over lines:
533
534 for line in (io.stdin) {
535 echo $line
536 }
537 # lines are buffered, so it's much faster than `while read --raw-line`
538
539<!--
540TODO: Str loop should give you the (UTF-8 offset, rune)
541Or maybe just UTF-8 offset? Decoding errors could be exceptions, or Unicode
542replacement.
543-->
544
545### `while` Loop
546
547While loops can use a **command** as the termination condition:
548
549 while test --file lock {
550 sleep 1
551 }
552
553Or an **expression**, which is surrounded in `()`:
554
555 var i = 3
556 while (i < 6) {
557 echo "i = $i"
558 setvar i += 1
559 }
560 # =>
561 # i = 3
562 # i = 4
563 # i = 5
564
565### Conditionals
566
567#### `if elif`
568
569If statements test the exit code of a command, and have optional `elif` and
570`else` clauses:
571
572 if test --file foo {
573 echo 'foo is a file'
574 rm --verbose foo # delete it
575 } elif test --dir foo {
576 echo 'foo is a directory'
577 } else {
578 echo 'neither'
579 }
580
581Invert the exit code with `!`:
582
583 if ! grep alice /etc/passwd {
584 echo 'alice is not a user'
585 }
586
587As with `while` loops, the condition can also be an **expression** wrapped in
588`()`:
589
590 if (num_beans > 0) {
591 echo 'so many beans'
592 }
593
594 var done = false
595 if (not done) { # negate with 'not' operator (contrast with !)
596 echo "we aren't done"
597 }
598
599#### `case`
600
601The case statement is a series of conditionals and executable blocks. The
602condition can be either an unquoted glob pattern like `*.py`, an eggex pattern
603like `/d+/`, or a typed expression like `(42)`:
604
605 var s = 'README.md'
606 case (s) {
607 *.py { echo 'Python' }
608 *.cc | *.h { echo 'C++' }
609 * { echo 'Other' }
610 }
611 # => Other
612
613 case (s) {
614 / dot* '.md' / { echo 'Markdown' }
615 (30 + 12) { echo 'the integer 42' }
616 (else) { echo 'neither' }
617 }
618 # => Markdown
619
620
621<!--
622(Shell style like `if foo; then ... fi` and `case $x in ... esac` is also
623legal, but discouraged in YSH code.)
624-->
625
626### Error Handling
627
628If statements are also used for **error handling**. Builtins and external
629commands use this style:
630
631 if ! test -d /bin {
632 echo 'not a directory'
633 }
634
635 if ! cp foo /tmp {
636 echo 'error copying' # any non-zero status
637 }
638
639Procs use this style (because of shell's *disabled `errexit` quirk*):
640
641 try {
642 myproc
643 }
644 if failed {
645 echo 'failed'
646 }
647
648For a complete list of examples, see [YSH Error
649Handling](ysh-error.html). For design goals and a reference, see [YSH
650Fixes Shell's Error Handling](error-handling.html).
651
652#### exit, break, continue, return
653
654The `exit` **keyword** exits a process. (It's not a shell builtin.)
655
656The other 3 control flow keywords behave like they do in Python and JavaScript.
657
658### Shell-like `proc`
659
660You can define units of code with the `proc` keyword. A `proc` is like a
661*procedure* or *process*.
662
663 proc mycopy (src, dest) {
664 ### Copy verbosely
665
666 mkdir -p $dest
667 cp --verbose $src $dest
668 }
669
670The `###` line is a "doc comment". Simple procs like this are invoked like a
671shell command:
672
673 touch log.txt
674 mycopy log.txt /tmp # first word 'mycopy' is a proc
675
676Procs have many features, including **four** kinds of arguments:
677
6781. Word args (which are always strings)
6791. Typed, positional args
6801. Typed, named args
6811. A final block argument, which may be written with `{ }`.
682
683At the call site, they can look like any of these forms:
684
685 ls /tmp # word arg
686
687 json write (d) # word arg, then positional arg
688
689 try {
690 error 'failed' (status=9) # word arg, then named arg
691 }
692
693 cd /tmp { echo $PWD } # word arg, then block arg
694
695 pp value ([1, 2]) # positional, typed arg
696
697<!-- TODO: lazy arg list: ls8 | where [age > 10] -->
698
699At the definition site, the kinds of parameters are separated with `;`, similar
700to the Julia language:
701
702 proc p2 (word1, word2; pos1, pos2, ...rest_pos) {
703 echo "$word1 $word2 $[pos1 + pos2]"
704 json write (rest_pos)
705 }
706
707 proc p3 (w ; ; named1, named2, ...rest_named; block) {
708 echo "$w $[named1 + named2]"
709 call io->eval(block)
710 json write (rest_named)
711 }
712
713 proc p4 (; ; ; block) {
714 call io->eval(block)
715 }
716
717YSH also has Python-like functions defined with `func`. These are part of the
718expression language, which we'll see later.
719
720For more info, see the [Guide to Procs and Funcs](proc-func.html).
721
722### Ruby-like Block Arguments
723
724A block is a value of type `Command`. For example, `shopt` is a builtin
725command that takes a block argument:
726
727 shopt --unset errexit { # ignore errors
728 cp ale /tmp
729 cp bean /bin
730 }
731
732In this case, the block doesn't form a new scope.
733
734#### Block Scope / Closures
735
736However, by default, block arguments capture the frame they're defined in.
737This means they obey *lexical scope*.
738
739Consider this proc, which accepts a block, and runs it:
740
741 proc do-it (; ; ; block) {
742 call io->eval(block)
743 }
744
745When the block arg is passed, the enclosing stack frame is captured. This
746means that code inside the block can use variables in the captured frame:
747
748 var x = 42
749 do-it {
750 echo "x = $x" # outer x is visible LATER, when the block is run
751 }
752
753- [Feature Index: Closures](ref/feature-index.html#Closures)
754
755### Builtin Commands
756
757**Shell builtins** like `cd` and `read` are the "standard library" of the
758command language. Each one takes various flags:
759
760 cd -L . # follow symlinks
761
762 echo foo | read --all # read all of stdin
763
764Here are some categories of builtin:
765
766- I/O: `echo write read`
767- File system: `cd test`
768- Processes: `fork wait forkwait exec`
769- Interpreter settings: `shopt shvar`
770- Meta: `command builtin runproc type eval`
771
772<!-- TODO: Link to a comprehensive list of builtins -->
773
774## Expression Language: Python-like Types
775
776YSH expressions look and behave more like Python or JavaScript than shell. For
777example, we write `if (x < y)` instead of `if [ $x -lt $y ]`. Expressions are
778usually surrounded by `( )`.
779
780At runtime, variables like `x` and `y` are bounded to **typed data**, like
781integers, floats, strings, lists, and dicts.
782
783<!--
784[Command vs. Expression Mode](command-vs-expression-mode.html) may help you
785understand how YSH is parsed.
786-->
787
788### Python-like `func`
789
790At the end of the *Command Language*, we saw that procs are shell-like units of
791code. YSH also has Python-like **functions**, which are different than
792`procs`:
793
794- They're defined with the `func` keyword.
795- They're called in expressions, not in commands.
796- They're **pure**, and live in the **interior** of a process.
797 - In contrast, procs usually perform I/O, and have **exterior** boundaries.
798
799The simplest function is:
800
801 func identity(x) {
802 return (x) # parens required for typed return
803 }
804
805A more complex pure function:
806
807 func myRepeat(s, n; special=false) { # positional; named params
808 var parts = []
809 for i in (0 ..< n) {
810 append $s (parts)
811 }
812 var result = join(parts)
813
814 if (special) {
815 return ("$result !!")
816 } else {
817 return (result)
818 }
819 }
820
821 echo $[myRepeat('z', 3)] # => zzz
822
823 echo $[myRepeat('z', 3, special=true)] # => zzz !!
824
825A function that mutates its argument:
826
827 func popTwice(mylist) {
828 call mylist->pop()
829 call mylist->pop()
830 }
831
832 var mylist = [3, 4]
833
834 # The call keyword is an "adapter" between commands and expressions,
835 # like the = keyword.
836 call popTwice(mylist)
837
838
839Funcs are named using `camelCase`, while procs use `kebab-case`. See the
840[Style Guide](style-guide.html) for more conventions.
841
842#### Builtin Functions
843
844In addition, to builtin commands, YSH has Python-like builtin **functions**.
845These are like the "standard library" for the expression language. Examples:
846
847- Functions that take multiple types: `len() type()`
848- Conversions: `bool() int() float() str() list() ...`
849- Explicit word evaluation: `split() join() glob() maybe()`
850
851<!-- TODO: Make a comprehensive list of func builtins. -->
852
853
854### Data Types: `Int`, `Str`, `List`, `Dict`, `Obj`, ...
855
856YSH has data types, each with an expression syntax and associated methods.
857
858### Methods
859
860Non-mutating methods are looked up with the `.` operator:
861
862 var line = ' ale bean '
863 var caps = line.trim().upper() # 'ALE BEAN'
864
865Mutating methods are looked up with a thin arrow `->`:
866
867 var foods = ['ale', 'bean']
868 var last = foods->pop() # bean
869 write @foods # => ale
870
871You can ignore the return value with the `call` keyword:
872
873 call foods->pop()
874
875That is, YSH adds mutable data structures to shell, so we have a special syntax
876for mutation.
877
878---
879
880You can also chain functions with a fat arrow `=>`:
881
882 var trimmed = line.trim() => upper() # 'ALE BEAN'
883
884The `=>` operator allows functions to appear in a natural left-to-right order,
885like methods.
886
887 # list() is a free function taking one arg
888 # join() is a free function taking two args
889 var x = {k1: 42, k2: 43} => list() => join('/') # 'K1/K2'
890
891---
892
893Now let's go through the data types in YSH. We'll show the syntax for
894literals, and what **methods** they have.
895
896#### Null and Bool
897
898YSH uses JavaScript-like spellings these three "atoms":
899
900 var x = null
901
902 var b1, b2 = true, false
903
904 if (b1) {
905 echo 'yes'
906 } # => yes
907
908
909#### Int
910
911There are many ways to write integers:
912
913 var small, big = 42, 65_536
914 echo "$small $big" # => 42 65536
915
916 var hex, octal, binary = 0x0001_0000, 0o755, 0b0001_0101
917 echo "$hex $octal $binary" # => 65536 493 21
918
919<!--
920"Runes" are integers that represent Unicode code points. They're not common in
921YSH code, but can make certain string algorithms more readable.
922
923 # Pound rune literals are similar to ord('A')
924 const a = #'A'
925
926 # Backslash rune literals can appear outside of quotes
927 const newline = \n # Remember this is an integer
928 const backslash = \\ # ditto
929
930 # Unicode rune literal is syntactic sugar for 0x3bc
931 const mu = \u{3bc}
932
933 echo "chars $a $newline $backslash $mu" # => chars 65 10 92 956
934-->
935
936#### Float
937
938Floats are written with a decimal point:
939
940 var big = 3.14
941
942You can use scientific notation, as in Python:
943
944 var small = 1.5e-10
945
946#### Str
947
948See the section above on *Three Kinds of String Literals*. It described
949`'single quoted'`, `"double ${quoted}"`, and `u'J8-style\n'` strings; as well
950as their multiline variants.
951
952Strings are UTF-8 encoded in memory, like strings in the [Go
953language](https://golang.org). There isn't a separate string and unicode type,
954as in Python.
955
956Strings are **immutable**, as in Python and JavaScript. This means they only
957have **transforming** methods:
958
959 var x = s.trim()
960
961Other methods:
962
963- `trimLeft() trimRight()`
964- `trimPrefix() trimSuffix()`
965- `upper() lower()`
966- `search() leftMatch()` - pattern matching
967- `replace() split()`
968
969#### List (and Arrays)
970
971All lists can be expressed with Python-like literals:
972
973 var foods = ['ale', 'bean', 'corn']
974 var recursive = [1, [2, 3]]
975
976As a special case, list of strings are called **arrays**. It's often more
977convenient to write them with shell-like literals:
978
979 # No quotes or commas
980 var foods = :| ale bean corn |
981
982 # You can use the word language here
983 var other = :| foo $s *.py {alice,bob}@example.com |
984
985Lists are **mutable**, as in Python and JavaScript. So they mainly have
986mutating methods:
987
988 call foods->reverse()
989 write -- @foods
990 # =>
991 # corn
992 # bean
993 # ale
994
995#### Dict
996
997Dicts use syntax that's like JavaScript. Here's a dict literal:
998
999 var d = {
1000 name: 'bob', # unquoted keys are allowed
1001 age: 42,
1002 'key with spaces': 'val'
1003 }
1004
1005You can use either `[]` or `.` to retrieve a value, given a key:
1006
1007 var v1 = d['name']
1008 var v2 = d.name # shorthand for the above
1009 var v3 = d['key with spaces'] # no shorthand for this
1010
1011(If the key doesn't exist, an error is raised.)
1012
1013You can change Dict values with the same 2 syntaxes:
1014
1015 set d['name'] = 'other'
1016 set d.name = 'fun'
1017
1018---
1019
1020If you want to compute a key name, use an expression inside `[]`:
1021
1022 var key = 'alice'
1023 var d2 = {[key ++ '_z']: 'ZZZ'} # Computed key name
1024 echo $[d2.alice_z] # => ZZZ
1025
1026If you omit the value, its taken from a variable of the same name:
1027
1028 var d3 = {key} # like {key: key}
1029 echo "name is $[d3.key]" # => name is alice
1030
1031More examples:
1032
1033 var empty = {}
1034 echo $[len(empty)] # => 0
1035
1036The `keys()` and `values()` methods return new `List` objects:
1037
1038 var keys = keys(d2) # => alice_z
1039 var vals = values(d3) # => alice
1040
1041#### Obj
1042
1043YSH has an `Obj` type that bundles **code** and **data**. (In contrast, JSON
1044messages are pure data, not objects.)
1045
1046The main purpose of objects is **polymorphism**:
1047
1048 var obj = makeMyObject(42) # I don't know what it looks like inside
1049
1050 echo $[obj.myMethod()] # But I can perform abstract operations
1051
1052 call obj->mutatingMethod() # Mutation is considered special, with ->
1053
1054YSH objects are similar to Lua and JavaScript objects. They can be thought of
1055as a linked list of `Dict` instances.
1056
1057Or you can say they have a `Dict` of properties, and a recursive "prototype
1058chain" that is also an `Obj`.
1059
1060- [Feature Index: Objects](ref/feature-index.html#Objects)
1061
1062### `Place` type / "out params"
1063
1064The `read` builtin can set an implicit variable `_reply`:
1065
1066 whoami | read --all # sets _reply
1067
1068Or you can pass a `value.Place`, created with `&`
1069
1070 var x # implicitly initialized to null
1071 whoami | read --all (&x) # mutate this "place"
1072 echo who=$x # => who=andy
1073
1074<!--
1075#### Quotation Types: value.Command (Block) and value.Expr
1076
1077These types are for reflection on YSH code. Most YSH programs won't use them
1078directly.
1079
1080- `Command`: an unevaluated code block.
1081 - rarely-used literal: `^(ls | wc -l)`
1082- `Expr`: an unevaluated expression.
1083 - rarely-used literal: `^[42 + a[i]]`
1084-->
1085
1086### Operators
1087
1088YSH operators are generally the same as in Python:
1089
1090 if (10 <= num_beans and num_beans < 20) {
1091 echo 'enough'
1092 } # => enough
1093
1094YSH has a few operators that aren't in Python. Equality can be approximate or
1095exact:
1096
1097 var n = ' 42 '
1098 if (n ~== 42) {
1099 echo 'equal after stripping whitespace and type conversion'
1100 } # => equal after stripping whitespace type conversion
1101
1102 if (n === 42) {
1103 echo "not reached because strings and ints aren't equal"
1104 }
1105
1106<!-- TODO: is n === 42 a type error? -->
1107
1108Pattern matching can be done with globs (`~~` and `!~~`)
1109
1110 const filename = 'foo.py'
1111 if (filename ~~ '*.py') {
1112 echo 'Python'
1113 } # => Python
1114
1115 if (filename !~~ '*.sh') {
1116 echo 'not shell'
1117 } # => not shell
1118
1119or regular expressions (`~` and `!~`). See the Eggex section below for an
1120example of the latter.
1121
1122Concatenation is `++` rather than `+` because it avoids confusion in the
1123presence of type conversion:
1124
1125 var n = 42 + 1 # string plus int does implicit conversion
1126 echo $n # => 43
1127
1128 var y = 'ale ' ++ "bean $n" # concatenation
1129 echo $y # => ale bean 43
1130
1131<!--
1132TODO: change example above
1133 var n = '42' + 1 # string plus int does implicit conversion
1134-->
1135
1136<!--
1137
1138#### Summary of Operators
1139
1140- Arithmetic: `+ - * / // %` and `**` for exponentatiation
1141 - `/` always yields a float, and `//` is integer division
1142- Bitwise: `& | ^ ~`
1143- Logical: `and or not`
1144- Comparison: `== < > <= >= in 'not in'`
1145 - Approximate equality: `~==`
1146 - Eggex and glob match: `~ !~ ~~ !~~`
1147- Ternary: `1 if x else 0`
1148- Index and slice: `mylist[3]` and `mylist[1:3]`
1149 - `mydict->key` is a shortcut for `mydict['key']`
1150- Function calls
1151 - free: `f(x, y)`
1152 - transformations and chaining: `s => startWith('prefix')`
1153 - mutating methods: `mylist->pop()`
1154- String and List: `++` for concatenation
1155 - This is a separate operator because the addition operator `+` does
1156 string-to-int conversion
1157
1158TODO: What about list comprehensions?
1159-->
1160
1161### Egg Expressions (YSH Regexes)
1162
1163An *Eggex* is a YSH expression that denotes a regular expression. Eggexes
1164translate to POSIX ERE syntax, for use with tools like `egrep`, `awk`, and `sed
1165--regexp-extended` (GNU only).
1166
1167They're designed to be readable and composable. Example:
1168
1169 var D = / digit{1,3} /
1170 var ip_pattern = / D '.' D '.' D '.' D'.' /
1171
1172 var z = '192.168.0.1'
1173 if (z ~ ip_pattern) { # Use the ~ operator to match
1174 echo "$z looks like an IP address"
1175 } # => 192.168.0.1 looks like an IP address
1176
1177 if (z !~ / '.255' %end /) {
1178 echo "doesn't end with .255"
1179 } # => doesn't end with .255"
1180
1181See the [Egg Expressions doc](eggex.html) for details.
1182
1183## Interlude
1184
1185Before moving onto other YSH features, let's review what we've seen.
1186
1187### Three Interleaved Languages
1188
1189Here are the languages we saw in the last 3 sections:
1190
11911. **Words** evaluate to a string, or list of strings. This includes:
1192 - literals like `'mystr'`
1193 - substitutions like `${x}` and `$(hostname)`
1194 - globs like `*.sh`
11952. **Commands** are used for
1196 - I/O: pipelines, builtins like `read`
1197 - control flow: `if`, `for`
1198 - abstraction: `proc`
11993. **Expressions** on typed data are borrowed from Python, with influence from
1200 JavaScript:
1201 - Lists: `['ale', 'bean']` or `:| ale bean |`
1202 - Dicts: `{name: 'bob', age: 42}`
1203 - Functions: `split('ale bean')` and `join(['pea', 'nut'])`
1204
1205### How Do They Work Together?
1206
1207Here are two examples:
1208
1209(1) In this this *command*, there are **four** *words*. The fourth word is an
1210*expression sub* `$[]`.
1211
1212 write hello $name $[d['age'] + 1]
1213 # =>
1214 # hello
1215 # world
1216 # 43
1217
1218(2) In this assignment, the *expression* on the right hand side of `=`
1219concatenates two strings. The first string is a literal, and the second is a
1220*command sub*.
1221
1222 var food = 'ale ' ++ $(echo bean | tr a-z A-Z)
1223 write $food # => ale BEAN
1224
1225So words, commands, and expressions are **mutually recursive**. If you're a
1226conceptual person, skimming [Syntactic Concepts](syntactic-concepts.html) may
1227help you understand this on a deeper level.
1228
1229<!--
1230One way to think about these sublanguages is to note that the `|` character
1231means something different in each context:
1232
1233- In the command language, it's the pipeline operator, as in `ls | wc -l`
1234- In the word language, it's only valid in a literal string like `'|'`, `"|"`,
1235 or `\|`. (It's also used in `${x|html}`, which formats a string.)
1236- In the expression language, it's the bitwise OR operator, as in Python and
1237 JavaScript.
1238-->
1239
1240---
1241
1242Let's move on from talking about **code**, and talk about **data**.
1243
1244## Data Notation / Interchange Formats
1245
1246In YSH, you can read and write data languages based on [JSON]($xref). This is
1247a primary way to exchange messages between Unix processes.
1248
1249Instead of being **executed**, like our command/word/expression languages,
1250these languages **parsed** as data structures.
1251
1252<!-- TODO: Link to slogans, fallacies, and concepts -->
1253
1254### UTF-8
1255
1256UTF-8 is the foundation of our data notation. It's the most common Unicode
1257encoding, and the most consistent:
1258
1259 var x = u'hello \u{1f642}' # store a UTF-8 string in memory
1260 echo $x # send UTF-8 to stdout
1261
1262hello &#x1f642;
1263
1264<!-- TODO: there's a runes() iterator which gives integer offsets, usable for
1265slicing -->
1266
1267### JSON
1268
1269JSON messages are UTF-8 text. You can encode and decode JSON with functions
1270(`func` style):
1271
1272 var message = toJson({x: 42}) # => (Str) '{"x": 42}'
1273 var mydict = fromJson('{"x": 42}') # => (Dict) {x: 42}
1274
1275Or with commands (`proc` style):
1276
1277 json write ({x: 42}) > foo.json # writes '{"x": 42}'
1278
1279 json read (&mydict) < foo.json # create var
1280 = mydict # => (Dict) {x: 42}
1281
1282### J8 Notation
1283
1284But JSON isn't quite enough for a principled shell.
1285
1286- Traditional Unix tools like `grep` and `awk` operate on streams of **lines**.
1287 In YSH, to avoid data-dependent bugs, we want a reliable way of **quoting**
1288 lines.
1289- In YSH, we also want to represent **binary** data, not just text. When you
1290 read a Unix file, it may or may not be text.
1291
1292So we borrow JSON-style strings, and create [J8 Notation][]. Slogans:
1293
1294- *Deconstructing and Augmenting JSON*
1295- *Fixing the JSON-Unix Mismatch*
1296
1297[J8 Notation]: $xref:j8-notation
1298
1299#### J8 Lines
1300
1301*J8 Lines* are a building block of J8 Notation. If you have a file
1302`lines.txt`:
1303
1304<pre>
1305 doc/hello.md
1306 "doc/with spaces.md"
1307b'doc/with byte \yff.md'
1308</pre>
1309
1310Then you can decode it with *split command sub* (mentioned above):
1311
1312 var decoded = @(cat lines.txt)
1313
1314This file has:
1315
13161. An unquoted string
13171. A JSON string with `"double quotes"`
13181. A J8-style string: `u'unicode'` or `b'bytes'`
1319
1320<!--
1321TODO: fromJ8Line() toJ8Line()
1322-->
1323
1324#### JSON8 is Tree-Shaped
1325
1326JSON8 is just like JSON, but it allows J8-style strings:
1327
1328<pre>
1329{ "foo": "hi \uD83D\uDE42"} # valid JSON, and valid JSON8
1330{u'foo': u'hi \u{1F642}' } # valid JSON8, with J8-style strings
1331</pre>
1332
1333<!--
1334In addition to strings and lines, you can write and read **tree-shaped** data
1335as [JSON][]:
1336
1337 var d = {key: 'value'}
1338 json write (d) # dump variable d as JSON
1339 # =>
1340 # {
1341 # "key": "value"
1342 # }
1343
1344 echo '["ale", 42]' > example.json
1345
1346 json read (&d2) < example.json # parse JSON into var d2
1347 pp (d2) # pretty print it
1348 # => (List) ['ale', 42]
1349
1350[JSON][] will lose information when strings have binary data, but the slight
1351[JSON8]($xref) upgrade won't:
1352
1353 var b = {binary: $'\xff'}
1354 json8 write (b)
1355 # =>
1356 # {
1357 # "binary": b'\yff'
1358 # }
1359-->
1360
1361[JSON]: $xref
1362
1363#### TSV8 is Table-Shaped
1364
1365(TODO: not yet implemented.)
1366
1367YSH supports data notation for tables:
1368
13691. Plain TSV files, which are untyped. Every column has string data.
1370 - Cells with tabs, newlines, and binary data are a problem.
13712. Our extension [TSV8]($xref), which supports typed data.
1372 - It uses JSON notation for booleans, integers, and floats.
1373 - It uses J8 strings, which can represent any string.
1374
1375<!-- Figure out the API. Does it work like JSON?
1376
1377Or I think we just implement
1378- rows: 'where' or 'filter' (dplyr)
1379- cols: 'select' conflicts with shell builtin; call it 'cols'?
1380- sort: 'sort-by' or 'arrange' (dplyr)
1381- TSV8 <=> sqlite conversion. Are these drivers or what?
1382 - and then let you pipe output?
1383
1384Do we also need TSV8 space2tab or something? For writing TSV8 inline.
1385
1386More later:
1387- MessagePack (e.g. for shared library extension modules)
1388 - msgpack read, write? I think user-defined function could be like this?
1389- SASH: Simple and Strict HTML? For easy processing
1390-->
1391
1392## YSH Modules are Files
1393
1394A module is a **file** of source code, like `lib/myargs.ysh`. The `use`
1395builtin turns it into an `Obj` that can be invoked and inspected:
1396
1397 use myargs.ysh
1398
1399 myargs proc1 --flag val # module name becomes a prefix, via __invoke__
1400 var alias = myargs.proc1 # module has attributes
1401
1402You can import specific names with the `--pick` flag:
1403
1404 use myargs.ysh --pick p2 p3
1405
1406 p2
1407 p3
1408
1409- [Feature Index: Modules](ref/feature-index.html#Modules)
1410
1411## The Runtime Shared by OSH and YSH
1412
1413Although we describe OSH and YSH as different languages, they use the **same**
1414interpreter under the hood.
1415
1416This interpreter has many `shopt` booleans to control behavior, like `shopt
1417--set parse_paren`. The group `shopt --set ysh:all` flips all booleans to make
1418`bin/osh` behave like `bin/ysh`.
1419
1420Understanding this common runtime, and its interface to the Unix kernel, will
1421help you understand **both** languages!
1422
1423### Interpreter Data Model
1424
1425The [Interpreter State](interpreter-state.html) doc is under construction. It
1426will cover:
1427
1428- The **call stack** for OSH and YSH
1429 - Each *stack frame* is a `{name -> cell}` mapping.
1430- Each cell has a **value**, with boolean flags
1431 - OSH has types `Str BashArray BashAssoc`, and flags `readonly export
1432 nameref`.
1433 - YSH has types `Bool Int Float Str List Dict Obj ...`, and the `readonly`
1434 flag.
1435- YSH **namespaces**
1436 - Modules with `use`
1437 - Builtin functions and commands
1438 - ENV
1439- Shell **options**
1440 - Boolean options with `shopt`: `parse_paren`, `simple_word_eval`, etc.
1441 - String options with `shvar`: `IFS`, `PATH`
1442- **Registers** that store interpreter state
1443 - `$?` and `_error`
1444 - `$!` for the last PID
1445 - `_this_dir`
1446 - `_reply`
1447
1448### Process Model (the kernel)
1449
1450The [Process Model](process-model.html) doc is **under construction**. It will cover:
1451
1452- Simple Commands, `exec`
1453- Pipelines. #[shell-the-good-parts](#blog-tag)
1454- `fork`, `forkwait`
1455- Command and process substitution
1456- Related:
1457 - [Tracing execution in Oils](xtrace.html) (xtrace), which divides
1458 process-based concurrency into **synchronous** and **async** constructs.
1459 - [Three Comics For Understanding Unix
1460 Shell](http://www.oilshell.org/blog/2020/04/comics.html) (blog)
1461
1462<!--
1463Process model additions: Capers, Headless shell
1464
1465some optimizations: See YSH starts fewer processes than other shells.
1466-->
1467
1468### Advanced: Reflecting on the Interpreter
1469
1470You can reflect on the interpreter with APIs like `io->eval()` and
1471`vm.getFrame()`.
1472
1473- [Feature Index: Reflection](ref/feature-index.html#Reflection)
1474
1475This allows YSH to be a language for creating other languages. (Ruby, Tcl, and
1476Racket also have this flavor.)
1477
1478<!--
1479
1480TODO: Hay and Awk examples
1481-->
1482
1483## Summary
1484
1485What have we described in this tour?
1486
1487YSH is a programming language that evolved from Unix shell. But you can
1488"forget" the bad parts of shell like `[ $x -lt $y ]`.
1489
1490<!--
1491Instead, we've shown you shell-like commands, Python-like expressions on typed
1492data, and Ruby-like command blocks.
1493-->
1494
1495Instead, focus on these central concepts:
1496
14971. Interleaved *word*, *command*, and *expression* languages.
14982. A standard library of *builtin commands*, as well as *builtin functions*
14993. Languages for *data*: J8 Notation, including JSON8 and TSV8
15004. A *runtime* shared by OSH and YSH
1501
1502## Appendix
1503
1504### Related Docs
1505
1506- [YSH vs. Shell Idioms](idioms.html) - YSH side-by-side with shell.
1507- [YSH Language Influences](language-influences.html) - In addition to shell,
1508 Python, and JavaScript, YSH is influenced by Ruby, Perl, Awk, PHP, and more.
1509- [A Feel For YSH Syntax](syntax-feelings.html) - Some thoughts that may help
1510 you remember the syntax.
1511- [YSH Language Warts](warts.html) documents syntax that may be surprising.
1512
1513
1514### YSH Script Template
1515
1516YSH can be used to write simple "shell scripts" or longer programs. It has
1517*procs* and *modules* to help with the latter.
1518
1519A module is just a file, like this:
1520
1521```
1522#!/usr/bin/env ysh
1523### Deploy script
1524
1525use $_this_dir/lib/util.ysh --pick log
1526
1527const DEST = '/tmp/ysh-tour'
1528
1529proc my-sync(...files) {
1530 ### Sync files and show which ones
1531
1532 cp --verbose @files $DEST
1533}
1534
1535proc main {
1536 mkdir -p $DEST
1537
1538 touch {foo,bar}.py {build,test}.sh
1539
1540 log "Copying source files"
1541 my-sync *.py *.sh
1542
1543 if test --dir /tmp/logs {
1544 cd /tmp/logs
1545
1546 log "Copying logs"
1547 my-sync *.log
1548 }
1549}
1550
1551if is-main { # The only top-level statement
1552 main @ARGV
1553}
1554```
1555
1556<!--
1557TODO:
1558- Also show flags parsing?
1559- Show longer examples where it isn't boilerplate
1560-->
1561
1562You wouldn't bother with the boilerplate for something this small. But this
1563example illustrates the basic idea: the top level often contains these words:
1564`use`, `const`, `proc`, and `func`.
1565
1566
1567<!--
1568TODO: not mentioning __provide__, since it should be optional in the most basic usage?
1569-->
1570
1571### YSH Features Not Shown
1572
1573#### Advanced
1574
1575These shell features are part of YSH, but aren't shown above:
1576
1577- The `fork` and `forkwait` builtins, for concurrent execution and subshells.
1578- Process Substitution: `diff <(sort left.txt) <(sort right.txt)`
1579
1580#### Deprecated Shell Constructs
1581
1582The shared interpreter supports many shell constructs that are deprecated:
1583
1584- YSH code uses shell's `||` and `&&` in limited circumstances, since `errexit`
1585 is on by default.
1586- Assignment builtins like `local` and `declare`. Use YSH keywords.
1587- Boolean expressions like `[[ x =~ $pat ]]`. Use YSH expressions.
1588- Shell arithmetic like `$(( x + 1 ))` and `(( y = x ))`. Use YSH expressions.
1589- The `until` loop can always be replaced with a `while` loop
1590- Most of what's in `${}` can be written in other ways. For example
1591 `${s#/tmp}` could be `s => removePrefix('/tmp')` (TODO).
1592
1593#### Not Yet Implemented
1594
1595This document mentions a few constructs that aren't yet implemented. Here's a
1596summary:
1597
1598```none
1599# Unimplemented syntax:
1600
1601echo ${x|html} # formatters
1602
1603echo ${x %.2f} # statically-parsed printf
1604
1605var x = "<p>$x</p>"html
1606echo "<p>$x</p>"html # tagged string
1607
1608var x = 15 Mi # units suffix
1609```
1610
1611<!--
1612- To implement: Capers: stateless coprocesses
1613-->
1614