1 | ---
|
2 | default_highlighter: oils-sh
|
3 | ---
|
4 |
|
5 | A Tour of YSH
|
6 | =============
|
7 |
|
8 | <!-- author's note about example names
|
9 |
|
10 | - people: alice, bob
|
11 | - nouns: ale, bean
|
12 | - peanut, coconut
|
13 | - 42 for integers
|
14 | -->
|
15 |
|
16 | This doc describes the [YSH]($xref) language from **clean slate**
|
17 | perspective. We don't assume you know Unix shell, or the compatible
|
18 | [OSH]($xref). But shell users will see the similarity, with simplifications
|
19 | and upgrades.
|
20 |
|
21 | Remember, YSH is for Python and JavaScript users who avoid shell! See the
|
22 | [project FAQ][FAQ] for more color on that.
|
23 |
|
24 | [FAQ]: https://www.oilshell.org/blog/2021/01/why-a-new-shell.html
|
25 |
|
26 | This document is **long** because it demonstrates nearly every feature of the
|
27 | language. You may want to read it in multiple sittings, or read [The Simplest
|
28 | Explanation of
|
29 | Oil](https://www.oilshell.org/blog/2020/01/simplest-explanation.html) first.
|
30 | (Until 2023, YSH was called the "Oil language".)
|
31 |
|
32 |
|
33 | Here's a summary of what follows:
|
34 |
|
35 | 1. YSH has interleaved *word*, *command*, and *expression* languages.
|
36 | - The command language has Ruby-like *blocks*, and the expression language
|
37 | has Python-like *data types*.
|
38 | 2. YSH has both builtin *commands* like `cd /tmp`, and builtin *functions* like
|
39 | `join()`.
|
40 | 3. Languages for *data*, like [JSON][], are complementary to YSH code.
|
41 | 4. OSH and YSH share both an *interpreter data model* and a *process model*
|
42 | (provided by the Unix kernel). Understanding these common models will make
|
43 | you both a better shell user and YSH user.
|
44 |
|
45 | Keep these points in mind as you read the details below.
|
46 |
|
47 | [JSON]: https://json.org
|
48 |
|
49 | <div id="toc">
|
50 | </div>
|
51 |
|
52 | ## Preliminaries
|
53 |
|
54 | Start YSH just like you start bash or Python:
|
55 |
|
56 | <!-- oils-sh below skips code block extraction, since it doesn't run -->
|
57 |
|
58 | ```sh-prompt
|
59 | bash$ ysh # assuming it's installed
|
60 |
|
61 | ysh$ echo 'hello world' # command typed into YSH
|
62 | hello world
|
63 | ```
|
64 |
|
65 | In the sections below, we'll save space by showing output **in comments**, with
|
66 | `=>`:
|
67 |
|
68 | echo 'hello world' # => hello world
|
69 |
|
70 | Multi-line output is shown like this:
|
71 |
|
72 | echo one
|
73 | echo two
|
74 | # =>
|
75 | # one
|
76 | # two
|
77 |
|
78 | ## Examples
|
79 |
|
80 | ### Hello World Script
|
81 |
|
82 | You can also type commands into a file like `hello.ysh`. This is a complete
|
83 | YSH program, which is identical to a shell program:
|
84 |
|
85 | echo 'hello world' # => hello world
|
86 |
|
87 | ### A Taste of YSH
|
88 |
|
89 | Unlike shell, YSH has `var` and `const` keywords:
|
90 |
|
91 | const name = 'world' # const is rarer, used the top-level
|
92 | echo "hello $name" # => hello world
|
93 |
|
94 | They take rich Python-like expressions on the right:
|
95 |
|
96 | var x = 42 # an integer, not a string
|
97 | setvar x = x * 2 + 1 # mutate with the 'setvar' keyword
|
98 |
|
99 | setvar x += 5 # Increment by 5
|
100 | echo $x # => 6
|
101 |
|
102 | var mylist = [x, 7] # two integers [6, 7]
|
103 |
|
104 | Expressions are often surrounded by `()`:
|
105 |
|
106 | if (x > 0) {
|
107 | echo 'positive'
|
108 | } # => positive
|
109 |
|
110 | for i, item in (mylist) { # 'mylist' is a variable, not a string
|
111 | echo "[$i] item $item"
|
112 | }
|
113 | # =>
|
114 | # [0] item 6
|
115 | # [1] item 7
|
116 |
|
117 | YSH has Ruby-like blocks:
|
118 |
|
119 | cd /tmp {
|
120 | echo hi > greeting.txt # file created inside /tmp
|
121 | echo $PWD # => /tmp
|
122 | }
|
123 | echo $PWD # prints the original directory
|
124 |
|
125 | And utilities to read and write JSON:
|
126 |
|
127 | var person = {name: 'bob', age: 42}
|
128 | json write (person)
|
129 | # =>
|
130 | # {
|
131 | # "name": "bob",
|
132 | # "age": 42,
|
133 | # }
|
134 |
|
135 | echo '["str", 42]' | json read # sets '_reply' variable by default
|
136 |
|
137 | The `=` keyword evaluates and prints an expression:
|
138 |
|
139 | = _reply
|
140 | # => (List) ["str", 42]
|
141 |
|
142 | (Think of it like `var x = _reply`, without the `var`.)
|
143 |
|
144 | ## Word Language: Expressions for Strings (and Arrays)
|
145 |
|
146 | Let's describe the word language first, and then talk about commands and
|
147 | expressions. Words are a rich language because **strings** are a central
|
148 | concept in shell.
|
149 |
|
150 | ### Unquoted Words
|
151 |
|
152 | Words denote strings, but you often don't need to quote them:
|
153 |
|
154 | echo hi # => hi
|
155 |
|
156 | Quotes are useful when a string has spaces, or punctuation characters like `( )
|
157 | ;`.
|
158 |
|
159 | ### Three Kinds of String Literals
|
160 |
|
161 | You can choose the style that's most convenient to write a given string.
|
162 |
|
163 | #### Double-Quoted, Single-Quoted, and J8 strings (like JSON)
|
164 |
|
165 | Double-quoted strings allow **interpolation**, with `$`:
|
166 |
|
167 | var person = 'alice'
|
168 | echo "hi $person, $(echo bye)" # => hi alice, bye
|
169 |
|
170 | Write operators by escaping them with `\`:
|
171 |
|
172 | echo "\$ \" \\ " # => $ " \
|
173 |
|
174 | In single-quoted strings, all characters are **literal** (except `'`, which
|
175 | can't be expressed):
|
176 |
|
177 | echo 'c:\Program Files\' # => c:\Program Files\
|
178 |
|
179 | If you want C-style backslash **character escapes**, use a J8 string, which is
|
180 | like JSON, but with single quotes:
|
181 |
|
182 | echo u' A is \u{41} \n line two, with backslash \\'
|
183 | # =>
|
184 | # A is A
|
185 | # line two, with backslash \
|
186 |
|
187 | The `u''` strings are guaranteed to be valid Unicode (unlike JSON). You can
|
188 | also use `b''` strings:
|
189 |
|
190 | echo b'byte \yff' # Byte that's not valid unicode, like \xff in C.
|
191 | # Don't confuse it with \u{ff}.
|
192 |
|
193 | #### Multi-line Strings
|
194 |
|
195 | Multi-line strings are surrounded with triple quotes. They come in the same
|
196 | three varieties, and leading whitespace is stripped in a convenient way.
|
197 |
|
198 | sort <<< """
|
199 | var sub: $x
|
200 | command sub: $(echo hi)
|
201 | expression sub: $[x + 3]
|
202 | """
|
203 | # =>
|
204 | # command sub: hi
|
205 | # expression sub: 9
|
206 | # var sub: 6
|
207 |
|
208 | sort <<< '''
|
209 | $2.00 # literal $, no interpolation
|
210 | $1.99
|
211 | '''
|
212 | # =>
|
213 | # $1.99
|
214 | # $2.00
|
215 |
|
216 | sort <<< u'''
|
217 | C\tD
|
218 | A\tB
|
219 | ''' # b''' strings also supported
|
220 | # =>
|
221 | # A B
|
222 | # C D
|
223 |
|
224 | (Use multiline strings instead of shell's [here docs]($xref:here-doc).)
|
225 |
|
226 | ### Three Kinds of Substitution
|
227 |
|
228 | YSH has syntax for 3 types of substitution, all of which start with `$`. That
|
229 | is, you can convert any of these things to a **string**:
|
230 |
|
231 | 1. Variables
|
232 | 2. The output of commands
|
233 | 3. The value of expressions
|
234 |
|
235 | #### Variable Sub
|
236 |
|
237 | The syntax `$a` or `${a}` converts a variable to a string:
|
238 |
|
239 | var a = 'ale'
|
240 | echo $a # => ale
|
241 | echo _${a}_ # => _ale_
|
242 | echo "_ $a _" # => _ ale _
|
243 |
|
244 | The shell operator `:-` is occasionally useful in YSH:
|
245 |
|
246 | echo ${not_defined:-'default'} # => default
|
247 |
|
248 | #### Command Sub
|
249 |
|
250 | The `$(echo hi)` syntax runs a command and captures its `stdout`:
|
251 |
|
252 | echo $(hostname) # => example.com
|
253 | echo "_ $(hostname) _" # => _ example.com _
|
254 |
|
255 | #### Expression Sub
|
256 |
|
257 | The `$[myexpr]` syntax evaluates an expression and converts it to a string:
|
258 |
|
259 | echo $[a] # => ale
|
260 | echo $[1 + 2 * 3] # => 7
|
261 | echo "_ $[1 + 2 * 3] _" # => _ 7 _
|
262 |
|
263 | <!-- TODO: safe substitution with "$[a]"html -->
|
264 |
|
265 | ### Arrays of Strings: Globs, Brace Expansion, Splicing, and Splitting
|
266 |
|
267 | There are four constructs that evaluate to a **list of strings**, rather than a
|
268 | single string.
|
269 |
|
270 | #### Globs
|
271 |
|
272 | Globs like `*.py` evaluate to a list of files.
|
273 |
|
274 | touch foo.py bar.py # create the files
|
275 | write *.py
|
276 | # =>
|
277 | # foo.py
|
278 | # bar.py
|
279 |
|
280 | If no files match, it evaluates to an empty list (`[]`).
|
281 |
|
282 | #### Brace Expansion
|
283 |
|
284 | The brace expansion mini-language lets you write strings without duplication:
|
285 |
|
286 | write {alice,bob}@example.com
|
287 | # =>
|
288 | # alice@example.com
|
289 | # bob@example.com
|
290 |
|
291 | #### Splicing
|
292 |
|
293 | The `@` operator splices an array into a command:
|
294 |
|
295 | var myarray = :| ale bean |
|
296 | write S @myarray E
|
297 | # =>
|
298 | # S
|
299 | # ale
|
300 | # bean
|
301 | # E
|
302 |
|
303 | You also have `@[]` to splice an expression that evaluates to a list:
|
304 |
|
305 | write -- @[split('ale bean')]
|
306 | # =>
|
307 | # ale
|
308 | # bean
|
309 |
|
310 | Each item will be converted to a string.
|
311 |
|
312 | #### Split Command Sub / Split Builtin Sub
|
313 |
|
314 | There's also a variant of *command sub* that decodes J8 lines into a sequence
|
315 | of strings:
|
316 |
|
317 | write @(seq 3) # write is passed 3 args
|
318 | # =>
|
319 | # 1
|
320 | # 2
|
321 | # 3
|
322 |
|
323 | ## Command Language: I/O, Control Flow, Abstraction
|
324 |
|
325 | ### Simple Commands
|
326 |
|
327 | A simple command is a space-separated list of words. YSH looks up the first
|
328 | word to determine if it's a builtin command, or a user-defined `proc`.
|
329 |
|
330 | echo 'hello world' # The shell builtin 'echo'
|
331 |
|
332 | proc greet (name) { # A proc is like a procedure or process
|
333 | echo "hello $name"
|
334 | }
|
335 |
|
336 | # The first word now resolves to the proc you defined
|
337 | greet alice # => hello alice
|
338 |
|
339 | If it's neither, then it's assumed to be an external command:
|
340 |
|
341 | ls -l /tmp # The external 'ls' command
|
342 |
|
343 | Commands accept traditional string arguments, as well as typed arguments in
|
344 | parentheses:
|
345 |
|
346 | # 'write' is a string arg; 'x' is a typed expression arg
|
347 | json write (x)
|
348 |
|
349 | <!--
|
350 | Block args are a special kind of typed arg:
|
351 |
|
352 | cd /tmp {
|
353 | echo $PWD
|
354 | }
|
355 | -->
|
356 |
|
357 | ### Redirects
|
358 |
|
359 | You can **redirect** `stdin` and `stdout` of simple commands:
|
360 |
|
361 | echo hi > tmp.txt # write to a file
|
362 | sort < tmp.txt
|
363 |
|
364 | Here are the most common idioms for using `stderr` (identical to shell):
|
365 |
|
366 | ls /tmp 2>errors.txt
|
367 | echo 'fatal error' >&2
|
368 |
|
369 | ### ARGV and ENV
|
370 |
|
371 | The `ARGV` list holds the arguments pased to the shell:
|
372 |
|
373 | var num_args = len(ARGV)
|
374 | ls /tmp @ARGV # pass shell's arguments through
|
375 |
|
376 | ---
|
377 |
|
378 | You can add to the environment of a new process with a *prefix binding*:
|
379 |
|
380 | PYTHONPATH=vendor ./demo.py
|
381 |
|
382 | The `ENV` object reflects the current environment:
|
383 |
|
384 | echo $[ENV.PYTHONPATH] # => vendor
|
385 |
|
386 | ### Pipelines
|
387 |
|
388 | Pipelines are a powerful method manipulating data streams:
|
389 |
|
390 | ls | wc -l # count files in this directory
|
391 | find /bin -type f | xargs wc -l # count files in a subtree
|
392 |
|
393 | The stream may contain (lines of) text, binary data, JSON, TSV, and more.
|
394 | Details below.
|
395 |
|
396 | ### Multi-line Commands
|
397 |
|
398 | The `...` prefix lets you write long commands, pipelines, and `&&` chains
|
399 | without `\` line continuations.
|
400 |
|
401 | ... find /bin # traverse this directory and
|
402 | -type f -a -executable # print executable files
|
403 | | sort -r # reverse sort
|
404 | | head -n 30 # limit to 30 files
|
405 | ;
|
406 |
|
407 | When this mode is active:
|
408 |
|
409 | - A single newline behaves like a space
|
410 | - A blank line (two newlines in a row) is illegal, but a line that has only a
|
411 | comment is allowed. This prevents confusion if you forget the `;`
|
412 | terminator.
|
413 |
|
414 | ### `var`, `setvar`, `const` to Declare and Mutate
|
415 |
|
416 | Constants can't be modified:
|
417 |
|
418 | const myconst = 'mystr'
|
419 | # setvar myconst = 'foo' would be an error
|
420 |
|
421 | Modify variables with the `setvar` keyword:
|
422 |
|
423 | var num_beans = 12
|
424 | setvar num_beans = 13
|
425 |
|
426 | A more complex example:
|
427 |
|
428 | var d = {name: 'bob', age: 42} # dict literal
|
429 | setvar d.name = 'alice' # d.name is a synonym for d['name']
|
430 | echo $[d.name] # => alice
|
431 |
|
432 | That's most of what you need to know about assignments. Advanced users may
|
433 | want to use `setglobal` or `call myplace->setValue(42)` in certain situations.
|
434 |
|
435 | <!--
|
436 | var g = 1
|
437 | var h = 2
|
438 | proc demo(:out) {
|
439 | setglobal g = 42
|
440 | setref out = 43
|
441 | }
|
442 | demo :h # pass a reference to h
|
443 | echo "$g $h" # => 42 43
|
444 | -->
|
445 |
|
446 | More info: [Variable Declaration and Mutation](variables.html).
|
447 |
|
448 | ### `for` Loop
|
449 |
|
450 | #### Words
|
451 |
|
452 | Shell-style for loops iterate over **words**:
|
453 |
|
454 | for word in 'oils' $num_beans {pea,coco}nut {
|
455 | echo $word
|
456 | }
|
457 | # =>
|
458 | # oils
|
459 | # 13
|
460 | # peanut
|
461 | # coconut
|
462 |
|
463 | You can also request the loop index:
|
464 |
|
465 | for i, word in README.md *.py {
|
466 | echo "$i - $word"
|
467 | }
|
468 | # =>
|
469 | # 0 - README.md
|
470 | # 1 - __init__.py
|
471 |
|
472 | #### Typed Data
|
473 |
|
474 | To iterate over a typed data, use parentheses around an **expression**. The
|
475 | expression should evaluate to an integer `Range`, `List`, `Dict`, or `Stdin`.
|
476 |
|
477 | Range:
|
478 |
|
479 | for i in (3 ..< 5) { # range operator ..<
|
480 | echo "i = $i"
|
481 | }
|
482 | # =>
|
483 | # i = 3
|
484 | # i = 4
|
485 |
|
486 | List:
|
487 |
|
488 | var foods = ['ale', 'bean']
|
489 | for item in (foods) {
|
490 | echo $item
|
491 | }
|
492 | # =>
|
493 | # ale
|
494 | # bean
|
495 |
|
496 | Again, you can request the index with `for i, item in ...`.
|
497 |
|
498 | ---
|
499 |
|
500 | Here's the most general form of the loop over `Dict`:
|
501 |
|
502 | var mydict = {pea: 42, nut: 10}
|
503 | for i, k, v in (mydict) {
|
504 | echo "$i - $k - $v"
|
505 | }
|
506 | # =>
|
507 | # 0 - pea - 42
|
508 | # 1 - nut - 10
|
509 |
|
510 | There are two simpler forms:
|
511 |
|
512 | - One variable gives you the key: `for k in (mydict)`
|
513 | - Two variables gives you the key and value: `for k, v in (mydict)`
|
514 |
|
515 | (One way to think of it: `for` loops in YSH have the functionality Python's
|
516 | `enumerate()`, `items()`, `keys()`, and `values()`.)
|
517 |
|
518 | ---
|
519 |
|
520 | The `io.stdin` object iterates over lines:
|
521 |
|
522 | for line in (io.stdin) {
|
523 | echo $line
|
524 | }
|
525 | # lines are buffered, so it's much faster than `while read --rawline`
|
526 |
|
527 | <!--
|
528 | TODO: Str loop should give you the (UTF-8 offset, rune)
|
529 | Or maybe just UTF-8 offset? Decoding errors could be exceptions, or Unicode
|
530 | replacement.
|
531 | -->
|
532 |
|
533 | ### `while` Loop
|
534 |
|
535 | While loops can use a **command** as the termination condition:
|
536 |
|
537 | while test --file lock {
|
538 | sleep 1
|
539 | }
|
540 |
|
541 | Or an **expression**, which is surrounded in `()`:
|
542 |
|
543 | var i = 3
|
544 | while (i < 6) {
|
545 | echo "i = $i"
|
546 | setvar i += 1
|
547 | }
|
548 | # =>
|
549 | # i = 3
|
550 | # i = 4
|
551 | # i = 5
|
552 |
|
553 | ### `if elif` Conditional
|
554 |
|
555 | If statements test the exit code of a command, and have optional `elif` and
|
556 | `else` clauses:
|
557 |
|
558 | if test --file foo {
|
559 | echo 'foo is a file'
|
560 | rm --verbose foo # delete it
|
561 | } elif test --dir foo {
|
562 | echo 'foo is a directory'
|
563 | } else {
|
564 | echo 'neither'
|
565 | }
|
566 |
|
567 | Invert the exit code with `!`:
|
568 |
|
569 | if ! grep alice /etc/passwd {
|
570 | echo 'alice is not a user'
|
571 | }
|
572 |
|
573 | As with `while` loops, the condition can also be an **expression** wrapped in
|
574 | `()`:
|
575 |
|
576 | if (num_beans > 0) {
|
577 | echo 'so many beans'
|
578 | }
|
579 |
|
580 | var done = false
|
581 | if (not done) { # negate with 'not' operator (contrast with !)
|
582 | echo "we aren't done"
|
583 | }
|
584 |
|
585 | ### `case` Conditional
|
586 |
|
587 | The case statement is a series of conditionals and executable blocks. The
|
588 | condition can be either an unquoted glob pattern like `*.py`, an eggex pattern
|
589 | like `/d+/`, or a typed expression like `(42)`:
|
590 |
|
591 | var s = 'README.md'
|
592 | case (s) {
|
593 | *.py { echo 'Python' }
|
594 | *.cc | *.h { echo 'C++' }
|
595 | * { echo 'Other' }
|
596 | }
|
597 | # => Other
|
598 |
|
599 | case (s) {
|
600 | / dot* '.md' / { echo 'Markdown' }
|
601 | (30 + 12) { echo 'the integer 42' }
|
602 | (else) { echo 'neither' }
|
603 | }
|
604 | # => Markdown
|
605 |
|
606 |
|
607 | <!--
|
608 | (Shell style like `if foo; then ... fi` and `case $x in ... esac` is also
|
609 | legal, but discouraged in YSH code.)
|
610 | -->
|
611 |
|
612 | ### Error Handling
|
613 |
|
614 | If statements are also used for **error handling**. Builtins and external
|
615 | commands use this style:
|
616 |
|
617 | if ! test -d /bin {
|
618 | echo 'not a directory'
|
619 | }
|
620 |
|
621 | if ! cp foo /tmp {
|
622 | echo 'error copying' # any non-zero status
|
623 | }
|
624 |
|
625 | Procs use this style (because of shell's *disabled `errexit` quirk*):
|
626 |
|
627 | try {
|
628 | myproc
|
629 | }
|
630 | if failed {
|
631 | echo 'failed'
|
632 | }
|
633 |
|
634 | For a complete list of examples, see [YSH Error
|
635 | Handling](ysh-error.html). For design goals and a reference, see [YSH
|
636 | Fixes Shell's Error Handling](error-handling.html).
|
637 |
|
638 | #### exit, break, continue, return
|
639 |
|
640 | The `exit` **keyword** exits a process. (It's not a shell builtin.)
|
641 |
|
642 | The other 3 control flow keywords behave like they do in Python and JavaScript.
|
643 |
|
644 | ### Ruby-like Block Arguments
|
645 |
|
646 | Here's a builtin command that takes a literal block argument:
|
647 |
|
648 | shopt --unset errexit { # ignore errors
|
649 | cp ale /tmp
|
650 | cp bean /bin
|
651 | }
|
652 |
|
653 | A block is a value of type `Command`.
|
654 |
|
655 | ### Shell-like `proc`
|
656 |
|
657 | You can define units of code with the `proc` keyword.
|
658 |
|
659 | proc mycopy (src, dest) {
|
660 | ### Copy verbosely
|
661 |
|
662 | mkdir -p $dest
|
663 | cp --verbose $src $dest
|
664 | }
|
665 |
|
666 | The `###` line is a "doc comment". Simple procs like this are invoked like a
|
667 | shell command:
|
668 |
|
669 | touch log.txt
|
670 | mycopy log.txt /tmp # first word 'mycopy' is a proc
|
671 |
|
672 | Procs have many features, including **four** kinds of arguments:
|
673 |
|
674 | 1. Word args (which are always strings)
|
675 | 1. Typed, positional args (aka positional args)
|
676 | 1. Typed, named args (aka named args)
|
677 | 1. A final block argument, which may be written with `{ }`.
|
678 |
|
679 | At the call site, they can look like any of these forms:
|
680 |
|
681 | ls /tmp # word arg
|
682 |
|
683 | json write (d) # word arg, then positional arg
|
684 |
|
685 | try {
|
686 | error 'failed' (status=9) # word arg, then named arg
|
687 | }
|
688 |
|
689 | cd /tmp { echo $PWD } # word arg, then block arg
|
690 |
|
691 | pp value ([1, 2]) # positional, typed arg
|
692 |
|
693 | <!-- TODO: lazy arg list: ls8 | where [age > 10] -->
|
694 |
|
695 | At the definition site, the kinds of parameters are separated with `;`, similar
|
696 | to the Julia language:
|
697 |
|
698 | proc p2 (word1, word2; pos1, pos2, ...rest_pos) {
|
699 | echo "$word1 $word2 $[pos1 + pos2]"
|
700 | json write (rest_pos)
|
701 | }
|
702 |
|
703 | proc p3 (w ; ; named1, named2, ...rest_named; block) {
|
704 | echo "$w $[named1 + named2]"
|
705 | eval (block)
|
706 | json write (rest_named)
|
707 | }
|
708 |
|
709 | proc p4 (; ; ; block) {
|
710 | eval (block)
|
711 | }
|
712 |
|
713 | YSH also has Python-like functions defined with `func`. These are part of the
|
714 | expression language, which we'll see later.
|
715 |
|
716 | For more info, see the [Guide to Procs and Funcs](proc-func.html).
|
717 |
|
718 | #### Builtin Commands
|
719 |
|
720 | **Shell builtins** like `cd` and `read` are the "standard library" of the
|
721 | command language. Each one takes various flags:
|
722 |
|
723 | cd -L . # follow symlinks
|
724 |
|
725 | echo foo | read --all # read all of stdin
|
726 |
|
727 | Here are some categories of builtin:
|
728 |
|
729 | - I/O: `echo write read`
|
730 | - File system: `cd test`
|
731 | - Processes: `fork wait forkwait exec`
|
732 | - Interpreter settings: `shopt shvar`
|
733 | - Meta: `command builtin runproc type eval`
|
734 |
|
735 | <!-- TODO: Link to a comprehensive list of builtins -->
|
736 |
|
737 | ## Expression Language: Python-like Types
|
738 |
|
739 | YSH expressions look and behave more like Python or JavaScript than shell. For
|
740 | example, we write `if (x < y)` instead of `if [ $x -lt $y ]`. Expressions are
|
741 | usually surrounded by `( )`.
|
742 |
|
743 | At runtime, variables like `x` and `y` are bounded to **typed data**, like
|
744 | integers, floats, strings, lists, and dicts.
|
745 |
|
746 | <!--
|
747 | [Command vs. Expression Mode](command-vs-expression-mode.html) may help you
|
748 | understand how YSH is parsed.
|
749 | -->
|
750 |
|
751 | ### Python-like `func`
|
752 |
|
753 | At the end of the *Command Language*, we saw that procs are shell-like units of
|
754 | code. YSH also has Python-like **functions**, which are different than
|
755 | `procs`:
|
756 |
|
757 | - They're defined with the `func` keyword.
|
758 | - They're called in expressions, not in commands.
|
759 | - They're **pure**, and live in the **interior** of a process.
|
760 | - In contrast, procs usually perform I/O, and have **exterior** boundaries.
|
761 |
|
762 | The simplest function is:
|
763 |
|
764 | func identity(x) {
|
765 | return (x) # parens required for typed return
|
766 | }
|
767 |
|
768 | A more complex pure function:
|
769 |
|
770 | func myRepeat(s, n; special=false) { # positional; named params
|
771 | var parts = []
|
772 | for i in (0 ..< n) {
|
773 | append $s (parts)
|
774 | }
|
775 | var result = join(parts)
|
776 |
|
777 | if (special) {
|
778 | return ("$result !!")
|
779 | } else {
|
780 | return (result)
|
781 | }
|
782 | }
|
783 |
|
784 | echo $[myRepeat('z', 3)] # => zzz
|
785 |
|
786 | echo $[myRepeat('z', 3, special=true)] # => zzz !!
|
787 |
|
788 | A function that mutates its argument:
|
789 |
|
790 | func popTwice(mylist) {
|
791 | call mylist->pop()
|
792 | call mylist->pop()
|
793 | }
|
794 |
|
795 | var mylist = [3, 4]
|
796 |
|
797 | # The call keyword is an "adapter" between commands and expressions,
|
798 | # like the = keyword.
|
799 | call popTwice(mylist)
|
800 |
|
801 |
|
802 | Funcs are named using `camelCase`, while procs use `kebab-case`. See the
|
803 | [Style Guide](style-guide.html) for more conventions.
|
804 |
|
805 | #### Builtin Functions
|
806 |
|
807 | In addition, to builtin commands, YSH has Python-like builtin **functions**.
|
808 | These are like the "standard library" for the expression language. Examples:
|
809 |
|
810 | - Functions that take multiple types: `len() type()`
|
811 | - Conversions: `bool() int() float() str() list() ...`
|
812 | - Explicit word evaluation: `split() join() glob() maybe()`
|
813 |
|
814 | <!-- TODO: Make a comprehensive list of func builtins. -->
|
815 |
|
816 |
|
817 | ### Data Types: `Int`, `Str`, `List`, `Dict`, `Obj`, ...
|
818 |
|
819 | YSH has data types, each with an expression syntax and associated methods.
|
820 |
|
821 | ### Methods
|
822 |
|
823 | YSH adds mutable data structures to shell, so we have a special syntax for
|
824 | mutating methods. They are looked up with a thin arrow `->`:
|
825 |
|
826 | var foods = ['ale', 'bean']
|
827 | var last = foods->pop() # bean
|
828 | write @foods # => ale
|
829 |
|
830 | You can ignore the return value with the `call` keyword:
|
831 |
|
832 | call foods->pop()
|
833 |
|
834 | Regular methods are looked up with the `.` operator:
|
835 |
|
836 | var line = ' ale bean '
|
837 | var caps = last.trim().upper() # 'ALE BEAN'
|
838 |
|
839 | ---
|
840 |
|
841 | You can also chain functions with a fat arrow `=>`:
|
842 |
|
843 | var trimmed = line.trim() => upper() # 'ALE BEAN'
|
844 |
|
845 | The `=>` operator allows functions to appear in a natural left-to-right order,
|
846 | like methods.
|
847 |
|
848 | # list() is a free function taking one arg
|
849 | # join() is a free function taking two args
|
850 | var x = {k1: 42, k2: 43} => list() => join('/') # 'K1/K2'
|
851 |
|
852 | ---
|
853 |
|
854 | Now let's go through the data types in YSH. We'll show the syntax for
|
855 | literals, and what **methods** they have.
|
856 |
|
857 | #### Null and Bool
|
858 |
|
859 | YSH uses JavaScript-like spellings these three "atoms":
|
860 |
|
861 | var x = null
|
862 |
|
863 | var b1, b2 = true, false
|
864 |
|
865 | if (b1) {
|
866 | echo 'yes'
|
867 | } # => yes
|
868 |
|
869 |
|
870 | #### Int
|
871 |
|
872 | There are many ways to write integers:
|
873 |
|
874 | var small, big = 42, 65_536
|
875 | echo "$small $big" # => 42 65536
|
876 |
|
877 | var hex, octal, binary = 0x0001_0000, 0o755, 0b0001_0101
|
878 | echo "$hex $octal $binary" # => 65536 493 21
|
879 |
|
880 | <!--
|
881 | "Runes" are integers that represent Unicode code points. They're not common in
|
882 | YSH code, but can make certain string algorithms more readable.
|
883 |
|
884 | # Pound rune literals are similar to ord('A')
|
885 | const a = #'A'
|
886 |
|
887 | # Backslash rune literals can appear outside of quotes
|
888 | const newline = \n # Remember this is an integer
|
889 | const backslash = \\ # ditto
|
890 |
|
891 | # Unicode rune literal is syntactic sugar for 0x3bc
|
892 | const mu = \u{3bc}
|
893 |
|
894 | echo "chars $a $newline $backslash $mu" # => chars 65 10 92 956
|
895 | -->
|
896 |
|
897 | #### Float
|
898 |
|
899 | Floats are written with a decimal point:
|
900 |
|
901 | var big = 3.14
|
902 |
|
903 | You can use scientific notation, as in Python:
|
904 |
|
905 | var small = 1.5e-10
|
906 |
|
907 | #### Str
|
908 |
|
909 | See the section above on *Three Kinds of String Literals*. It described
|
910 | `'single quoted'`, `"double ${quoted}"`, and `u'J8-style\n'` strings; as well
|
911 | as their multiline variants.
|
912 |
|
913 | Strings are UTF-8 encoded in memory, like strings in the [Go
|
914 | language](https://golang.org). There isn't a separate string and unicode type,
|
915 | as in Python.
|
916 |
|
917 | Strings are **immutable**, as in Python and JavaScript. This means they only
|
918 | have **transforming** methods:
|
919 |
|
920 | var x = s.trim()
|
921 |
|
922 | Other methods:
|
923 |
|
924 | - `trimLeft() trimRight()`
|
925 | - `trimPrefix() trimSuffix()`
|
926 | - `upper() lower()`
|
927 | - `search() leftMatch()` - pattern matching
|
928 | - `replace() split()`
|
929 |
|
930 | #### List (and Arrays)
|
931 |
|
932 | All lists can be expressed with Python-like literals:
|
933 |
|
934 | var foods = ['ale', 'bean', 'corn']
|
935 | var recursive = [1, [2, 3]]
|
936 |
|
937 | As a special case, list of strings are called **arrays**. It's often more
|
938 | convenient to write them with shell-like literals:
|
939 |
|
940 | # No quotes or commas
|
941 | var foods = :| ale bean corn |
|
942 |
|
943 | # You can use the word language here
|
944 | var other = :| foo $s *.py {alice,bob}@example.com |
|
945 |
|
946 | Lists are **mutable**, as in Python and JavaScript. So they mainly have
|
947 | mutating methods:
|
948 |
|
949 | call foods->reverse()
|
950 | write -- @foods
|
951 | # =>
|
952 | # corn
|
953 | # bean
|
954 | # ale
|
955 |
|
956 | #### Dict
|
957 |
|
958 | Dicts use syntax that's like JavaScript. Here's a dict literal:
|
959 |
|
960 | var d = {
|
961 | name: 'bob', # unquoted keys are allowed
|
962 | age: 42,
|
963 | 'key with spaces': 'val'
|
964 | }
|
965 |
|
966 | You can use either `[]` or `.` to retrieve a value, given a key:
|
967 |
|
968 | var v1 = d['name']
|
969 | var v2 = d.name # shorthand for the above
|
970 | var v3 = d['key with spaces'] # no shorthand for this
|
971 |
|
972 | (If the key doesn't exist, an error is raised.)
|
973 |
|
974 | You can change Dict values with the same 2 syntaxes:
|
975 |
|
976 | set d['name'] = 'other'
|
977 | set d.name = 'fun'
|
978 |
|
979 | ---
|
980 |
|
981 | If you want to compute a key name, use an expression inside `[]`:
|
982 |
|
983 | var key = 'alice'
|
984 | var d2 = {[key ++ '_z']: 'ZZZ'} # Computed key name
|
985 | echo $[d2.alice_z] # => ZZZ
|
986 |
|
987 | If you omit the value, its taken from a variable of the same name:
|
988 |
|
989 | var d3 = {key} # like {key: key}
|
990 | echo "name is $[d3.key]" # => name is alice
|
991 |
|
992 | More examples:
|
993 |
|
994 | var empty = {}
|
995 | echo $[len(empty)] # => 0
|
996 |
|
997 | The `keys()` and `values()` methods return new `List` objects:
|
998 |
|
999 | var keys = keys(d2) # => alice_z
|
1000 | var vals = values(d3) # => alice
|
1001 |
|
1002 | ### `Place` type / "out params"
|
1003 |
|
1004 | The `read` builtin can either set an implicit variable `_reply`:
|
1005 |
|
1006 | whoami | read --all # sets _reply
|
1007 |
|
1008 | Or you can pass a `value.Place`, created with `&`
|
1009 |
|
1010 | var x # implicitly initialized to null
|
1011 | whoami | read --all (&x) # mutate this "place"
|
1012 | echo who=$x # => who=andy
|
1013 |
|
1014 | <!--
|
1015 | #### Quotation Types: value.Command (Block) and value.Expr
|
1016 |
|
1017 | These types are for reflection on YSH code. Most YSH programs won't use them
|
1018 | directly.
|
1019 |
|
1020 | - `Command`: an unevaluated code block.
|
1021 | - rarely-used literal: `^(ls | wc -l)`
|
1022 | - `Expr`: an unevaluated expression.
|
1023 | - rarely-used literal: `^[42 + a[i]]`
|
1024 | -->
|
1025 |
|
1026 | ### Operators
|
1027 |
|
1028 | YSH operators are generally the same as in Python:
|
1029 |
|
1030 | if (10 <= num_beans and num_beans < 20) {
|
1031 | echo 'enough'
|
1032 | } # => enough
|
1033 |
|
1034 | YSH has a few operators that aren't in Python. Equality can be approximate or
|
1035 | exact:
|
1036 |
|
1037 | var n = ' 42 '
|
1038 | if (n ~== 42) {
|
1039 | echo 'equal after stripping whitespace and type conversion'
|
1040 | } # => equal after stripping whitespace type conversion
|
1041 |
|
1042 | if (n === 42) {
|
1043 | echo "not reached because strings and ints aren't equal"
|
1044 | }
|
1045 |
|
1046 | <!-- TODO: is n === 42 a type error? -->
|
1047 |
|
1048 | Pattern matching can be done with globs (`~~` and `!~~`)
|
1049 |
|
1050 | const filename = 'foo.py'
|
1051 | if (filename ~~ '*.py') {
|
1052 | echo 'Python'
|
1053 | } # => Python
|
1054 |
|
1055 | if (filename !~~ '*.sh') {
|
1056 | echo 'not shell'
|
1057 | } # => not shell
|
1058 |
|
1059 | or regular expressions (`~` and `!~`). See the Eggex section below for an
|
1060 | example of the latter.
|
1061 |
|
1062 | Concatenation is `++` rather than `+` because it avoids confusion in the
|
1063 | presence of type conversion:
|
1064 |
|
1065 | var n = 42 + 1 # string plus int does implicit conversion
|
1066 | echo $n # => 43
|
1067 |
|
1068 | var y = 'ale ' ++ "bean $n" # concatenation
|
1069 | echo $y # => ale bean 43
|
1070 |
|
1071 | <!--
|
1072 | TODO: change example above
|
1073 | var n = '42' + 1 # string plus int does implicit conversion
|
1074 | -->
|
1075 |
|
1076 | <!--
|
1077 |
|
1078 | #### Summary of Operators
|
1079 |
|
1080 | - Arithmetic: `+ - * / // %` and `**` for exponentatiation
|
1081 | - `/` always yields a float, and `//` is integer division
|
1082 | - Bitwise: `& | ^ ~`
|
1083 | - Logical: `and or not`
|
1084 | - Comparison: `== < > <= >= in 'not in'`
|
1085 | - Approximate equality: `~==`
|
1086 | - Eggex and glob match: `~ !~ ~~ !~~`
|
1087 | - Ternary: `1 if x else 0`
|
1088 | - Index and slice: `mylist[3]` and `mylist[1:3]`
|
1089 | - `mydict->key` is a shortcut for `mydict['key']`
|
1090 | - Function calls
|
1091 | - free: `f(x, y)`
|
1092 | - transformations and chaining: `s => startWith('prefix')`
|
1093 | - mutating methods: `mylist->pop()`
|
1094 | - String and List: `++` for concatenation
|
1095 | - This is a separate operator because the addition operator `+` does
|
1096 | string-to-int conversion
|
1097 |
|
1098 | TODO: What about list comprehensions?
|
1099 | -->
|
1100 |
|
1101 | ### Egg Expressions (YSH Regexes)
|
1102 |
|
1103 | An *Eggex* is a YSH expression that denotes a regular expression. Eggexes
|
1104 | translate to POSIX ERE syntax, for use with tools like `egrep`, `awk`, and `sed
|
1105 | --regexp-extended` (GNU only).
|
1106 |
|
1107 | They're designed to be readable and composable. Example:
|
1108 |
|
1109 | var D = / digit{1,3} /
|
1110 | var ip_pattern = / D '.' D '.' D '.' D'.' /
|
1111 |
|
1112 | var z = '192.168.0.1'
|
1113 | if (z ~ ip_pattern) { # Use the ~ operator to match
|
1114 | echo "$z looks like an IP address"
|
1115 | } # => 192.168.0.1 looks like an IP address
|
1116 |
|
1117 | if (z !~ / '.255' %end /) {
|
1118 | echo "doesn't end with .255"
|
1119 | } # => doesn't end with .255"
|
1120 |
|
1121 | See the [Egg Expressions doc](eggex.html) for details.
|
1122 |
|
1123 | ## Interlude
|
1124 |
|
1125 | Let's review what we've seen before moving onto other YSH features.
|
1126 |
|
1127 | ### Three Interleaved Languages
|
1128 |
|
1129 | Here are the languages we saw in the last 3 sections:
|
1130 |
|
1131 | 1. **Words** evaluate to a string, or list of strings. This includes:
|
1132 | - literals like `'mystr'`
|
1133 | - substitutions like `${x}` and `$(hostname)`
|
1134 | - globs like `*.sh`
|
1135 | 2. **Commands** are used for
|
1136 | - I/O: pipelines, builtins like `read`
|
1137 | - control flow: `if`, `for`
|
1138 | - abstraction: `proc`
|
1139 | 3. **Expressions** on typed data are borrowed from Python, with influence from
|
1140 | JavaScript:
|
1141 | - Lists: `['ale', 'bean']` or `:| ale bean |`
|
1142 | - Dicts: `{name: 'bob', age: 42}`
|
1143 | - Functions: `split('ale bean')` and `join(['pea', 'nut'])`
|
1144 |
|
1145 | ### How Do They Work Together?
|
1146 |
|
1147 | Here are two examples:
|
1148 |
|
1149 | (1) In this this *command*, there are **four** *words*. The fourth word is an
|
1150 | *expression sub* `$[]`.
|
1151 |
|
1152 | write hello $name $[d['age'] + 1]
|
1153 | # =>
|
1154 | # hello
|
1155 | # world
|
1156 | # 43
|
1157 |
|
1158 | (2) In this assignment, the *expression* on the right hand side of `=`
|
1159 | concatenates two strings. The first string is a literal, and the second is a
|
1160 | *command sub*.
|
1161 |
|
1162 | var food = 'ale ' ++ $(echo bean | tr a-z A-Z)
|
1163 | write $food # => ale BEAN
|
1164 |
|
1165 | So words, commands, and expressions are **mutually recursive**. If you're a
|
1166 | conceptual person, skimming [Syntactic Concepts](syntactic-concepts.html) may
|
1167 | help you understand this on a deeper level.
|
1168 |
|
1169 | <!--
|
1170 | One way to think about these sublanguages is to note that the `|` character
|
1171 | means something different in each context:
|
1172 |
|
1173 | - In the command language, it's the pipeline operator, as in `ls | wc -l`
|
1174 | - In the word language, it's only valid in a literal string like `'|'`, `"|"`,
|
1175 | or `\|`. (It's also used in `${x|html}`, which formats a string.)
|
1176 | - In the expression language, it's the bitwise OR operator, as in Python and
|
1177 | JavaScript.
|
1178 | -->
|
1179 |
|
1180 | ## Advanced YSH Features
|
1181 |
|
1182 | Unlike shell, YSH is powerful enough to write reusable **libraries**. It also
|
1183 | has reflective features, to allow creating reusable **languages**!
|
1184 |
|
1185 | The following sections give you a taste of some advanced features.
|
1186 |
|
1187 | ### Closures
|
1188 |
|
1189 | Block arguments capture the frame they're defined in, which means they have
|
1190 | *lexical scope*.
|
1191 |
|
1192 | For example, this proc accepts a block, and runs it:
|
1193 |
|
1194 | proc do-it (; ; ; block) {
|
1195 | call io->eval(block)
|
1196 | }
|
1197 |
|
1198 | When you pass a block to it, the enclosing stack frame is captured:
|
1199 |
|
1200 | var x = 42
|
1201 | do-it {
|
1202 | echo "x = $x" # outer x is visible LATER, when the block is run
|
1203 | }
|
1204 |
|
1205 | - [Feature Index: Closures](ref/feature-index.html#Closures)
|
1206 |
|
1207 | ### Objects
|
1208 |
|
1209 | YSH has an `Obj` type that bundles **code** and **data**. (In contrast, JSON
|
1210 | messages are pure data, not objects.)
|
1211 |
|
1212 | The main purpose of objects is **polymorphism**:
|
1213 |
|
1214 | var obj = makeMyObject(42) # I don't know what it looks like inside
|
1215 |
|
1216 | echo $[obj.myMethod()] # But I can perform abstract operations
|
1217 |
|
1218 | call obj->mutatingMethod() # Mutation is considered special, with ->
|
1219 |
|
1220 | YSH objects are similar to Lua and JavaScript objects: they have a `Dict` of
|
1221 | properties, and a recursive "prototype chain" that is also an `Obj`.
|
1222 |
|
1223 | - [Feature Index: Objects](ref/feature-index.html#Objects)
|
1224 |
|
1225 | ### Modules
|
1226 |
|
1227 | A module is a **file** of source code, like `lib/myargs.ysh`.
|
1228 |
|
1229 | The `use` builtin turns it into an `Obj` that can be invoked and inspected:
|
1230 |
|
1231 | use myargs.ysh
|
1232 | myargs proc1 --flag val # module name becomes a prefix, via __invoke__
|
1233 | var alias = myargs.proc1 # module has attributes
|
1234 |
|
1235 | You can import specific names with the `--pick` flag:
|
1236 |
|
1237 | use myargs.ysh --pick p2 p3
|
1238 | p2
|
1239 | p3
|
1240 |
|
1241 | <!--
|
1242 | TODO: not mentioning __provide__, since it should be optional in the most basic usage?
|
1243 | -->
|
1244 |
|
1245 | - [Feature Index: Modules](ref/feature-index.html#Modules)
|
1246 |
|
1247 | ### Reflecting on the Interpreter
|
1248 |
|
1249 | YSH is a language for creating other languages. You can reflect on the
|
1250 | interpreter with APIs like `io->eval()` and `vm.getFrame()`.
|
1251 |
|
1252 | - [Feature Index: Reflection](ref/feature-index.html#Reflection)
|
1253 |
|
1254 | (Ruby, Tcl, and Racket also have this flavor.)
|
1255 |
|
1256 | ---
|
1257 |
|
1258 | These advanced features all live **inside** the Oils interpreter. But a shell
|
1259 | naturally deals with textual data from the **outside**, so let's switch gears.
|
1260 |
|
1261 | ## Data Notation / Interchange Formats
|
1262 |
|
1263 | YSH reads and writes **data notation**, like [JSON]($xref).
|
1264 |
|
1265 | I think of them as languages for data, rather than code. Instead of being
|
1266 | executed, they're parsed as data structures.
|
1267 |
|
1268 | <!-- TODO: Link to slogans, fallacies, and concepts -->
|
1269 |
|
1270 | ### UTF-8
|
1271 |
|
1272 | UTF-8 is the foundation of our textual data languages.
|
1273 |
|
1274 | It's the most common Unicode encoding, and represents all code points
|
1275 | consistently and efficiently.
|
1276 |
|
1277 | <!-- TODO: there's a runes() iterator which gives integer offsets, usable for
|
1278 | slicing -->
|
1279 |
|
1280 | <!-- TODO: write about J8 notation -->
|
1281 |
|
1282 | ### Lines of Text (traditional), and JSON/J8 Strings
|
1283 |
|
1284 | Traditional Unix tools like `grep` and `awk` operate on streams of lines. YSH
|
1285 | supports this style, like any other shell.
|
1286 |
|
1287 | But YSH also has [J8 Notation][], a data format based on [JSON][]. It's a 100%
|
1288 | compatible upgrade that fixes some warts in JSON, and makes Unix text and JSON
|
1289 | work together more smoothly.
|
1290 |
|
1291 | ---
|
1292 |
|
1293 | [J8 Notation]: j8-notation.html
|
1294 |
|
1295 | Let's talk about simple strings and lines first. Here is YSH code for making a
|
1296 | string with 2 lines:
|
1297 |
|
1298 | var mystr = u'pea\n' ++ u'42\n'
|
1299 |
|
1300 | Now we can **encode** it into a message, which will fit on a single line.
|
1301 |
|
1302 | json write (mystr) > message.txt
|
1303 |
|
1304 | Now we can compress `message.txt`, encrypt it, and send it to another computer.
|
1305 |
|
1306 | And then we can **decode** it, i.e. read it back into a variable:
|
1307 |
|
1308 | json read (&x) < message.txt
|
1309 | = x # => "pea\n42\n"
|
1310 |
|
1311 | <!--
|
1312 | This can also be done with functions like `toJson()` and `fromJson()`
|
1313 |
|
1314 | write $[toJson(mystr)] # => "pea\n42\n"
|
1315 |
|
1316 | # JSON8 is the same, but it's not lossy for binary data
|
1317 | write $[toJson8(mystr)] # => "pea\t42\n"
|
1318 |
|
1319 | -->
|
1320 |
|
1321 | ### Structured: JSON8, TSV8
|
1322 |
|
1323 | In addition to strings and lines, you can write and read **tree-shaped** data
|
1324 | as [JSON][]:
|
1325 |
|
1326 | var d = {key: 'value'}
|
1327 | json write (d) # dump variable d as JSON
|
1328 | # =>
|
1329 | # {
|
1330 | # "key": "value"
|
1331 | # }
|
1332 |
|
1333 | echo '["ale", 42]' > example.json
|
1334 |
|
1335 | json read (&d2) < example.json # parse JSON into var d2
|
1336 | pp (d2) # pretty print it
|
1337 | # => (List) ['ale', 42]
|
1338 |
|
1339 | [JSON][] will lose information when strings have binary data, but the slight
|
1340 | [JSON8]($xref) upgrade won't:
|
1341 |
|
1342 | var b = {binary: $'\xff'}
|
1343 | json8 write (b)
|
1344 | # =>
|
1345 | # {
|
1346 | # "binary": b'\yff'
|
1347 | # }
|
1348 |
|
1349 | [JSON]: $xref
|
1350 |
|
1351 | **Table-shaped** data can be read and written as [TSV8]($xref). (TODO: not yet
|
1352 | implemented.)
|
1353 |
|
1354 | <!-- Figure out the API. Does it work like JSON?
|
1355 |
|
1356 | Or I think we just implement
|
1357 | - rows: 'where' or 'filter' (dplyr)
|
1358 | - cols: 'select' conflicts with shell builtin; call it 'cols'?
|
1359 | - sort: 'sort-by' or 'arrange' (dplyr)
|
1360 | - TSV8 <=> sqlite conversion. Are these drivers or what?
|
1361 | - and then let you pipe output?
|
1362 |
|
1363 | Do we also need TSV8 space2tab or something? For writing TSV8 inline.
|
1364 |
|
1365 | More later:
|
1366 | - MessagePack (e.g. for shared library extension modules)
|
1367 | - msgpack read, write? I think user-defined function could be like this?
|
1368 | - SASH: Simple and Strict HTML? For easy processing
|
1369 | -->
|
1370 |
|
1371 | ## The Runtime Shared by OSH and YSH
|
1372 |
|
1373 | Although we describe OSH and YSH as different languages, they use the **same**
|
1374 | interpreter under the hood. This interpreter has various `shopt` flags that
|
1375 | are flipped for different behavior, e.g. with `shopt --set ysh:all`.
|
1376 |
|
1377 | Understanding this interpreter and its interface to the Unix kernel will help
|
1378 | you understand **both** languages!
|
1379 |
|
1380 | ### Interpreter Data Model
|
1381 |
|
1382 | The [Interpreter State](interpreter-state.html) doc is **under construction**.
|
1383 | It will cover:
|
1384 |
|
1385 | - Two separate namespaces (like Lisp 1 vs. 2):
|
1386 | - **proc** namespace for procs as the first word
|
1387 | - **variable** namespace
|
1388 | - The variable namespace has a **call stack**, for the local variables of a
|
1389 | proc.
|
1390 | - Each **stack frame** is a `{name -> cell}` mapping.
|
1391 | - A **cell** has one of the above data types: `Bool`, `Int`, `Str`, etc.
|
1392 | - A cell has `readonly`, `export`, and `nameref` **flags**.
|
1393 | - Boolean shell options with `shopt`: `parse_paren`, `simple_word_eval`, etc.
|
1394 | - String shell options with `shvar`: `IFS`, `PATH`
|
1395 | - **Registers** that are silently modified by the interpreter
|
1396 | - `$?` and `_error`
|
1397 | - `$!` for the last PID
|
1398 | - `_this_dir`
|
1399 | - `_reply`
|
1400 |
|
1401 | ### Process Model (the kernel)
|
1402 |
|
1403 | The [Process Model](process-model.html) doc is **under construction**. It will cover:
|
1404 |
|
1405 | - Simple Commands, `exec`
|
1406 | - Pipelines. #[shell-the-good-parts](#blog-tag)
|
1407 | - `fork`, `forkwait`
|
1408 | - Command and process substitution.
|
1409 | - Related links:
|
1410 | - [Tracing execution in Oils](xtrace.html) (xtrace), which divides
|
1411 | process-based concurrency into **synchronous** and **async** constructs.
|
1412 | - [Three Comics For Understanding Unix
|
1413 | Shell](http://www.oilshell.org/blog/2020/04/comics.html) (blog)
|
1414 |
|
1415 |
|
1416 | <!--
|
1417 | Process model additions: Capers, Headless shell
|
1418 |
|
1419 | some optimizations: See YSH starts fewer processes than other shells.
|
1420 | -->
|
1421 |
|
1422 | ## Summary
|
1423 |
|
1424 | YSH is a large language that evolved from Unix shell. It has shell-like
|
1425 | commands, Python-like expressions on typed data, and Ruby-like command blocks.
|
1426 |
|
1427 | Even though it's large, you can "forget" the bad parts of shell like `[ $x -lt
|
1428 | $y ]`.
|
1429 |
|
1430 | These concepts are central to YSH:
|
1431 |
|
1432 | 1. Interleaved *word*, *command*, and *expression* languages.
|
1433 | 2. A standard library of *shell builtins*, as well as *builtin functions*
|
1434 | 3. Languages for *data*: J8 Notation, including JSON8 and TSV8
|
1435 | 4. A *runtime* shared by OSH and YSH
|
1436 |
|
1437 | ## Related Docs
|
1438 |
|
1439 | - [YSH vs. Shell Idioms](idioms.html) - YSH side-by-side with shell.
|
1440 | - [YSH Language Influences](language-influences.html) - In addition to shell,
|
1441 | Python, and JavaScript, YSH is influenced by Ruby, Perl, Awk, PHP, and more.
|
1442 | - [A Feel For YSH Syntax](syntax-feelings.html) - Some thoughts that may help
|
1443 | you remember the syntax.
|
1444 | - [YSH Language Warts](warts.html) documents syntax that may be surprising.
|
1445 |
|
1446 | ## Appendix: Features Not Shown
|
1447 |
|
1448 | ### Advanced
|
1449 |
|
1450 | These shell features are part of YSH, but aren't shown for brevity.
|
1451 |
|
1452 | - The `fork` and `forkwait` builtins, for concurrent execution and subshells.
|
1453 | - Process Substitution: `diff <(sort left.txt) <(sort right.txt)`
|
1454 |
|
1455 | ### Deprecated Shell Constructs
|
1456 |
|
1457 | The shared interpreter supports many shell constructs that are deprecated:
|
1458 |
|
1459 | - YSH code uses shell's `||` and `&&` in limited circumstances, since `errexit`
|
1460 | is on by default.
|
1461 | - Assignment builtins like `local` and `declare`. Use YSH keywords.
|
1462 | - Boolean expressions like `[[ x =~ $pat ]]`. Use YSH expressions.
|
1463 | - Shell arithmetic like `$(( x + 1 ))` and `(( y = x ))`. Use YSH expressions.
|
1464 | - The `until` loop can always be replaced with a `while` loop
|
1465 | - Most of what's in `${}` can be written in other ways. For example
|
1466 | `${s#/tmp}` could be `s => removePrefix('/tmp')` (TODO).
|
1467 |
|
1468 | ### Not Yet Implemented
|
1469 |
|
1470 | This document mentions a few constructs that aren't yet implemented. Here's a
|
1471 | summary:
|
1472 |
|
1473 | ```none
|
1474 | # Unimplemented syntax:
|
1475 |
|
1476 | echo ${x|html} # formatters
|
1477 |
|
1478 | echo ${x %.2f} # statically-parsed printf
|
1479 |
|
1480 | var x = "<p>$x</p>"html
|
1481 | echo "<p>$x</p>"html # tagged string
|
1482 |
|
1483 | var x = 15 Mi # units suffix
|
1484 | ```
|
1485 |
|
1486 | <!--
|
1487 | - To implement: Capers: stateless coprocesses
|
1488 | -->
|
1489 |
|
1490 | ## Appendix: Example of an YSH Module
|
1491 |
|
1492 | YSH can be used to write simple "shell scripts" or longer programs. It has
|
1493 | *procs* and *modules* to help with the latter.
|
1494 |
|
1495 | A module is just a file, like this:
|
1496 |
|
1497 | ```
|
1498 | #!/usr/bin/env ysh
|
1499 | ### Deploy script
|
1500 |
|
1501 | use $_this_dir/lib/util.ysh --pick log
|
1502 |
|
1503 | const DEST = '/tmp/ysh-tour'
|
1504 |
|
1505 | proc my-sync(...files) {
|
1506 | ### Sync files and show which ones
|
1507 |
|
1508 | cp --verbose @files $DEST
|
1509 | }
|
1510 |
|
1511 | proc main {
|
1512 | mkdir -p $DEST
|
1513 |
|
1514 | touch {foo,bar}.py {build,test}.sh
|
1515 |
|
1516 | log "Copying source files"
|
1517 | my-sync *.py *.sh
|
1518 |
|
1519 | if test --dir /tmp/logs {
|
1520 | cd /tmp/logs
|
1521 |
|
1522 | log "Copying logs"
|
1523 | my-sync *.log
|
1524 | }
|
1525 | }
|
1526 |
|
1527 | if is-main { # The only top-level statement
|
1528 | main @ARGV
|
1529 | }
|
1530 | ```
|
1531 |
|
1532 | <!--
|
1533 | TODO:
|
1534 | - Also show flags parsing?
|
1535 | - Show longer examples where it isn't boilerplate
|
1536 | -->
|
1537 |
|
1538 | You wouldn't bother with the boilerplate for something this small. But this
|
1539 | example illustrates the basic idea: the top level often contains these words:
|
1540 | `use`, `const`, `proc`, and `func`.
|
1541 |
|