OILS / doc / idioms.md View on Github | oils.pub

1077 lines, 682 significant
1---
2default_highlighter: oils-sh
3---
4
5YSH vs. Shell Idioms
6====================
7
8This is an informal, lightly-organized list of recommended idioms for the
9[YSH]($xref) language. Each section has snippets labeled *No* and *Yes*.
10
11- Use the *Yes* style when you want to write in YSH, and don't care about
12 compatibility with other shells.
13- The *No* style is discouraged in new code, but YSH will run it. The [OSH
14 language]($xref:osh-language) is compatible with
15 [POSIX]($xref:posix-shell-spec) and [bash]($xref).
16
17[J8 Notation]: j8-notation.html
18
19<!-- cmark.py expands this -->
20<div id="toc">
21</div>
22
23## Use [Simple Word Evaluation](simple-word-eval.html) to Avoid "Quoting Hell"
24
25### Substitute Variables
26
27No:
28
29 local x='my song.mp3'
30 ls "$x" # quotes required to avoid mangling
31
32Yes:
33
34 var x = 'my song.mp3'
35 ls $x # no quotes needed
36
37### Splice Arrays
38
39No:
40
41 local myflags=( --all --long )
42 ls "${myflags[@]}" "$@"
43
44Yes:
45
46 var myflags = :| --all --long |
47 ls @myflags @ARGV
48
49### Explicitly Split, Glob, and Omit Empty Args
50
51YSH doesn't split arguments after variable expansion.
52
53No:
54
55 local packages='python-dev gawk'
56 apt install $packages
57
58Yes:
59
60 var packages = 'python-dev gawk'
61 apt install @[split(packages)]
62
63Even better:
64
65 var packages = :| python-dev gawk | # array literal
66 apt install @packages # splice array
67
68---
69
70YSH doesn't glob after variable expansion.
71
72No:
73
74 local pat='*.py'
75 echo $pat
76
77
78Yes:
79
80 var pat = '*.py'
81 echo @[glob(pat)] # explicit call
82
83---
84
85YSH doesn't omit unquoted words that evaluate to the empty string.
86
87No:
88
89 local e=''
90 cp $e other $dest # cp gets 2 args, not 3, in sh
91
92Yes:
93
94 var e = ''
95 cp @[maybe(e)] other $dest # explicit call
96
97### Iterate a Number of Times (Split Command Sub)
98
99No:
100
101 local n=3
102 for x in $(seq $n); do # No implicit splitting of unquoted words in YSH
103 echo $x
104 done
105
106OK:
107
108 var n = 3
109 for x in @(seq $n) { # Explicit splitting
110 echo $x
111 }
112
113Better;
114
115 var n = 3
116 for x in (1 .. n+1) { # Range, avoids external program
117 echo $x
118 }
119
120Note that `{1..3}` works in bash and YSH, but the numbers must be constant.
121
122## Strings and Quoting
123
124### Use J8 strings for C-style escapes, like `\n`
125
126Bash adds C-style strings to shell with the `$''` syntax. YSH looks more like
127Python or Rust:
128
129No:
130
131 echo $'hi \n two' # two lines
132
133 echo $'mu \u03bc -- smiley \U0001f642' # \u and \U escapes
134
135 echo $'raw byte \xff'
136
137Yes:
138
139 echo u'hi \n two' # two lines
140
141 echo u'mu \u{3bc} -- smiley \u{1f642}' # one consistent syntax \u{}
142
143 echo b'raw byte \yff' # byte escapes with \yff
144
145Note that the syntax of YSH **code** matches our syntax for **data**, which is
146based on JSON. See [Data Languages](ref/toc-data.html) > [J8
147Notation](ref/chap-j8.html).
148
149### Use Multi-line Strings instead of Here Docs
150
151YSH has multi-line strings with `'''` and `"""`. Combined with the `<<<`
152operator, they replace here docs.
153
154No:
155
156 cat <<EOF
157 hello $name
158 bye
159 EOF
160
161Yes:
162
163 cat <<< """
164 hello $name
165 bye
166 """
167
168No:
169
170 cat <<'EOF'
171 prize is $4.99
172 EOF
173
174Yes:
175
176 cat <<< '''
177 prize is $4.99
178 '''
179
180If there's whitespace before the closing quote, the same whitespace is removed
181from every line. Compare with shell:
182
183No:
184
185 # special <<- operator removes leading tabs
186 cat <<-EOF
187 hello $name
188 EOF
189 # ^^^^^^ tabs only, not spaces
190
191Yes:
192
193 cat <<< """
194 hello $name
195 """
196 # ^^ indent can consist of spaces, making code look nicer
197
198## Avoid Ad Hoc Parsing and Splitting
199
200In other words, avoid *groveling through backslashes and spaces* in shell.
201
202Instead, emit and consume [J8 Notation]($xref:j8-notation):
203
204- J8 strings are [JSON]($xref) strings, with an upgrade for byte string
205 literals
206- [JSON8]($xref) is [JSON]($xref), with this same upgrade
207- [TSV8]($xref) is TSV with this upgrade (not yet implemented)
208
209Custom parsing and serializing should be limited to "the edges" of your YSH
210programs.
211
212### More Strategies For Structured Data
213
214- **Wrap** and Adapt External Tools. Parse their output, and emit [J8 Notation][].
215 - These can be one-off, "bespoke" wrappers in your program, or maintained
216 programs. Use the `proc` construct and `flagspec`!
217 - Example: [uxy](https://github.com/sustrik/uxy) wrappers.
218 - TODO: Examples written in YSH and in other languages.
219- **Patch** Existing Tools.
220 - Enhance GNU grep, etc. to emit [J8 Notation][]. Add a
221 `--j8` flag.
222- **Write Your Own** Structured Versions.
223 - For example, you can write a structured subset of `ls` in Python with
224 little effort.
225
226<!--
227 ls -q and -Q already exist, but --j8 or --tsv8 is probably fine
228-->
229
230## The `write` Builtin Is Simpler Than `printf` and `echo`
231
232### Write an Arbitrary Line
233
234No:
235
236 printf '%s\n' "$mystr"
237
238Yes:
239
240 write -- $mystr
241
242The `write` builtin accepts `--` so it doesn't confuse flags and args.
243
244### Write Without a Newline
245
246No:
247
248 echo -n "$mystr" # breaks if mystr is -e
249
250Yes:
251
252 write --end '' -- $mystr
253 write -n -- $mystr # -n is an alias for --end ''
254
255### Write an Array of Lines
256
257 var myarray = :| one two three |
258 write -- @myarray
259
260## New Long Flags on the `read` builtin
261
262### Read a Line
263
264No:
265
266 read line # Mangles your backslashes!
267
268Better:
269
270 read -r line # Still messes with leading and trailing whitespace
271
272 IFS= read -r line # OK, but doesn't work in YSH
273
274Yes:
275
276 read --raw-line # Gives you the line, without trailing \n
277
278(Note that `read --raw-line` is still an unbuffered read, which means it slowly
279reads a byte at a time. We plan to add buffered reads as well.)
280
281### Read a Whole File
282
283No:
284
285 read -d '' # harder to read, easy to forget -r
286
287Yes:
288
289 read --all # sets $_reply
290 read --all (&myvar) # sets $myvar
291
292### Read Lines of a File
293
294No:
295
296 # The IFS= idiom doesn't work in YSH, because of dynamic scope!
297 while IFS= read -r line; do
298 echo $line
299 done
300
301Yes:
302
303 while read --raw-line {
304 echo $_reply
305 }
306 # this reads a byte at a time, unbuffered, like shell
307
308Yes:
309
310 for line in (io.stdin) {
311 echo $line
312 }
313 # this reads buffered lines, which is much faster
314
315### Read a Number of Bytes
316
317No:
318
319 read -n 3 # slow because it respects -d delim
320 # also strips whitespace
321
322Better:
323
324 read -N 3 # good behavior, but easily confused with -n
325
326Yes:
327
328 read --num-bytes 3 # sets $_reply
329 read --num-bytes 3 (&myvar) # sets $myvar
330
331
332### Read Until `\0` (consume `find -print0`)
333
334No:
335
336 # Obscure syntax that bash accepts, but not other shells
337 read -r -d '' myvar
338
339Yes:
340
341 read -0 (&myvar)
342
343## YSH Enhancements to Builtins
344
345### Use `shopt` Instead of `set`
346
347Using a single builtin for all options makes scripts easier to read:
348
349Discouraged:
350
351 set -o errexit
352 shopt -s dotglob
353
354Idiomatic:
355
356 shopt --set errexit
357 shopt --set dotglob
358
359(As always, `set` can be used when you care about compatibility with other
360shells.)
361
362### Use `&` When Mentioning Variable Names
363
364YSH uses [places](variables.html#return-by-mutating-a-place-advanced) to make
365out-parameters of procs more explicit.
366
367No:
368
369 read -0 record < file.bin
370 echo $record
371
372Yes:
373
374 read -0 (&myvar) < file.bin
375 echo $record
376
377
378### Consider Using `--long-flags`
379
380Easier to write:
381
382 test -d /tmp
383 test -d / && test -f /vmlinuz
384
385 shopt -u extglob
386
387Easier to read:
388
389 test --dir /tmp
390 test --dir / && test --file /vmlinuz
391
392 shopt --unset extglob
393
394## Use Blocks to Save and Restore Context
395
396### Do Something In Another Directory
397
398No:
399
400 ( cd /tmp; echo $PWD ) # subshell is unnecessary (and limited)
401
402No:
403
404 pushd /tmp
405 echo $PWD
406 popd
407
408Yes:
409
410 cd /tmp {
411 echo $PWD
412 }
413
414### Batch I/O
415
416No:
417
418 echo 1 > out.txt
419 echo 2 >> out.txt # appending is less efficient
420 # because open() and close()
421
422No:
423
424 { echo 1
425 echo 2
426 } > out.txt
427
428Yes:
429
430 redir > out.txt {
431 echo 1
432 echo 2
433 }
434
435The `redir` builtin is syntactic sugar -- it lets you see redirects before the
436code that uses them.
437
438### Temporarily Set Shell Options
439
440No:
441
442 set +o errexit
443 myfunc # without error checking
444 set -o errexit
445
446Yes:
447
448 shopt --unset errexit {
449 myfunc
450 }
451
452### Use the `forkwait` builtin for Subshells, not `()`
453
454No:
455
456 ( cd /tmp; rm *.sh )
457
458Yes:
459
460 forkwait {
461 cd /tmp
462 rm *.sh
463 }
464
465Better:
466
467 cd /tmp { # no process created
468 rm *.sh
469 }
470
471### Use the `fork` builtin for async, not `&`
472
473No:
474
475 myfunc &
476
477 { sleep 1; echo one; sleep 2; } &
478
479Yes:
480
481 fork { myfunc }
482
483 fork { sleep 1; echo one; sleep 2 }
484
485## Use Procs (Better Shell Functions)
486
487### Use Named Parameters Instead of `$1`, `$2`, ...
488
489No:
490
491 f() {
492 local src=$1
493 local dest=${2:-/tmp}
494
495 cp "$src" "$dest"
496 }
497
498Yes:
499
500 proc f(src, dest='/tmp') { # Python-like default values
501 cp $src $dest
502 }
503
504### Use Named Varargs Instead of `"$@"`
505
506No:
507
508 f() {
509 local first=$1
510 shift
511
512 echo $first
513 echo "$@"
514 }
515
516Yes:
517
518 proc f(first, @rest) { # @ means "the rest of the arguments"
519 write -- $first
520 write -- @rest # @ means "splice this array"
521 }
522
523You can also use the implicit `ARGV` variable:
524
525 proc p {
526 cp -- @ARGV /tmp
527 }
528
529### Use "Out Params" instead of `declare -n`
530
531Out params are one way to "return" values from a `proc`.
532
533No:
534
535 f() {
536 local in=$1
537 local -n out=$2
538
539 out=PREFIX-$in
540 }
541
542 myvar='init'
543 f zzz myvar # assigns myvar to 'PREFIX-zzz'
544
545
546Yes:
547
548 proc f(in, :out) { # : is an out param, i.e. a string "reference"
549 setref out = "PREFIX-$in"
550 }
551
552 var myvar = 'init'
553 f zzz :myvar # assigns myvar to 'PREFIX-zzz'.
554 # colon is required
555
556### Note: Procs Don't Mess With Their Callers
557
558That is, [dynamic scope]($xref:dynamic-scope) is turned off when procs are
559invoked.
560
561Here's an example of shell functions reading variables in their caller:
562
563 bar() {
564 echo $foo_var # looks up the stack
565 }
566
567 foo() {
568 foo_var=x
569 bar
570 }
571
572 foo
573
574In YSH, you have to pass params explicitly:
575
576 proc bar {
577 echo $foo_var # error, not defined
578 }
579
580Shell functions can also **mutate** variables in their caller! But procs can't
581do this, which makes code easier to reason about.
582
583## Use Modules
584
585YSH has a few lightweight features that make it easier to organize code into
586files. It doesn't have "namespaces".
587
588### Relative Imports
589
590Suppose we are running `bin/mytool`, and we want `BASE_DIR` to be the root of
591the repository so we can do a relative import of `lib/foo.sh`.
592
593No:
594
595 # All of these are common idioms, with caveats
596 BASE_DIR=$(dirname $0)/..
597
598 BASE_DIR=$(dirname ${BASH_SOURCE[0]})/..
599
600 BASE_DIR=$(cd $($dirname $0)/.. && pwd)
601
602 BASE_DIR=$(dirname (dirname $(readlink -f $0)))
603
604 source $BASE_DIR/lib/foo.sh
605
606Yes:
607
608 const BASE_DIR = "$this_dir/.."
609
610 source $BASE_DIR/lib/foo.sh
611
612 # Or simply:
613 source $_this_dir/../lib/foo.sh
614
615The value of `_this_dir` is the directory that contains the currently executing
616file.
617
618### Include Guards
619
620No:
621
622 # libfoo.sh
623 if test -z "$__LIBFOO_SH"; then
624 return
625 fi
626 __LIBFOO_SH=1
627
628Yes:
629
630 # libfoo.sh
631 module libfoo.sh || return 0
632
633### Taskfile Pattern
634
635No:
636
637 deploy() {
638 echo ...
639 }
640 "$@"
641
642Yes
643
644 proc deploy() {
645 echo ...
646 }
647 runproc @ARGV # gives better error messages
648
649## Error Handling
650
651[YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html) once and
652for all! Here's a comprehensive list of error handling idioms.
653
654### Don't Use `&&` Outside of `if` / `while`
655
656It's implicit because `errexit` is on in YSH.
657
658No:
659
660 mkdir /tmp/dest && cp foo /tmp/dest
661
662Yes:
663
664 mkdir /tmp/dest
665 cp foo /tmp/dest
666
667It also avoids the *Trailing `&&` Pitfall* mentioned at the end of the [error
668handling](error-handling.html) doc.
669
670### Ignore an Error
671
672No:
673
674 ls /bad || true # OK because ls is external
675 myfunc || true # suffers from the "Disabled errexit Quirk"
676
677Yes:
678
679 try { ls /bad }
680 try { myfunc }
681
682### Retrieve A Command's Status When `errexit` is On
683
684No:
685
686 # set -e is enabled earlier
687
688 set +e
689 mycommand # this ignores errors when mycommand is a function
690 status=$? # save it before it changes
691 set -e
692
693 echo $status
694
695Yes:
696
697 try {
698 mycommand
699 }
700 echo $[_error.code]
701
702### Does a Builtin Or External Command Succeed?
703
704These idioms are OK in both shell and YSH:
705
706 if ! cp foo /tmp {
707 echo 'error copying' # any non-zero status
708 }
709
710 if ! test -d /bin {
711 echo 'not a directory'
712 }
713
714To be consistent with the idioms below, you can also write them like this:
715
716 try {
717 cp foo /tmp
718 }
719 if failed { # shortcut for (_error.code !== 0)
720 echo 'error copying'
721 }
722
723### Does a Function Succeed?
724
725When the command is a shell function, you shouldn't use `if myfunc` directly.
726This is because shell has the *Disabled `errexit` Quirk*, which is detected by
727YSH `strict_errexit`.
728
729**No**:
730
731 if myfunc; then # errors not checked in body of myfunc
732 echo 'success'
733 fi
734
735**Yes**. The *`$0` Dispatch Pattern* is a workaround that works in all shells.
736
737 if $0 myfunc; then # invoke a new shell
738 echo 'success'
739 fi
740
741 "$@" # Run the function $1 with args $2, $3, ...
742
743**Yes**. The YSH `try` builtin sets the special `_error` variable and returns
744`0`.
745
746 try {
747 myfunc # doesn't abort
748 }
749 if failed {
750 echo 'success'
751 }
752
753### Does a Pipeline Succeed?
754
755No:
756
757 if ps | grep python; then
758 echo 'found'
759 fi
760
761This is technically correct when `pipefail` is on, but it's impossible for
762YSH `strict_errexit` to distinguish it from `if myfunc | grep python` ahead
763of time (the ["meta" pitfall](error-handling.html#the-meta-pitfall)). If you
764know what you're doing, you can disable `strict_errexit`.
765
766Yes:
767
768 try {
769 ps | grep python
770 }
771 if failed {
772 echo 'found'
773 }
774
775 # You can also examine the status of each part of the pipeline
776 if (_pipeline_status[0] !== 0) {
777 echo 'ps failed'
778 }
779
780### Does a Command With Process Subs Succeed?
781
782Similar to the pipeline example above:
783
784No:
785
786 if ! comm <(sort left.txt) <(sort right.txt); then
787 echo 'error'
788 fi
789
790Yes:
791
792 try {
793 comm <(sort left.txt) <(sort right.txt)
794 }
795 if failed {
796 echo 'error'
797 }
798
799 # You can also examine the status of each process sub
800 if (_process_sub_status[0] !== 0) {
801 echo 'first process sub failed'
802 }
803
804(I used `comm` in this example because it doesn't have a true / false / error
805status like `diff`.)
806
807### Handle Errors in YSH Expressions
808
809 try {
810 var x = 42 / 0
811 echo "result is $[42 / 0]"
812 }
813 if failed {
814 echo 'divide by zero'
815 }
816
817### Test Boolean Statuses, like `grep`, `diff`, `test`
818
819The YSH `boolstatus` builtin distinguishes **error** from **false**.
820
821**No**, this is subtly wrong. `grep` has 3 different return values.
822
823 if grep 'class' *.py {
824 echo 'found' # status 0 means found
825 } else {
826 echo 'not found OR ERROR' # any non-zero status
827 }
828
829**Yes**. `boolstatus` aborts the program if `egrep` doesn't return 0 or 1.
830
831 if boolstatus grep 'class' *.py { # may abort
832 echo 'found' # status 0 means found
833 } else {
834 echo 'not found' # status 1 means not found
835 }
836
837More flexible style:
838
839 try {
840 grep 'class' *.py
841 }
842 case (_error.code) {
843 (0) { echo 'found' }
844 (1) { echo 'not found' }
845 (else) { echo 'fatal' }
846 }
847
848## Use YSH Expressions, Initializations, and Assignments (var, setvar)
849
850### Set an Environment Variable Globally
851
852No:
853
854 export PYTHONPATH=. # export is disabled in YSH
855
856Yes:
857
858 setglobal ENV.PYTHONPATH = '.'
859
860That is, enviroments use the [ENV][] object/namespace, rather than being global
861variables.
862
863[ENV]: ref/chap-special-var.html#ENV
864
865Note: the idiom for setting an env var locally is unchanged:
866
867 PYTHONPATH=. myscript.py
868
869### Initialize and Assign Strings and Integers
870
871No:
872
873 local mystr=foo
874 mystr='new value'
875
876 local myint=42 # still a string in shell
877
878Yes:
879
880 var mystr = 'foo'
881 setvar mystr = 'new value'
882
883 var myint = 42 # a real integer
884
885### Expressions on Integers
886
887No:
888
889 x=$(( 1 + 2*3 ))
890 (( x = 1 + 2*3 ))
891
892Yes:
893
894 setvar x = 1 + 2*3
895
896### Mutate Integers
897
898No:
899
900 (( i++ )) # interacts poorly with errexit
901 i=$(( i+1 ))
902
903Yes:
904
905 setvar i += 1 # like Python, with a keyword
906
907### Initialize and Assign Arrays
908
909Arrays in YSH look like `:| my array |` and `['my', 'array']`.
910
911No:
912
913 local -a myarray=(one two three)
914 myarray[3]='THREE'
915
916Yes:
917
918 var myarray = :| one two three |
919 setvar myarray[3] = 'THREE'
920
921 var same = ['one', 'two', 'three']
922 var typed = [1, 2, true, false, null]
923
924
925### Initialize and Assign Dicts
926
927Dicts in YSH look like `{key: 'value'}`.
928
929No:
930
931 local -A myassoc=(['key']=value ['k2']=v2)
932 myassoc['key']=V
933
934
935Yes:
936
937 # keys don't need to be quoted
938 var myassoc = {key: 'value', k2: 'v2'}
939 setvar myassoc['key'] = 'V'
940
941### Get Values From Arrays and Dicts
942
943No:
944
945 local x=${a[i-1]}
946 x=${a[i]}
947
948 local y=${A['key']}
949
950Yes:
951
952 var x = a[i-1]
953 setvar x = a[i]
954
955 var y = A['key']
956
957### Conditions and Comparisons
958
959No:
960
961 if (( x > 0 )); then
962 echo 'positive'
963 fi
964
965Yes:
966
967 if (x > 0) {
968 echo 'positive'
969 }
970
971### Substituting Expressions in Words
972
973No:
974
975 echo flag=$((1 + a[i] * 3)) # C-like arithmetic
976
977Yes:
978
979 echo flag=$[1 + a[i] * 3] # Arbitrary YSH expressions
980
981 # Possible, but a local var might be more readable
982 echo flag=$['1' if x else '0']
983
984## Use [Egg Expressions](eggex.html) instead of Regexes
985
986### Test for a Match
987
988No:
989
990 local pat='[[:digit:]]+'
991 if [[ $x =~ $pat ]]; then
992 echo 'number'
993 fi
994
995Yes:
996
997 if (x ~ /digit+/) {
998 echo 'number'
999 }
1000
1001Or extract the pattern:
1002
1003 var pat = / digit+ /
1004 if (x ~ pat) {
1005 echo 'number'
1006 }
1007
1008### Extract Submatches
1009
1010No:
1011
1012 if [[ $x =~ foo-([[:digit:]]+) ]] {
1013 echo "${BASH_REMATCH[1]}" # first submatch
1014 }
1015
1016Yes:
1017
1018 if (x ~ / 'foo-' <capture d+> /) { # <> is capture
1019 echo $[_group(1)] # first submatch
1020 }
1021
1022## Glob Matching
1023
1024No:
1025
1026 if [[ $x == *.py ]]; then
1027 echo 'Python'
1028 fi
1029
1030Yes:
1031
1032 if (x ~~ '*.py') {
1033 echo 'Python'
1034 }
1035
1036
1037No:
1038
1039 case $x in
1040 *.py)
1041 echo Python
1042 ;;
1043 *.sh)
1044 echo Shell
1045 ;;
1046 esac
1047
1048Yes (purely a style preference):
1049
1050 case $x { # curly braces
1051 (*.py) # balanced parens
1052 echo 'Python'
1053 ;;
1054 (*.sh)
1055 echo 'Shell'
1056 ;;
1057 }
1058
1059## TODO
1060
1061### Distinguish Between Variables and Functions
1062
1063- `$RANDOM` vs. `random()`
1064- `LANG=C` vs. `shopt --setattr LANG=C`
1065
1066## Related Documents
1067
1068- [Shell Language Idioms](shell-idioms.html). This advice applies to shells
1069 other than YSH.
1070- [What Breaks When You Upgrade to YSH](upgrade-breakage.html). Shell constructs that YSH
1071 users should avoid.
1072- [YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html). YSH fixes the
1073 flaky error handling in POSIX shell and bash.
1074- TODO: Go through more of the [Pure Bash
1075 Bible](https://github.com/dylanaraps/pure-bash-bible). YSH provides
1076 alternatives for such quirky syntax.
1077