OILS / doc / proc-func.md View on Github | oils.pub

819 lines, 582 significant
1---
2default_highlighter: oils-sh
3---
4
5Guide to Procs and Funcs
6========================
7
8YSH has two major units of code: shell-like `proc`, and Python-like `func`.
9
10- Roughly speaking, procs are for commands and **I/O**, while funcs are for
11 pure **computation**.
12- Procs are often **big**, and may call **small** funcs. On the other hand,
13 it's possible, but rarer, for funcs to call procs.
14- You can write shell scripts **mostly** with procs, and perhaps a few funcs.
15
16This doc compares the two mechanisms, and gives rough guidelines.
17
18<!--
19See the blog for more conceptual background: [Oils is
20Exterior-First](https://www.oilshell.org/blog/2023/06/ysh-design.html).
21-->
22
23<div id="toc">
24</div>
25
26## Tip: Start Simple
27
28Before going into detail, here's a quick reminder that you don't have to use
29**either** procs or funcs. YSH is a language that scales both down and up.
30
31You can start with just a list of plain commands:
32
33 mkdir -p /tmp/dest
34 cp --verbose *.txt /tmp/dest
35
36Then copy those into procs as the script gets bigger:
37
38 proc build-app {
39 ninja --verbose
40 }
41
42 proc deploy {
43 mkdir -p /tmp/dest
44 cp --verbose *.txt /tmp/dest
45 }
46
47 build-app
48 deploy
49
50Then add funcs if you need pure computation:
51
52 func isTestFile(name) {
53 return (name => endsWith('._test.py'))
54 }
55
56 if (isTestFile('my_test.py')) {
57 echo 'yes'
58 }
59
60## At a Glance
61
62### Procs vs. Funcs
63
64This table summarizes the difference between procs and funcs. The rest of the
65doc will elaborate on these issues.
66
67<style>
68 thead {
69 background-color: #eee;
70 font-weight: bold;
71 }
72 table {
73 font-family: sans-serif;
74 border-collapse: collapse;
75 }
76
77 tr {
78 border-bottom: solid 1px;
79 border-color: #ddd;
80 }
81
82 td {
83 padding: 8px; /* override default of 5px */
84 }
85</style>
86
87
88<table>
89
90- thead
91 - <!-- empty -->
92 - Proc
93 - Func
94- tr
95 - Design Influence
96 - Shell-like.
97 - Python- and JavaScript-like, but **pure**.
98- tr
99 - Shape
100 - Procs are shaped like Unix processes: with `argv`, an integer return code,
101 and `stdin` / `stdout` streams.
102
103 They're a generalization of Bourne shell "functions".
104 - Funcs are shaped like mathematical functions.
105- tr
106 - Architectural Role ([Oils is Exterior First](https://www.oilshell.org/blog/2023/06/ysh-design.html))
107 - **Exterior**: processes and files.
108 - **Interior**: functions and garbage-collected data structures.
109- tr
110 - I/O
111 - Procs may start external processes and pipelines. Can perform I/O
112 anywhere.
113 - Funcs need an explicit `io` param to perform I/O.
114- tr
115 - Example Definition
116 - ```
117 proc print-max (; x, y) {
118 echo $[x if x > y else y]
119 }
120 ```
121 - ```
122 func computeMax(x, y) {
123 return (x if x > y else y)
124 }
125 ```
126- tr
127 - Example Call
128 - ```
129 print-max (3, 4)
130 ```
131
132 Procs can be put in pipelines:
133
134 ```
135 print-max (3, 4) | tee out.txt
136 ```
137 - ```
138 var m = computeMax(3, 4)
139 ```
140
141 Or throw away the return value, which is useful for functions that mutate:
142
143 ```
144 call computeMax(3, 4)
145 ```
146- tr
147 - Naming Convention
148 - `kebab-case`
149 - `camelCase`
150- tr
151 - [Syntax Mode](command-vs-expression-mode.html) of call site
152 - Command Mode
153 - Expression Mode
154- tr
155 - Kinds of Parameters / Arguments
156 - <!-- empty -->
157 1. Word aka string
158 1. Typed and Positional
159 1. Typed and Named
160 1. Block
161
162 Examples shown below.
163 - <!-- empty -->
164 1. Positional
165 1. Named
166
167 (both typed)
168- tr
169 - Return Value
170 - Integer status 0-255
171 - Any type of value, e.g.
172
173 ```
174 return ([42, {name: 'bob'}])
175 ```
176- tr
177 - Can it be a method on an object?
178 - No
179 - Yes, funcs may be bound to objects:
180
181 ```
182 var x = obj.myMethod()
183 call obj->myMutatingMethod()
184 ```
185- tr
186 - Interface Evolution
187 - **Slower**: Procs exposed to the outside world may need to evolve in a compatible or "versionless" way.
188 - **Faster**: Funcs may be refactored internally.
189- tr
190 - Parallelism?
191 - Procs can be parallel with:
192 - shell constructs: pipelines, `&` aka `fork`
193 - external tools and the [$0 Dispatch
194 Pattern](https://www.oilshell.org/blog/2021/08/xargs.html): xargs, make,
195 Ninja, etc.
196 - Funcs are inherently **serial**, unless wrapped in a proc.
197- tr
198 - More `proc` Features ...
199 <cell-attrs colspan=3 style="text-align: center; padding: 3em" />
200- tr
201 - Kinds of Signature
202 - Open `proc p {` or <br/>
203 Closed `proc p () {`
204 - <!-- dash --> -
205- tr
206 - Lazy Args
207 - ```
208 assert [42 === x]
209 ```
210 - <!-- dash --> -
211
212</table>
213
214### Func Calls and Defs
215
216Now that we've compared procs and funcs, let's look more closely at funcs.
217They're inherently **simpler**: they have 2 types of args and params, rather
218than 4.
219
220YSH argument binding is based on Julia, which has all the power of Python, but
221without the "evolved warts" (e.g. `/` and `*`).
222
223In general, with all the bells and whistles, func definitions look like:
224
225 # pos args and named args separated with ;
226 func f(p1, p2, ...rest_pos; n1=42, n2='foo', ...rest_named) {
227 return (len(rest_pos) + len(rest_named))
228 }
229
230Func calls look like:
231
232 # spread operator ... at call site
233 var pos_args = [3, 4]
234 var named_args = {foo: 'bar'}
235 var x = f(1, 2, ...pos_args; n1=43, ...named_args)
236
237Note that positional args/params and named args/params can be thought of as two
238"separate worlds".
239
240This table shows simpler, more common cases.
241
242
243<table>
244 <thead>
245 <tr>
246 <td>Args / Params</td>
247 <td>Call Site</td>
248 <td>Definition</td>
249 </tr>
250 </thead>
251
252 <tr>
253 <td>Positional Args</td>
254<td>
255
256 var x = myMax(3, 4)
257
258</td>
259<td>
260
261 func myMax(x, y) {
262 return (x if x > y else y)
263 }
264
265</td>
266 </tr>
267
268 <tr>
269 <td>Spread Pos Args</td>
270<td>
271
272 var args = [3, 4]
273 var x = myMax(...args)
274
275</td>
276<td>
277
278(as above)
279
280</td>
281 </tr>
282
283 <tr>
284 <td>Rest Pos Params</td>
285<td>
286
287 var x = myPrintf("%s is %d", 'bob', 30)
288
289</td>
290<td>
291
292 func myPrintf(fmt, ...args) {
293 # ...
294 }
295
296</td>
297 </tr>
298
299 <tr>
300 <td colspan=3 style="text-align: center; padding: 3em">...</td>
301 </tr>
302
303 <tr>
304 <td>Named Args</td>
305<td>
306
307 var x = mySum(3, 4, start=5)
308
309</td>
310<td>
311
312 func mySum(x, y; start=0) {
313 return (x + y + start)
314 }
315
316</td>
317 </tr>
318
319 <tr>
320 <td>Spread Named Args</td>
321<td>
322
323 var opts = {start: 5}
324 var x = mySum(3, 4, ...opts)
325
326</td>
327<td>
328
329(as above)
330
331</td>
332 </tr>
333
334 <tr>
335 <td>Rest Named Params</td>
336<td>
337
338 var x = f(start=5, end=7)
339
340</td>
341<td>
342
343 func f(; ...opts) {
344 if ('start' not in opts) {
345 setvar opts.start = 0
346 }
347 # ...
348 }
349
350</td>
351 </tr>
352
353</table>
354
355### Proc Calls and Defs
356
357Like funcs, procs have 2 kinds of typed args/params: positional and named.
358
359But they may also have **string aka word** args/params, and a **block**
360arg/param.
361
362In general, a proc signature has 4 sections, like this:
363
364 proc p (
365 w1, w2, ...rest_word; # word params
366 p1, p2, ...rest_pos; # pos params
367 n1, n2, ...rest_named; # named params
368 block # block param
369 ) {
370 echo 'body'
371 }
372
373In general, a proc call looks like this:
374
375 var pos_args = [3, 4]
376 var named_args = {foo: 'bar'}
377
378 p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args) {
379 echo 'block'
380 }
381
382The block can also be passed as an expression after a second semicolon:
383
384 p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args; block)
385
386<!--
387- Block is really last positional arg: `cd /tmp { echo $PWD }`
388-->
389
390Some simpler examples:
391
392<table>
393 <thead>
394 <tr>
395 <td>Args / Params</td>
396 <td>Call Site</td>
397 <td>Definition</td>
398 </tr>
399 </thead>
400
401 <tr>
402 <td>Word args</td>
403<td>
404
405 my-cd /tmp
406
407</td>
408<td>
409
410 proc my-cd (dest) {
411 cd $dest
412 }
413
414</td>
415 </tr>
416
417 <tr>
418 <td>Rest Word Params</td>
419<td>
420
421 my-cd -L /tmp
422
423</td>
424<td>
425
426 proc my-cd (...flags) {
427 cd @flags
428 }
429
430 <tr>
431 <td>Spread Word Args</td>
432<td>
433
434 var flags = :| -L /tmp |
435 my-cd @flags
436
437</td>
438<td>
439
440(as above)
441
442</td>
443 </tr>
444
445</td>
446 </tr>
447
448 <tr>
449 <td colspan=3 style="text-align: center; padding: 3em">...</td>
450 </tr>
451
452 <tr>
453 <td>Typed Pos Arg</td>
454<td>
455
456 print-max (3, 4)
457
458</td>
459<td>
460
461 proc print-max ( ; x, y) {
462 echo $[x if x > y else y]
463 }
464
465</td>
466 </tr>
467
468 <tr>
469 <td>Typed Named Arg</td>
470<td>
471
472 print-max (3, 4, start=5)
473
474</td>
475<td>
476
477 proc print-max ( ; x, y; start=0) {
478 # ...
479 }
480
481</td>
482 </tr>
483
484 <tr>
485 <td colspan=3 style="text-align: center; padding: 3em">...</td>
486 </tr>
487
488
489
490 <tr>
491 <td>Block Argument</td>
492<td>
493
494 my-cd /tmp {
495 echo $PWD
496 echo hi
497 }
498
499</td>
500<td>
501
502 proc my-cd (dest; ; ; block) {
503 cd $dest (; ; block)
504 }
505
506</td>
507 </tr>
508
509 <tr>
510 <td>All Four Kinds</td>
511<td>
512
513 p 'word' (42, verbose=true) {
514 echo $PWD
515 echo hi
516 }
517
518</td>
519<td>
520
521 proc p (w; myint; verbose=false; block) {
522 = w
523 = myint
524 = verbose
525 = block
526 }
527
528</td>
529 </tr>
530
531</table>
532
533## Common Features
534
535Let's recap the common features of procs and funcs.
536
537### Spread Args, Rest Params
538
539- Spread arg list `...` at call site
540- Rest params `...` at definition
541
542### The `error` builtin raises exceptions
543
544The `error` builtin is idiomatic in both funcs and procs:
545
546 func f(x) {
547 if (x <= 0) {
548 error 'Should be positive' (status=99)
549 }
550 }
551
552Tip: reserve such errors for **exceptional** situations. For example, an input
553string being invalid may not be uncommon, while a disk full I/O error is more
554exceptional.
555
556(The `error` builtin is implemented with C++ exceptions, which are slow in the
557error case.)
558
559### Out Params: `&myvar` is of type `value.Place`
560
561Out params are more common in procs, because they don't have a typed return
562value.
563
564 proc p ( ; out) {
565 call out->setValue(42)
566 }
567 var x
568 p (&x)
569 echo "x set to $x" # => x set to 42
570
571But they can also be used in funcs:
572
573 func f (out) {
574 call out->setValue(42)
575 }
576 var x
577 call f(&x)
578 echo "x set to $x" # => x set to 42
579
580Observation: procs can do everything funcs can. But you may want the purity
581and familiar syntax of a `func`.
582
583---
584
585Design note: out params are a nicer way of doing what bash does with `declare
586-n` aka `nameref` variables. They don't rely on [dynamic
587scope]($xref:dynamic-scope).
588
589## Proc-Only Features
590
591Procs have some features that funcs don't have.
592
593### Lazy Arg Lists `where [x > 10]`
594
595A lazy arg list is implemented with `shopt --set parse_bracket`, and is syntax
596sugar for an unevaluated `value.Expr`.
597
598Longhand:
599
600 var my_expr = ^[42 === x] # value of type Expr
601 assert (myexpr)
602
603Shorthand:
604
605 assert [42 === x] # equivalent to the above
606
607### Open Proc Signatures bind `argv`
608
609TODO: Implement new `ARGV` semantics.
610
611When a proc signature omits `()`, it's called **"open"** because the caller can
612pass "extra" arguments:
613
614 proc my-open {
615 write 'args are' @ARGV
616 }
617 # All valid:
618 my-open
619 my-open 1
620 my-open 1 2
621
622Stricter closed procs:
623
624 proc my-closed (x) {
625 write 'arg is' $x
626 }
627 my-closed # runtime error: missing argument
628 my-closed 1 # valid
629 my-closed 1 2 # runtime error: too many arguments
630
631
632An "open" proc is nearly is nearly identical to a shell function:
633
634 shfunc() {
635 write 'args are' @ARGV
636 }
637
638## Methods are Funcs Bound to Objects
639
640Values of type `Obj` have an ordered set of name-value bindings, as well as a
641prototype chain of more `Obj` instances ("parents"). They support these
642operators:
643
644- dot (`.`) looks for attributes or methods with a given name.
645 - Reference: [ysh-attr](ref/chap-expr-lang.html#ysh-attr)
646 - Attributes may be in the object, or up the chain. They are returned
647 literally.
648 - Methods live up the chain. They are returned as `BoundFunc`, so that the
649 first `self` argument of a method call is the object itself.
650- Thin arrow (`->`) looks for mutating methods, which have an `M/` prefix.
651 - Reference: [thin-arrow](ref/chap-expr-lang.html#thin-arrow)
652
653## The `__invoke__` method makes an Object "Proc-like"
654
655First, define a proc, with the first typed arg named `self`:
656
657 proc myInvoke (word_param; self, int_param) {
658 echo "sum = $[self.x + self.y + int_param]"
659 }
660
661Make it the `__invoke__` method of an `Obj`:
662
663 var methods = Object(null, {__invoke__: myInvoke})
664 var invokable_obj = Object(methods, {x: 1, y: 2})
665
666Then invoke it like a proc:
667
668 invokable_obj myword (3)
669 # sum => 6
670
671## Usage Notes
672
673### 3 Ways to Return a Value
674
675Let's review the recommended ways to "return" a value:
676
6771. `return (x)` in a `func`.
678 - The parentheses are required because expressions like `(x + 1)` should
679 look different than words.
6801. Pass a `value.Place` instance to a proc or func.
681 - That is, out param `&out`.
6821. Print to stdout in a `proc`
683 - Capture it with command sub: `$(myproc)`
684 - Or with `read`: `myproc | read --all; echo $_reply`
685
686Obsolete ways of "returning":
687
6881. Using `declare -n` aka `nameref` variables in bash.
6891. Relying on [dynamic scope]($xref:dynamic-scope) in POSIX shell.
690
691### Procs Compose in Pipelines / "Bernstein Chaining"
692
693Some YSH users may tend toward funcs because they're more familiar. But shell
694composition with procs is very powerful!
695
696They have at least two kinds of composition that funcs don't have.
697
698See #[shell-the-good-parts]($blog-tag):
699
7001. [Shell Has a Forth-Like
701 Quality](https://www.oilshell.org/blog/2017/01/13.html) - Bernstein
702 chaining.
7031. [Pipelines Support Vectorized, Point-Free, and Imperative
704 Style](https://www.oilshell.org/blog/2017/01/15.html) - the shell can
705 transparently run procs as elements of pipelines.
706
707<!--
708
709In summary:
710
711* func signatures look like JavaScript, Julia, and Go.
712 * named and positional are separated with `;` in the signature.
713 * The prefix `...` "spread" operator takes the place of Python's `*args` and `**kwargs`.
714 * There are optional type annotations
715* procs are like shell functions
716 * but they also allow you to name parameters, and throw errors if the arity
717is wrong.
718 * and they take blocks.
719
720-->
721
722## Summary
723
724YSH is influenced by both shell and Python, so it has both procs and funcs.
725
726Many programmers will gravitate towards funcs because they're familiar, but
727procs are more powerful and shell-like.
728
729Make your YSH programs by learning to use procs!
730
731## Appendix
732
733### Implementation Details
734
735procs vs. funcs both have these concerns:
736
7371. Evaluation of default args at definition time.
7381. Evaluation of actual args at the call site.
7391. Arg-Param binding for builtin functions, e.g. with `typed_args.Reader`.
7401. Arg-Param binding for user-defined functions.
741
742So the implementation can be thought of as a **2 &times; 4 matrix**, with some
743code shared. This code is mostly in [ysh/func_proc.py]($oils-src).
744
745### Related
746
747- [Variable Declaration, Mutation, and Scope](variables.html) - in particular,
748 procs don't have [dynamic scope]($xref:dynamic-scope).
749- [Block Literals](block-literals.html) (in progress)
750
751<!--
752TODO: any reference topics?
753-->
754
755<!--
756OK we're getting close here -- #**language-design>Unifying Proc and Func Params**
757
758I think we need to write a quick guide first, not a reference
759
760
761It might have some **tables**
762
763It might mention concerete use cases like the **flag parser** -- #**oil-dev>Progress on argparse**
764
765
766### Diff-based explanation
767
768- why not Python -- because of `/` and `*` special cases
769- Julia influence
770- lazy args for procs `where` filters and `awk`
771- out Ref parameters are for "returning" without printing to stdout
772
773#**language-design>N ways to "return" a value**
774
775
776- What does shell have?
777 - it has blocks, e.g. with redirects
778 - it has functions without params -- only named params
779
780
781- Ruby influence -- rich DSLs
782
783
784So I think you can say we're a mix of
785
786- shell
787- Python
788- Julia (mostly subsumes Python?)
789- Ruby
790
791
792### Implemented-based explanation
793
794- ASDL schemas -- #**oil-dev>Good Proc/Func refactoring**
795
796
797### Big Idea: procs are for I/O, funcs are for computation
798
799We may want to go full in on this idea with #**language-design>func evaluator without redirects and $?**
800
801
802### Very Basic Advice, Up Front
803
804
805Done with #**language-design>value.Place, & operator, read builtin**
806
807Place works with both func and proc
808
809
810### Bump
811
812I think this might go in the backlog - #**blog-ideas**
813
814
815#**language-design>Simplify proc param passing?**
816
817-->
818
819<!-- vim sw=2 -->