OILS / doc / proc-func.md View on Github | oils.pub

822 lines, 584 significant
1---
2default_highlighter: oils-sh
3---
4
5Guide to Procs and Funcs
6========================
7
8YSH has two major units of code: shell-like `proc`, and Python-like `func`.
9
10- Roughly speaking, procs are for commands and **I/O**, while funcs are for
11 pure **computation**.
12- Procs are often **big**, and may call **small** funcs. On the other hand,
13 it's possible, but rarer, for funcs to call procs.
14- You can write shell scripts **mostly** with procs, and perhaps a few funcs.
15
16This doc compares the two mechanisms, and gives rough guidelines.
17
18<!--
19See the blog for more conceptual background: [Oils is
20Exterior-First](https://www.oilshell.org/blog/2023/06/ysh-design.html).
21-->
22
23<div id="toc">
24</div>
25
26## Tip: Start Simple
27
28Before going into detail, here's a quick reminder that you don't have to use
29**either** procs or funcs. YSH is a language that scales both down and up.
30
31You can start with just a list of plain commands:
32
33 mkdir -p /tmp/dest
34 cp --verbose *.txt /tmp/dest
35
36Then copy those into procs as the script gets bigger:
37
38 proc build-app {
39 ninja --verbose
40 }
41
42 proc deploy {
43 mkdir -p /tmp/dest
44 cp --verbose *.txt /tmp/dest
45 }
46
47 build-app
48 deploy
49
50Then add funcs if you need pure computation:
51
52 func isTestFile(name) {
53 return (name => endsWith('._test.py'))
54 }
55
56 if (isTestFile('my_test.py')) {
57 echo 'yes'
58 }
59
60## At a Glance
61
62### Procs vs. Funcs
63
64This table summarizes the difference between procs and funcs. The rest of the
65doc will elaborate on these issues.
66
67<style>
68 thead {
69 background-color: #eee;
70 font-weight: bold;
71 }
72 table {
73 font-family: sans-serif;
74 border-collapse: collapse;
75 }
76
77 tr {
78 border-bottom: solid 1px;
79 border-color: #ddd;
80 }
81
82 td {
83 padding: 8px; /* override default of 5px */
84 }
85</style>
86
87
88<table>
89
90- thead
91 - <!-- empty -->
92 - Proc
93 - Func
94- tr
95 - Design Influence
96 - Shell-like.
97 - Python- and JavaScript-like, but **pure**.
98- tr
99 - Shape
100 - Procs are shaped like Unix processes: with `argv`, an integer return code,
101 and `stdin` / `stdout` streams.
102
103 They're a generalization of Bourne shell "functions".
104 - Funcs are shaped like mathematical functions.
105- tr
106 - Architectural Role ([Oils is Exterior First](https://www.oilshell.org/blog/2023/06/ysh-design.html))
107 - **Exterior**: processes and files.
108 - **Interior**: functions and garbage-collected data structures.
109- tr
110 - I/O
111 - Procs may start external processes and pipelines. Can perform I/O
112 anywhere.
113 - Funcs need an explicit `io` param to perform I/O.
114- tr
115 - Example Definition
116 - ```
117 proc print-max (; x, y) {
118 echo $[x if x > y else y]
119 }
120 ```
121 - ```
122 func computeMax(x, y) {
123 return (x if x > y else y)
124 }
125 ```
126- tr
127 - Example Call
128 - ```
129 print-max (3, 4)
130 ```
131
132 Procs can be put in pipelines:
133
134 ```
135 print-max (3, 4) | tee out.txt
136 ```
137 - ```
138 var m = computeMax(3, 4)
139 ```
140
141 Or throw away the return value, which is useful for functions that mutate:
142
143 ```
144 call computeMax(3, 4)
145 ```
146- tr
147 - Naming Convention
148 - `kebab-case`
149 - `camelCase`
150- tr
151 - [Syntax Mode](command-vs-expression-mode.html) of call site
152 - Command Mode</td>
153 - Expression Mode</td>
154- tr
155 - Kinds of Parameters / Arguments
156 - <!-- empty -->
157 1. Word aka string
158 1. Typed and Positional
159 1. Typed and Named
160 1. Block
161
162 Examples shown below.
163 - <!-- empty -->
164 1. Positional
165 1. Named
166
167 (both typed)
168- tr
169 - Return Value
170 - Integer status 0-255
171 - Any type of value, e.g.
172
173 ```
174 return ([42, {name: 'bob'}])
175 ```
176- tr
177 - Can it be a method on an object?
178 - No
179 - Yes, funcs may be bound to objects:
180
181 ```
182 var x = obj.myMethod()
183 call obj->myMutatingMethod()
184 ```
185- tr
186 - Interface Evolution
187 - **Slower**: Procs exposed to the outside world may need to evolve in a compatible or "versionless" way.
188 - **Faster**: Funcs may be refactored internally.
189- tr
190 - Parallelism?
191 - Procs can be parallel with:
192 - shell constructs: pipelines, `&` aka `fork`
193 - external tools and the [$0 Dispatch
194 Pattern](https://www.oilshell.org/blog/2021/08/xargs.html): xargs, make,
195 Ninja, etc.
196 - Funcs are inherently **serial**, unless wrapped in a proc.
197- tr
198 - More `proc` Features ...
199 <cell-attrs colspan=3 style="text-align: center; padding: 3em" />
200- tr
201 - Kinds of Signature
202 - Open `proc p {` or <br/>
203 Closed `proc p () {`
204 - <!-- dash --> -
205- tr
206 - Lazy Args
207 - ```
208 assert [42 === x]
209 ```
210 - <!-- dash --> -
211
212</table>
213
214### Func Calls and Defs
215
216Now that we've compared procs and funcs, let's look more closely at funcs.
217They're inherently **simpler**: they have 2 types of args and params, rather
218than 4.
219
220YSH argument binding is based on Julia, which has all the power of Python, but
221without the "evolved warts" (e.g. `/` and `*`).
222
223In general, with all the bells and whistles, func definitions look like:
224
225 # pos args and named args separated with ;
226 func f(p1, p2, ...rest_pos; n1=42, n2='foo', ...rest_named) {
227 return (len(rest_pos) + len(rest_named))
228 }
229
230Func calls look like:
231
232 # spread operator ... at call site
233 var pos_args = [3, 4]
234 var named_args = {foo: 'bar'}
235 var x = f(1, 2, ...pos_args; n1=43, ...named_args)
236
237Note that positional args/params and named args/params can be thought of as two
238"separate worlds".
239
240This table shows simpler, more common cases.
241
242
243<table>
244 <thead>
245 <tr>
246 <td>Args / Params</td>
247 <td>Call Site</td>
248 <td>Definition</td>
249 </tr>
250 </thead>
251
252 <tr>
253 <td>Positional Args</td>
254<td>
255
256 var x = myMax(3, 4)
257
258</td>
259<td>
260
261 func myMax(x, y) {
262 return (x if x > y else y)
263 }
264
265</td>
266 </tr>
267
268 <tr>
269 <td>Spread Pos Args</td>
270<td>
271
272 var args = [3, 4]
273 var x = myMax(...args)
274
275</td>
276<td>
277
278(as above)
279
280</td>
281 </tr>
282
283 <tr>
284 <td>Rest Pos Params</td>
285<td>
286
287 var x = myPrintf("%s is %d", 'bob', 30)
288
289</td>
290<td>
291
292 func myPrintf(fmt, ...args) {
293 # ...
294 }
295
296</td>
297 </tr>
298
299 <tr>
300 <td colspan=3 style="text-align: center; padding: 3em">...</td>
301 </tr>
302
303</td>
304 </tr>
305
306 <tr>
307 <td>Named Args</td>
308<td>
309
310 var x = mySum(3, 4, start=5)
311
312</td>
313<td>
314
315 func mySum(x, y; start=0) {
316 return (x + y + start)
317 }
318
319</td>
320 </tr>
321
322 <tr>
323 <td>Spread Named Args</td>
324<td>
325
326 var opts = {start: 5}
327 var x = mySum(3, 4, ...opts)
328
329</td>
330<td>
331
332(as above)
333
334</td>
335 </tr>
336
337 <tr>
338 <td>Rest Named Params</td>
339<td>
340
341 var x = f(start=5, end=7)
342
343</td>
344<td>
345
346 func f(; ...opts) {
347 if ('start' not in opts) {
348 setvar opts.start = 0
349 }
350 # ...
351 }
352
353</td>
354 </tr>
355
356</table>
357
358### Proc Calls and Defs
359
360Like funcs, procs have 2 kinds of typed args/params: positional and named.
361
362But they may also have **string aka word** args/params, and a **block**
363arg/param.
364
365In general, a proc signature has 4 sections, like this:
366
367 proc p (
368 w1, w2, ...rest_word; # word params
369 p1, p2, ...rest_pos; # pos params
370 n1, n2, ...rest_named; # named params
371 block # block param
372 ) {
373 echo 'body'
374 }
375
376In general, a proc call looks like this:
377
378 var pos_args = [3, 4]
379 var named_args = {foo: 'bar'}
380
381 p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args) {
382 echo 'block'
383 }
384
385The block can also be passed as an expression after a second semicolon:
386
387 p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args; block)
388
389<!--
390- Block is really last positional arg: `cd /tmp { echo $PWD }`
391-->
392
393Some simpler examples:
394
395<table>
396 <thead>
397 <tr>
398 <td>Args / Params</td>
399 <td>Call Site</td>
400 <td>Definition</td>
401 </tr>
402 </thead>
403
404 <tr>
405 <td>Word args</td>
406<td>
407
408 my-cd /tmp
409
410</td>
411<td>
412
413 proc my-cd (dest) {
414 cd $dest
415 }
416
417</td>
418 </tr>
419
420 <tr>
421 <td>Rest Word Params</td>
422<td>
423
424 my-cd -L /tmp
425
426</td>
427<td>
428
429 proc my-cd (...flags) {
430 cd @flags
431 }
432
433 <tr>
434 <td>Spread Word Args</td>
435<td>
436
437 var flags = :| -L /tmp |
438 my-cd @flags
439
440</td>
441<td>
442
443(as above)
444
445</td>
446 </tr>
447
448</td>
449 </tr>
450
451 <tr>
452 <td colspan=3 style="text-align: center; padding: 3em">...</td>
453 </tr>
454
455 <tr>
456 <td>Typed Pos Arg</td>
457<td>
458
459 print-max (3, 4)
460
461</td>
462<td>
463
464 proc print-max ( ; x, y) {
465 echo $[x if x > y else y]
466 }
467
468</td>
469 </tr>
470
471 <tr>
472 <td>Typed Named Arg</td>
473<td>
474
475 print-max (3, 4, start=5)
476
477</td>
478<td>
479
480 proc print-max ( ; x, y; start=0) {
481 # ...
482 }
483
484</td>
485 </tr>
486
487 <tr>
488 <td colspan=3 style="text-align: center; padding: 3em">...</td>
489 </tr>
490
491
492
493 <tr>
494 <td>Block Argument</td>
495<td>
496
497 my-cd /tmp {
498 echo $PWD
499 echo hi
500 }
501
502</td>
503<td>
504
505 proc my-cd (dest; ; ; block) {
506 cd $dest (; ; block)
507 }
508
509</td>
510 </tr>
511
512 <tr>
513 <td>All Four Kinds</td>
514<td>
515
516 p 'word' (42, verbose=true) {
517 echo $PWD
518 echo hi
519 }
520
521</td>
522<td>
523
524 proc p (w; myint; verbose=false; block) {
525 = w
526 = myint
527 = verbose
528 = block
529 }
530
531</td>
532 </tr>
533
534</table>
535
536## Common Features
537
538Let's recap the common features of procs and funcs.
539
540### Spread Args, Rest Params
541
542- Spread arg list `...` at call site
543- Rest params `...` at definition
544
545### The `error` builtin raises exceptions
546
547The `error` builtin is idiomatic in both funcs and procs:
548
549 func f(x) {
550 if (x <= 0) {
551 error 'Should be positive' (status=99)
552 }
553 }
554
555Tip: reserve such errors for **exceptional** situations. For example, an input
556string being invalid may not be uncommon, while a disk full I/O error is more
557exceptional.
558
559(The `error` builtin is implemented with C++ exceptions, which are slow in the
560error case.)
561
562### Out Params: `&myvar` is of type `value.Place`
563
564Out params are more common in procs, because they don't have a typed return
565value.
566
567 proc p ( ; out) {
568 call out->setValue(42)
569 }
570 var x
571 p (&x)
572 echo "x set to $x" # => x set to 42
573
574But they can also be used in funcs:
575
576 func f (out) {
577 call out->setValue(42)
578 }
579 var x
580 call f(&x)
581 echo "x set to $x" # => x set to 42
582
583Observation: procs can do everything funcs can. But you may want the purity
584and familiar syntax of a `func`.
585
586---
587
588Design note: out params are a nicer way of doing what bash does with `declare
589-n` aka `nameref` variables. They don't rely on [dynamic
590scope]($xref:dynamic-scope).
591
592## Proc-Only Features
593
594Procs have some features that funcs don't have.
595
596### Lazy Arg Lists `where [x > 10]`
597
598A lazy arg list is implemented with `shopt --set parse_bracket`, and is syntax
599sugar for an unevaluated `value.Expr`.
600
601Longhand:
602
603 var my_expr = ^[42 === x] # value of type Expr
604 assert (myexpr)
605
606Shorthand:
607
608 assert [42 === x] # equivalent to the above
609
610### Open Proc Signatures bind `argv`
611
612TODO: Implement new `ARGV` semantics.
613
614When a proc signature omits `()`, it's called **"open"** because the caller can
615pass "extra" arguments:
616
617 proc my-open {
618 write 'args are' @ARGV
619 }
620 # All valid:
621 my-open
622 my-open 1
623 my-open 1 2
624
625Stricter closed procs:
626
627 proc my-closed (x) {
628 write 'arg is' $x
629 }
630 my-closed # runtime error: missing argument
631 my-closed 1 # valid
632 my-closed 1 2 # runtime error: too many arguments
633
634
635An "open" proc is nearly is nearly identical to a shell function:
636
637 shfunc() {
638 write 'args are' @ARGV
639 }
640
641## Methods are Funcs Bound to Objects
642
643Values of type `Obj` have an ordered set of name-value bindings, as well as a
644prototype chain of more `Obj` instances ("parents"). They support these
645operators:
646
647- dot (`.`) looks for attributes or methods with a given name.
648 - Reference: [ysh-attr](ref/chap-expr-lang.html#ysh-attr)
649 - Attributes may be in the object, or up the chain. They are returned
650 literally.
651 - Methods live up the chain. They are returned as `BoundFunc`, so that the
652 first `self` argument of a method call is the object itself.
653- Thin arrow (`->`) looks for mutating methods, which have an `M/` prefix.
654 - Reference: [thin-arrow](ref/chap-expr-lang.html#thin-arrow)
655
656## The `__invoke__` method makes an Object "Proc-like"
657
658First, define a proc, with the first typed arg named `self`:
659
660 proc myInvoke (word_param; self, int_param) {
661 echo "sum = $[self.x + self.y + int_param]"
662 }
663
664Make it the `__invoke__` method of an `Obj`:
665
666 var methods = Object(null, {__invoke__: myInvoke})
667 var invokable_obj = Object(methods, {x: 1, y: 2})
668
669Then invoke it like a proc:
670
671 invokable_obj myword (3)
672 # sum => 6
673
674## Usage Notes
675
676### 3 Ways to Return a Value
677
678Let's review the recommended ways to "return" a value:
679
6801. `return (x)` in a `func`.
681 - The parentheses are required because expressions like `(x + 1)` should
682 look different than words.
6831. Pass a `value.Place` instance to a proc or func.
684 - That is, out param `&out`.
6851. Print to stdout in a `proc`
686 - Capture it with command sub: `$(myproc)`
687 - Or with `read`: `myproc | read --all; echo $_reply`
688
689Obsolete ways of "returning":
690
6911. Using `declare -n` aka `nameref` variables in bash.
6921. Relying on [dynamic scope]($xref:dynamic-scope) in POSIX shell.
693
694### Procs Compose in Pipelines / "Bernstein Chaining"
695
696Some YSH users may tend toward funcs because they're more familiar. But shell
697composition with procs is very powerful!
698
699They have at least two kinds of composition that funcs don't have.
700
701See #[shell-the-good-parts]($blog-tag):
702
7031. [Shell Has a Forth-Like
704 Quality](https://www.oilshell.org/blog/2017/01/13.html) - Bernstein
705 chaining.
7061. [Pipelines Support Vectorized, Point-Free, and Imperative
707 Style](https://www.oilshell.org/blog/2017/01/15.html) - the shell can
708 transparently run procs as elements of pipelines.
709
710<!--
711
712In summary:
713
714* func signatures look like JavaScript, Julia, and Go.
715 * named and positional are separated with `;` in the signature.
716 * The prefix `...` "spread" operator takes the place of Python's `*args` and `**kwargs`.
717 * There are optional type annotations
718* procs are like shell functions
719 * but they also allow you to name parameters, and throw errors if the arity
720is wrong.
721 * and they take blocks.
722
723-->
724
725## Summary
726
727YSH is influenced by both shell and Python, so it has both procs and funcs.
728
729Many programmers will gravitate towards funcs because they're familiar, but
730procs are more powerful and shell-like.
731
732Make your YSH programs by learning to use procs!
733
734## Appendix
735
736### Implementation Details
737
738procs vs. funcs both have these concerns:
739
7401. Evaluation of default args at definition time.
7411. Evaluation of actual args at the call site.
7421. Arg-Param binding for builtin functions, e.g. with `typed_args.Reader`.
7431. Arg-Param binding for user-defined functions.
744
745So the implementation can be thought of as a **2 &times; 4 matrix**, with some
746code shared. This code is mostly in [ysh/func_proc.py]($oils-src).
747
748### Related
749
750- [Variable Declaration, Mutation, and Scope](variables.html) - in particular,
751 procs don't have [dynamic scope]($xref:dynamic-scope).
752- [Block Literals](block-literals.html) (in progress)
753
754<!--
755TODO: any reference topics?
756-->
757
758<!--
759OK we're getting close here -- #**language-design>Unifying Proc and Func Params**
760
761I think we need to write a quick guide first, not a reference
762
763
764It might have some **tables**
765
766It might mention concerete use cases like the **flag parser** -- #**oil-dev>Progress on argparse**
767
768
769### Diff-based explanation
770
771- why not Python -- because of `/` and `*` special cases
772- Julia influence
773- lazy args for procs `where` filters and `awk`
774- out Ref parameters are for "returning" without printing to stdout
775
776#**language-design>N ways to "return" a value**
777
778
779- What does shell have?
780 - it has blocks, e.g. with redirects
781 - it has functions without params -- only named params
782
783
784- Ruby influence -- rich DSLs
785
786
787So I think you can say we're a mix of
788
789- shell
790- Python
791- Julia (mostly subsumes Python?)
792- Ruby
793
794
795### Implemented-based explanation
796
797- ASDL schemas -- #**oil-dev>Good Proc/Func refactoring**
798
799
800### Big Idea: procs are for I/O, funcs are for computation
801
802We may want to go full in on this idea with #**language-design>func evaluator without redirects and $?**
803
804
805### Very Basic Advice, Up Front
806
807
808Done with #**language-design>value.Place, & operator, read builtin**
809
810Place works with both func and proc
811
812
813### Bump
814
815I think this might go in the backlog - #**blog-ideas**
816
817
818#**language-design>Simplify proc param passing?**
819
820-->
821
822<!-- vim sw=2 -->