YSH Input/Output

This doc describes how YSH improves upon I/O in shell.

Table of Contents
Summary
Details on Problems with Shell
Shell Pitfall: the Exit Code of read
Tested Invariants
Set Up Test Data
File -> String -> File
File -> Array of Lines -> File (fast)
File -> Array of Lines -> File (slow)
NUL File -> Array of Lines -> NUL File (fast)
NUL File -> Array of Lines -> NUL File (slow)
J8 File -> Array of Lines -> J8 File
Array -> File of J8 Lines -> Array
Reference
Three Types of I/O
Related Docs
Help Topics

Summary

These YSH constructs make string processing more orthogonal to I/O:

Details on Problems with Shell

Examples:

hostname | read --all (&x)
write -- $x
echo $x

Shell Pitfall: the Exit Code of read

Suppose you have lines without a trailing \n:

$ printf 'a\nb'
a
b  # no trailing newline

Then this loop doesn't print the last line, because read fails if it doesn't see the newline delimiter.

$ printf 'a\nb' | while read -r; do echo $REPLY done
a

In contrast, a loop with YSH read --raw-line prints all lines:

$ printf 'a\nb' | while read --raw-line { echo $_reply  }
a
b

Tested Invariants

These examples show that YSH I/O is orthogonal and composable. You can round trip data between YSH data structures and the OS.

Set Up Test Data

First, let's create files with funny names:

mkdir -p mydir
touch   'mydir/file with spaces'
touch  b'mydir/newline \n file'

And let's list these files in 3 different formats:

# Line-based: one file spans multiple lines
find . > lines.txt

# NUL-terminated
find . -print0 > 0.bin

# J8 lines
redir >j8-lines.txt {
  for path in mydir/* {
    write -- $[toJson8(path)]
  }
}

head lines.txt j8-lines.txt

Now let's test the invariants.

File -> String -> File

Start with a file, slurp it into a string, and write it back to an equivalent file.

cat lines.txt | read --all

= _reply  # (Str)

# suppress trailing newline
write --end '' -- $_reply > out.txt

# files are equal
diff lines.txt out.txt

File -> Array of Lines -> File (fast)

Start with a file, read it into an array of lines, and write it back to an equivalent file.

# newlines removed on reading
var lines = []
cat lines.txt | for line in (io.stdin) {
  call lines->append(line)
}

= lines  # (List)

# newlines added
write -- @lines > out.txt

# files are equal, even though one path is split across lines
diff lines.txt out.txt

File -> Array of Lines -> File (slow)

This idiom can be slow, since read --raw-line reads one byte at a time:

# newlines removed on reading
var paths = []
cat lines.txt | while read --raw-line (&path) {
  call paths->append(path)
}

= paths   # (List)

# newlines added
write -- @paths > out.txt

# files are equal, even though one path is split across lines
diff lines.txt out.txt

NUL File -> Array of Lines -> NUL File (fast)

Start with a file, slurp it into a string, split it into an array, and write it back to an equivalent file.

var paths = []
read --all < 0.bin
var paths = _reply.split( \y00 )  # split by NUL

# last \y00 is terminator, not separator
# TODO: could improve this
call paths->pop()

= paths

# Use NUL separator and terminator
write --sep b'\y00' --end b'\y00' -- @paths > out0.bin

diff 0.bin out0.bin

NUL File -> Array of Lines -> NUL File (slow)

This idiom can be slow, since read -0 reads one byte at a time:

var paths = []
cat 0.bin | while read -0 path {
  call paths->append(path)
}

= paths

# Use NUL separator and terminator
write --sep b'\y00' --end b'\y00' -- @paths > out0.bin

diff 0.bin out0.bin

J8 File -> Array of Lines -> J8 File

Start with a file, slurp it into an array of lines, and write it back to an equivalent file.

var paths = @(cat j8-lines.txt)

= paths

redir >j8-out.txt {
  for path in (paths) {
    write -- $[toJson8(path)]
  }
}

diff j8-lines.txt j8-out.txt

Array -> File of J8 Lines -> Array

Start with an array, write it to a file, and slurp it back into an array.

var strs = :| 'with space' b'with \n newline' |
redir >j8-tmp.txt {
  for s in (strs) {
    write -- $[toJson8(s)]
  }
}

cat j8-tmp.txt

# round-tripped
assert [strs === @(cat j8-tmp.txt)]

Reference

Three Types of I/O

This table characterizes the performance of different ways to read input:

Performance Shell Constructs

Buffered, and therefore fast

Unbuffered and fast
(large chunks)

  • ysh-read: read --all and --num-bytes
  • Shell $(command sub)
  • YSH @(command splice)

Unbuffered and slow
(one byte at a time)

  • The POSIX shell read builtin: either without flags, or with short flags like -r -d
  • The bash mapfile builtin
  • ysh-read:
    • YSH read --raw-line (replaces the idiom IFS= read -r)
    • YSH read -0 (replaces the idiom read -r -d '')

Related Docs

Help Topics

Generated on Sun, 23 Nov 2025 01:57:38 +0000