Why Sponsor Oils? | source | all docs for version 0.36.0 | all versions | oils.pub
This doc describes how YSH improves upon I/O in shell.
read --allread --num-bytesfind -print0 | xargs -0
These YSH constructs make string processing more orthogonal to I/O:
${x %.2f} as a static version of the printf builtin (TODO)${x|html} and html"<p>$x</p>" for safe escaping (TODO)echo $x is a bug, because $x could be -n.
\ escapes, unless
-r is passed.
\ escapes create a mini-language that isn't understood by other
line-based tools like grep and awk. The set of escapes isn't
consistent between shells.$() removes the trailing newline
read --all, which preserves the data exactly.echo hi | read; echo $REPLY doesn't work in bash because the last part of a
pipeline (read) runs in a child process. That is, the data is indeed read,
but it's lost to the rest of the program.
echo hi | read works because the last part of a pipeline
runs in the shell process. (This is what bash calls shopt -s lastpipe,
mentioned in Known Differences.)Examples:
hostname | read --all (&x)
write -- $x
echo $x
readSuppose you have lines without a trailing \n:
$ printf 'a\nb'
a
b # no trailing newline
Then this loop doesn't print the last line, because read fails if it doesn't
see the newline delimiter.
$ printf 'a\nb' | while read -r; do echo $REPLY done
a
In contrast, a loop with YSH read --raw-line prints all lines:
$ printf 'a\nb' | while read --raw-line { echo $_reply }
a
b
These examples show that YSH I/O is orthogonal and composable. You can round trip data between YSH data structures and the OS.
First, let's create files with funny names:
mkdir -p mydir
touch 'mydir/file with spaces'
touch b'mydir/newline \n file'
And let's list these files in 3 different formats:
# Line-based: one file spans multiple lines
find . > lines.txt
# NUL-terminated
find . -print0 > 0.bin
# J8 lines
redir >j8-lines.txt {
for path in mydir/* {
write -- $[toJson8(path)]
}
}
head lines.txt j8-lines.txt
Now let's test the invariants.
Start with a file, slurp it into a string, and write it back to an equivalent file.
cat lines.txt | read --all
= _reply # (Str)
# suppress trailing newline
write --end '' -- $_reply > out.txt
# files are equal
diff lines.txt out.txt
Start with a file, read it into an array of lines, and write it back to an equivalent file.
# newlines removed on reading
var lines = []
cat lines.txt | for line in (io.stdin) {
call lines->append(line)
}
= lines # (List)
# newlines added
write -- @lines > out.txt
# files are equal, even though one path is split across lines
diff lines.txt out.txt
This idiom can be slow, since read --raw-line reads one byte at a time:
# newlines removed on reading
var paths = []
cat lines.txt | while read --raw-line (&path) {
call paths->append(path)
}
= paths # (List)
# newlines added
write -- @paths > out.txt
# files are equal, even though one path is split across lines
diff lines.txt out.txt
Start with a file, slurp it into a string, split it into an array, and write it back to an equivalent file.
var paths = []
read --all < 0.bin
var paths = _reply.split( \y00 ) # split by NUL
# last \y00 is terminator, not separator
# TODO: could improve this
call paths->pop()
= paths
# Use NUL separator and terminator
write --sep b'\y00' --end b'\y00' -- @paths > out0.bin
diff 0.bin out0.bin
This idiom can be slow, since read -0 reads one byte at a time:
var paths = []
cat 0.bin | while read -0 path {
call paths->append(path)
}
= paths
# Use NUL separator and terminator
write --sep b'\y00' --end b'\y00' -- @paths > out0.bin
diff 0.bin out0.bin
Start with a file, slurp it into an array of lines, and write it back to an equivalent file.
var paths = @(cat j8-lines.txt)
= paths
redir >j8-out.txt {
for path in (paths) {
write -- $[toJson8(path)]
}
}
diff j8-lines.txt j8-out.txt
Start with an array, write it to a file, and slurp it back into an array.
var strs = :| 'with space' b'with \n newline' |
redir >j8-tmp.txt {
for s in (strs) {
write -- $[toJson8(s)]
}
}
cat j8-tmp.txt
# round-tripped
assert [strs === @(cat j8-tmp.txt)]
This table characterizes the performance of different ways to read input:
| Performance | Shell Constructs |
|---|---|
|
Buffered, and therefore fast |
|
|
Unbuffered and fast |
|
|
Unbuffered and slow |