Errors (Oils Reference)

Overlong encoding. In UTF-8, each code point should be represented with the fewest possible bytes.
- Overlong encodings are the equivalent of writing the integer 42 as 042, 0042, 00042, etc. This is not allowed.
Surrogate code point. The sequence decodes to a code point in the surrogate range, which is used only for the UTF-16 encoding, not for string data.
Exceeds max code point. The sequence decodes to an integer that's larger than the maximum code point.
Bad encoding. A byte is not encoded like a UTF-8 start byte or a continuation byte.
Incomplete sequence. Too few continuation bytes appeared after the start byte.

J8 String

J8 strings extend JSON strings, and are a primary building block of J8 Notation.

err-j8-str-encode

J8 strings can represent any string — bytes or unicode — so there are no encoding errors.

err-j8-str-decode

Escape sequence like \u{dc00} should not be in the surrogate range.
- This means it doesn't represent a real character. Byte escapes like \yff should be used instead.
Escape sequence like \u{110000} is greater than the maximum Unicode code point.
Byte escapes like \yff should not be in u'' string.
- By design, they're only valid in b'' strings.

Implementation-defined limit:

Max string length (NYI)
- e.g. more than 4 billion bytes could overflow a length field, in some implementations

J8 Lines

Roughly speaking, J8 Lines are an encoding for a stream of J8 strings. In YSH, it's used by @(split command sub).

err-j8-lines-encode

Like J8 strings, J8 Lines have no encoding errors by design.

err-j8-lines-decode

Any error in a J8 quoted string.
- e.g. no closing quote, invalid UTF-8, invalid backslash escape, ...
A line with a quoted string has extra text after it.
- e.g. "mystr" extra.
An unquoted line is not valid UTF-8.

JSON

err-json-encode

JSON encoding has these errors:

Object of this type can't be serialized.
- For example, Str List Dict are Oils objects can be serialized, but Eggex Func Range can't.
Circular reference.
- e.g. a Dict that points to itself, a List that points to itself, and other permutations
Float values of NaN, Inf, and -Inf can't be encoded.
- (These encode to null in Oils, following JavaScript.)

Note that invalid UTF-8 bytes like 0xfe produce a Unicode replacement character, not a hard error.

err-json-decode

The encoded message itself is not valid UTF-8.
- (Typically, you need to check the unescaped bytes in string literals "abc\n").
Lexical error, like
- the message +
- an invalid escape "\z" or a truncated escape "\u1"
- A single quoted string like u''
Grammatical error
- like the message }{
Unexpected trailing input
- like the message 42] or {}]

Implementation-defined limits, i.e. outside the grammar:

Integer too big
- implementations may decode to a 64-bit integer
Floats that are too big
- may decode to Inf
Max array length (NYI)
- e.g. more than 4 billion objects in an array could overflow a length field, in some implementations
Max object length (NYI)
Max depth for arrays and objects (NYI)
- to avoid a recursive parser blowing the stack

JSON8

err-json8-encode

JSON8 has the same encoding errors as JSON.

However, the encoding is lossless by design. Instead of invalid UTF-8 being turned into a Unicode replacement character, it can use J8 strings with byte escapes like b'byte \yfe\yff'.

err-json8-decode

JSON8 has the same decoding errors as JSON, plus J8 string decoding errors.

See err-j8-str-decode.

Generated on Mon, 04 Nov 2024 13:29:41 +0000