OILS / doc / table-object-doc.md View on Github | oils.pub

223 lines, 179 significant
1---
2in_progress: yes
3default_highlighter: oils-sh
4body_css_class: width50
5---
6
7Tables, Object, and Documents - Notation, Query, Creation, Schema
8=============================
9
10<style>
11 thead {
12 background-color: #eee;
13 font-weight: bold;
14 text-align: left;
15 }
16 table {
17 font-family: sans-serif;
18 border-collapse: collapse;
19 }
20
21 tr {
22 border-bottom: solid 1px;
23 border-color: #ddd;
24 }
25
26 td {
27 padding: 8px; /* override default of 5px */
28 }
29</style>
30
31This is part of **maximal** YSH!
32
33<div id="toc">
34</div>
35
36## Philosophy
37
38- Oils is Exterior-First
39- Tables, Objects, Documents - CSV, JSON, HTML
40 - Oils cleanup: TSV8, JSON8, HTM8
41
42## Tables
43
44
45<table>
46
47- thead
48 - Data Type
49 - Notation
50 - Query
51 - Creation
52 - Schema
53- tr
54 - Table
55 - TSV, CSV
56 - csvkit, xsv, awk-ish, etc. <br/>
57 SQL, Data Frames
58 - ?
59 - ?
60- tr
61 - Object
62 - JSON
63 - jq <br/>
64 JSONPath: MySQL/Postgres/sqlite support it?
65 - jq
66 - JSON Schema
67- tr
68 - Document
69 - HTML5
70 - DOM API like getElementById() <br/>
71 CSS selectors <br/>
72 - JSX Templates
73 - ?
74- tr
75 - Document
76 - XML
77 - XPath? XQuery?
78 - XSLT?
79 - three:
80 - DTD (document type definition, 1986)
81 - RelaxNG (2001)
82 - XML Schema aka XSD (2001)
83
84<!-- TODO: ul-table should allow caption at the top -->
85<caption>Existing</caption>
86
87</table>
88
89&nbsp;
90
91<table>
92
93- thead
94 - Data Type
95 - Notation
96 - Query
97 - Creation
98 - Schema
99 - In-Memory
100- tr
101 - Table
102 - TSV8 (is valid TSV)
103 - dplyr-like Data Frames <br/>
104 Maybe some SQL-pipe subset thing?
105 - `table { }`
106 - ?
107 - By column: dict of "arrays" <br/>
108 By row: list of dicts <br/>
109- tr
110 - Object
111 - JSON8 (superset)
112 - JSONPath? <br/>
113 jq as a reshaping language
114 - Hay? `Package { }`
115 - JSON Schema?
116 - List and Dict
117- tr
118 - Document
119 - HTM8 (subset)
120 - CSS selectors
121 - Markaby Style `div { }` <br/>
122 "sed" style
123 - ?
124 - DocFrag - a span within a doc<br/>
125 DocTree - an Obj representation<br/>
126 ?
127
128<caption>Oils</caption>
129
130</table>
131
132## Note: SQL Databases Support all three models!
133
134- sqlite, MySQL, and PostGres obviously have tables
135- They all have JSON and JSONPath support!
136 - JSONPath syntax might differ a bit?
137- XML support
138 - Postgres: XML type, XPath, more
139 - MySQL: XML extraction functions only
140 - sqlite: none
141
142## Design Issues
143
144### Streaming
145
146- jq has a line-based streaming model, by taking advantage of the fact that
147 all JSON can be encoded without literal newlines
148 - HTML/XML don't have this property
149- Solution: Netstring based streaming?
150 - can do it for both JSON8 and HTM8 ?
151
152### Mutual Nesting
153
154- JSON must be UTF-8, so JSON strings can contain JSON
155 - ditto for JSON8, and J8 strings
156- TSV cells can't contain tabs or newlines
157 - so they can't contain TSV
158 - if you remove all the newlines, they can contain JSON
159- TSV8 cells use J8 strings, so they can contain JSON, TSV
160- HTM8
161 - you can escape everything, so you can put another HTM8 doc inside
162 - and you can put JSON/JSON8 or TSV/TSV8
163 - although are there whitespace rules?
164 - all nodes can be like `<pre>` nodes, preserving whitespace, until
165 - you apply another function to it
166
167### HTML5 whitespace rules
168
169- inside text context:
170 - multiple whitespace chars collapsed into a single one
171 - newlines converted to spaces
172 - leading and trailing space is preserved
173- `<pre> <code> <textarea>`
174 - whitespace is preserved exactly as written
175 - I guess HTM8 could use another function for this?
176- quoted attributes
177 - whitespace is untouched
178
179## Related
180
181- [stream-table-process.html](stream-table-process.html)
182- [ysh-doc-processing.html](ysh-doc-processing.html)
183
184## Notes
185
186### RelaxNG, XSD, DTD
187
188I didn't know there were these 3 schema types!
189
190- DTD is older, associated with SGML created in 1986
191- XML Schema and Relax NG created in 2001
192 - XML Schema use XML syntax, which is bad!
193
194
195### Algorithms?
196
197- I looked at `jq`
198- how do you do CSS selectors?
199- how do you do JSONPath?
200
201- XML Path
202 - holistic twig joins - bounded memory
203 - Hollandar Marx XPath Streaming
204
205
206### Naming
207
208- HTM8 doesn't use J8 strings
209 - but TSV8 does
210
211- Technically we could add j8 strings with
212 - j''
213 - and even templated strings with $"" ?
214- hm
215 - well then we would need $[ j'' ] and so forth
216
217Is
218
219- `<span x=j'foo'>` identical to `<span x="j'foo'">` in HTML5 ?
220 - it seems do
221 - ditto for `$""`
222- then we could disallow those pattern in double quotes?
223 - they would have to be quoted like &sq; or something