1 | ---
|
2 | in_progress: yes
|
3 | default_highlighter: oils-sh
|
4 | body_css_class: width50
|
5 | ---
|
6 |
|
7 | Tables, Object, and Documents - Notation, Query, Creation, Schema
|
8 | =============================
|
9 |
|
10 | <style>
|
11 | thead {
|
12 | background-color: #eee;
|
13 | font-weight: bold;
|
14 | text-align: left;
|
15 | }
|
16 | table {
|
17 | font-family: sans-serif;
|
18 | border-collapse: collapse;
|
19 | }
|
20 |
|
21 | tr {
|
22 | border-bottom: solid 1px;
|
23 | border-color: #ddd;
|
24 | }
|
25 |
|
26 | td {
|
27 | padding: 8px; /* override default of 5px */
|
28 | }
|
29 | </style>
|
30 |
|
31 | This is part of **maximal** YSH!
|
32 |
|
33 | <div id="toc">
|
34 | </div>
|
35 |
|
36 | ## Philosophy
|
37 |
|
38 | - Oils is Exterior-First
|
39 | - Tables, Objects, Documents - CSV, JSON, HTML
|
40 | - Oils cleanup: TSV8, JSON8, HTM8
|
41 |
|
42 | ## Tables
|
43 |
|
44 |
|
45 | <table>
|
46 |
|
47 | - thead
|
48 | - Data Type
|
49 | - Notation
|
50 | - Query
|
51 | - Creation
|
52 | - Schema
|
53 | - tr
|
54 | - Table
|
55 | - TSV, CSV
|
56 | - csvkit, xsv, awk-ish, etc. <br/>
|
57 | SQL, Data Frames
|
58 | - ?
|
59 | - ?
|
60 | - tr
|
61 | - Object
|
62 | - JSON
|
63 | - jq <br/>
|
64 | JSONPath: MySQL/Postgres/sqlite support it?
|
65 | - jq
|
66 | - JSON Schema
|
67 | - tr
|
68 | - Document
|
69 | - HTML5
|
70 | - DOM API like getElementById() <br/>
|
71 | CSS selectors <br/>
|
72 | - JSX Templates
|
73 | - ?
|
74 | - tr
|
75 | - Document
|
76 | - XML
|
77 | - XPath? XQuery?
|
78 | - XSLT?
|
79 | - three:
|
80 | - DTD (document type definition, 1986)
|
81 | - RelaxNG (2001)
|
82 | - XML Schema aka XSD (2001)
|
83 |
|
84 | <!-- TODO: ul-table should allow caption at the top -->
|
85 | <caption>Existing</caption>
|
86 |
|
87 | </table>
|
88 |
|
89 |
|
90 |
|
91 | <table>
|
92 |
|
93 | - thead
|
94 | - Data Type
|
95 | - Notation
|
96 | - Query
|
97 | - Creation
|
98 | - Schema
|
99 | - In-Memory
|
100 | - tr
|
101 | - Table
|
102 | - TSV8 (is valid TSV)
|
103 | - dplyr-like Data Frames <br/>
|
104 | Maybe some SQL-pipe subset thing?
|
105 | - `table { }`
|
106 | - ?
|
107 | - By column: dict of "arrays" <br/>
|
108 | By row: list of dicts <br/>
|
109 | - tr
|
110 | - Object
|
111 | - JSON8 (superset)
|
112 | - JSONPath? <br/>
|
113 | jq as a reshaping language
|
114 | - Hay? `Package { }`
|
115 | - JSON Schema?
|
116 | - List and Dict
|
117 | - tr
|
118 | - Document
|
119 | - HTM8 (subset)
|
120 | - CSS selectors
|
121 | - Markaby Style `div { }` <br/>
|
122 | "sed" style
|
123 | - ?
|
124 | - DocFrag - a span within a doc<br/>
|
125 | DocTree - an Obj representation<br/>
|
126 | ?
|
127 |
|
128 | <caption>Oils</caption>
|
129 |
|
130 | </table>
|
131 |
|
132 | ## Note: SQL Databases Support all three models!
|
133 |
|
134 | - sqlite, MySQL, and PostGres obviously have tables
|
135 | - They all have JSON and JSONPath support!
|
136 | - JSONPath syntax might differ a bit?
|
137 | - XML support
|
138 | - Postgres: XML type, XPath, more
|
139 | - MySQL: XML extraction functions only
|
140 | - sqlite: none
|
141 |
|
142 | ## Design Issues
|
143 |
|
144 | ### Streaming
|
145 |
|
146 | - jq has a line-based streaming model, by taking advantage of the fact that
|
147 | all JSON can be encoded without literal newlines
|
148 | - HTML/XML don't have this property
|
149 | - Solution: Netstring based streaming?
|
150 | - can do it for both JSON8 and HTM8 ?
|
151 |
|
152 | ### Mutual Nesting
|
153 |
|
154 | - JSON must be UTF-8, so JSON strings can contain JSON
|
155 | - ditto for JSON8, and J8 strings
|
156 | - TSV cells can't contain tabs or newlines
|
157 | - so they can't contain TSV
|
158 | - if you remove all the newlines, they can contain JSON
|
159 | - TSV8 cells use J8 strings, so they can contain JSON, TSV
|
160 | - HTM8
|
161 | - you can escape everything, so you can put another HTM8 doc inside
|
162 | - and you can put JSON/JSON8 or TSV/TSV8
|
163 | - although are there whitespace rules?
|
164 | - all nodes can be like `<pre>` nodes, preserving whitespace, until
|
165 | - you apply another function to it
|
166 |
|
167 | ### HTML5 whitespace rules
|
168 |
|
169 | - inside text context:
|
170 | - multiple whitespace chars collapsed into a single one
|
171 | - newlines converted to spaces
|
172 | - leading and trailing space is preserved
|
173 | - `<pre> <code> <textarea>`
|
174 | - whitespace is preserved exactly as written
|
175 | - I guess HTM8 could use another function for this?
|
176 | - quoted attributes
|
177 | - whitespace is untouched
|
178 |
|
179 | ## Related
|
180 |
|
181 | - [stream-table-process.html](stream-table-process.html)
|
182 | - [ysh-doc-processing.html](ysh-doc-processing.html)
|
183 |
|
184 | ## Notes
|
185 |
|
186 | ### RelaxNG, XSD, DTD
|
187 |
|
188 | I didn't know there were these 3 schema types!
|
189 |
|
190 | - DTD is older, associated with SGML created in 1986
|
191 | - XML Schema and Relax NG created in 2001
|
192 | - XML Schema use XML syntax, which is bad!
|
193 |
|
194 |
|
195 | ### Algorithms?
|
196 |
|
197 | - I looked at `jq`
|
198 | - how do you do CSS selectors?
|
199 | - how do you do JSONPath?
|
200 |
|
201 | - XML Path
|
202 | - holistic twig joins - bounded memory
|
203 | - Hollandar Marx XPath Streaming
|
204 |
|
205 |
|
206 | ### Naming
|
207 |
|
208 | - HTM8 doesn't use J8 strings
|
209 | - but TSV8 does
|
210 |
|
211 | - Technically we could add j8 strings with
|
212 | - j''
|
213 | - and even templated strings with $"" ?
|
214 | - hm
|
215 | - well then we would need $[ j'' ] and so forth
|
216 |
|
217 | Is
|
218 |
|
219 | - `<span x=j'foo'>` identical to `<span x="j'foo'">` in HTML5 ?
|
220 | - it seems do
|
221 | - ditto for `$""`
|
222 | - then we could disallow those pattern in double quotes?
|
223 | - they would have to be quoted like &sq; or something
|