OILS / doc / ul-table.md View on Github | oilshell.org

549 lines, 397 significant
1ul-table: Markdown Tables Without New Syntax
2================================
3
4`ul-table` is an HTML processor that lets you write **tables** as bulleted
5**lists**, in Markdown.
6
7
8<!--
9
10- Solve the "column alignment problem" concisely.
11 - HTML punts on this, as explained below.
12-->
13
14<div id="toc">
15</div>
16
17## Simple Example
18
19To make this table:
20
21<style>
22table {
23 margin: 0 auto;
24}
25td {
26 padding-left: 1em;
27 padding-right: 1em;
28}
29</style>
30
31<table>
32
33- thead
34 - Shell
35 - Version
36- tr
37 - [bash](https://www.gnu.org/software/bash/)
38 - 5.2
39- tr
40 - [OSH](https://oils.pub/)
41 - 0.25.0
42
43</table>
44
45You write:
46
47<!-- TODO: Add pygments highlighting -->
48
49```
50<table>
51
52- thead
53 - Shell
54 - Version
55- tr
56 - [bash](https://www.gnu.org/software/bash/)
57 - 5.2
58- tr
59 - [OSH](https://oils.pub/)
60 - 0.25.0
61
62</table>
63```
64
65Any Markdown processor will produce this:
66
67- thead
68 - Shell
69 - Version
70- tr
71 - [bash](https://www.gnu.org/software/bash/)
72 - 5.2
73- tr
74 - [OSH](https://oils.pub/)
75 - 0.25.0
76
77And then **our** `ul-table` plugin transforms that into the table shown.
78
79So the conversion takes **2 steps**. The intermediate form is what sourcehut
80or Github will show, because they currently don't support `ul-table`.
81
82This is good, because it means that `ul-table` degrades gracefully! You can
83use it anywhere without worrying about breakage.
84
85## About `ul-table`
86
87### Why?
88
89Because it's tedious to read, write, and edit `<tr>` and `<td>` and `</td>` and
90`</tr>`. Aligning columns is also tedious in HTML.
91
92<!--
93This means your docs are still readable without it, e.g. on sourcehut or
94Github. It degrades gracefully.
95-->
96
97Other design goals:
98
99- Don't invent any new Markdown syntax.
100- Scale to large, complex tables.
101- Expose the **full** power of HTML, unlike other solutions.
102
103### Structure
104
105You make tables with a **two-level Markdown list**, between `<table>` tags.
106The top level list contains either:
107
108<table>
109
110- tr
111 - `thead`
112 - zero or one, at the beginning
113- tr
114 - `tr`
115 - zero or more, after `thead`
116
117</table>
118
119The second level contains the contents of cells, but you **don't** write `td`
120or `<td>`.
121
122(This format looks similar to [tables in
123reStructuredText](https://sublime-and-sphinx-guide.readthedocs.io/en/latest/tables.html)).
124
125### Markdown &rarr; HTML &rarr; HTML Conversion
126
127As mentioned, it takes two steps to convert:
128
1291. Any Markdown translator will produce a
130 `<table> <ul> <li> ... </li> </ul> </table>` structure.
1311. **Our** `ul-table` plugin transforms that into a
132 `<table> <tr> <td> </td> </tr> </table>` structure, which is a normal HTML
133 table.
134
135So `ul-table` is an HTML processor, **not** a Markdown processor. But it's
136meant to be used with Markdown.
137
138## Details
139
140### Comparison: Tedious Inline HTML
141
142Here's the equivalent in CommonMark:
143
144 <table>
145 <thead>
146 <tr>
147 <td>Shell</td>
148 <td>Version</td>
149 </tr>
150 </thead>
151 <tr>
152 <td>
153
154 <!-- be careful not to indent this 4 spaces! -->
155 [bash](https://www.gnu.org/software/bash/)
156
157 </td>
158 <td>5.2</td>
159 </tr>
160 <tr>
161 <td>
162
163 [OSH](https://oils.pub/)
164
165 </td>
166 <td>0.25.0</td>
167 </tr>
168
169 </table>
170
171It uses the rule where you can embed Markdown inside HTML inside Markdown.
172With `ul-table`, you **don't** need this mutual nesting.
173
174The `ul-table` text is also shorter!
175
176---
177
178Trivia: with CommonMark, you get an extra `<p>` element:
179
180 <td>
181 <p>OSH</p>
182 </td>
183
184`ul-table` can produce simpler HTML:
185
186 <td>
187 OSH
188 </td>
189
190### Stylesheet
191
192To make the table look nice, I add a `<style>` tag, inside Markdown:
193
194 <style>
195 table {
196 margin: 0 auto;
197 }
198 td {
199 padding-left: 1em;
200 padding-right: 1em;
201 }
202 </style>
203
204## Adding HTML Attributes
205
206HTML attributes like `<tr class=foo>` and `<td id=bar>` let you format and
207style your table.
208
209You can add attributes to cells, columns, and rows.
210
211### Cells
212
213Add cell attributes with a `cell-attrs` tag **before** the cell contents:
214
215 - thead
216 - Name
217 - Age
218 - tr
219 - Alice
220 - <cell-attrs class=num /> 42
221
222It's important that `cell-attrs` is a **self-closing** tag:
223
224 <cell-attrs /> # Yes
225 <cell-attrs> # No: this is an opening tag
226
227How does this work? `ul-table` takes the attributes from `<cell-attrs />`, and
228puts it on the generated `<td>`.
229
230### Columns
231
232Add attributes to **every cell in a column** the same way, except in the
233`thead` section:
234
235 - thead
236 - Name
237 - <cell-attrs class=num /> Age
238 - tr
239 - Alice
240 - 42 <!-- this cell gets class=num -->
241 - tr
242 - Bob
243 - 9 <!-- this cells gets class=num -->
244
245This is particularly useful for aligning numbers to the right:
246
247 <style>
248 .num {
249 text-align: right;
250 }
251 </style>
252
253Example:
254
255<style>
256.num {
257 text-align: right;
258}
259</style>
260
261<table>
262
263- thead
264 - Name
265 - <cell-attrs class=num /> Age
266- tr
267 - Alice
268 - 42
269- tr
270 - Bob
271 - 9
272
273</table>
274
275If the same attribute appears in the `thead` and a `tr` section, the values are
276**concatenated**, with a space. Example:
277
278 <td class="from-thead from-tr">
279
280### Rows
281
282Add row attributes like this:
283
284 - thead
285 - Name
286 - Age
287 - tr
288 - Alice
289 - 42
290 - tr <row-attrs class="special-row />
291 - Bob
292 - 9
293
294## Example: Markdown and HTML Inside Cells
295
296Here's an example that uses more features. Source code of this table:
297[doc/ul-table.md]($oils-src).
298
299[bash]: $xref
300
301<table id="foo">
302
303- thead
304 - Shell
305 - Version
306 - Example Code
307- tr
308 - [bash][]
309 - 5.2
310 - ```
311 echo sh=$bash
312 ls /tmp | wc -l
313 echo
314 ```
315- tr
316 - [dash]($xref)
317 - 1.5
318 - <em>Inline HTML</em>
319- tr
320 - [mksh]($xref)
321 - 4.0
322 - <table>
323 <tr>
324 <td>HTML table</td>
325 <td>inside</td>
326 </tr>
327 <tr>
328 <td>this table</td>
329 <td>no way to re-enter inline markdown though?</td>
330 </tr>
331 </table>
332- tr
333 - [zsh]($xref)
334 - 3.6
335 - Unordered List
336 - one
337 - two
338- tr
339 - [yash]($xref)
340 - 1.0
341 - Ordered List
342 1. one
343 1. two
344- tr
345 - [ksh]($xref)
346 - This is
347 paragraph one.
348
349 This is
350 paragraph two
351 - Another cell with ...
352
353 ... multiple paragraphs.
354
355</table>
356
357## Markdown Quirks to Be Aware Of
358
359Here are some quirks I ran into when creating ul-tables.
360
361(1) CommonMark doesn't allow empty list items:
362
363 - thead
364 -
365 - above is not rendered as a list item
366
367You can work around this by using a comment, or invisible character:
368
369 - tr
370 - <!-- empty -->
371 - above is OK
372 - tr
373 - &nbsp;
374 - also OK
375
376- [Related CommonMark thread](https://talk.commonmark.org/t/clarify-following-empty-list-items-in-0-31-2/4599)
377
378As similar issue is that line breaks affect backtick expansion to `<code>`:
379
380 - tr
381 - <cell-attrs /> <!-- we need something on this line -->
382 ... More `proc` features ...
383
384I think this is also because `<cell-attrs />` doesn't "count" as text, so the
385list item is considered empty.
386
387(2) Likewise, a cell with a literal hyphen may need a comment or space in front of it:
388
389 - tr
390 - <!-- hyphen --> -
391 - &nbsp; -
392
393## Comparisons
394
395### CommonMark Doesn't Have Tables
396
397Related discussions:
398
399- 2014: [Tables in pure Markdown](https://talk.commonmark.org/t/tables-in-pure-markdown/81)
400- 2022: [Obvious Markdown syntax for Tables](https://talk.commonmark.org/t/obvious-markdown-syntax-for-tables/4143/9)
401
402### Github Tables are Awkward
403
404Github-flavored Markdown has an non-standard extension for tables:
405
406- [Github: Organizing Information With Tables](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-tables)
407
408This style is hard to read and write, especially with large tables:
409
410```
411| Command | Description |
412| --- | --- |
413| git status | List all new or modified files |
414| git diff | Show file differences that haven't been staged |
415```
416
417Our style is less noisy, and more easily editable:
418
419```
420<table>
421
422- thead
423 - Command
424 - Description
425- tr
426 - git status
427 - List all new or modified files
428- tr
429 - git diff
430 - Show file differences that haven't been staged
431
432</table>
433```
434
435- Related wiki page: [Markdown Tables]($wiki)
436
437
438## Conclusion
439
440`ul-table` is a nice way of writing and maintaining HTML tables. The appendix
441has links and details.
442
443### Related Docs
444
445- [How We Build Oils Documentation](doc-toolchain.html)
446- [Examples of HTML Plugins](doc-plugins.html)
447
448## Appendix: Implemention
449
450- [doctools/ul_table.py]($oils-src) - about 500 lines
451- [lazylex/html.py]($oils-src) - about 500 lines
452
453### Algorithm Notes
454
455- lazy lexing
456- recursive descent parser
457 - TODO: show grammar
458
459TODO: I would like someone to produce a **DOM**-based implementation!
460
461Our implementation is pretty low-level. It's meant to avoid the "big load
462anti-pattern" (allocating too much), so it's a necessarily more verbose.
463
464A DOM-based implementation should be much less than 1000 lines.
465
466## Appendix: Real Examples
467
468- [Guide to Procs and Funcs]($oils-doc:proc-func.html) has a big `ul-table`.
469 - Source: [doc/proc-func.md]($oils-src)
470
471I converted the tables in these September posts to `ul-table`:
472
473- [What Oils Looks Like in 2024](https://www.oilshell.org/blog/2024/09/project-overview.html)
474- [After 8 Years, Oils Is Still Small and Flexible](https://www.oilshell.org/blog/2024/09/line-counts.html)
475- [Garbage Collection Makes YSH Different](https://www.oilshell.org/blog/2024/09/gc.html)
476- [A Retrospective on the Oils Project](https://www.oilshell.org/blog/2024/09/retrospective.html)
477
478The markup was much shorter and simpler after conversion!
479
480TODO:
481
482- More tables to Make
483 - Interior/Exterior
484 - Narrow Waist
485- Wiki pages could use conversion
486 - [Alternative Shells]($wiki)
487 - [Alternative Regex Syntax]($wiki)
488 - [Survey of Config Languages]($wiki)
489 - [Polyglot Language Understanding]($wiki)
490 - [The Biggest Shell Programs in the World]($wiki)
491
492## HTML Quirks
493
494- `<th>` is like `<td>`, but it belongs in `<thead><tr>`. Browsers make it
495 bold and centered.
496- You can't put `class=` on `<colgroup>` and `<col>` and align columns left and
497 right.
498 - You have to put `class=` on *every* `<td>` cell instead.
499 - `ul-table` solves this with "inherited" `<cell-attrs />` in the `thead`
500 section.
501
502<!--
503
504### FAQ
505
506(1) Why do row with attributes look like `tr <row-attrs />`? The first `tr`
507doesn't seem neecssary.
508
509This is because of the CommonMark quirk above: a list item without **text** is
510treated as **empty**. So we require the extra `tr` text.
511
512It's also consistent with plain rows, without attributes.
513
514-->
515
516## Ideas for Features
517
518We could help users edit well-formed tables with enforced column names:
519
520 - thead
521 - <cell-attrs ult-name=name /> Name
522 - <cell-attrs ult-name=age /> Age
523 - tr
524 - <cell-attrs ult-name=name /> Hi
525 - <cell-attrs ult-name=age /> 5
526
527This is a bit verbose, but may be worth it for large tables.
528
529Less verbose syntax idea:
530
531 - thead
532 - <ult col=NAME /> <cell-attrs class=foo /> Name
533 - <ult col=AGE /> Age
534 - tr
535 - <ult col=NAME /> Hi
536 - <ult col=AGE /> 5
537
538Even less verbose:
539
540 - thead
541 - {NAME} Name
542 - {AGE} Age
543 - tr
544 - {NAME} Hi
545 - {AGE} 5
546
547The obvious problem is that we might want the literal text `{NAME}` in the
548header. It's unlikely, but possible.
549