OILS / doc / ul-table.md View on Github | oilshell.org

585 lines, 440 significant
1ul-table: Markdown Tables Without New Syntax
2================================
3
4`ul-table` is an HTML processor that lets you write **tables** as bulleted
5**lists**, in Markdown.
6
7It's a short program I wrote because I got tired of reading and writing `<tr>`
8and `<td>` and `</td>` and `</tr>`. And I got tired of aligning numbers by
9writing `<td class="num">` for every cell.
10
11<div id="toc">
12</div>
13
14## Simple Example
15
16Let's see how it works. How do you make this table?
17
18<style>
19table {
20 margin: 0 auto;
21}
22td {
23 padding-left: 1em;
24 padding-right: 1em;
25}
26</style>
27
28<table>
29
30- thead
31 - Shell
32 - Version
33- tr
34 - [bash](https://www.gnu.org/software/bash/)
35 - 5.2
36- tr
37 - [OSH](https://oils.pub/)
38 - 0.25.0
39
40</table>
41
42With `ul-table`, you create a **two-level** Markdown list, inside `<table>`
43tags:
44
45<!-- TODO: Add pygments highlighting -->
46
47```
48<table>
49
50- thead
51 - Shell
52 - Version
53- tr
54 - [bash](https://www.gnu.org/software/bash/)
55 - 5.2
56- tr
57 - [OSH](https://oils.pub/)
58 - 0.25.0
59
60</table>
61```
62
63The header and data rows are at the top level, and the cells are indented under
64them.
65
66---
67
68The conversion takes **2 steps**: it's Markdown &rarr; HTML &rarr; HTML.
69
70First, any Markdown processor will produce this list structure, with `<ul>` and
71`<li>`:
72
73- thead
74 - Shell
75 - Version
76- tr
77 - [bash](https://www.gnu.org/software/bash/)
78 - 5.2
79- tr
80 - [OSH](https://oils.pub/)
81 - 0.25.0
82
83Second, **our** `ul-table` plugin parses and transforms that into a table, with
84`<tr>` and `<td>`:
85
86<table>
87
88- thead
89 - Shell
90 - Version
91- tr
92 - [bash](https://www.gnu.org/software/bash/)
93 - 5.2
94- tr
95 - [OSH](https://oils.pub/)
96 - 0.25.0
97
98</table>
99
100So `ul-table` is an HTML processor, **not** a Markdown processor. But it's
101meant to be used with Markdown.
102
103## Design
104
105### Goals
106
107<!--
108This means your docs are still readable without it, e.g. on sourcehut or
109Github. It degrades gracefully.
110-->
111
112- Don't invent any new syntax.
113 - It reuses your knowledge of Markdown &mdash; e.g. hyperlinks.
114 - It reuses your knowledge of HTML &mdash; e.g. attributes on tags.
115- Large, complex tables should be maintainable.
116- The user should have the **full** power of HTML. We don't hide it under
117 another language, like MediaWiki does.
118- Degrade gracefully. Because it's just Markdown, you **won't break** docs by
119 adding it.
120 - The intermediate list form is what sourcehut or Github will show.
121
122### Comparison
123
124Compared to other table markup formats, `ul-table` is shorter, less noisy, and
125easier to edit:
126
127- [ul-table Comparison: Github, Wikipedia, reStructuredText, AsciiDoc](ul-table-compare.html)
128
129## Details
130
131### ul-table "Grammar"
132
133Recall that a `ul-table` is a **two-level Markdown list**, between `<table>`
134tags. The top level list contains either:
135
136<table>
137
138- tr
139 - `thead`
140 - zero or one, at the beginning
141- tr
142 - `tr`
143 - zero or more, after `thead`
144
145</table>
146
147The second level contains the contents of cells, but you **don't** write `td`
148or `<td>`.
149
150### Stylesheet
151
152To make the table look nice, I add a `<style>` tag, inside Markdown:
153
154 <style>
155 table {
156 margin: 0 auto;
157 }
158 td {
159 padding-left: 1em;
160 padding-right: 1em;
161 }
162 </style>
163
164## Adding HTML Attributes
165
166HTML attributes like `<tr class=foo>` and `<td id=bar>` let you format and
167style your table.
168
169You can add attributes to cells, columns, and rows.
170
171### Cells
172
173<style>
174.hi { background-color: thistle }
175</style>
176
177<table>
178
179- thead
180 - Name
181 - Age
182- tr
183 - Alice
184 - 42 <cell-attrs class=hi />
185- tr
186 - Bob
187 - 9
188
189</table>
190
191Add cell attributes with a `cell-attrs` tag after the cell contents:
192
193```
194- thead
195 - Name
196 - Age
197- tr
198 - Alice
199 - 42 <cell-attrs class=hi />
200- tr
201 - Bob
202 - 9
203```
204
205You must use a **self-closing** tag:
206
207 <cell-attrs /> # Yes
208 <cell-attrs> # No: this is an opening tag
209
210Notice that `ul-table` takes the attributes from the `<cell-attrs />` tag, and
211puts it on the generated `<td>` tag.
212
213### Columns
214
215<style>
216.num {
217 text-align: right;
218}
219</style>
220
221<table>
222
223- thead
224 - Name
225 - Age <cell-attrs class=num />
226- tr
227 - Alice
228 - 42
229- tr
230 - Bob
231 - 9
232
233</table>
234
235To add attributes to **every cell in a column**, put `<cell-attrs />` in the
236`thead` section:
237
238<style>
239.num {
240 background-color: bisque;
241 align: right;
242}
243</style>
244
245```
246- thead
247 - Name
248 - Age <cell-attrs class=num />
249- tr
250 - Alice
251 - 42 <!-- this cell gets class=num -->
252- tr
253 - Bob
254 - 9 <!-- this cells gets class=num -->
255```
256
257Then every `<td>` in the column will "inherit" those attributes. This is
258useful for aligning numbers to the right:
259
260 <style>
261 .num {
262 align: right;
263 }
264 </style>
265
266If the same attribute appears in a column in both `thead` and `tr`, the values
267are **concatenated**, with a space. Example:
268
269 <td class="from-thead from-tr">
270
271### Rows
272
273<style>
274.special-row {
275 background-color: powderblue;
276}
277</style>
278
279<table>
280
281- thead
282 - Name
283 - Age
284- tr
285 - Alice
286 - 42
287- tr <row-attrs class="special-row "/>
288 - Bob
289 - 9
290
291</table>
292
293To add row attributes, put `<row-attrs />` after the `- tr`:
294
295 - thead
296 - Name
297 - Age
298 - tr
299 - Alice
300 - 42
301 - tr <row-attrs class="special-row" />
302 - Bob
303 - 9
304
305## More Complex Example
306
307This example uses more features, like Markdown and HTML inside cells. You may
308want to view the source text for this table: [doc/ul-table.md]($oils-src).
309
310[bash]: $xref
311
312<table id="foo">
313
314- thead
315 - Shell
316 - Version
317 - Example Code
318- tr
319 - [bash][]
320 - 5.2
321 - ```
322 echo sh=$bash
323 ls /tmp | wc -l
324 echo
325 ```
326- tr
327 - [dash]($xref)
328 - 1.5
329 - <em>Inline HTML</em>
330- tr
331 - [mksh]($xref)
332 - 4.0
333 - <table>
334 <tr>
335 <td>HTML table</td>
336 <td>inside</td>
337 </tr>
338 <tr>
339 <td>this table</td>
340 <td>no way to re-enter inline markdown though?</td>
341 </tr>
342 </table>
343- tr
344 - [zsh]($xref)
345 - 3.6
346 - Unordered List
347 - one
348 - two
349- tr
350 - [yash]($xref)
351 - 1.0
352 - Ordered List
353 1. one
354 1. two
355- tr
356 - [ksh]($xref)
357 - This is
358 paragraph one.
359
360 This is
361 paragraph two
362 - Another cell with ...
363
364 ... multiple paragraphs.
365
366</table>
367
368&nbsp;
369
370Another table:
371
372<style>
373.osh-code { color: darkred }
374.ysh-code { color: darkblue }
375</style>
376
377
378<table>
379
380- thead
381 - OSH
382 - YSH
383- tr
384 - ```
385 my-copy() {
386 cp --verbose "$@"
387 }
388 ```
389 <cell-attrs class=osh-code />
390 - ```
391 proc my-copy {
392 cp --verbose @ARGV
393 }
394 ```
395 <cell-attrs class=ysh-code />
396- tr
397 - x
398 - y
399
400</table>
401
402
403## Markdown Quirks
404
405Here are some quirks I ran into when using `ul-table`.
406
407(1) CommonMark doesn't allow empty list items:
408
409 - thead
410 -
411 - above is not rendered as a list item
412
413You can work around this by using a comment, or invisible character:
414
415 - tr
416 - <!-- empty -->
417 - above is OK
418 - tr
419 - &nbsp;
420 - also OK
421
422- [Related CommonMark thread](https://talk.commonmark.org/t/clarify-following-empty-list-items-in-0-31-2/4599)
423
424(2) Similarly, a cell with a literal hyphen may need a comment or space in
425front of it:
426
427 - tr
428 - <!-- hyphen --> -
429 - &nbsp; -
430
431## Conclusion
432
433`ul-table` is a nice way of writing and maintaining HTML tables. The appendix
434has links and details.
435
436### Related Docs
437
438- [ul-table Comparison: Github, Wikipedia, reStructuredText, AsciiDoc](ul-table-compare.html)
439- [How We Build Oils Documentation](doc-toolchain.html)
440- [Examples of HTML Plugins](doc-plugins.html)
441
442## Appendix: Implemention
443
444- [doctools/ul_table.py]($oils-src) - about 500 lines
445- [lazylex/html.py]($oils-src) - about 500 lines
446
447### Notes on the Algorithm
448
449- lazy lexing
450- recursive descent parser
451 - TODO: show grammar
452
453TODO: I would like someone to produce a **DOM**-based implementation!
454
455Our implementation is pretty low-level. It's meant to avoid the "big load
456anti-pattern" (allocating too much), so it's a necessarily more verbose.
457
458A DOM-based implementation should be much less than 1000 lines.
459
460## Appendix: Real Examples
461
462- [Guide to Procs and Funcs]($oils-doc:proc-func.html) has a big `ul-table`.
463 - Source: [doc/proc-func.md]($oils-src)
464
465I converted the tables in these September posts to `ul-table`:
466
467- [What Oils Looks Like in 2024](https://www.oilshell.org/blog/2024/09/project-overview.html)
468- [After 8 Years, Oils Is Still Small and Flexible](https://www.oilshell.org/blog/2024/09/line-counts.html)
469- [Garbage Collection Makes YSH Different](https://www.oilshell.org/blog/2024/09/gc.html)
470- [A Retrospective on the Oils Project](https://www.oilshell.org/blog/2024/09/retrospective.html)
471
472The markup was much shorter and simpler after conversion!
473
474TODO:
475
476- More tables to Make
477 - Interior/Exterior
478 - Narrow Waist
479- Wiki pages could use conversion
480 - [Alternative Shells]($wiki)
481 - [Alternative Regex Syntax]($wiki)
482 - [Survey of Config Languages]($wiki)
483 - [Polyglot Language Understanding]($wiki)
484 - [The Biggest Shell Programs in the World]($wiki)
485
486## HTML Quirks
487
488- `<th>` is like `<td>`, but it belongs in `<thead><tr>`. Browsers make it
489 bold and centered.
490- `<colgroup>` and `<col>` often do do what I want.
491 - As mentioned above, you can't put `class=` columns and align them to the
492 right or left. You have to put `class=` on *every* `<td>` cell instead.
493
494<!--
495
496### FAQ
497
498(1) Why do row with attributes look like `tr <row-attrs />`? The first `tr`
499doesn't seem neecssary.
500
501This is because of the CommonMark quirk above: a list item without **text** is
502treated as **empty**. So we require the extra `tr` text.
503
504It's also consistent with plain rows, without attributes.
505
506-->
507
508## Ideas for Features
509
510- Support `tfoot`?
511- Emit `tbody`?
512
513---
514
515We could help users edit well-formed tables with enforced column names:
516
517 - thead
518 - <cell-attrs ult-name=name /> Name
519 - <cell-attrs ult-name=age /> Age
520 - tr
521 - <cell-attrs ult-name=name /> Hi
522 - <cell-attrs ult-name=age /> 5
523
524This is a bit verbose, but may be worth it for large tables.
525
526Less verbose syntax idea:
527
528 - thead
529 - <ult col=NAME /> <cell-attrs class=foo /> Name
530 - <ult col=AGE /> Age
531 - tr
532 - <ult col=NAME /> Hi
533 - <ult col=AGE /> 5
534
535Even less verbose:
536
537 - thead
538 - {NAME} Name
539 - {AGE} Age
540 - tr
541 - {NAME} Hi
542 - {AGE} 5
543
544The obvious problem is that we might want the literal text `{NAME}` in the
545header. It's unlikely, but possible.
546
547
548<!--
549
550TODO: We should detect cell-attrs before the closing `</li>`, or in any
551position?
552
553<table>
554
555- thead
556 - OSH
557 - YSH
558- tr
559 - ```
560 my-copy() {
561 cp --verbose "$@"
562 }
563 ```
564 <cell-attrs class=osh-code />
565 - ```
566 proc my-copy {
567 cp --verbose @ARGV
568 }
569 ```
570 <cell-attrs class=ysh-code />
571
572</table>
573
574-->
575
576
577<!--
578TODO:
579
580- change back to oilshell.org/ for publishing
581- Compare to wikipedia
582 - https://en.wikipedia.org/wiki/Help:Table
583 - table caption - this is just <caption>
584 - rowspan
585-->