1 | ul-table: Markdown Tables Without New Syntax
|
2 | ================================
|
3 |
|
4 | `ul-table` is an HTML processor that lets you write **tables** as bulleted
|
5 | **lists**, in Markdown:
|
6 |
|
7 | <table>
|
8 |
|
9 | - thead
|
10 | - Shell
|
11 | - Version
|
12 | - tr
|
13 | - bash
|
14 | - 5.2
|
15 | - tr
|
16 | - OSH
|
17 | - 0.25.0
|
18 |
|
19 | </table>
|
20 |
|
21 | <table>
|
22 |
|
23 | - thead
|
24 | - Shell
|
25 | - Version
|
26 | - tr
|
27 | - bash
|
28 | - 5.2
|
29 | - tr
|
30 | - OSH
|
31 | - 0.25.0
|
32 |
|
33 | </table>
|
34 |
|
35 | I designed this format because it's tedious to read, write, and edit `<tr>` and
|
36 | `<td>` and `</td>` and `</tr>`. Aligning columns is also tedious in HTML.
|
37 |
|
38 | `ul-table` does **not** involve new Markdown syntax, only a new interpretation.
|
39 |
|
40 | This means your docs are still readable without it, e.g. on sourcehut or
|
41 | Github. It degrades gracefully.
|
42 |
|
43 | ---
|
44 |
|
45 | Other design goals:
|
46 |
|
47 | - It should scale to large, complex tables.
|
48 | - Expose the **full** power of HTML, unlike other solutions.
|
49 |
|
50 | <!--
|
51 |
|
52 | - Solve the "column alignment problem" concisely.
|
53 | - HTML punts on this, as explained below.
|
54 | -->
|
55 |
|
56 | <div id="toc">
|
57 | </div>
|
58 |
|
59 | ## Simple Example
|
60 |
|
61 | Let's add hyperlinks to our example, to make it more realistic.
|
62 |
|
63 | <style>
|
64 | table {
|
65 | margin: 0 auto;
|
66 | }
|
67 | td {
|
68 | padding: 0.2em;
|
69 | }
|
70 | </style>
|
71 |
|
72 | <table>
|
73 |
|
74 | - thead
|
75 | - Shell
|
76 | - Version
|
77 | - tr
|
78 | - [bash](https://www.gnu.org/software/bash/)
|
79 | - 5.2
|
80 | - tr
|
81 | - [OSH](https://www.oilshell.org/)
|
82 | - 0.25.0
|
83 |
|
84 | </table>
|
85 |
|
86 | ### `ul-table` Syntax
|
87 |
|
88 | You can make this table with a **two-level Markdown list**, with Markdown
|
89 | hyperlink syntax:
|
90 |
|
91 | <table> <!-- don't forget this tag -->
|
92 |
|
93 | - thead
|
94 | - Shell
|
95 | - Version
|
96 | - tr
|
97 | - [bash](https://www.gnu.org/software/bash/)
|
98 | - 5.2
|
99 | - tr
|
100 | - [OSH](https://www.oilshell.org/)
|
101 | - 0.25.0
|
102 |
|
103 | </table>
|
104 |
|
105 | (This format looks similar to [tables in
|
106 | reStructuredText](https://sublime-and-sphinx-guide.readthedocs.io/en/latest/tables.html)).
|
107 |
|
108 | It takes two steps to convert:
|
109 |
|
110 | 1. Any Markdown translator will produce a
|
111 | `<table> <ul> <li> ... </li> </ul> </table>` structure.
|
112 | 1. **Our** `ul-table` plugin transforms that into a
|
113 | `<table> <tr> <td> </td> </tr> </table>` structure, which is a normal HTML
|
114 | table.
|
115 |
|
116 | ### Comparison: Markdown Uses Tedious Inline HTML
|
117 |
|
118 | Here's the equivalent in CommonMark:
|
119 |
|
120 | <table>
|
121 | <thead>
|
122 | <tr>
|
123 | <td>Shell</td>
|
124 | <td>Version</td>
|
125 | </tr>
|
126 | </thead>
|
127 | <tr>
|
128 | <td>
|
129 |
|
130 | <!-- be careful not to indent this 4 spaces! -->
|
131 | [bash](https://www.gnu.org/software/bash/)
|
132 |
|
133 | </td>
|
134 | <td>5.2</td>
|
135 | </tr>
|
136 | <tr>
|
137 | <td>
|
138 |
|
139 | [OSH](https://www.oilshell.org/)
|
140 |
|
141 | </td>
|
142 | <td>0.25.0</td>
|
143 | </tr>
|
144 |
|
145 | </table>
|
146 |
|
147 | It uses the rule where you can embed Markdown inside HTML inside Markdown.
|
148 | With `ul-table`, you **don't** need this mutual nesting.
|
149 |
|
150 | The text you have to write is also a lot shorter!
|
151 |
|
152 | ---
|
153 |
|
154 | Trivia: with CommonMark, you also get an extra `<p>` element:
|
155 |
|
156 | <td>
|
157 | <p>OSH</p>
|
158 | </td>
|
159 |
|
160 | `ul-table` can produce simpler HTML:
|
161 |
|
162 | <td>
|
163 | OSH
|
164 | </td>
|
165 |
|
166 | ### Stylesheet
|
167 |
|
168 | To make the table look nice, I add a `<style>` tag, inside Markdown:
|
169 |
|
170 | <style>
|
171 | table {
|
172 | margin: 0 auto;
|
173 | }
|
174 | td {
|
175 | padding: 0.2em;
|
176 | }
|
177 | </style>
|
178 |
|
179 | ### The Untranslated HTML
|
180 |
|
181 | If you omit the `<table>` tags, then the rendered HTML looks like this:
|
182 |
|
183 | - thead
|
184 | - Shell
|
185 | - Version
|
186 | - tr
|
187 | - [bash]($xref)
|
188 | - 5.2
|
189 | - tr
|
190 | - [OSH]($xref)
|
191 | - 0.25.0
|
192 |
|
193 | This is how your tables will appear on sourcehut or Github, which don't (yet)
|
194 | have `ul-table` support. Remember, `ul-table` is **not** an extension to
|
195 | Markdown syntax.
|
196 |
|
197 | ## Adding HTML Attributes
|
198 |
|
199 | HTML attributes like `<tr class=foo>` and `<td id=bar>` let you format and
|
200 | style your table.
|
201 |
|
202 | You can add attributes to cells, columns, and rows.
|
203 |
|
204 | ### Cells
|
205 |
|
206 | Add cell attributes with a `cell-attrs` tag **before** the cell contents:
|
207 |
|
208 | - thead
|
209 | - Name
|
210 | - Age
|
211 | - tr
|
212 | - Alice
|
213 | - <cell-attrs class=num /> 42
|
214 |
|
215 | It's important that `cell-attrs` is a **self-closing** tag:
|
216 |
|
217 | <cell-attrs /> # Yes
|
218 | <cell-attrs> # No: this is an opening tag
|
219 |
|
220 | How does this work? `ul-table` takes the attributes from `<cell-attrs />`, and
|
221 | puts it on the generated `<td>`.
|
222 |
|
223 | ### Columns
|
224 |
|
225 | Add attributes to **every cell in a column** by the same way, except in the
|
226 | `thead` section:
|
227 |
|
228 | - thead
|
229 | - Name
|
230 | - <cell-attrs class=num /> Age
|
231 | - tr
|
232 | - Alice
|
233 | - 42 # this cell gets class=num
|
234 | - tr
|
235 | - Bob
|
236 | - 25 # this cells gets class=num
|
237 |
|
238 | This is particularly useful for aligning numbers to the right:
|
239 |
|
240 | <style>
|
241 | .num {
|
242 | text-align: right;
|
243 | }
|
244 | </style>
|
245 |
|
246 | If the same attribute appears in the `thead` and and a `tr` section, the values
|
247 | are **concatenated**, with a space. Example:
|
248 |
|
249 | <td class="from-thead from-tr">
|
250 |
|
251 | ### Rows
|
252 |
|
253 | Add row attributes like this:
|
254 |
|
255 | - thead
|
256 | - Name
|
257 | - Age
|
258 | - tr
|
259 | - Alice
|
260 | - 42
|
261 | - tr <row-attrs class="special-row />
|
262 | - Bob
|
263 | - 25
|
264 |
|
265 | ## Example: Markdown and HTML Inside Cells
|
266 |
|
267 | Here's an example that uses more features. Source code of this table:
|
268 | [doc/ul-table.md]($oils-src).
|
269 |
|
270 | [bash]: $xref
|
271 |
|
272 | <table id="foo">
|
273 |
|
274 | - thead
|
275 | - Shell
|
276 | - Version
|
277 | - Example Code
|
278 | - tr
|
279 | - [bash][]
|
280 | - 5.2
|
281 | - ```
|
282 | echo sh=$bash
|
283 | ls /tmp | wc -l
|
284 | echo
|
285 | ```
|
286 | - tr
|
287 | - [dash]($xref)
|
288 | - 1.5
|
289 | - <em>Inline HTML</em>
|
290 | - tr
|
291 | - [mksh]($xref)
|
292 | - 4.0
|
293 | - <table>
|
294 | <tr>
|
295 | <td>HTML table</td>
|
296 | <td>inside</td>
|
297 | </tr>
|
298 | <tr>
|
299 | <td>this table</td>
|
300 | <td>no way to re-enter inline markdown though?</td>
|
301 | </tr>
|
302 | </table>
|
303 | - tr
|
304 | - [zsh]($xref)
|
305 | - 3.6
|
306 | - Unordered List
|
307 | - one
|
308 | - two
|
309 | - tr
|
310 | - [yash]($xref)
|
311 | - 1.0
|
312 | - Ordered List
|
313 | 1. one
|
314 | 1. two
|
315 | - tr
|
316 | - [ksh]($xref)
|
317 | - This is
|
318 | paragraph one.
|
319 |
|
320 | This is
|
321 | paragraph two
|
322 | - Another cell with ...
|
323 |
|
324 | ... multiple paragraphs.
|
325 |
|
326 | </table>
|
327 |
|
328 | ## Markdown Quirks to Be Aware Of
|
329 |
|
330 | Here are some quirks I ran into when creating ul-tables.
|
331 |
|
332 | (1) CommonMark doesn't allow empty list items:
|
333 |
|
334 | - thead
|
335 | -
|
336 | - above is not rendered as a list item
|
337 |
|
338 | You can work around this by using a comment, or invisible character:
|
339 |
|
340 | - tr
|
341 | - <!-- empty -->
|
342 | - above is OK
|
343 | - tr
|
344 | -
|
345 | - also OK
|
346 |
|
347 | - [Related CommonMark thread](https://talk.commonmark.org/t/clarify-following-empty-list-items-in-0-31-2/4599)
|
348 |
|
349 | As similar issue is that line breaks affect backtick expansion to `<code>`:
|
350 |
|
351 | - tr
|
352 | - <cell-attrs /> <!-- we need something on this line -->
|
353 | ... More `proc` features ...
|
354 |
|
355 | I think this is also because `<cell-attrs />` doesn't "count" as text, so the
|
356 | list item is considered empty.
|
357 |
|
358 | (2) Likewise, a cell with a literal hyphen may need a comment or space in front of it:
|
359 |
|
360 | - tr
|
361 | - <!-- hyphen --> -
|
362 | - -
|
363 |
|
364 | ## Comparisons
|
365 |
|
366 | ### CommonMark Doesn't Have Tables
|
367 |
|
368 | Related discussions:
|
369 |
|
370 | - 2014: [Tables in pure Markdown](https://talk.commonmark.org/t/tables-in-pure-markdown/81)
|
371 | - 2022: [Obvious Markdown syntax for Tables](https://talk.commonmark.org/t/obvious-markdown-syntax-for-tables/4143/9)
|
372 |
|
373 | ### Github Tables are Awkward
|
374 |
|
375 | Github-flavored Markdown has an non-standard extension for tables:
|
376 |
|
377 | - [Github: Organizing Information With Tables](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-tables)
|
378 |
|
379 | This style is hard to read and write, especially with large tables:
|
380 |
|
381 | ```
|
382 | | Command | Description |
|
383 | | --- | --- |
|
384 | | git status | List all new or modified files |
|
385 | | git diff | Show file differences that haven't been staged |
|
386 | ```
|
387 |
|
388 | Our style is less noisy, and more easily editable:
|
389 |
|
390 | ```
|
391 | - thead
|
392 | - Command
|
393 | - Description
|
394 | - tr
|
395 | - git status
|
396 | - List all new or modified files
|
397 | - tr
|
398 | - git diff
|
399 | - Show file differences that haven't been staged
|
400 | ```
|
401 |
|
402 | - Related wiki page: [Markdown Tables]($wiki)
|
403 |
|
404 |
|
405 | ## Conclusion
|
406 |
|
407 | `ul-table` is a nice way of writing and maintaining HTML tables. The appendix
|
408 | has links and details.
|
409 |
|
410 | ### Related Docs
|
411 |
|
412 | - [How We Build Oils Documentation](doc-toolchain.html)
|
413 | - [Examples of HTML Plugins](doc-plugins.html)
|
414 |
|
415 | ## Appendix: Implemention
|
416 |
|
417 | - [doctools/ul_table.py]($oils-src) - about 500 lines
|
418 | - [lazylex/html.py]($oils-src) - about 500 lines
|
419 |
|
420 | TODO:
|
421 |
|
422 | - Make it run under Python 3, including unit tests
|
423 | - De-couple it from cmark.so
|
424 | - Use Unix pipes, with a demo in `doctools/ul-table.sh`
|
425 |
|
426 | ### Algorithm Notes
|
427 |
|
428 | - lazy lexing
|
429 | - recursive descent parser
|
430 | - TODO: show grammar
|
431 |
|
432 | TODO:
|
433 |
|
434 | - I would like someone to produce a **DOM** implementation!
|
435 | - Our implementation is meant to avoid the "big load anti-pattern"
|
436 | (allocating too much), so it's a necessarily more verbose. A DOM
|
437 | implementation should be much less than 1000 lines.
|
438 |
|
439 |
|
440 | ## Appendix: Real Examples
|
441 |
|
442 | - [Guide to Procs and Funcs]($oils-doc:proc-func.html) has a big `ul-table`.
|
443 | - Source: [doc/proc-func.md]($oils-src)
|
444 |
|
445 | I converted the tables in these September posts to `ul-table`:
|
446 |
|
447 | - [What Oils Looks Like in 2024](https://www.oilshell.org/blog/2024/09/project-overview.html)
|
448 | - [After 8 Years, Oils Is Still Small and Flexible](https://www.oilshell.org/blog/2024/09/line-counts.html)
|
449 | - [Garbage Collection Makes YSH Different](https://www.oilshell.org/blog/2024/09/gc.html)
|
450 | - [A Retrospective on the Oils Project](https://www.oilshell.org/blog/2024/09/retrospective.html)
|
451 |
|
452 | The markup was much shorter and simpler after conversion!
|
453 |
|
454 | TODO:
|
455 |
|
456 | - Tables to Make
|
457 | - Interior/Exterior
|
458 | - Narrow Waist
|
459 |
|
460 | - Wiki pages could use conversion
|
461 | - [Alternative Shells]($wiki)
|
462 | - [Alternative Regex Syntax]($wiki)
|
463 | - [Survey of Config Languages]($wiki)
|
464 | - [Polyglot Language Understanding]($wiki)
|
465 | - [The Biggest Shell Programs in the World]($wiki)
|
466 |
|
467 |
|
468 | ## HTML Quirks
|
469 |
|
470 | - `<th>` is like `<td>`, but it belongs in `<thead><tr>`. Browsers make it
|
471 | bold and centered.
|
472 | - You can't put `class=` on `<colgroup>` and `<col>` and align columns left and
|
473 | right.
|
474 | - You have to put `class=` on *every* `<td>` cell instead.
|
475 | - `ul-table` solves this with "inherited" `<cell-attrs />` in the `thead`
|
476 | section.
|
477 |
|
478 | <!--
|
479 |
|
480 | ### FAQ
|
481 |
|
482 | (1) Why do row with attributes look like `tr <row-attrs />`? The first `tr`
|
483 | doesn't seem neecssary.
|
484 |
|
485 | This is because of the CommonMark quirk above: a list item without **text** is
|
486 | treated as **empty**. So we require the extra `tr` text.
|
487 |
|
488 | It's also consistent with plain rows, without attributes.
|
489 |
|
490 | -->
|
491 |
|
492 | ## Feature Ideas
|
493 |
|
494 | We could help users edit well-formed tables with enforced column names:
|
495 |
|
496 | - thead
|
497 | - <cell-attrs ult-name=name /> Name
|
498 | - <cell-attrs ult-name=age /> Age
|
499 | - tr
|
500 | - <cell-attrs ult-name=name /> Hi
|
501 | - <cell-attrs ult-name=age /> 5
|
502 |
|
503 | This is a bit verbose, but may be worth it for large tables.
|
504 |
|
505 | Less verbose syntax idea:
|
506 |
|
507 | - thead
|
508 | - <ult col=NAME /> <cell-attrs class=foo /> Name
|
509 | - <ult col=AGE /> Age
|
510 | - tr
|
511 | - <ult col=NAME /> Hi
|
512 | - <ult col=AGE /> 5
|
513 |
|
514 | Even less verbose:
|
515 |
|
516 | - thead
|
517 | - {NAME} Name
|
518 | - {AGE} Age
|
519 | - tr
|
520 | - {NAME} Hi
|
521 | - {AGE} 5
|
522 |
|
523 | The obvious problem is that we might want the literal text `{NAME}` in the
|
524 | header. It's unlikely, but possible.
|
525 |
|