1 | ul-table: Markdown Tables Without New Syntax
|
2 | ================================
|
3 |
|
4 | `ul-table` is an HTML processor that lets you write **tables** as bulleted
|
5 | **lists**, in Markdown.
|
6 |
|
7 |
|
8 | <!--
|
9 |
|
10 | - Solve the "column alignment problem" concisely.
|
11 | - HTML punts on this, as explained below.
|
12 | -->
|
13 |
|
14 | <div id="toc">
|
15 | </div>
|
16 |
|
17 | ## Simple Example
|
18 |
|
19 | To make this table:
|
20 |
|
21 | <style>
|
22 | table {
|
23 | margin: 0 auto;
|
24 | }
|
25 | td {
|
26 | padding-left: 1em;
|
27 | padding-right: 1em;
|
28 | }
|
29 | </style>
|
30 |
|
31 | <table>
|
32 |
|
33 | - thead
|
34 | - Shell
|
35 | - Version
|
36 | - tr
|
37 | - [bash](https://www.gnu.org/software/bash/)
|
38 | - 5.2
|
39 | - tr
|
40 | - [OSH](https://oils.pub/)
|
41 | - 0.25.0
|
42 |
|
43 | </table>
|
44 |
|
45 | You write:
|
46 |
|
47 | <!-- TODO: Add pygments highlighting -->
|
48 |
|
49 | ```
|
50 | <table>
|
51 |
|
52 | - thead
|
53 | - Shell
|
54 | - Version
|
55 | - tr
|
56 | - [bash](https://www.gnu.org/software/bash/)
|
57 | - 5.2
|
58 | - tr
|
59 | - [OSH](https://oils.pub/)
|
60 | - 0.25.0
|
61 |
|
62 | </table>
|
63 | ```
|
64 |
|
65 | Any Markdown processor will produce this:
|
66 |
|
67 | - thead
|
68 | - Shell
|
69 | - Version
|
70 | - tr
|
71 | - [bash](https://www.gnu.org/software/bash/)
|
72 | - 5.2
|
73 | - tr
|
74 | - [OSH](https://oils.pub/)
|
75 | - 0.25.0
|
76 |
|
77 | And then **our** `ul-table` plugin transforms that into the table shown.
|
78 |
|
79 | So the conversion takes **2 steps**. The intermediate form is what sourcehut
|
80 | or Github will show, because they currently don't support `ul-table`.
|
81 |
|
82 | This is good, because it means that `ul-table` degrades gracefully! You can
|
83 | use it anywhere without worrying about breakage.
|
84 |
|
85 | ## About `ul-table`
|
86 |
|
87 | ### Why?
|
88 |
|
89 | Because it's tedious to read, write, and edit `<tr>` and `<td>` and `</td>` and
|
90 | `</tr>`. Aligning columns is also tedious in HTML.
|
91 |
|
92 | <!--
|
93 | This means your docs are still readable without it, e.g. on sourcehut or
|
94 | Github. It degrades gracefully.
|
95 | -->
|
96 |
|
97 | Other design goals:
|
98 |
|
99 | - Don't invent any new Markdown syntax.
|
100 | - Scale to large, complex tables.
|
101 | - Expose the **full** power of HTML, unlike other solutions.
|
102 |
|
103 | ### Structure
|
104 |
|
105 | You make tables with a **two-level Markdown list**, between `<table>` tags.
|
106 | The top level list contains either:
|
107 |
|
108 | <table>
|
109 |
|
110 | - tr
|
111 | - `thead`
|
112 | - zero or one, at the beginning
|
113 | - tr
|
114 | - `tr`
|
115 | - zero or more, after `thead`
|
116 |
|
117 | </table>
|
118 |
|
119 | The second level contains the contents of cells, but you **don't** write `td`
|
120 | or `<td>`.
|
121 |
|
122 | (This format looks similar to [tables in
|
123 | reStructuredText](https://sublime-and-sphinx-guide.readthedocs.io/en/latest/tables.html)).
|
124 |
|
125 | ### Markdown → HTML → HTML Conversion
|
126 |
|
127 | As mentioned, it takes two steps to convert:
|
128 |
|
129 | 1. Any Markdown translator will produce a
|
130 | `<table> <ul> <li> ... </li> </ul> </table>` structure.
|
131 | 1. **Our** `ul-table` plugin transforms that into a
|
132 | `<table> <tr> <td> </td> </tr> </table>` structure, which is a normal HTML
|
133 | table.
|
134 |
|
135 | So `ul-table` is an HTML processor, **not** a Markdown processor. But it's
|
136 | meant to be used with Markdown.
|
137 |
|
138 | ## Details
|
139 |
|
140 | ### Comparison: Tedious Inline HTML
|
141 |
|
142 | Here's the equivalent in CommonMark:
|
143 |
|
144 | <table>
|
145 | <thead>
|
146 | <tr>
|
147 | <td>Shell</td>
|
148 | <td>Version</td>
|
149 | </tr>
|
150 | </thead>
|
151 | <tr>
|
152 | <td>
|
153 |
|
154 | <!-- be careful not to indent this 4 spaces! -->
|
155 | [bash](https://www.gnu.org/software/bash/)
|
156 |
|
157 | </td>
|
158 | <td>5.2</td>
|
159 | </tr>
|
160 | <tr>
|
161 | <td>
|
162 |
|
163 | [OSH](https://oils.pub/)
|
164 |
|
165 | </td>
|
166 | <td>0.25.0</td>
|
167 | </tr>
|
168 |
|
169 | </table>
|
170 |
|
171 | It uses the rule where you can embed Markdown inside HTML inside Markdown.
|
172 | With `ul-table`, you **don't** need this mutual nesting.
|
173 |
|
174 | The `ul-table` text is also shorter!
|
175 |
|
176 | ---
|
177 |
|
178 | Trivia: with CommonMark, you get an extra `<p>` element:
|
179 |
|
180 | <td>
|
181 | <p>OSH</p>
|
182 | </td>
|
183 |
|
184 | `ul-table` can produce simpler HTML:
|
185 |
|
186 | <td>
|
187 | OSH
|
188 | </td>
|
189 |
|
190 | ### Stylesheet
|
191 |
|
192 | To make the table look nice, I add a `<style>` tag, inside Markdown:
|
193 |
|
194 | <style>
|
195 | table {
|
196 | margin: 0 auto;
|
197 | }
|
198 | td {
|
199 | padding-left: 1em;
|
200 | padding-right: 1em;
|
201 | }
|
202 | </style>
|
203 |
|
204 | ## Adding HTML Attributes
|
205 |
|
206 | HTML attributes like `<tr class=foo>` and `<td id=bar>` let you format and
|
207 | style your table.
|
208 |
|
209 | You can add attributes to cells, columns, and rows.
|
210 |
|
211 | ### Cells
|
212 |
|
213 | Add cell attributes with a `cell-attrs` tag **before** the cell contents:
|
214 |
|
215 | - thead
|
216 | - Name
|
217 | - Age
|
218 | - tr
|
219 | - Alice
|
220 | - <cell-attrs class=num /> 42
|
221 |
|
222 | It's important that `cell-attrs` is a **self-closing** tag:
|
223 |
|
224 | <cell-attrs /> # Yes
|
225 | <cell-attrs> # No: this is an opening tag
|
226 |
|
227 | How does this work? `ul-table` takes the attributes from `<cell-attrs />`, and
|
228 | puts it on the generated `<td>`.
|
229 |
|
230 | ### Columns
|
231 |
|
232 | Add attributes to **every cell in a column** the same way, except in the
|
233 | `thead` section:
|
234 |
|
235 | - thead
|
236 | - Name
|
237 | - <cell-attrs class=num /> Age
|
238 | - tr
|
239 | - Alice
|
240 | - 42 <!-- this cell gets class=num -->
|
241 | - tr
|
242 | - Bob
|
243 | - 9 <!-- this cells gets class=num -->
|
244 |
|
245 | This is particularly useful for aligning numbers to the right:
|
246 |
|
247 | <style>
|
248 | .num {
|
249 | text-align: right;
|
250 | }
|
251 | </style>
|
252 |
|
253 | Example:
|
254 |
|
255 | <style>
|
256 | .num {
|
257 | text-align: right;
|
258 | }
|
259 | </style>
|
260 |
|
261 | <table>
|
262 |
|
263 | - thead
|
264 | - Name
|
265 | - <cell-attrs class=num /> Age
|
266 | - tr
|
267 | - Alice
|
268 | - 42
|
269 | - tr
|
270 | - Bob
|
271 | - 9
|
272 |
|
273 | </table>
|
274 |
|
275 | If the same attribute appears in the `thead` and a `tr` section, the values are
|
276 | **concatenated**, with a space. Example:
|
277 |
|
278 | <td class="from-thead from-tr">
|
279 |
|
280 | ### Rows
|
281 |
|
282 | Add row attributes like this:
|
283 |
|
284 | - thead
|
285 | - Name
|
286 | - Age
|
287 | - tr
|
288 | - Alice
|
289 | - 42
|
290 | - tr <row-attrs class="special-row />
|
291 | - Bob
|
292 | - 9
|
293 |
|
294 | ## Example: Markdown and HTML Inside Cells
|
295 |
|
296 | Here's an example that uses more features. Source code of this table:
|
297 | [doc/ul-table.md]($oils-src).
|
298 |
|
299 | [bash]: $xref
|
300 |
|
301 | <table id="foo">
|
302 |
|
303 | - thead
|
304 | - Shell
|
305 | - Version
|
306 | - Example Code
|
307 | - tr
|
308 | - [bash][]
|
309 | - 5.2
|
310 | - ```
|
311 | echo sh=$bash
|
312 | ls /tmp | wc -l
|
313 | echo
|
314 | ```
|
315 | - tr
|
316 | - [dash]($xref)
|
317 | - 1.5
|
318 | - <em>Inline HTML</em>
|
319 | - tr
|
320 | - [mksh]($xref)
|
321 | - 4.0
|
322 | - <table>
|
323 | <tr>
|
324 | <td>HTML table</td>
|
325 | <td>inside</td>
|
326 | </tr>
|
327 | <tr>
|
328 | <td>this table</td>
|
329 | <td>no way to re-enter inline markdown though?</td>
|
330 | </tr>
|
331 | </table>
|
332 | - tr
|
333 | - [zsh]($xref)
|
334 | - 3.6
|
335 | - Unordered List
|
336 | - one
|
337 | - two
|
338 | - tr
|
339 | - [yash]($xref)
|
340 | - 1.0
|
341 | - Ordered List
|
342 | 1. one
|
343 | 1. two
|
344 | - tr
|
345 | - [ksh]($xref)
|
346 | - This is
|
347 | paragraph one.
|
348 |
|
349 | This is
|
350 | paragraph two
|
351 | - Another cell with ...
|
352 |
|
353 | ... multiple paragraphs.
|
354 |
|
355 | </table>
|
356 |
|
357 | ## Markdown Quirks to Be Aware Of
|
358 |
|
359 | Here are some quirks I ran into when creating ul-tables.
|
360 |
|
361 | (1) CommonMark doesn't allow empty list items:
|
362 |
|
363 | - thead
|
364 | -
|
365 | - above is not rendered as a list item
|
366 |
|
367 | You can work around this by using a comment, or invisible character:
|
368 |
|
369 | - tr
|
370 | - <!-- empty -->
|
371 | - above is OK
|
372 | - tr
|
373 | -
|
374 | - also OK
|
375 |
|
376 | - [Related CommonMark thread](https://talk.commonmark.org/t/clarify-following-empty-list-items-in-0-31-2/4599)
|
377 |
|
378 | As similar issue is that line breaks affect backtick expansion to `<code>`:
|
379 |
|
380 | - tr
|
381 | - <cell-attrs /> <!-- we need something on this line -->
|
382 | ... More `proc` features ...
|
383 |
|
384 | I think this is also because `<cell-attrs />` doesn't "count" as text, so the
|
385 | list item is considered empty.
|
386 |
|
387 | (2) Likewise, a cell with a literal hyphen may need a comment or space in front of it:
|
388 |
|
389 | - tr
|
390 | - <!-- hyphen --> -
|
391 | - -
|
392 |
|
393 | ## Comparisons
|
394 |
|
395 | ### CommonMark Doesn't Have Tables
|
396 |
|
397 | Related discussions:
|
398 |
|
399 | - 2014: [Tables in pure Markdown](https://talk.commonmark.org/t/tables-in-pure-markdown/81)
|
400 | - 2022: [Obvious Markdown syntax for Tables](https://talk.commonmark.org/t/obvious-markdown-syntax-for-tables/4143/9)
|
401 |
|
402 | ### Github Tables are Awkward
|
403 |
|
404 | Github-flavored Markdown has an non-standard extension for tables:
|
405 |
|
406 | - [Github: Organizing Information With Tables](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-tables)
|
407 |
|
408 | This style is hard to read and write, especially with large tables:
|
409 |
|
410 | ```
|
411 | | Command | Description |
|
412 | | --- | --- |
|
413 | | git status | List all new or modified files |
|
414 | | git diff | Show file differences that haven't been staged |
|
415 | ```
|
416 |
|
417 | Our style is less noisy, and more easily editable:
|
418 |
|
419 | ```
|
420 | <table>
|
421 |
|
422 | - thead
|
423 | - Command
|
424 | - Description
|
425 | - tr
|
426 | - git status
|
427 | - List all new or modified files
|
428 | - tr
|
429 | - git diff
|
430 | - Show file differences that haven't been staged
|
431 |
|
432 | </table>
|
433 | ```
|
434 |
|
435 | - Related wiki page: [Markdown Tables]($wiki)
|
436 |
|
437 |
|
438 | ## Conclusion
|
439 |
|
440 | `ul-table` is a nice way of writing and maintaining HTML tables. The appendix
|
441 | has links and details.
|
442 |
|
443 | ### Related Docs
|
444 |
|
445 | - [How We Build Oils Documentation](doc-toolchain.html)
|
446 | - [Examples of HTML Plugins](doc-plugins.html)
|
447 |
|
448 | ## Appendix: Implemention
|
449 |
|
450 | - [doctools/ul_table.py]($oils-src) - about 500 lines
|
451 | - [lazylex/html.py]($oils-src) - about 500 lines
|
452 |
|
453 | ### Algorithm Notes
|
454 |
|
455 | - lazy lexing
|
456 | - recursive descent parser
|
457 | - TODO: show grammar
|
458 |
|
459 | TODO: I would like someone to produce a **DOM**-based implementation!
|
460 |
|
461 | Our implementation is pretty low-level. It's meant to avoid the "big load
|
462 | anti-pattern" (allocating too much), so it's a necessarily more verbose.
|
463 |
|
464 | A DOM-based implementation should be much less than 1000 lines.
|
465 |
|
466 | ## Appendix: Real Examples
|
467 |
|
468 | - [Guide to Procs and Funcs]($oils-doc:proc-func.html) has a big `ul-table`.
|
469 | - Source: [doc/proc-func.md]($oils-src)
|
470 |
|
471 | I converted the tables in these September posts to `ul-table`:
|
472 |
|
473 | - [What Oils Looks Like in 2024](https://www.oilshell.org/blog/2024/09/project-overview.html)
|
474 | - [After 8 Years, Oils Is Still Small and Flexible](https://www.oilshell.org/blog/2024/09/line-counts.html)
|
475 | - [Garbage Collection Makes YSH Different](https://www.oilshell.org/blog/2024/09/gc.html)
|
476 | - [A Retrospective on the Oils Project](https://www.oilshell.org/blog/2024/09/retrospective.html)
|
477 |
|
478 | The markup was much shorter and simpler after conversion!
|
479 |
|
480 | TODO:
|
481 |
|
482 | - More tables to Make
|
483 | - Interior/Exterior
|
484 | - Narrow Waist
|
485 | - Wiki pages could use conversion
|
486 | - [Alternative Shells]($wiki)
|
487 | - [Alternative Regex Syntax]($wiki)
|
488 | - [Survey of Config Languages]($wiki)
|
489 | - [Polyglot Language Understanding]($wiki)
|
490 | - [The Biggest Shell Programs in the World]($wiki)
|
491 |
|
492 | ## HTML Quirks
|
493 |
|
494 | - `<th>` is like `<td>`, but it belongs in `<thead><tr>`. Browsers make it
|
495 | bold and centered.
|
496 | - You can't put `class=` on `<colgroup>` and `<col>` and align columns left and
|
497 | right.
|
498 | - You have to put `class=` on *every* `<td>` cell instead.
|
499 | - `ul-table` solves this with "inherited" `<cell-attrs />` in the `thead`
|
500 | section.
|
501 |
|
502 | <!--
|
503 |
|
504 | ### FAQ
|
505 |
|
506 | (1) Why do row with attributes look like `tr <row-attrs />`? The first `tr`
|
507 | doesn't seem neecssary.
|
508 |
|
509 | This is because of the CommonMark quirk above: a list item without **text** is
|
510 | treated as **empty**. So we require the extra `tr` text.
|
511 |
|
512 | It's also consistent with plain rows, without attributes.
|
513 |
|
514 | -->
|
515 |
|
516 | ## Ideas for Features
|
517 |
|
518 | We could help users edit well-formed tables with enforced column names:
|
519 |
|
520 | - thead
|
521 | - <cell-attrs ult-name=name /> Name
|
522 | - <cell-attrs ult-name=age /> Age
|
523 | - tr
|
524 | - <cell-attrs ult-name=name /> Hi
|
525 | - <cell-attrs ult-name=age /> 5
|
526 |
|
527 | This is a bit verbose, but may be worth it for large tables.
|
528 |
|
529 | Less verbose syntax idea:
|
530 |
|
531 | - thead
|
532 | - <ult col=NAME /> <cell-attrs class=foo /> Name
|
533 | - <ult col=AGE /> Age
|
534 | - tr
|
535 | - <ult col=NAME /> Hi
|
536 | - <ult col=AGE /> 5
|
537 |
|
538 | Even less verbose:
|
539 |
|
540 | - thead
|
541 | - {NAME} Name
|
542 | - {AGE} Age
|
543 | - tr
|
544 | - {NAME} Hi
|
545 | - {AGE} 5
|
546 |
|
547 | The obvious problem is that we might want the literal text `{NAME}` in the
|
548 | header. It's unlikely, but possible.
|
549 |
|