1 | ul-table: Markdown Tables Without New Syntax
|
2 | ================================
|
3 |
|
4 | `ul-table` is an HTML processor that lets you write **tables** as bulleted
|
5 | **lists**, in Markdown:
|
6 |
|
7 | <!-- TODO: Add hyperlinks here, or maybe add markdown-->
|
8 |
|
9 | ```
|
10 | <table>
|
11 |
|
12 | - thead
|
13 | - Shell
|
14 | - Version
|
15 | - tr
|
16 | - bash
|
17 | - 5.2
|
18 | - tr
|
19 | - OSH
|
20 | - 0.25.0
|
21 |
|
22 | </table>
|
23 | ```
|
24 |
|
25 | <table>
|
26 |
|
27 | - thead
|
28 | - Shell
|
29 | - Version
|
30 | - tr
|
31 | - bash
|
32 | - 5.2
|
33 | - tr
|
34 | - OSH
|
35 | - 0.25.0
|
36 |
|
37 | </table>
|
38 |
|
39 | I designed this format because it's tedious to read, write, and edit `<tr>` and
|
40 | `<td>` and `</td>` and `</tr>`. Aligning columns is also tedious in HTML.
|
41 |
|
42 | `ul-table` does **not** involve new Markdown syntax, only a new interpretation.
|
43 |
|
44 | This means your docs are still readable without it, e.g. on sourcehut or
|
45 | Github. It degrades gracefully.
|
46 |
|
47 | ---
|
48 |
|
49 | Other design goals:
|
50 |
|
51 | - It should scale to large, complex tables.
|
52 | - Expose the **full** power of HTML, unlike other solutions.
|
53 |
|
54 | <!--
|
55 |
|
56 | - Solve the "column alignment problem" concisely.
|
57 | - HTML punts on this, as explained below.
|
58 | -->
|
59 |
|
60 | <div id="toc">
|
61 | </div>
|
62 |
|
63 | ## Simple Example
|
64 |
|
65 | Let's add hyperlinks to our example, to make it more realistic.
|
66 |
|
67 | <style>
|
68 | table {
|
69 | margin: 0 auto;
|
70 | }
|
71 | td {
|
72 | padding: 0.2em;
|
73 | }
|
74 | </style>
|
75 |
|
76 | <table>
|
77 |
|
78 | - thead
|
79 | - Shell
|
80 | - Version
|
81 | - tr
|
82 | - [bash](https://www.gnu.org/software/bash/)
|
83 | - 5.2
|
84 | - tr
|
85 | - [OSH](https://www.oilshell.org/)
|
86 | - 0.25.0
|
87 |
|
88 | </table>
|
89 |
|
90 | ### `ul-table` Syntax
|
91 |
|
92 | You can make this table with a **two-level Markdown list**, with Markdown
|
93 | hyperlink syntax:
|
94 |
|
95 | <table> <!-- don't forget this tag -->
|
96 |
|
97 | - thead
|
98 | - Shell
|
99 | - Version
|
100 | - tr
|
101 | - [bash](https://www.gnu.org/software/bash/)
|
102 | - 5.2
|
103 | - tr
|
104 | - [OSH](https://www.oilshell.org/)
|
105 | - 0.25.0
|
106 |
|
107 | </table>
|
108 |
|
109 | (This format looks similar to [tables in
|
110 | reStructuredText](https://sublime-and-sphinx-guide.readthedocs.io/en/latest/tables.html)).
|
111 |
|
112 | It takes two steps to convert:
|
113 |
|
114 | 1. Any Markdown translator will produce a
|
115 | `<table> <ul> <li> ... </li> </ul> </table>` structure.
|
116 | 1. **Our** `ul-table` plugin transforms that into a
|
117 | `<table> <tr> <td> </td> </tr> </table>` structure, which is a normal HTML
|
118 | table.
|
119 |
|
120 | ### Comparison: Markdown Uses Tedious Inline HTML
|
121 |
|
122 | Here's the equivalent in CommonMark:
|
123 |
|
124 | <table>
|
125 | <thead>
|
126 | <tr>
|
127 | <td>Shell</td>
|
128 | <td>Version</td>
|
129 | </tr>
|
130 | </thead>
|
131 | <tr>
|
132 | <td>
|
133 |
|
134 | <!-- be careful not to indent this 4 spaces! -->
|
135 | [bash](https://www.gnu.org/software/bash/)
|
136 |
|
137 | </td>
|
138 | <td>5.2</td>
|
139 | </tr>
|
140 | <tr>
|
141 | <td>
|
142 |
|
143 | [OSH](https://www.oilshell.org/)
|
144 |
|
145 | </td>
|
146 | <td>0.25.0</td>
|
147 | </tr>
|
148 |
|
149 | </table>
|
150 |
|
151 | It uses the rule where you can embed Markdown inside HTML inside Markdown.
|
152 | With `ul-table`, you **don't** need this mutual nesting.
|
153 |
|
154 | The text you have to write is also a lot shorter!
|
155 |
|
156 | ---
|
157 |
|
158 | Trivia: with CommonMark, you also get an extra `<p>` element:
|
159 |
|
160 | <td>
|
161 | <p>OSH</p>
|
162 | </td>
|
163 |
|
164 | `ul-table` can produce simpler HTML:
|
165 |
|
166 | <td>
|
167 | OSH
|
168 | </td>
|
169 |
|
170 | ### Stylesheet
|
171 |
|
172 | To make the table look nice, I add a `<style>` tag, inside Markdown:
|
173 |
|
174 | <style>
|
175 | table {
|
176 | margin: 0 auto;
|
177 | }
|
178 | td {
|
179 | padding: 0.2em;
|
180 | }
|
181 | </style>
|
182 |
|
183 | ### The Untranslated HTML
|
184 |
|
185 | If you omit the `<table>` tags, then the rendered HTML looks like this:
|
186 |
|
187 | - thead
|
188 | - Shell
|
189 | - Version
|
190 | - tr
|
191 | - [bash]($xref)
|
192 | - 5.2
|
193 | - tr
|
194 | - [OSH]($xref)
|
195 | - 0.25.0
|
196 |
|
197 | This is how your tables will appear on sourcehut or Github, which don't (yet)
|
198 | have `ul-table` support. Remember, `ul-table` is **not** an extension to
|
199 | Markdown syntax.
|
200 |
|
201 | ## Adding HTML Attributes
|
202 |
|
203 | HTML attributes like `<tr class=foo>` and `<td id=bar>` let you format and
|
204 | style your table.
|
205 |
|
206 | You can add attributes to cells, columns, and rows.
|
207 |
|
208 | ### Cells
|
209 |
|
210 | Add cell attributes with a `cell-attrs` tag **before** the cell contents:
|
211 |
|
212 | - thead
|
213 | - Name
|
214 | - Age
|
215 | - tr
|
216 | - Alice
|
217 | - <cell-attrs class=num /> 42
|
218 |
|
219 | It's important that `cell-attrs` is a **self-closing** tag:
|
220 |
|
221 | <cell-attrs /> # Yes
|
222 | <cell-attrs> # No: this is an opening tag
|
223 |
|
224 | How does this work? `ul-table` takes the attributes from `<cell-attrs />`, and
|
225 | puts it on the generated `<td>`.
|
226 |
|
227 | ### Columns
|
228 |
|
229 | Add attributes to **every cell in a column** by the same way, except in the
|
230 | `thead` section:
|
231 |
|
232 | - thead
|
233 | - Name
|
234 | - <cell-attrs class=num /> Age
|
235 | - tr
|
236 | - Alice
|
237 | - 42 # this cell gets class=num
|
238 | - tr
|
239 | - Bob
|
240 | - 25 # this cells gets class=num
|
241 |
|
242 | This is particularly useful for aligning numbers to the right:
|
243 |
|
244 | <style>
|
245 | .num {
|
246 | text-align: right;
|
247 | }
|
248 | </style>
|
249 |
|
250 | If the same attribute appears in the `thead` and and a `tr` section, the values
|
251 | are **concatenated**, with a space. Example:
|
252 |
|
253 | <td class="from-thead from-tr">
|
254 |
|
255 | ### Rows
|
256 |
|
257 | Add row attributes like this:
|
258 |
|
259 | - thead
|
260 | - Name
|
261 | - Age
|
262 | - tr
|
263 | - Alice
|
264 | - 42
|
265 | - tr <row-attrs class="special-row />
|
266 | - Bob
|
267 | - 25
|
268 |
|
269 | ## Example: Markdown and HTML Inside Cells
|
270 |
|
271 | Here's an example that uses more features. Source code of this table:
|
272 | [doc/ul-table.md]($oils-src).
|
273 |
|
274 | [bash]: $xref
|
275 |
|
276 | <table id="foo">
|
277 |
|
278 | - thead
|
279 | - Shell
|
280 | - Version
|
281 | - Example Code
|
282 | - tr
|
283 | - [bash][]
|
284 | - 5.2
|
285 | - ```
|
286 | echo sh=$bash
|
287 | ls /tmp | wc -l
|
288 | echo
|
289 | ```
|
290 | - tr
|
291 | - [dash]($xref)
|
292 | - 1.5
|
293 | - <em>Inline HTML</em>
|
294 | - tr
|
295 | - [mksh]($xref)
|
296 | - 4.0
|
297 | - <table>
|
298 | <tr>
|
299 | <td>HTML table</td>
|
300 | <td>inside</td>
|
301 | </tr>
|
302 | <tr>
|
303 | <td>this table</td>
|
304 | <td>no way to re-enter inline markdown though?</td>
|
305 | </tr>
|
306 | </table>
|
307 | - tr
|
308 | - [zsh]($xref)
|
309 | - 3.6
|
310 | - Unordered List
|
311 | - one
|
312 | - two
|
313 | - tr
|
314 | - [yash]($xref)
|
315 | - 1.0
|
316 | - Ordered List
|
317 | 1. one
|
318 | 1. two
|
319 | - tr
|
320 | - [ksh]($xref)
|
321 | - This is
|
322 | paragraph one.
|
323 |
|
324 | This is
|
325 | paragraph two
|
326 | - Another cell with ...
|
327 |
|
328 | ... multiple paragraphs.
|
329 |
|
330 | </table>
|
331 |
|
332 | ## Markdown Quirks to Be Aware Of
|
333 |
|
334 | Here are some quirks I ran into when creating ul-tables.
|
335 |
|
336 | (1) CommonMark doesn't allow empty list items:
|
337 |
|
338 | - thead
|
339 | -
|
340 | - above is not rendered as a list item
|
341 |
|
342 | You can work around this by using a comment, or invisible character:
|
343 |
|
344 | - tr
|
345 | - <!-- empty -->
|
346 | - above is OK
|
347 | - tr
|
348 | -
|
349 | - also OK
|
350 |
|
351 | - [Related CommonMark thread](https://talk.commonmark.org/t/clarify-following-empty-list-items-in-0-31-2/4599)
|
352 |
|
353 | As similar issue is that line breaks affect backtick expansion to `<code>`:
|
354 |
|
355 | - tr
|
356 | - <cell-attrs /> <!-- we need something on this line -->
|
357 | ... More `proc` features ...
|
358 |
|
359 | I think this is also because `<cell-attrs />` doesn't "count" as text, so the
|
360 | list item is considered empty.
|
361 |
|
362 | (2) Likewise, a cell with a literal hyphen may need a comment or space in front of it:
|
363 |
|
364 | - tr
|
365 | - <!-- hyphen --> -
|
366 | - -
|
367 |
|
368 | ## Comparisons
|
369 |
|
370 | ### CommonMark Doesn't Have Tables
|
371 |
|
372 | Related discussions:
|
373 |
|
374 | - 2014: [Tables in pure Markdown](https://talk.commonmark.org/t/tables-in-pure-markdown/81)
|
375 | - 2022: [Obvious Markdown syntax for Tables](https://talk.commonmark.org/t/obvious-markdown-syntax-for-tables/4143/9)
|
376 |
|
377 | ### Github Tables are Awkward
|
378 |
|
379 | Github-flavored Markdown has an non-standard extension for tables:
|
380 |
|
381 | - [Github: Organizing Information With Tables](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-tables)
|
382 |
|
383 | This style is hard to read and write, especially with large tables:
|
384 |
|
385 | ```
|
386 | | Command | Description |
|
387 | | --- | --- |
|
388 | | git status | List all new or modified files |
|
389 | | git diff | Show file differences that haven't been staged |
|
390 | ```
|
391 |
|
392 | Our style is less noisy, and more easily editable:
|
393 |
|
394 | ```
|
395 | - thead
|
396 | - Command
|
397 | - Description
|
398 | - tr
|
399 | - git status
|
400 | - List all new or modified files
|
401 | - tr
|
402 | - git diff
|
403 | - Show file differences that haven't been staged
|
404 | ```
|
405 |
|
406 | - Related wiki page: [Markdown Tables]($wiki)
|
407 |
|
408 |
|
409 | ## Conclusion
|
410 |
|
411 | `ul-table` is a nice way of writing and maintaining HTML tables. The appendix
|
412 | has links and details.
|
413 |
|
414 | ### Related Docs
|
415 |
|
416 | - [How We Build Oils Documentation](doc-toolchain.html)
|
417 | - [Examples of HTML Plugins](doc-plugins.html)
|
418 |
|
419 | ## Appendix: Implemention
|
420 |
|
421 | - [doctools/ul_table.py]($oils-src) - about 500 lines
|
422 | - [lazylex/html.py]($oils-src) - about 500 lines
|
423 |
|
424 | TODO:
|
425 |
|
426 | - Move unit tests to spec tests - `doctools/ul-table-test.ysh`
|
427 | - can run under Python 2 vs. Python 3
|
428 | - can run with cmark vs. `markdown.pl`
|
429 | - De-couple it from cmark.so
|
430 | - Publish the code separately?
|
431 | - `lazylex/html.py` can got in `doctools/`
|
432 | - or maybe it's ready to go in `data_lang/htm8.py`, if we can fix the lexer
|
433 | - Could revive pyannotate automation to help with the typing
|
434 |
|
435 | ### Algorithm Notes
|
436 |
|
437 | - lazy lexing
|
438 | - recursive descent parser
|
439 | - TODO: show grammar
|
440 |
|
441 | TODO:
|
442 |
|
443 | - I would like someone to produce a **DOM** implementation!
|
444 | - Our implementation is meant to avoid the "big load anti-pattern"
|
445 | (allocating too much), so it's a necessarily more verbose. A DOM
|
446 | implementation should be much less than 1000 lines.
|
447 |
|
448 |
|
449 | ## Appendix: Real Examples
|
450 |
|
451 | - [Guide to Procs and Funcs]($oils-doc:proc-func.html) has a big `ul-table`.
|
452 | - Source: [doc/proc-func.md]($oils-src)
|
453 |
|
454 | I converted the tables in these September posts to `ul-table`:
|
455 |
|
456 | - [What Oils Looks Like in 2024](https://www.oilshell.org/blog/2024/09/project-overview.html)
|
457 | - [After 8 Years, Oils Is Still Small and Flexible](https://www.oilshell.org/blog/2024/09/line-counts.html)
|
458 | - [Garbage Collection Makes YSH Different](https://www.oilshell.org/blog/2024/09/gc.html)
|
459 | - [A Retrospective on the Oils Project](https://www.oilshell.org/blog/2024/09/retrospective.html)
|
460 |
|
461 | The markup was much shorter and simpler after conversion!
|
462 |
|
463 | TODO:
|
464 |
|
465 | - Tables to Make
|
466 | - Interior/Exterior
|
467 | - Narrow Waist
|
468 |
|
469 | - Wiki pages could use conversion
|
470 | - [Alternative Shells]($wiki)
|
471 | - [Alternative Regex Syntax]($wiki)
|
472 | - [Survey of Config Languages]($wiki)
|
473 | - [Polyglot Language Understanding]($wiki)
|
474 | - [The Biggest Shell Programs in the World]($wiki)
|
475 |
|
476 |
|
477 | ## HTML Quirks
|
478 |
|
479 | - `<th>` is like `<td>`, but it belongs in `<thead><tr>`. Browsers make it
|
480 | bold and centered.
|
481 | - You can't put `class=` on `<colgroup>` and `<col>` and align columns left and
|
482 | right.
|
483 | - You have to put `class=` on *every* `<td>` cell instead.
|
484 | - `ul-table` solves this with "inherited" `<cell-attrs />` in the `thead`
|
485 | section.
|
486 |
|
487 | <!--
|
488 |
|
489 | ### FAQ
|
490 |
|
491 | (1) Why do row with attributes look like `tr <row-attrs />`? The first `tr`
|
492 | doesn't seem neecssary.
|
493 |
|
494 | This is because of the CommonMark quirk above: a list item without **text** is
|
495 | treated as **empty**. So we require the extra `tr` text.
|
496 |
|
497 | It's also consistent with plain rows, without attributes.
|
498 |
|
499 | -->
|
500 |
|
501 | ## Feature Ideas
|
502 |
|
503 | We could help users edit well-formed tables with enforced column names:
|
504 |
|
505 | - thead
|
506 | - <cell-attrs ult-name=name /> Name
|
507 | - <cell-attrs ult-name=age /> Age
|
508 | - tr
|
509 | - <cell-attrs ult-name=name /> Hi
|
510 | - <cell-attrs ult-name=age /> 5
|
511 |
|
512 | This is a bit verbose, but may be worth it for large tables.
|
513 |
|
514 | Less verbose syntax idea:
|
515 |
|
516 | - thead
|
517 | - <ult col=NAME /> <cell-attrs class=foo /> Name
|
518 | - <ult col=AGE /> Age
|
519 | - tr
|
520 | - <ult col=NAME /> Hi
|
521 | - <ult col=AGE /> 5
|
522 |
|
523 | Even less verbose:
|
524 |
|
525 | - thead
|
526 | - {NAME} Name
|
527 | - {AGE} Age
|
528 | - tr
|
529 | - {NAME} Hi
|
530 | - {AGE} 5
|
531 |
|
532 | The obvious problem is that we might want the literal text `{NAME}` in the
|
533 | header. It's unlikely, but possible.
|
534 |
|