OILS / doctools / README.md View on Github | oils.pub

112 lines, 86 significant
1Doctools
2========
3
4Tools we use to generate the [Oils documentation](../doc/). Some of this code
5is used to build the [the blog](//www.oilshell.org/blog/) as well.
6
7See [doc/doc-toolchain.md](../doc/doc-toolchain.md) for details.
8
9Tools shared with the blog:
10
11- `cmark.py`: Our wrapper around CommonMark.
12- `spelling.py`: spell checker
13- `split_doc.py`: Split "front matter" from Markdown.
14
15More tools:
16
17- `html_head.py`: Common HTML fragments.
18- `oil_doc.py`: HTML filters.
19- `help_gen.py`: For `doc/ref/index-{osh,ysh}.md`.
20
21## Micro Syntax
22
23- `src_tree.py` is a fast and minimal source viewer.
24- It uses polyglot syntax analysis called "micro syntax". See
25 [micro-syntax.md](micro-syntax.md).
26
27## TODO
28
29Immediate:
30
31- iPhone CSS
32 - font sizes are wrong when the line wraps
33 - need to debug this directly
34
35- Experiment with shared library -- narrow waist that eliminates process overhead
36 - main(argv)
37 - stdin / stdout are buffers?
38 - for the coprocess protocol, I wanted file descriptors. But this is a
39 portable interface
40 - returns status code
41 - what about errors?
42 - I think they can go to stderr to start
43
44- Ninja makes it faster
45- CLI syntax
46 - modes: wc and cat
47 - `--col` and `--omit-col`
48 - `mtax cat --no-comments --no-strs` or `--empty-strs`
49 - replace with spaces
50 - reminds me of "garfield minus garfield"
51
52- output: 4 columns
53 - which I guess are selectable
54 - should tokens be binary data though?
55
56- Detect UTF-8
57 - lexing doesn't work without UTF-8
58
59- Analyze TSV for function names in Python parser combinators
60 - add () {} -- this is all you need really
61 - oh and : for Python
62
63- C++ multi-line
64 - comes up in `_gen/bin/text_files.cc`
65 - this is an architecture issues, will allow Rust/Lua as well
66
67- Max color mode for debugging?
68 - detail: blank lines in re2c blocks shouldn't be significant
69 - I guess you detect whitespace in re2c blocks then
70
71- SLOC
72 - add to index.html, with attrs
73 - light grey monospace?
74 - Subsumes these tools:
75 - <https://github.com/AlDanial/cloc> - this is a 17K line Perl script!
76 - <https://dwheeler.com/sloccount/> - no release since 2004 ?
77
78- Maybe add language for `*.test.sh`
79 - the `####` and `##` lines are special
80
81src-tree:
82
83- should README.md be inserted in index.html ?
84 - probably, sourcehut has this too
85 - use cmark
86 - also use our TOC plugin
87- line counts in metrics/source-code.sh could link to src-tree
88 - combine the CI jobs
89 - should use `micro_syntax --wc` to count SLOC
90 - Also `micro_syntax` --print --format ansi
91 - this is the default?
92
93Later:
94
95- Recast as TSV-Netstring, which is different than TSV8
96 - select one of these cols:
97 - Path, HTML, `num_lines`, `num_sig_lines`, and I guess ANSI?
98 - line/tokens binary? You have line number and so forth
99 - 3:foo\t 5:foo\n # last cell has to end with newline ?
100 - preserves wc -l when data has no newlines?
101 - or is this too easily confused with TSV8 itself? You don't want it to be valid TSV8?
102 - it can start out as text
103 - !tsv8 Str
104 - !type
105 - !nets-row ?
106 - another thing you can do is have a prefix of cells
107 - netstring is 3:foo,
108 - prefix is 5; 3:foo,
109 - that's a command to read 5 net strings I guess?
110
111- Parsing, jump to definition
112