I sat in on an unsession about tree-sitter, which atom uses for syntax highlighting, and I think I really like this idea of external Kakoune highlighter programs.
For clojure, for example, it could track enough state so that it would know, based on the namespace declaration, whether a word is from clojure.core/ or locally overridden. In addition, it could take multiple passes to annotate the tree before highlighting, which is sometimes required in lisp.
Tree-sitter has been discussed a bit on the IRC channel. Here are a few links about tree-sitter for people discovering this project:
I’ve thought of it too, but I suppose this involves doing highlighting asynchronously in separate thread (which not gonna happen, I suppose) and keeping results in sync may be difficult to do. I hope this feature will be available though, because having semantic aware highlighting is noice, but old good regex bases highlighting should not go away either.
After years of working with them, I am not sure I agree anymore. They are misleading unacceptably often on non-trivial code, take an insane effort to really get right around the edge cases, maybe time to let them die for general highlighting of code, and just used in user hands for adhoc stuff.
The only benefit is that not every external highlighter is available for every platform, and regexp highlighter depends only on regexp engine within the editor. It’s handy to quickly adding basic support for new/arcane languages. I also like to extend highlighting to my liking.
I want external highlighters too, but my point is that we can’t just throw away working solution just because there are better ones. It’s not just about highlighting, it’s everywhere. Look at bash for example, we can’t just throw it away, despite that this is really bad shell. And it even still compatible with POSIX sh, and POSIX is dumb incomplete standard that is still being used.
In terms of externalizing highlighters, if Kakoune had a good interface for external highlighting, then we could also build and ship
kak-regex which uses this interface with regular expression specifications. This would not be “throwing away” regular expressions, but relying more heavily on POSIX. Of course, we could have other programs use other highlighting specs, such as tree-sitter, sublime’s syntax definitions, or TextMate’s grammar files.
(And we probably already have a good interface for this, since
<a-|> no longer deadlocks if the command we send to ends in