Kakscript for the first time is hard, documenting my journey

Hey,

I have been using Kakoune for a few years already. But I came to realize recently that I did not do much of kakscript. I just adapted a few configurations that I found, and I kept my kakrc very minimalist, trying to learn what the default experience could bring me. And this has been very valuable.

However, recently, I decided that I would like to take inspiration from the Plan9 Acme editor (which I really like) and bring them into Kakoune. Maybe not everything, but small bits that I could incorporate and that would turn my experience with Kakoune into a more personal story. But before doing that, I had to overcome something that I did not think would be much of a challenge: kakscript.

Effectively, even if I have dug many parts of Kakoune extensively, going so far as to read a lot of its internal source code, surprisingly, I never took much time writing kakscript. I mostly read what was inside the internal rc (which is super valuable already). I am rather familiar with POSIX shell (if you are not, I recommend reading POSIX Shell Tutorial). But trying to write kakscript was a different story. I think this is a hard task for someone that starts doing it for the first time.

This is hard mostly because there is no material to learn it. You just read the docs, pick some bits from rc and start adapting and writing yours. And you start hitting some walls. And I realized those walls were mostly related to the lack of understanding on how kakscript was parsed and evaluated. So a few weeks ago, I decided to stop the implementation of my kak plugin and start writing a blog post that would explain what I understood from learning the parsing and evaluation of kakscript.

I have been helped a lot by @alexherbo2 , so thank you a lot for the time you spent with me.

And now I would like to share my post here. I am aware that this can says things that may be wrong, and I would love to have feedback so that I can improve it. I haven’t seen any guide like that, so I really hope that it can help beginners to get up to speed rapidly. When those concepts are understood, kakscript becomes fun to write!

Have a good read and thank you in advance for any feedback that you can give me, including corrections, reorganisations or just your feelings about whether such a guide can be useful or not.

Here is the link below. Note that I don’t have a personal blog. I don’t write that much and I like the convenience of gist for my writing. The style might not be super pretty, but it is effective…

Thank you!

9 Likes

This was an excellent read and really helped me wrap my head around some aspects of kak scripting I’ve been a bit overwhelmed trying to understand! Thank you for putting the time into writing this. The quiz examples in particular really helped to clarify everything you’d discussed, especially the way the registers quizzes built on each other!

In my opinion, this would be an excellent addition to the kakoune wiki!

Thank you, so timely! Just yesterday I resolved to have another more serious attempt at configuring Kakoune, after letting it slide for a while precisely because of the lack of clear documentation. I’ll let you know how I fare with your article.

This is really well-written and comprehensive! Good work! I do have some notes, however.

Under “How Kakoune loads configuration”, the text kind of implies without stating outright that Kakoune will load the standard library then load user configuration. This is generally true, but a bit misleading. If you look at the standard kakrc file the process runs like this:

  • Load everything from the user’s autoload directory
  • If and only if that doesn’t exist, load everything from the system autoload directory
  • Load the per-system kakrc.local file
  • Load the user’s kakrc

One of the most common questions we get from new users is “I installed a plugin and now my syntax highlighting stopped working”, because they created the autoload directory to put the plugin into, and that disabled loading of the standard library. I think it’s worth calling out this quirk for prospective new users.

In “A Better Way: Store State First”, the example uses $kak_client_list to get a list from Kakoune into the shell, and later $kak_quoted_reg_c to get another list from Kakoune into the shell, but you don’t describe the difference between those two, or how to distinguish one from the other. It might be worth having an earlier section about communication between Kakoune and the shell, to cover *_quoted_* expansions, and Kakoune-quoting in shell.

In “Use Kakoune Quotes in Shell Blocks When Possible”, it suggests that Kakoune always provides handy environment variables like $kak_client or $kak_opt_filetype to shell blocks. This is not true! Kakoune only exports environment variables mentioned in the text of the block or (in recent versions) in command-line arguments.

So:

echo %sh{ env | grep kak_ }

…will print nothing, but:

echo %sh{ env | grep kak_ # kak_client }

…will print the name of the current client, and so will:

def -params 1 checkvars %{ echo %sh{ env | grep kak_ } }
checkvars kak_client

There’s scope for a follow-up post with advanced topics like $kak_command_fifo, but since this is just focussing on the layers of evaluation and expansion, that’s fine — you address those topics well.

1 Like

Thank you @Screwtapello for that detailed review!

I plan to iterate on modifications to the article, taking into account the feedback I received. I have a few topics to address, but this morning, I rewrote the “How Kakoune loads configuration” section.

Initially, I wanted to have that section be a simplified explanation of what was happening behind the scenes. I just wanted to document the default experience. However, you convinced me, alongside the confusion we see on the Discord, that people often mess up with their autoload, and this breaks Kakoune.

So I decided to go into more detail about what effectively happens, starting by the runtime directory, as advised by @alexherbo2. That way, the full behavior is documented, and the pitfalls are highlighted now. So when people struggle with that, we can guide them toward that section.

I will comment back here when I address other issues with the article.

1 Like

I am continuing the improvements to the article. Here, I added one section just after having explained all the different kinds of strings.

This is to demystify what happens when unquoted expansions are juxtaposed before or after text, as the behavior is far from intuitive in Kakoune.

echo %val{bufname}after
echo before%val{bufname}

Another update. In the section “Resist Shell Expansions”, I added another example for ideas around branching without a shell.

This is directly gathered from the topic: Branching on a boolean option -- without calling the shell

Thanks @ftonneau for those ideas. I think they are super valuable to share!

Hey, I found some time today to bring the last corrections brought by your comment @Screwtapello.

I decided to add a big section named “Diving Into Shell Expansions”. In there, I explain in more detail what shell expansions are, what the quirks are, and things to know about them, including all the comments given in your review.

I hope I managed to restore them correctly in the article.

Now, this article is around 1400 lines of markdown, meaning it is super extensive. It can even discourage the reader, given the length. So I suppose I will stop here in terms of ideas, and if there are other important bits, they could add up in a different blog post. However, I am willing to make corrections if things are not accurate enough.

I believe this piece of writing is a precise guide to get someone up to speed in kakscript. I learned a lot by writing it, and it makes kakscript way more enjoyable to me.

Thanks again for the reviews and support, and let me know if you have any comments.

Well done, that’s well-written and complete!

There’s just one shell-quoting quirk I want to point out, though. This works perfectly:

$ items="'first item' 'second item'"
$ eval set -- $items
$ printf '%s\n' "$@"
first item
second item

…but this does not:

$ items="'first item' 'second        item'"
$ eval set -- $items
$ printf '%s\n' "$@"
first item
second item

Collapsing the whitespace (or rather, $IFS characters) within each item might not be a problem, but then again it might. To be actually safe, you need to quote the expansion in the eval, too:

$ items="'first item' 'second        item'"
$ eval set -- "$items"
$ printf '%s\n' "$@"
first item
second        item

Sometimes in the Kakoune standard library you’ll see it written eval "set -- $items" which is the same thing - as long as the variable expansion is inside double quotes, it should be fine.

This is amazing content! Thank you so much for the tremendous effort here!

It’s really cool to understand that there is nothing special about command bodies.

I wanted to I could define a command body using a shell expansion that only executes at the time when kakoune sources the file!

define-command echo-blah %sh{ 
    printf "echo %s" "blah blah" 
}

I always had the mistaken mental model that the command body passed to define-command was something special…when in fact it’s just a string like anything else.

I’m still working my way through, but one tiny nit I found…

Under “Expansions” you mention:

  • Their content is quoted using a balanced punctuation character like {, (, [, or <

I’m seeing expansions working just fine using non-nested punctuation delimiters as well.:

echo %val|buffile|

Hey @Screwtapello ,

Thanks for pointing this out. I should admit that I struggle to understand exactly why those double quotes are needed. But it effectively solves that problem. I have added the quotes in the article, even if it is still not fully clear in my head.

@schickm ,

Thank you, you made a really important point. Effectively, echo %val|buffile| seems totally possible. I re-read my article and understood that expansions were still not clear in my head for some details, such as this one.

In fact, expansions are not a sort of quoting. This is the mistake I was making. There are just a special form of %-strings, which happen to have a type. But they can be representated both in a balanced format (%val{runtime}) or in a quoted format (%val|runtime|).

This made me rethink and rewrite a big part of the article as this was badly explained. I introduced the concept of quoting “shapes” as an intermediary step to the actual 4 kinds of quoting in Kakoune. I believe it is simpler like this. For sure, it is more accurate because now, I talk about nesting in parsing more than expansions.

In fact, double-quoted strings are parsed in a nested way, and this has nothing to do with expansions.

echo "hey %{you}"

Here, we have a regular double-quoted string which contains a balanced string. A balanced string can be an expansion or not. This depends on whether it has a type. Here it does not. However, during the parsing of the parent double-quoted string, nesting happens! So Kakoune will also parse the balanced string, and the result will be hey you.

Expansions, on the other hand, is a concept which happens post-parsing, where the content of the generated string (surfaced by the parsing of the quoting kind of the expansion, being either balanced or quoted) will then be replaced by something else at that time. The something depends on the type of expansion.

I hope it makes more sense written like this. At least it does for me.

1 Like

Let’s say we have:

kak_quoted_buflist="'file a' 'file   b'"

If we try to use it unquoted:

eval set -- $kak_quoted_buflist

…then the shell goes through the following steps:

  1. it expands the variable:

    eval set -- 'file a' 'file   b'
    
  2. it splits the text according to IFS:

    "eval" "set" "--" "'file" "a'" "'file" "b'"
    
  3. it invokes the resulting command, eval, with all the following words as arguments

  4. eval takes its arguments and joins them with spaces:

    set -- 'file a' 'file b'
    
  5. eval executes that command, and now file b has lost its extra space characters.

The splitting step does not occur to variables that occurred within double quotes, so if you start with:

eval set -- "$kak_quoted_buflist"

…then the shell goes through the following steps:

  1. it expands the variable:

    eval set -- "'file a' 'file   b'"
    
  2. it splits according to IFS, but it leaves the quoted text alone:

    "eval" "set" "--" "'file a' 'file   b'"
    
  3. it invokes the resulting command, eval, with all the following words as arguments

  4. eval takes its arguments and joins them with spaces:

    "set -- 'file a' 'file   b'"
    
  5. eval executes that command, and extra spaces are preserved.

2 Likes

Thanks a lot @Screwtapello !

This makes a lot of sense. I took the time to rephrase and incorporate those explanations inside the article. It keeps growing, but I have the feeling that it becomes more and more accurate.

I would appreciate if you could review the new section that I rewrote, which explains in my own words what you are telling me. I also took the time to more broadly give insights about shell parsing, quoting and field splitting. I hope it is precise enough.

I took the time to rephrase and incorporate those explanations inside the article. It keeps growing, but I have the feeling that it becomes more and more accurate.

I like it! It is getting long, but I think it’s still an easy read, and explains things pretty well.

I do have one minor nitpick, though. In the “Loop Through Positional Arguments” section, you describe "$@" like this:

In other words, even when enclosed in double quotes, $@ undergoes field splitting. However, this process is not governed by IFS; instead, it is split at each argument while respecting whitespace, if present.

“field splitting” generally refers to the way that the shell breaks up the result of variable expansion. $foo undergoes field splitting, "$foo" does not undergo field splitting. The spec doesn’t have a technical term for what happens to $@, but the closest matching plain English term I can think of is “splicing”.

The values in $@ are spliced into the commandline as separate words. Then, those words are processed as separate variable expansions - they undergo field splitting unless they were in double quotes, just like regular variable expansions.

For example, let’s make some variables, and put them in the argument list.

a="first item"
b="second item"
c="third item"
set -- "$a" "$b" "$c"

Now, we can put those values into a command individually:

$ printf '%s\n' ...$a $b $c...
...first
item
second
item
third
item...
$ printf '%s\n' "...$a" "$b" "$c..."
...first item
second item
third item...

…or using $@:

$ printf '%s\n' ...$@...
...first
item
second
item
third
item...
$ printf '%s\n' "...$@..."
...first item
second item
third item...

Whether we use $@ to splice values into the command line, or we splice them in manually ourselves, we get the same behaviour - values inside double-quotes are used as-is, values outside double-quotes undergo field splitting.