Array/iterator and looping behaviour in kak script

tldr : In a kak script, how can I store a list of selections and then send it to an external script ?

Hi,

I’m trying to make a script similar to occivink/kakoune-expand . The script is a lazy <a-i> which try to find the closest delimiter to the cursor position.

The approach is to store the selections from all the different “inner object” command ( b, q etc…) and send them to an external script. The external script try to find the closest match to the cursor position.

I have trouble iterating through all the different <a-i>X command and storing the resulting selection.

Kakoune script doesn’t have looping feature and use dash shell who doesn’t have array. So my current solution is to iterate through all the <a-i> in a %sh expansion and store them into a giant string.
Then send the string to my external script.

But this approch brings me into a unreadable hell of shell escape, and I’m pretty sure it’s gonna break if my selections contain special characters.

Any pointers or tips on the right way to do it ?

POSIX shells don’t have explicit arrays, but perhaps you could use positional parameters with

eval set -- "$kak_quoted_selections"

or something along these lines?
(My 2 cents.)

1 Like

Could you elaborate ? I’m not familiar with ‘eval set’ command

You say “dash shell doesn’t have arrays”, but every shell has one array variable, the argument list ($1, $2, $3, etc.)

You can’t normally assign to $1, but any non-flag arguments supplied to the set builtin replace the previous contents of the argument list. For example, if you create a shell-script like this:

#!/bin/sh
printf "Got argument: %s\n" "$@"
set -- x y z
printf "Got argument: %s\n" "$@"

…and then run it with arguments:

$ argument-test.sh a b c d
Got argument: a
Got argument: b
Got argument: c
Got argument: d
Got argument: x
Got argument: y
Got argument: z

The first printf prints the arguments given on the command-line, the set replaces them with a new argument list, so the second printf prints that new list.

The idiom listed in the Kakoune docs is:

eval set -- "$kak_quoted_whatever"

…in order to reliably handle special characters. The quoted_ part of the variable name causes Kakoune to shell-quote the value before setting it in the environment. The eval undoes the shell-quoting, guaranteeing that the resulting argument list is split into items in exactly the way Kakoune intended.

3 Likes

Thanks for the explanation, I didn’t knew about it.

It’s a shame that such simple think as having an array of string is so complicated in kak script.

I though I was good at shell until I started using kakoune. This plugin has put me through an enormous amount of pain (and all the “it work in bash but not in dash” bugs are part of it.) I can’t imagine how it must feel for a someone who don’t have any shell scripting experience.

I believe I could use the set -- solution for my array problem, but I doesn’t feel good at all. I never used and pretty sure I’m gonna run into others painful bugs. Plus, next time I’ll read my script I’ll have no idea what it means.

Don’t get me wrong I love kakoune but damn I hate all this shady shell scripting.

It would be easier for me to just send the whole text of the buffer, the cursor position to an external script and implement <a-i> myself in a decent programming language. This doesn’t make anysense.

I think you are calling POSIX shell kak script, but Kakoune is not responsible for the good or bad things about the shell. You can even run Kakoune with a different shell if you’d prefer that, but a lot of scripts won’t work then.

Just add a comment then.

I’d say you can easily do that. When you feel the complexity is too big, it is reasonable to use another programming language. Just invoke it from the shell.

Agree and at first I loved this idea of being full POSIX compliant, but it’s nice in theory. Pragmatically, and now that I start writing plugin, I rather have an editor with an efficient scripting language over an editor that is more “unix kosher”. POSIX responsibilty was to find good standards, and their did a terrible job at that. Kakoune’s resonsability is to create a good editor, and if POSIX use a bad language, maybe kakoune should not follow their path.

That’s show that POSIX shell is a bad language.

This work well in theory when you have a clean separation between the internal kak script and the external script . But the more this separation is blurry and overlap and the more it’s not worth it to use any invocation. Then you have to write the whole plugin in dash, and unless your are a dash guru ( which is rare) you will have a painful time.

Shell is the glue code between different programs. You can even run another program in daemon mode and just use shell to invoke a command that passes a message from Kakoune.

I think it is general belief from Kakoune users that it is not necessary to develop a whole language to build an editor. However, we may be wrong. Emacs follows that philosophy closer.

Personally, I’m happy that Kakoune uses posix shell, and it’s a language I have no problems practicing because it’s actually useful.

I wouldn’t say it’s not bad (for some reasons), but I don’t think that not having arrays and having a set expression is one of them. Indeed, it was a design goal to keep it simple. Thanks to that, you can get dash started in 2 ms and run incredibly fast even though it’s an interpreted language. But, of course, it has some bad things which could have been handled better. And it is not a confortable tool for some tasks.

Yes and that’s a beautiful concept on the paper, it work well most of the time. But in some cases you end up with a big mess of glue, duck tape and staple where a single clean screw could have done the job.

I think kakoune devs feels the same as you, but we should consider that everybody might not feel the same regarding posix shell. It’s sad if kakoune is an elitist editor, for unix super-user only.

Anyway, I just found out about the amazing luar from @gustavo-hms , I think this is the “clean screw” I was looking for. It also help kakoune being more inclusive toward non unix super user ( aka normal people :slight_smile: ). I think lua is a good middle ground between the two worlds. I hope the use of lua will be more used in the kakoune community ( mostly plugin makers ) and that one day it would be part of kakoune in a more official way.

While writing a plugin in pure sh is a plus, that is definitely not the only way to do write plugins. At the end of the day, you want your plugin to be

  • easy to write
  • easy to run for other people

Pure sh satisfies the second trivially, since it is guaranteed to work in all environment you can run Kakoune in, however it is definitely not easy to write.

The nice thing with Kakoune is that there is nothing preventing you from using any language you want to trade-off between the above two. Anything that is exposed to the %sh scope you can also expose to the other language you are writing with: Usually through command-line arguments or directly accessing the environment variables that Kakoune sets. luar makes it easier and more convenient, but it’s not the only way to write non-sh plugins.

Given above, you can easily gain on the “easy to write” dimension without losing much on the “easy to run” dimension. For instance, here are a few alternative languages:

  • perl: pretty much available on every *nix distro by default. AFAIK a few of @andreyorst’s plugins such as kaktree utilize it, along with kakoune-gdb
  • python: python3 is very widely available and pretty much guaranteed to be 3.5+ in modern distros. Plugins like kak-spell and others use it
  • compiled binaries: As long as it doesn’t have onerous runtime dependencies, it can be a good choice for performance-critical applications. kak-lsp is written in rust and is distributed as a single binary without runtime dependencies

luar is definitely nice and I think lua makes a good language for writing plugins, but it isn’t as widely available as the options above. If you are writing plugins to just use for yourself though, the “easy to run” criterion doesn’t even apply :slight_smile:

3 Likes

I would also add:

  • easy to read ( by your future self and others people )
  • easy to hack : I can read and/or write bash script, but I wouldn’t dare modify a single char from a script

On the software side I agree. And the “use shell as a glue” principle work very well with simple workflow like:
grab state in kak script -> send to better language to handle complex tasks -> execute command in kak script

But in the human/UX side I disagree.

If I write a plugin a bit more complex ( that require a lot of back and forth between kak script and an external script for example) then it get very messy and hacky.

And what is stopping me is not that I have reached the limit of kakoune software extensibility, but more my own limits: I’m having a hard time, the code is ugly, I have no clue where the bug is coming from, painfullness etc…

This could be improved with better integration of other language. I see two improvements:

  • the use of language specific expansion : %python{ } `
  • the creation of library to communicate with kakoune from other script like so:
#! /usr/bin/env python
import kak
sel = kak.get_val('selection')
kak.execute_keys('i' + reversed(sel))

Perl is available but no one can’t read, write or hack it if they never done perl. Which not the case of all languages: I have never done lua but I can easily read a lua function and understand what is going on.

I think python is a great option as its very available, easy to read, write and hack. But it is a slow language which can be problematic when writing plugin.

True but it’s so lightweight and easy to install. It has great performance. It is also popular and used by a lot of other projects ( i.e. nvim ) so I doesn’t feel bad installing it.

2 Likes

Hi, @scr!

I remember the experience I had when I learned Lua many years ago. Back then, I was mainly a C programmer with some skills on Perl and Python scripting. But nothing of that prepared me to Lua: it was the first time I was seeing lambdas, closures, coroutines and modules being first class values I could store in a variable and manipulate as I wish (much like OCaml modules, but even more flexible). It took me a long time until I fully understand all of that concepts.

Then I started learning purely functional languages (Haskell first, then Elm some years later). It was another breakthrough. At first I was fascinated but also very frustrated because I couldn’t manage to do even simple things like putting a printf to inspect some variable.

These two experiences were very important to me because they taught me to be a better person (not a better programmer but a better person). And they did so because I learned something valuable: programming is all about community, and when you enter a new community you may have to face a new culture, and whenever you face a new culture, you must leave behind whatever you think you know about the world, otherwise you won’t understand the new culture and you will never be able to embrace it.

Different cultures have different views of the world, have different cosmogonies, take different paths, and we shouldn’t expect they solve problems the same way we are used to.

Now you are entering the Kakoune community. Welcome! But be aware that to enjoy the experience at full, you must leave behind the old you and start to learn anew. Otherwise, you will experience to much pain to do even simple things, like the old me trying to learn new programming languages.

Writing scripts in Kakoune doesn’t need to be that hard, you just need to learn how to do it differently, from the perspective of another culture. For example, it’s true that Kakoune doesn’t have loops, but functional languages also don’t have them, and yet they are very powerful and elegant.

It’s also true that Kakoune doesn’t have a built-in programming language, but on the other hand it has a very expressive text editing language, in par with the one provided by the Vis editor, but way more powerful than what Vim and Emacs provide, let alone Atom, VSCode and Sublime Text. Together with its commands and expansions, it’s capable of doing a lot of sophisticated things without too much burden. But it takes time to fully appreciate it, because it demands a cultural change, and every cultural change demands a cognitive change, and that takes time. And it’s fine, since we are humans, not machines.

So let me present you some of the capabilities of such a sophisticated text editing language. And, since I just talked about functional languages, let’s see what we can learn from them that can be used here. You don’t need to know the concepts I’m going to mention here though, specially because the concepts we are going to see are already present in many imperative languages that took inspiration from the functional world, like SmallTalk, Pharo, Rust and Ruby.

As we are about to see, more often than not loops are an unnecessary complication.

Selections are functors

Sometimes, it’s hard to conceive a way to iterate over every element of a data structure to transform it. But, what if I simply don’t need to? What if I could just find a way to operate on a single element and the system magically uses this information to transform all of them? That’s what a functor is all about. Functors let me focus on the transformation of a single element extracted from the data structure, and its machinery apply this transformation to all of the elements inside the data structure for me, without the need for loops. It’s an operation called mapping.

Kakoune selections are functors. Let’s see how.

Mapping using keys

Suppose I want to surround every word construir with quotation marks in the following verse from João Cabral de Melo Neto:

A arquitetura como construir portas,
de abrir; ou como construir o aberto;
construir, não como ilhar e prender,
nem construir como fechar secretos;
construir portas abertas, em portas;
casas exclusivamente portas e tecto.

Well, the first thing I need to do is select (executing %sconstruir) all of these words:

A arquitetura como [construir] portas,
de abrir; ou como [construir] o aberto;
[construir], não como ilhar e prender,
nem [construir] como fechar secretos;
[construir] portas abertas, em portas;
casas exclusivamente portas e tecto.

For now on I’ll use the notation [something] to mean that something is selected. Now comes the question: how do I iterate over all of these selections to quote everyone of them? You know, Kakoune doesn’t have loops. Does it mean I must resort to %sh{} and use a shell loop over the $kak_selections expansion? Fortunately not. I can just forget for a moment I have a lot of selections and pretend I have just one. And I know how to surround a single selection with quotes: inserting a quote (i") at the beginning of the selection, leaving insert mode (<esc>), then appending a quote (a") at its end. And, suddenly, every selection is magically surrounded by quotes:

A arquitetura como “[construir”] portas,
de abrir; ou como “[construir”] o aberto;
“[construir”], não como ilhar e prender,
nem “[construir”] como fechar secretos;
“[construir”] portas abertas, em portas;
casas exclusivamente portas e tecto.

No loops required.

Mapping using an external tool

Now something a bit trickier. Suppose I have a list of numbers I would like to increment:

[12], [16], [18], [22], [28], [30] 

Kakoune doesn’t know how to count. How can I do that? The secret is the | key. It allows me to pipe each selection to an external program and write the result in its place. So let’s try executing |xargs echo "1 + " | bc. Here, the idea is to append each selection to the expression 1 + and then send this string to the bc calculator. Does it work?

[13], [17], [19], [23], [29], [31] 

Wow! Prime numbers!

Note that the external tool didn’t need to iterate over the values of the selections. From the bc point of view, there’s only one expression, not a list of them. Kakoune managed to extract one item at a time and send it the pipeline.

Let’s try the same thing using Lua this time: |lua -e 'print(tonumber(io.read()) + 1)'. Same thing: it just works. No loops.

Functors are nice.

As an aside, judging only by the text in your first post in this topic, this is exactly the functionality you were looking for.

Selections can be filtered

Suppose I have the following code with all lines selected:

[zip : Tree node leaf -> Zipper node leaf]
[zip tree =]
[    ( tree, [] )]
[]
[]
[unzip : Zipper node leaf -> Tree node leaf]
[unzip =]
[    goToRoot >> subtree]
[]
[]
[subtree : Zipper node leaf -> Tree node leaf]
[subtree ( tree, _ ) =]
[    tree]

I want to keep only the selections with a type annotation. How do I do that?

In the imperative way of thinking, I would need to iterate over each selection, matching its text against the regex ^\w+ : and keeping the ones that match. Those with a functional background know that I can use a function like filter to do the job. The equivalent in Kakoune is the <a-k> key. Let’s try it executing <a-k>^\w+ ::

[zip : Tree node leaf -> Zipper node leaf]
zip tree =
    ( tree, [] )


[unzip : Zipper node leaf -> Tree node leaf]
unzip =
    goToRoot >> subtree


[subtree : Zipper node leaf -> Tree node leaf]
subtree ( tree, _ ) =
    tree

Success!

And there’s also <a-K>, that clears all the selections that match the provided regex. It’s the complement of <a-k>.

Filtering using an external tool

But what if I have the following text from Guimarães Rosa:

[Só] [outro] [silêncio]. [O] [senhor] [sabe] [o] [que] [o] [silêncio] [é]? [É] [a] [gente] [mesmo], [demais].

and want to keep selected only the words with at least six characters? Complicated… Kakoune not just can’t count, but it’s even worse: it can’t decide, meaning it can’t use ifs and elses to make decisions and take branches. What now?

Let me present you the excelent $ key. With it, I can pipe each selection to an external tool and Kakoune will keep those selections for which the shell returned 0. Let us try: $lua -e 'if #io.read() >= 6 then os.exit(0) else os.exit(1) end'.

Só outro [silêncio]. O [senhor] sabe o que o [silêncio] é? É a gente mesmo, [demais].

Selections are monads (sort of)

We have just seen how mapping applies automatically on all selections operations that take some text (inside a selection) and give some text back. But what if my operation, instead of returning some text, returns even more selections? Monads to the rescue!

Let’s first see an example. Consider the following verses from Fernando Pessoa:

[O poeta é um fingidor.]
[Finge tão completamente]
[Que chega a fingir que é dor]
[A dor que deveras sente.]

Now, if I do S , I’m communicating to Kakoune that I want to split each selection on its spaces. Kakoune then goes over each selection for me and apply this operation as if I was operating on a single selection. But the result of such an operation is itself a collection of selections. So, I’d expect the result to be something like this:

[[O] [poeta] [é] [um] [fingidor.]]
[[Finge] [tão] [completamente]]
[[Que] [chega] [a] [fingir] [que] [é] [dor]]
[[A] [dor] [que] [deveras] [sente.]]

But Kakoune knows how to deal with that, and instead removes these nested structures, giving back a flat collection of selections. This allows us to select inside other selections, and that’s an incredible powerful concept!

Let me illustrate how powerful this concept is. Consider I have the following scenario:

  • a Markdown file containing many sections;
  • at each section, I have many code snippets in Elm and Lua;
  • inside a section named “Meus algoritmos maravilhosos”, I want to select all the comments from Lua snippets, but not from Elm snippets.

It’s tricky, because both Lua and Elm comments start with -- . Here are some steps to do it using selections inside selections:

  • select the whole buffer: %;
  • select the right heading inside the whole buffer: s^# Meus algoritmos maravilhosos;
  • extend the selection until the next heading: ?^# ;
  • select the beginning of all Lua snippets inside this selection: s^```lua ;
  • extend the selections until the closing of the snippet: ?^``` ;
  • split the selections on line boundaries: <a-s>;
  • keep those starting with a Lua comment mark: <a-k>--

Selections are monoids

Selections can be merged together in several ways. There’s the <a-_> key and all the ways marks can be combined. Any data structure that has a notion of merging is called a Monoid. A monoid is useful for many things, but let’s try to find an interesting and simple example.

Suppose I have the following Lua code:

local x = 17

if something then
    x = 19
end

local t = { key = x }

if y < 21 then
    z = something or 3
    t.other_key = y + z
end

Now I want to select these two if blocks. Using marks, I can do the following:

  • select the whole buffer (%);
  • select every if keyword (sif );
  • mark these selections (Z);
  • again, select the whole buffer (%);
  • select every end keyword (send);
  • keep a union of the current selections with the ones previously marked (<a-z>u).

Done!

Selections can be manipulated like lists

Lists are ubiquitous in functional languages. It’s because they have some nice properties, among which the fact that they can be easily splitted in a head (the first element) and a tail (the remaining elements), something that plays nicely with recursive algorithms.

The interesting thing is that selections also have a head (the main selection) and a tail (the remaining selections) and so it should be possible to implement recursive algorithms with them.

Let’s try to build one, just for fun!

I want to define a command called reverse, that reverses all the lines of a buffer (I mean: the first line becomes the last one and so on). How hard can it be?

This is my attempt:

define-command revert -docstring "Os últimos serão os primeiros!" %{
    execute-keys <percent><a-s> # Select the whole buffer and split on line boundaries
    revert-the-lines # The actual recursive command
}

define-command -hidden revert-the-lines %{
    # Use a `try` to avoid raising an error when there are no more selections left
    try %{ 
	    # - run `execute-keys` in a draft context to avoid loosing the current selections
        # - reduce selections to just the main one (`<space>`) 
        # - cut its content (`d`)
		# - go to the end of the buffer (`gj`)
		# - paste the line there (`p`)
        execute-keys -draft <space>dgjp

		# remove the head (the main selection)
        execute-keys <a-space>

		# recurse, this time with only the remaining selections
		revert-the-lines
    }
}

For a great explanation of why the -draft switch was needed, let @alexherbo2 enlighten us.

Can this recursive command possibly work? Well, let’s try it: :reverse

}
    }
		revert-the-lines
		# recurse, this time with only the remaining selections

        execute-keys <a-space>
		# remove the head (the main selection)

        execute-keys -draft <space>dgjp
		# - paste the line there (`p`)
		# - go to the end of the buffer (`gj`)
        # - cut its content (`d`)
        # - reduce selections to just the main one (`<space>`) 
	    # - run `execute-keys` in a draft context to not loose the current selections
    try %{ 
    # Use a try to avoid raising errors when there are no more selections left
define-command -hidden revert-the-lines %{

}
    revert-the-lines # The actual recursive command
    execute-keys <percent><a-s> # Select the whole buffer and split on line boundaries
define-command revert -docstring "Os últimos serão os primeiros!" %{

Q.E.D.

That’s… amasing!! It really works!!

Keep your mind open

First of all, I must enphasize that everything we achieved here was done without any %sh{} or lua %{} blocks. And we achieved a lot. Kakoune’s text editing language is so powerful we simply didn’t need a programming language for the tasks we’ve presented here.

Then, a very personal commentary: I don’t see %sh{} as the Kakoune’s programming language. I like to see it as a message. With it, Kakoune is saying: “I desire to communicate with my surroundings”. Kakoune is saying it doesn’t want to be a selfish black box and instead want to be a part in something bigger. The %sh{} expansion is a brigde to the outside world. And that’s beautiful.

From within Vim we can use external tools to manipulate text, but no editor I’m aware of has such a flexible mechanism to integrate it to its surroundings as Kakoune has. I’m not capable of writing luar in another editor spending less than 100 lines of code, exactly because they lack something like %sh{}.

I know you don’t like POSIX shell scripting. Me neither. But that’s OK, because we can use tools like luar. I would never write the logic of a plugin using shell scripting, but just because people do things in a way I’d never do doesn’t mean they are ugly, barbarians and evil. We must keep our mind open to be able to embrace new cultures. That’s the beauty of diversity.

Practical considerations

I’m not an experienced plugin developer and can’t offer good advices on that regard. But I like to observe and listen, and I’d say a good cultural practice inside the Kakoune community is relying on the powerful features of the Kakoune API, resorting to a programming language just on some special cases where the API is not enough. Take for instance this plugin from @mawww and see how little of %sh{} usage it needs. Maybe this way you can avoid much of the burden you are currently having.

14 Likes

Thanks a lot for this wonderful post and the parallels you drawn between selections list and functional programming. (It definitely deserves a lot more exposure! Please consider turning it into a proper blog article if you have the will and the time to do so).

I often thought about this mapping / filtering capabilities. Ultimately these two mechanisms are special cases of what some languages call reduction (as in JavaScript Array.protoype.reduce) or folding (as in Haskell fold function). Unfortunately, my reflection never went as far as to how this notion could be implemented easily through the Kakoune mindset.

Let me reuse one of your example to illustrate a more concrete scenario. The one involving numbers:

[12], [16], [18], [22], [28], [30] 

We have 6 selections, each with a number. What if I want to reduce this list of numbers to a single one which is the total ? (a fold +):

[126]

There’s this hidden notion of “accumulated value”. Should this value, which represents intermediate sums leading to the construction of the final value, be added to the Kakoune buffer at each step of the recursive walk-through? Or is it better to consider this as some kind of temp artifact that should better be handled though something like a Kakoune register?

Also what are your thoughts about reduction/folding in Kakoune in general. Do we already have all the right editing primitives / keys we need at our disposal or do we miss some core ones that we simplify the job? Thanks!

3 Likes

In these kind of situations I’ve noticed that kakoune’s multi-selection yank and paste works pretty good.

So here you could select all numbers, yank them and open a new, temporary buffer. Then in the buffer you can process the numbers and “fold +” them via keys:
<a-p><a-space>a+<esc>x|bc<ret>

Have to remember that kakoune buffers are also one of the primitives that can be easily created and discarded.

2 Likes

edit -scratch was added exactly for this use case: It auto generates a name and gives you a temporary buffer to work in.

4 Likes

Wow, I never noticed -scratch auto generated a buffer name. It is very nice.

Hi @gustavo-hms,

You say programming is all about community and I’m so happy you are part of kakoune’s. I really enjoyed reading this article with my morning coffee and I learnd a bunch of new stuff. With @Screwtapello “punk-rock” article we are very spoiled those days :slight_smile:

But unfortunately it didn’t helped me solve the problem. I see the confusion when I read my initial post. I should have been more precisely on the root of the issue, which is more clear now:
Kakoune merge selection automaticaly when they overlap.

In other word, kakoune will have beautifully functional-like behaviour with:

[foo] [bar] [wiz]

But it’s get tricky if we want:

[foo [bar] wiz]

This cannot exist in kakoune, it will be merged in one unique selection:

[foo bar wiz]

When you make a plugin like occivink/kakoune-expand/, you need to compare all the surrounding selections to choose the appropriate one ( usually the shortest, but it’s more tricky with quotes, think of: "foo" _ "bar"))
You would need the following selections:

[{   [(   ["foo"]  )]  }] 

Since I can’t have those 3 selections together and pipe them to an external program, I have to store them one by one in some kind of array. But dash doesn’t have this kind of structure.

At this point, I felt I was arriving to a dead end, so I posted on the forum. Not that I could not see a solution, I just got discouraged. I was so angry that such a simple thing as storing a bunch of string in an array would require so much effort.

If you look at occivink/kakoune-expand/ you will see that it fail sending the selections to an external tool and it l is entirely written in dash. Dash is supposed to be just the glue. I don’t like when plugin are 100% in dash: it’s unreadable and unhackable.
Their plugin also only work with brace-like delimiter. It doesn’t manage quotes-like delimiter for the same reason: it fail to send the selections to another tool. Since handling quotes require complex logic and doing complex logic is hard to do in dash therefore it can’t handle quotes. Also working with quotes-like character inside dash is just a string-escaping nightmare.

Anyhoo… this is my anecdote trying to add quotes support to a plugin. I’m happy it inspired @gustavo-hms for his great article :slight_smile:. I hope I made myself more clear and that I pin-pointed a case where kakoune doesn’t deliver the simplicity we are used to. Maybe plugin like luar can help solving those kind of challenges.

2 Likes

Please consider turning it into a proper blog article if you have the will and the time to do so.

The truth is that I don’t like to spend too much time in front of a computer. I don’t even have a website :sweat_smile: But I’ll take your suggestion into consideration.

Do we already have all the right editing primitives / keys we need at our disposal or do we miss some core ones that we simplify the job?

I’d really appreciate if Kakoune could treat object selections as first class citizens. Currently, there are some issues I dislike. First, <a-a> and <a-i> are unergonomical. When I started using Kakoune, I simply couldn’t do some editing as fast as in Vim just because my brain had to stop for a moment to think about the key combinations to select a word or a paragraph. I tried hard to get used to them but without luck. It was until I remapped these keys to something else. It had a big positive impact on my productivity.

Then, it would be very useful (for me, at least) if we could use object selections in combination with s, S, <a-k> and <a-K>. If I want to select all the words of a line, why do I need to do xs\w+ if Kakoune already knows what a word is? And sometimes Kakoune knows it even better than me, because it can read the extra_words_chars option. This is truth specially if I have to deal with nested blocks, since selecting them can’t be done easily and generally with regular expresssions. Why can’t I simply execute s<a-a>{ to select a brackets block?

Now to the folds!

Folding in Kakoune

There’s this hidden notion of “accumulated value”.

This notion of accumulated value is precisely what a monoid express. For instance, we can fold a list of integers because integers act as monoids respect to the addition and multiplication operations. In the same vein, booleans are monoids respect to the and and or operations, and strings are monoids respect to the concatenation operation.

The general idea is: if we know how to merge two values of the same type, we can merge all of them.

We already know that selections are monoids respect to the operations on marks, so let’s see what we can do with them. First, consider the following code:

declare-user-mode reduction
map global reduction u ': reduce u<ret>' -docstring 'take the union'
map global reduction < ': reduce <lt><ret>' -docstring 'take the first'
map global reduction > ': reduce <gt><ret>' -docstring 'take the last'
map global reduction + ': reduce +<ret>' -docstring 'take the longest'
map global reduction <minus> ': reduce -<ret>' -docstring 'take the shortest'
# Define a `<a-r>` key to apply reductions
map global normal <a-r> ': enter-user-mode reduction<ret>' -docstring 'reduce selections'

# The `reduce` command is the equivalent of Haskell's `foldr1`. It takes as
# argument the operation to be used as the monoid operation.
define-command -hidden -params 1 reduce %{
    # Save the head to the mark register. It's going to be the base case of the recursion
    execute-keys -draft -save-regs /"|@ <space>Z
    # Get the tail of the list
    execute-keys <a-space>
    # Call the recursive function applied to the tail
    reduce-with-base-case %arg{1}
    # Restore the accumulated selection from the marks register
    execute-keys z
}

# The `reduce-with-base-case` is the equivalent of Haskell's `foldr`
define-command -hidden -params 1 reduce-with-base-case %{
    try %{
        # Merge the head with the value accumulated in the marks register
        # using the operation provided as argument.
        execute-keys -draft -save-regs /"|@ %sh{ printf '<space><a-z>%sZ' "$1" }
        # Get the tail of the list
        execute-keys <a-space>
        # Recurse
        reduce-with-base-case %arg{1}
    }
}

Let’s see what are the effects of using this new <a-r> key. Consider the following scenario, a quote from Mia Couto:

O [ar] é [uma] [pele], [feita] de [poros] [por] onde [escoa] [a] luz, [gota] [por] [gota], como um suor [solar].

If I then press <a-r>u, the selections are reduced to the union of them all:

O [ar é uma pele, feita de poros por onde escoa a luz, gota por gota, como um suor solar].

This is a generalisation of the <a-_> key that also works on non-contiguous selections.

But, if I choose <a-r>+, the selections are reduced to the longest of the them:

O ar é uma pele, feita de poros por onde escoa a luz, gota por gota, como um suor [solar].

Since reduce is implemented as a right fold (that is, from right to left), it takes the rightmost one from the longest selections.

I can also choose to reduce to the shortest one (<a-r>-):

O ar é uma pele, feita de poros por onde escoa [a] luz, gota por gota, como um suor solar.

As you can see, we already can do some interesting folding operations, but we still can’t sum a list of numbers like in your example. So, let’s see what is still missing.

Folding to a scalar

Selections wrap text but are not text themselves. Currently, marks only work at the selection level: it can’t unwrap the text that’s inside it. It’s like having a list of Maybe and only being able to operate on the Maybes but not on the values they hold, as in the following Elm code:

[Just 3, Just 21, Just 4]
    |> foldr (\maybe accumulator -> if maybe == Nothing then Nothing else accumulator) (Just 17) 

The above code reduces the list to a Just 17. If I can’t inspect what is wrapped inside the Maybe monad, that’s pretty much what I can get. I can’t, for example, sum the integers the Maybes hold.

It turns out we need to define another operation for marks to be able to express more things with it. By doing so, we extend the ways by which selections can be monoids.

So, let’s imagine we have a pipe (|) operation for marks. The exact API should be thought more carefully, but for the moment suppose this operation allows us to call an external tool passing the value inside the ^ register as $accumulated and the value inside the selection as $selection in the command line; then, the union (as in <a-z>u) of the mark and the selection is replaced (c) by whatever this external tool prints to the stdout. That is, if I have the text:

O ar é uma [pele]

And the ^ register contains the selection for [ar], executing <a-z>| external-tool $accumulated $selection, the text

ar é uma pele

will be replaced by whatever the external-tool prints to the stdout. Say it prints mar, quando quebra na praia, é bonito. Then, we end up with:

O [mar, quando quebra na praia, é bonito]

Now, it’s enough to declare:

map global reduction | ': reduce |' -docstring 'reduce using an external tool'

To keep the example of the lists of numbers we were working with in this topic, having:

[12], [16], [18], [22], [28], [30]

If we execute <a-r>| echo "$accumulated + $selection" | bc, we then get:

[126]

As we wish.

Remember that our reduce command goes from right to left. For this kind of operation, it’s probably more intuitive accumulate from left to right. Fortunatelly, the change is straightforward:

define-command -hidden -params 1 reduce %{
    execute-keys ) # That's all we need to convert our `foldr` to a `foldl`
    execute-keys -draft -save-regs /"|@ <space>Z
    execute-keys <a-space>
    reduce-with-base-case %arg{1}
    execute-keys z
}

This new key could also be used for operations other than on numbers. Say I have a list of words I want to check against a blacklist provided by an external tool and keep only those allowed, deleting the rest. I could select them all and execute <a-r>| external-tool $accumulated $selection, provided the external tool prints the concatenated list of good words to the stdout.

Let’s take a moment to appreciate the elegance of Kakoune. A small addition to a seemingly unrelated feature (operations on marks) and suddenly we have all we were asking for!

Folding to another list

The fold is indeed a powerful function. As you said, many functions can be expressed in terms of folds, like map, filter, length, all… But it can do it only because the accumulated value can be anything (even other data structures), not just scalars. Here’s how we can reverse a list in Haskell:

reverse = foldl (\accumulated x -> x : accumulated) []

Note that the accumulated value is a list here, not a scalar. We can say reduce and fold are not good words to express this kind of operation anymore (fold is not a good word in any case by the way). So let’s call it aggregate by now, just to make it clear we are working on a slightly variation of the preceding operation.

How can we get at least some of this power in Kakoune? How can we make folds that produce another list of selections as the accumulated value? It’s hard to say. But maybe I can give some insight into how to build a somewhat useful aggregate command.

To do so, now we must consider that the accumulated value stored in the marks register is a list of selections, not just a single one. What we need is an aggregate command that, at each selection:

  • takes the head of the list stored in the marks register and the current selection and calls an external tool passing these two values as arguments: external-tool $head-of-marks $selection;
  • this external tool is expected to print to the stdout one of the following values: u, a, <, >, + or -, each one corresponding to an operation on marks (union, append, take leftmost, take rightmost, take longest, take shortest);
  • the command then merges the head of the marks and the current selection according to the operation indicated by the output of the external tool;
  • the result of such a merge then replaces the head of the marks.

Better seeing it in action. Consider the following scenario, quoted from Machado de Assis:

Palavra [puxa] [palavra], uma [ideia] traz outra, e [assim] se faz um [livro], um [governo], ou uma [revolução].

That is, I have the following list of selections: [puxa], [palavra], [ideia], [assim], [livro], [governo], [revolução].

  • our aggregate command starts by putting the head of the selections in the marks register:

    • marks: [puxa];
    • selections: [palavra], [ideia], [assim], [livro], [governo], [revolução];
  • our aggregate command now calls external-tool passing [puxa] and [palavra] as arguments: external-tool puxa palavra;

  • say it returns a, meaning it wants these selections to be merged;

  • we now have:

    • marks: [puxa], [palavra];
    • selections: [ideia], [assim], [livro], [governo], [revolução];
  • next turn: external-tool palavra ideia;

  • it gives us u;

  • now we have:

    • marks: [puxa], [palavra, uma ideia]
    • selections: [assim], [livro], [governo], [revolução];
  • then: external-tool "palavra, uma ideia" assim;

  • we receive -;

  • what results in:

    • marks: [puxa], [assim];
    • selections: [livro], [governo], [revolução];

And so on.

We could end up with something like this:

Palavra [puxa] palavra, uma ideia traz outra, e [assim] se faz um [livro, um governo, ou uma revolução].

Conclusions

what are your thoughts about reduction/folding in Kakoune in general

Some examples here seem to be too much complicated to be used when editing. This makes me think if it would be worth of investing in such a concept. On the other hand, we must take into account that the same language used to live editing text in Kakoune is also used to script it, and perhaps for scripting it could allows some interesting extra powers.

In any case, I think it was a fun conceptual exercise to do :smile:

5 Likes

That’s true. But, on the other hand, you can have a list of marks overlap a list of selections. You can even take the intersection of them if you want, you can take the shortest or the longest ones and so on… Have you already tried to use marks to solve your problem? I mean, the plain marks, not the concepts I sketched in the reply to @Delapouite.

Hi, @bravekarma!

I’ve seen this argument sometimes and, while it’s true a Lua interpreter isn’t shipped by default on most machines the same as a Python or Perl interpreters, I do think this problem is a bit overestimated.

Take for instance some of our must popular plugins: kak-lsp, parinfer and kak-tree depend on Rust, connect.kak has an optional dependency on Crystal and kakboard depends on usually not pre-installed external tools for communicating with the clipboard. And nobody looks at these dependencies as a problem.

A standalone static Lua binary with everything included is just 300 kB on size. It’s way easier to install than Rust, Cargo and Crystal. Actually, bootstrapping it is way easier than any other language I’m aware of.

So, personally, I don’t a dependency on Lua as a problem for writting a plugin :slightly_smiling_face:

3 Likes