You have stumbled on a weird legacy quirk added to the original versions of Unix.
In the beginning was the typewriter, which had a knob to feed paper through the device, and a big metal lever to return the “carriage” back to the left-hand edge of the paper, so the user could resume typing. As technology advanced, somebody figured out how to put the keyboard and the print head at opposite ends of a telephone connection, creating the “tele-typewriter”, or “teletype” or just “TTY” for short.
To keep things mechanically simple, because there were two separate mechanisms involved (move the carriage horizontally, feed paper vertically) the TTY protocol had two separate control codes to trigger them: “carriage return” and “line feed” respectively, usually abbreviated to CR and LF. Early-model TTYs, like the ASR-33 had separate CR and LF keys.
Once computers came along, we needed some way to communicate with them, and since there were so many TTYs lying around, it was natural to plug one in and teach the computer how to use it. However, early computing pioneers were annoyed by having to use both CR and LF to signal end-of-line. What if you just get CR without LF? What if you get LF without CR? What if you get them the wrong way around? What if you get a CR and then nothing, how long do you wait?
Using a single control code for “end of line” would make things much simpler. It would make your computer incompatible with all the existing TTYs, but luckily computers are smart, and you can teach them to automatically translate the computer’s internal scheme when talking with an TTY. Unix’s creators decided that on their system, a bare LF would represent the end of a line (which is why LF on Unix systems is often called “newline” or “NL”), and taught their operating system to do the conversion for every TTY. Thus, even on my modern Linux system:
$ stty -a
speed 38400 baud; rows 62; columns 239; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>;
eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z;
rprnt = ^R; werase = ^W; lnext = ^V; discard = ^O; min = 1; time = 0;
-parenb -parodd -cmspar cs8 -hupcl -cstopb cread -clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl
ixon -ixoff -iuclc -ixany -imaxbel iutf8
opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0
vt0 ff0
isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop
-echoprt echoctl echoke -flusho -extproc
stty -a
prints all the TTY settings for a terminal window I happened to have open. In particular, “speed 38400 baud” means it think it’s limited to ~3.8KB/s, “icrnl” at the end of the sixth line means that incoming CR characters are automatically translated to LF, and “onlcr” in the middle of the eighth line means that outgoing LF characters are automatically translated to CRLF.
You might be thinking “yes, that’s it, that’s why <c-j>
is always received as <ret>
!” but unfortunately things are a bit more complex than that.
“intr = ^C” means that when the terminal sends a Ctrl-C, the kernel eats it and sends SIGINT to the foreground process. “quit = ^” means that the kernel converts Ctrl-Backslash into SIGQUIT. “susp = ^Z” means that the kernel converts Ctrl-Z into SIGTSTP, and so forth. All these keys are very useful and important for command-line tools like grep and cat, but they get in the way for full-screen, interactive tools like Kakoune. Therefore, the kernel provides “raw” mode, where all these conversions are disabled and every key the terminal sends is received directly by the application.
In raw mode, <c-j>
sends LF, and Enter send CR, so they can definitely be distinguished. However typing on the keyboard is not the only way to send text through the terminal: if you paste text with Ctrl-Shift-V or Shift-Insert, and that text includes a line-break, the line-break will still be a LF regardless of whether the terminal happens to be in raw mode or not. Therefore, Kakoune includes code that treats both LF and CR as <ret>
: