Pipe-map, a tool to map out a shell pipeline

Screwtapello · December 14, 2020, 11:46am

The other day I was working on a Kakoune plugin that tried to do something clever with shell pipelines and FIFO buffers, but it just silently hung and I couldn’t figure out why. I suspected I’d somehow set up the pipeline incorrectly, but I didn’t have a way to inspect it and see what I got wrong.

So I threw together a hacky Python script to scrape the Linux /proc directory for processes and their open file descriptors, pick out a process of interest, and map out the pipeline it’s part of. Now I can run a command like:

$ (cat | tr a-z A-Z | sort) 2>/dev/null

…and automatically generate a map of the pipeline like:

I don’t know if this might be useful to anybody else, or even if I’ll ever use it again myself, but I figured I’d toss it into a git repo just in case:

scr · December 14, 2020, 4:58pm

Wow ! that’s super cool !
Could you specify which python version you use exactly, I didn’t work for me with python3.6 ( __future__ error ), neither with 3.7:

Traceback (most recent call last):
  File "./pipe-map.py", line 139, in <module>
    graph = Graph.from_proc_tree(pathlib.Path("/proc"))
  File "./pipe-map.py", line 84, in from_proc_tree
    proc = Process.from_path(each)
  File "./pipe-map.py", line 48, in from_path
    file = File.from_path(each.readlink())
AttributeError: 'PosixPath' object has no attribute 'readlink'

Also, if I understand well one cannot render the graph of a finished pipeline, the pipeline should be “on going”. So is it necessary to use, for example:
( cat | my | pipeline)
instead of:
(cat my_input | my | pipeline ) ?

Screwtapello · December 15, 2020, 1:16am

I happen to be using Python 3.9. I avoided using the datatypes standard library module which I know was added only recently, but I guess there were other dependencies I missed.

The __future__ import might be a bit of a pain to remove, but if it works on Python 3.7 that’s probably Close Enough.

I am very suprised that .readlink() was only added to pathlib in Python 3.9. If you add import os at the top of the file (with the rest of the imports) you should be able to change that line to:

file = File.from_path(pathlib.Path(os.readlink(each)))

PRs welcome!

You can’t render the graph of a finished pipeline, because when the pipeline finishes all the information about it gets cleaned away. Using cat to “hold it open” definitely works, but hitting <c-z> to pause the pipeline while it’s running, sending SIGSTOP to one of the processes, or just getting lucky with the timing all work too.

scr · December 15, 2020, 2:43am

file = File.from_path(pathlib.Path(os.readlink(each)))

That worked on python3.7 Thanks I could not have found it myself.