Back to the schedule
Previous: Trivial Emacs Kits
Next: Building reproducible Emacs

Beyond Vim and Emacs: A Scalable UI Paradigm

Sid Kasivajhula

Download compressed .webm video (40M)
Download compressed .webm video (45.1M, highly compressed)
View transcript

Download compressed .webm video (9.2M)
Download compressed .webm video (5M, highly compressed)

A practiced dexterity with the arcane incantations known as keybindings is the true mark of the veteran Emacs user. Yet, it takes years to get there, and if you tried to explain what you were doing there, nobody would understand, least of all those Vim users who would say that the whole enterprise was foolhardy to begin with. They don't get it, those fools. Let them flounder about in their "normal mode." Normal isn't good enough for me! I want exceptional, IDEAL, I want… glorious mode, that's what I want. And the only thing that'll cut it is if I do it … my way. Why, with my precious emacs.d, I'm invincible! Well… just between you and me, there are times when learning new keybindings every time someone makes a new toy gets to be a bit of a drag, and some days I can't keep my C-c's and my C-c C-c's straight if I'm being honest with you, but you'll never catch me admitting it! I do wonder if there's a better way to get to glorious mode, even though my .emacs.d is already perfect (of course).

If this secretly sounds like you, then rejoice, there just might be a new way, a better way! And you could potentially get there in days instead of years, so that even your script kiddie coworker with their "VSCode" (groan) may at last come around to your way of looking at things, and, maybe, just maybe, even those Vim users (hiss!)!

"Epistemic" Emacs is a user interface paradigm based on treating aspects of the user interface as conceptual entities that can be reasoned about in terms of a standard language. Essentially, instead of learning keybindings for each specific action, you learn keybindings for general, conceptual habits, kind of like Vim, except that instead of reasoning only about text, you reason about any aspect of your interaction with the machine, whether it's windows or buffers or even those interactions themselves. The promise of this approach is that you just learn a simple language once, and you can then apply it to vastly different aspects of your user interface, with the same keybindings doing different things in different contexts, in sensible and predictable ways. And in principle, whenever that new toy technology comes around, anyone could extend the UI language to apply to it in a matter of minutes, and you'd already know how to use it.

Actual start and end time (EST)

  • Start: 2020-11-28T11.00.47
  • Q&A: 2020-11-28T11.18.12
  • End: 2020-11-28T11.24.51


Can minor-modes in Emacs be integrated via chimera as a "mode"?

Good question. If it is already a "modal"-like minor mode, then we ocould potentially do it this way. But in general, it could make sense to couple minor modes to rigpa "modes", towers (sets of modes), or complexes (sets of towers), so that entering those modes/towers would enable those minor modes, and likewise disable the minor modes upon exiting. E.g. for Lisp editing, we might want to enable the symex / paredit minor mode in Lisp tower, and disable it upon swapping to Vim / Emacs tower.

Do you think it would be hard for people to remember all the modes and bindings?

  • Bindings, no - it would be easier than currently because the bindings generally stay the same across modes (e.g. hjkl always means left down up right, and there are other conventions).
  • Modes, if the tower is 2-3 tall, then it's not a problem at all. Totally intuitive. For > 3 it might be hard, so I think in practice you would alternate across more small towers rather than have fewer big towers.
  • Also, most modes are always available via "direct access" keybinding (eg. s-w = window mode), so you can jump to one at any time, and it'll return you to your original position in the tower when you exit. Modes don't need to be in the current tower in order for you to use them. But if you're using them frequently you might want to add them or temporarily switch to a tower that has them – whatever feels the most natural for the specific case.

Are you familiar with ? And other earlier implementations. A short comparison would be nice.

Not familiar with this, but it looks very interesting. From a quick look, I can say that versors is partially related to rigpa, in that its "cursors" roughly correspond to noun modes. Rigpa isn't limited to noun modes, though. For instance Vim's normal mode contains many nouns and a special command language. On the other hand, Emacs's usual editing behavior doesn't think in terms of nouns at all and has a myriad of ad hoc keybindings. Yet, both are rigpa modes, along with modes like window-mode and buffer-mode which each correspond to individual nouns (like versor). Rigpa is less about the nature of the modes (about which it is relatively unopinionated, although noun-specific modes may be a common choice) than it is about the relationship between modes, the ability to structure them and interrelate them and configure them on the fly.

What package is used?

Why is the package called rigpa?

A reference: (knowledge of the ground).

How to deal with Dvorak (et al.) layouts? This has always bugged me. Is there a "XModmap Mode"…?

  • Vim users don't remap their keys. The homerow is not a big deal, actually.
  • Hm… I've always found it a bit of an obstacle but haven't tried hard! hjkl → jk makes sense but hl, not so much.
    • The day you want to do this, you'll absolutely be able to do it and have it become natural. Just gotta want it :)

I mostly use default model provided by vanilla Emacs and work in Org mode for text editing. Can you give some examples, e.g. how can the user can use the concept of "mode of mode" to do some interesting editing?

  • The more modes you have, the shorter the individual keystrokes become.
    • ^ Not to be a pain but my comment about Dvorak is related :-)
  • There are many bindings in Org mode (e.g. agenda manipulation, manipulating headings and subheadings, promoting/demoting) that would be a natural fit for a dedicated modal interface. At the moment you probably use only a subset of all of the available options because of the constraints of conveniently (1) knowing about, (2) remembering and (3) using the bindings. With a dedicated mode, you could edit Org buffers using a Vim-like modal interface where all of the options are easy to remember and use.
  • Mode mode / tower mode could be useful if you are doing literate programming or "multi-modal" Org buffers where you have many different languages embedded within the Org file. In this case, you could modify your tower using mode mode, or swap between different towers, to quickly have the right modes for different parts of the file.

How do new modes come into existence?

  • Modes from any modal interface provider are supported via a modal interface abstraction layer ("chimera").
  • You can define new modes as a hydra or as an evil state, and then they just need to be "registered" with the framework via a function call for them to be incorporated.

Is this built on top of Hydra?

  • Any modal interface provider is in principle supported. There is an abstraction layer called "chimera" that allows any provider to be used as long as it implements an interface (e.g. including indicating entry and exit hooks for each mode).
  • Some of the modes are evil modes (e.g. normal, insert).
  • While others are hydras (window, buffer, etc) (including Symex? yes, Symex too).

Which retro theme are you using?

green phosphor.

Will this involve defining more epistemic-modes for non-editable buffers like Dired? How do you deal with the explosion of the number of modes?

  • This is a great question, so here is a long answer:
    • I am keen to keep this extension lightweight so that it plays well with existing Emacs tools without needing a custom ecosystem. The modal interface abstraction layer "chimera" would be a big part of this, enabling existing modal-like interfaces to be recognized in the framework out of the box, meaning that they would be automatically "wired into" the broader framework via the standard exits (e.g. Escape and Enter).
    • I'm not sure what the best way to handle dired would be, but if it could be handled in this way, then that would be the way to do it.
    • The "complex" of towers initially available is tied to major mode, that takes away some of the complexity right off the bat. E.g. when you open a Lisp file it gives you a Lisp-related + general-purpose complex of towers.
    • The idea is to support the "explosion" of modes, but make it scale well by (1) having them be structured, and (2) the structure being the same at every level

How do you deal with the mental overhead of keeping a stack of modes and your position in it? While this simplifies the actual editing process by defining them as a single set of keybindings, the complexity is transferred to navigating modes.

  • While the complexity is transferred, the nature of that complexity is different. In the case of keybindings, the complexity is unstructured and ad hoc, whereas in the case of mode navigation, it's a matter of "going to the right place" for your keys to have the right meaning.
  • In practice you would only have towers of size 2-3 I would guess, with every other mode jump always being available via an ad hoc jump (e.g. even in Vim tower, you can always jump to Window mode and it would return you to the original mode you were in upon exit).
  • And the main paradigm would be swapping between small towers.


  • Indra's Net:
  • "We are at a higher level looking down at the text, we can describe this text…".
  • "There is a way to go down to ground level, and a way to escape from that to the referential level".
  • "All of the nouns of the world of text are available".
  • …. Or you could have a dedicated mode for every noun — Nouns as modes.
  • Character, Word, Line mode; Window mode! All with the same basic keystrokes.
  • "Rumpelstiltskin Principle" from CS — if you can name something you have power over it.
  • modes of modes → "Mode mode" (the modes that are present in the buffer).
    • Such a refreshing point of view.
  • Tower mode → ?? "There are many towers available for use in different buffers".
  • Demos "Strange Loop".
  • Two directions: sideways changes perspective (normal, word, line) all different perspectives; up or down (takes you through meta levels).
  • Unknown meta level → same basic interactions.
  • formerly called epistemic-mode, now called rigpa (concept in Tibetan Buddhism, in Dzogchen teaching, or the great completion).
  • Similar idea from


[00:00:02.960] "Far away in the heavenly abode of the great god Indra, there is a wonderful net which has been hung by some cunning artificer in such a manner that it stretches out infinitely in all directions. In accordance with the extravagant tastes of deities, the artificer has hung a single glittering jewel in each eye of the net, and since the net itself is infinite, the jewels are infinite in number. There hang the jewels, glittering like stars in the first magnitude, a wonderful sight to behold. Were we to select one of these jewels for inspection, we would discover that in its polished surface there are reflected all the other jewels in the net, infinite in number. If we look still more closely, we would see that each of the jewels reflected in this one jewel reflects all the others." This is the metaphor of Indra's Net, which is told in some schools of philosophy. Let's keep this metaphor in mind, because it'll help us understand the Emacs extension that we're about to discuss.

[00:01:06.960] In editing text, there's two main paradigms: one is editing at the ground level, where the characters that we type actually appear on the screen, the changes we make actually occur. The other editing paradigm is where we escape to a higher level and now the characters that we type are not... They don't actually appear on the screen because we're not at the ground level with the text, we are at a higher level looking down at the text and regarding the text, referring to this world of text in terms of a language.

[00:01:56.159] For instance, we could describe this world as having words and paragraphs and sentences and lines and so on. We could reason about this text in terms of these textual entities and this textual language. This is the second paradigm of text editing. When we're in the second paradigm, there is a way to go down to ground level. You hit Enter now--or we'll hit Enter to go down to the ground level, and you can hit Escape to go back out to the referential level. Enter to go down to ground level and Escape to go up to the referential level.

[00:02:40.160] Now, in Vim, the nouns in this world of text all share the same referential plane which we call normal mode. So in normal mode, all of the nouns of the world of text are available, whether it's words or sentences or paragraphs, and they all share this same referential plane. They compete for space on the keyboard.

[00:03:12.720] An alternative way to structure these modes is instead of having a single mode where all the nouns coexist, peacefully or otherwise, you instead have a dedicated mode for every noun. In that case, what happens is because your modal spaces are now much smaller, you're just talking about words or paragraphs or lines or something, the keys that you use can be much more targeted. You can use the same keystrokes in all of your modes and they would have the same ideas behind them, but they would have different effects depending on which context you're using. It's the same keystrokes, different contexts. The advantage of that is it's often easier to change context than it is to learn new key bindings.

[00:04:07.888] So let's see an example of how that works. We go into character mode, and if you look at the mode line at the bottom of the screen there, you'll see that we're in character mode. Now, when we move up, down, left, and right, we're moving by character. We can also transform the text, and the transformations occur in terms of character. You can also go into word mode. In word mode, the transformations that you do are on words. and you try... Your movement is also in terms of words. So that's the level of granularity that you have. You could also go to line mode. When you're in line mode, you go up and down by line, and you can move lines up and down left and right and so on. The transformations you do are in terms of lines. You could also go to window mode, where now the objects that you're referring to are windows. You can move spatially amongst the windows or do transformations on the windows using the same keystrokes. So let's go to...

[00:05:28.720] Right. One of the things, the principles at play here is something called the Rumpelstiltskin principle, which is something that's known in computer science. If you can name something, then you have power over it. This is kind of an adaptation of that principle which says that if you can name something and if you can talk about it, then it's a noun in your editing language. If it's a noun, then it has... It's a mode. So if we can talk about it, it's a noun. If it's a noun, then it's a mode.

[00:06:04.818] One of the things we've been talking a lot about is modes. In fact, by this principle, modes also should be a mode. You should have a mode that can reason in terms of modes as objects, just like you have modes where you can reason in terms of words or lines as objects.

[00:06:26.560] So let's do that. Let's go to mode mode. When you go to mode mode, you see that the objects that are depicted here are the modes that are present in the buffer, which we knew about because the style of editing that we had in this buffer was the Vim style of editing where there's an insert mode at the ground level and a normal mode that you can escape to. You insert, enter the ground level. Enter to the insert mode and escape to normal mode. When you look at the mode mode representation, you see that in fact that is the structure that's depicted. But in different situations, you might find that these modes are not the ones that you want. You want something more tailored for the specific application. For instance, if you're editing Lisp code (or code in general, but Lisp code is a particular example), you might want to take advantage of the structure of the code. For Lisp code in particular, we have a mode called symex-mode which is able to reason about your code in terms of its tree structure. So you can use the same keystrokes: hjkl goes left, right, up, and down, but you also have other keystrokes that are more specialized to the application. You can run the code. We'll see that happen here in a minute. You can make changes to it really quickly and see the effects of those changes. You're doing this all in a mode that's convenient for this particular application, which is editing Lisp code, and that is, in this case, symex-mode.

[00:08:28.960] Typically, when you're editing code like this, you'd want to be in insert mode actually typing out the code, and then you'd want to escape to symex mode rather than normal mode, and then you could escape again and you'd end up in normal mode. So this, if we go to mode mode, we see is depicted as this tower where insert is at the bottom and normal is at the top, but symex-mode is in between the two. You could also change that if you like. If you don't want symex-mode to be there, you could just move it to the top. Now you find symex is at the top and you enter down to normal. You can see it on the status bar at the bottom there. Enter to insert, escape to normal, escape to symex. In fact, you can even add more modes if you don't like the existing ones. Now we have an additional mode here. We have window mode. It goes down to symex, it goes down to normal. Enter the insert, escape to normal, escape to symex, escape to window.

[00:09:33.600] So we've talked... Okay, so another thing actually to note here is that in editing modes, if you look at the mode line at the bottom of the screen, you'll see that we are currently, in this buffer, we are currently in line mode. I'm going to hit Enter now and you'll see that when I hit Enter, nothing is happening. It's still in line mode. If you hit Escape, it's still in line mode. You can find out the reason for that by taking another meta jump out of this. You'll see that, in fact, the reason is that we're currently in line mode, and line mode is the only one available in this tower for editing the modes that are in operation in your ground level. In fact, line mode is all you need here, because this is just the nature of how these modes are laid out is in rows. So line mode is the most appropriate thing here. But you could change it to something else if you like.

[00:10:40.959] Now we've seen two towers. We've seen the Vim tower and we've seen also the symex tower, the Lisp tower. It turns out that, because we've been talking about towers now, by the Rumpelstiltskin principle, towers also can be talked about, and therefore they also are a mode. So how do we go to tower mode? The way we go to tower mode is we go in a slightly different direction, and we find that we are now in tower mode. We see that there are many towers available. We're now... We're seeing several possible towers that we have written to be available and for use in different buffers. You can edit them on the fly. For instance, let's enter this tower. Now you see that in the bottom of the... In the mode line, you see that we're going across all of these different modes that were in the tower. You could escape and you could even move things around. You could put window mode all the way at the bottom, right above insert mode. Let's see that happen. There it is, window is right above insert, and so on. The tower always reflects your current position, so if you're in buffer mode here and you go down to line mode, when you go back to mode mode, you see that we are in line mode. But in practice, you wouldn't have a tower this elaborate because you'd rather have several smaller towers you enter, that you alternate between.

[00:12:33.360] Okay. So one other thing of interest here is that when you're in tower mode, if you look at the status line at the bottom there, we are currently in buffer mode while we are in tower mode. Tower mode actually isn't a mode really. Neither is mode mode. They're really referential planes or meta planes. In any case, you can see that we're in buffer mode. We can take a meta jump out of this to confirm that buffer mode is the only mode available when we're editing towers because that's the one we need, given that our towers are represented in individual buffers.

[00:13:23.200] Right. So let's see where we're at. Rumpelstiltskin principle... We talked about mode mode. We talked about the strange loop application of ground level modes in meta levels. We saw the different towers, and in fact, we're currently in Vim tower, where you can go to Emacs tower. Now, with a single keystroke, you can alternate between Emacs and Vim, which are represented-- which are modeled as towers.

[00:14:13.360] So there's... One thing that we've sort of alluded to is that there are two directions that you can travel in when you're going through this framework. One direction is--and we'll visualize it like so... There's two directions you can travel, and you can either go sideways or you can go up and down. If you go sideways, you're changing your perspective. So normal mode, word mode, line mode, window mode, and so on are all different perspectives on your ground editing experience. The other direction you can travel in is up or down, which takes you through meta levels. So you go from the ground level editing experience, up to mode mode, and then up to the tower plane, and so on, and so on.

[00:15:07.040] So this all sounds very complex, but the truth is it's not really that complicated, even though it feels that way. The reason it isn't that complicated is because no matter how many levels up or down you go and no matter where you are, whether you're in at the ground level editing the actual text or whether you're at a meta level, some unknown meta level and you don't know where you are, no matter where you are, the way in which you interact with it is the same at every level. That is the great power of this approach: that all of the different levels are the same. In fact, the complexity of the whole is exactly identical to the complexity of each part, so if you know how to edit words in the ground level buffer and you know how to move lines around using line mode, then you know how to edit any aspect of your editing experience at any level.

[00:16:30.079] So this is a pre-release demo. This doesn't exist on MELPA yet, but you can follow updates at this repo on github. If you can also be a beta tester or something like that, if you like, that would be very helpful. You can learn more about this at, which is where I house the research that I work on. In particular, the research on epistemic levels is what inspired this particular Emacs extension. You can also learn about dialectical inheritance attribution, which is the basis of a new economic system that could be fair and could lead to a prosperous and happy world. You can follow me on Twitter at @countvajhula. That's it! Thank you.

Saturday, Nov 28 2020, ~11:01 AM - 11:21 AM EST
Saturday, Nov 28 2020, ~ 8:01 AM - 8:21 AM PST
Saturday, Nov 28 2020, ~ 4:01 PM - 4:21 PM UTC
Saturday, Nov 28 2020, ~ 5:01 PM - 5:21 PM CET
Sunday, Nov 29 2020, ~12:01 AM - 12:21 AM +08

Back to the schedule
Previous: Trivial Emacs Kits
Next: Building reproducible Emacs