Back to the talks Previous by track: Gypsum: my clone of Emacs and ELisp written in Scheme Next by track: p-search: a local search engine in Emacs Track: Development

An Experimental Emacs Core in Rust

Troy Hinckley - https://coredumped.dev, troy@troyhinckley.com

Format: 21-min talk ; Q&A: BigBlueButton conference room
Status: TO_CAPTION_QA

Talk

00:00.000 Rune 00:17.082 The Emacs core 00:57.168 Why create this? 01:55.865 How does this compare to other projects? 03:01.315 Multi-threading 03:32.441 Multi-threading elisp 03:47.648 No-GIL method 04:32.638 Actors 04:51.252 Multi-threading elisp (functions) 05:34.680 Caveats 05:57.090 Multi-threading elisp (data) 06:38.249 Copy values to other threads on demands 06:57.884 Multi-threading elisp (buffers) 08:11.903 Would this actually be useful? 08:46.919 Precise garbage collection 09:16.537 How Emacs used to deal with roots 10:38.713 Conservative stack scanning 11:00.157 Movable objects 12:38.829 How Rust makes precise GC easy 14:13.227 Other Rust niceties: proc macro 15:14.560 sum types 16:01.041 Regex 16:16.052 Parsers 16:27.210 Other changes: GUI first, terminal second 16:58.919 Off-screen cursor 17:16.305 Image flow 17:24.440 Testing 18:36.345 Status 19:07.247 Next directions 19:22.739 How to get involved

Duration: 20:06 minutes

Q&A

00:08.500 Q: Have you considered using CRDTs to share buffers between threads and merge any concurrent edits automatically? 01:05.874 Q: Why hosted on GitHub? GitHub is nonfree. Is it possible to report bugs/send patches without using GitHub? 01:22.960 Q: Do you think it's possible to achieve 100% compatibility with current Emacs code? 02:11.913 Q: so you're re-implementing elisp in rust? have you considered using a more modern lisp, such as scheme? [11:03] 04:01.400 Q: Do you have specific features from the Rust compiler that are missing (or are nightly-only) that you would take advantage of? 05:26.880 Q: Can remacs be reused? 07:23.600 Q: What are you thoughts on the GUI layer. Any plans on how to reimplement it? 08:21.240 Q: If money could fix the problem, how much would it cost to ship this with feature parity before 2026? 09:28.350 GObject implementation 09:56.600 Q: elisp is implemented in c, so if you're not implementing elisp in rust, are you using/keeping the c implementation of elisp? 10:42.680 Clarifying rewriting Elisp in Rust 12:57.908 Q: Will your Rust implementation also be able to run Emacs bytecode? Or are you implementing it at the Lisp level? 14:20.100 Q: Is it possible to bootstrap with just the bytecode interpreter? 17:03.960 What would it take to bootstrap Guile in Rune?

Listen to just the audio:
Duration: 19:15 minutes

Description

An overview and discussion and early prototype of a new Emacs core written in Rust. The talk covers some of the interesting design choices in the GNU Emacs C core, as well as some of the trade-offs made in the Rust core. https://github.com/CeleritasCelery/rune

  • What is the Emacs core?
  • How has the core evolved?
  • Design trade-offs
    • multi-threading
    • Precise GC
  • Being bug compatible with GNU Emacs
  • Comparison

About the speaker:

Hardware Engineer with interest in low-level programming and the hardware-software boundary.

Discussion

Questions and answers

  • Q: Have you considered using CRDTs to share buffers between threads and merge any concurrent edits automatically?
    • A: While mathematically correct, CRDT does not always produce a useful answer. With different packages, this will the issue: not a useful result. https://www.moment.dev/blog/lies-i-was-told-pt-1
      • This question answer about CRDT's is by itself showing a lot of problem awareness
  • Q: Why hosted on GitHub? GitHub is nonfree. Is it possible to report bugs/send patches without using GitHub?
    • A: Email patches are possible. Github is what the speaker knows.
  • Q: Do you think it's possible to achieve 100% compatibility with current emacs code? 
    • A: Most should be compatible (since elisp package is the biggest emacs strength) but there might be differences on purpose.
  • Q: troyhinckley: so you're re-implementing elisp in rust?  have you considered using a more modern lisp, such as scheme? [11:03]
    • A: No actually. Only trying to reimplementing the C part of emacs, replacing it by rust. There are two other talks in the conference to use something else (guile and scheme).
  • Q:can remacs be reused?
    • A: Some of the code and some of the ideas (documentation, ideas and approach to problems) were taken. But different model: in remacs everything is an external type. Here, instead trying, to use the builtin type reimplementating the objects instead.
  • Q: hello, great talk, fascinating topic! I am a contributor of the compiler team of the Rust prog. language (though I don't delve in the code myself). Do you have specific features from the Rust compiler that are missing (or are nightly-only) that you would take advantage of?  10:05:38 
  • Q: What are you thoughts on the GUI layer. Any plans on how to reimplement it?
    • A: Either GTK+ or direct GUI in Rust, but no concrete plan so far.
  • Q: (not a question) Re. the GUI layer, the Gtk project has automated bindings with a framework called "GObject Introspection," which is what I am using for my "Gypsum" project. Probably Rust has a GObject Implementation which you could use.
    • A: Problem with Rust GUIs very new and mostly demo without accessibility and test in all environements.
  • Q: If money could fix the problem, how much would it cost to ship this with feature parity before 2026?
    • A: Probably needs a couple people and more than one year of work. Lots of testing required to fix all bugs.
  • Q: troyhinckley: elisp is implemented in c, so if you're not implementing elisp in rust, are you using/keeping the c implementation of elisp?
    • A: Idea of the project is to keep the Elisp layer and changing the C layer underneath that is currently called, replacing it by rust. Elisp does not change. Plan is to be even be bug compatible to emacs: Elisp should be perfectly compatible with both C and rust. Elisp is fixed in this approach.
  • Q: sorry if you already discussed this, but will your Rust implementation also be able to run Emacs bytecode? Or are you implementing it at the Lisp level?
    • A: Bytecode interpreter already exists (actually bytecode compiler and interpreter was bootstrapped first as it is implemented in Elisp).
    • A: the bytecode is missing some OP-code that have not been encountered so far.
  • Q: Cool, so will you also provide bytecode JIT compilation via "libgcc" the way Emacs currently does?
    • A: Eventually maybe, but a proper JIT would be better as it includes type information to better optimize the code.
  •  Q: Is it possible to bootstrap without the interprerter?
    • A: having only bytecode does not work, as some part are expected to be interpreted with macros containing functions that are not yet defined. You really need an interpreter for the lazy macro expansion.
  • Q: How would you do the native module system?  What would be different?  Sounded like part of an anwer about Rust Emacs modules got cut off.
    • A: There is an FFI, so it should be possible to have the same thing. Could possibley implement FFI in elisp.
  • Q:  That was me who was discussing with you about bootstrapping Scheme! (Not Guile, R7RS Scheme) Yes, it would be a whole other thing than what you have done so far. But it would be cool, because then you would be able to run the "Gypsum" editor on top of your Emacs engine.
    • A: I still think that would be really cool to get working! we will have to collaborate on that more.
  • Q: elisp is implemented in c, so if you're not implementing elisp in rust, are you using/keeping the c implementation of elisp?
    • IRC: of course they are implementing elisp in rust (?), what else could they mean?
    • I think they misunderstood your question
    • (not yet answered)?

Notes

  • Getting good vibes from this Rust work
  • (you can do the offscreen-cursor thing in current emacs with enough effort. see scroll-in-place.el.)
  • Oooooh flow images.  My last expectation was Rust making Dslide image animations slick
  • troyhinckley: Very cool talk! 
  • I recommend https://codeberg.org/ (as a freedom-respecting github replacement)
  • one big problem with using gtk -- there is a *two-decade-old* unfixed bug in gtk where it keeps permanent references to the x server etc (IIRC, its wayland support has similar problems). this means a gtk emacs --daemon crashes if you connect to an x server and then close the x server down. this is.... unfortunate
  • I had not thought about the fact that we could just mutate functions on the fly just like data. Neat!
  • really good talk!
  • Getting good vibes from this Rust work
  • I'm spending half this talk going OOH and wishing we had this in current emacs
  • Very cool talk!
  • So looking forward to a faster multi-threaded Emacs :D
  • Well, it does sound like we need some scale to turn this from a multi-year idea to a next-year idea
  • ruby shows one advantage here -- they're reimplementing bits of the ruby C core in ruby, and speeding it up because their new jitter then sees the insides of it and can optimize the hell out of it. ruby is, of course, famously slow... but not as slow as elisp :)
    • one wonders if emacs could benefit similarly...
  • one big problem with using gtk -- there is a two-decade-old unfixed bug in gtk where it keeps permanent references to the x server etc (IIRC, its wayland support has similar problems). this means a gtk emacs --daemon crashes if you connect to an x server and then close the x server down. this is.... unfortunate
  • yeah "of course" all new emacs efforts should be able to run elisp code; but it would be awesome if new lispy code could be written in scheme
  • YouTube comment: Really interesting approach. I like the humility in the beginning, the likes of "this is to try new approaches".

Transcript

Hello, EmacsConf. My name is Troy Hinckley, and this is my talk on Rune, a Rust implementation in Emacs. We strive to be bug compatible with Emacs, so you can use the same Elisp. It's still a fairly early stage experimental project, and we have some basic things implemented.
[00:00:17.082] The Emacs core
Before I get started, I want to talk a bit more about what the core is. So the Emacs core, it includes the runtime, the interpreter, garbage collector, everything used to run the code. It includes the GUI. It includes all the data structures. If you look underneath all the Elisp data structures, there's C code underneath there, as well as the auxiliary functions of which there's about 1500. In making this talk, I don't want to give the impression that I'm saying the core is outdated or that needs to be replaced or that it can't be evolved on its own, because clearly it has continued to evolve. If we look in just the last few years, we can see that we've added native compilation, we've added tree-sitter support, we've added color emoji, and there's work right now to add a new garbage collector to Emacs as well.
[00:00:57.168] Why create this?
Why create this project? Emacs has a long history. It has a lot of users. It needs to support a big community. Because of that, it has to be very conservative about what things it can allow into the project. Forks like this create an opportunity to experiment and try new approaches. This is particularly a good use case for Rust because the C core, it's pretty well tested. It's been around for a long time. A lot of the bugs have been ironed out, but when you're doing a new greenfield project, it's very easy to introduce new undefined behavior and memory unsafety and stuff like that. Rust protects us from most of that, but it also gives us the ability to be fast and has a strong ecosystem behind it. Rust is also really good at multi-threading. Their phrase in the community is fearless concurrency. They should be able to write concurrent programs without having to worry about data races. It's also really high performance. It has a really good regex engine. It's known for its non-copy I/O as well.
[00:01:55.865] How does this compare to other projects?
How does this compare to other Rust and Emacs projects, whether there's been a couple? The first is Remacs. This project was the first. It took an outside-in approach. Basically you could take a C function and replace it with a Rust function and build it together as one executable. This is pretty easy to do because they could both talk over the C ABI. You could swap out functions once at a time. They made really good progress at first, but eventually they ran into the problem that as you get down to the really core parts of it, you can't just replace one function at a time anymore, because some of that functionality is connected to other things. Like for example, you can't replace the garbage collector without replacing the entire garbage collection system. So the progress really kind of slowed down. Another issue with it was, is that they were doing a one-to-one rewrite, so they weren't adding any new features or functionality, just taking the same code and replacing it in Rust, which doesn't add any advantages in and of itself. This spawned Emacs-NG, which was kind of the spiritual successor to Remacs, where they decided to add new functionality, the biggest one being a JavaScript runtime, as well as some new renderers to Emacs. This is no longer actively developed though.
[00:03:01.315] Multi-threading
In this project, one of the big focuses we have is on multi-threading. The C core itself is, everything is designed around being single-threaded, all the data structures and everything like that. Rust has a great concurrency story. In Rust, everything is intended to be multi-threaded. That doesn't mean that everything has to run on multiple threads, but you can't write something that is limited to only running in a single-threaded environment. So this makes it really easy to use all the existing packages and build something that is concurrency safe. which is what we've done here, and that was relatively easy to do.
[00:03:32.441] Multi-threading elisp
But adding it to Elisp is the hard part, because we've got to come up with a good model for Lisp, and Elisp is just a giant ball of mutable state. We need to find some way to tame that so we can make workable concurrency out of it. There's really two ways you can do this.
[00:03:47.648] No-GIL method
One is what I call the no-GIL method. This is what Python is doing, where you take all of your data structures, you make them concurrency safe, and then you just leave it up to the programmer to decide what they're going to do with it. They've got to build safe abstractions on top of that. One of the big downsides with this is that it comes with a pretty high cost. The last benchmarks I've seen is that by making everything concurrency safe in Python, single-threaded code is about 20% slower in some benchmarks. Since most code is single-threaded, this has a big performance impact for most code that isn't taking advantage of the multi-threading. The other thing is this introduces a lot of nasty concurrency bugs because you can have anything mutating any part of the data from any thread, even if you can't have memory unsafety per se. The other option is actors, which are a really known way to approach this, where you trade some of that flexibility that you get with fully concurrent for more control and. Code and functions are shared between all the different threads, but data has to be passed along channels between different actors.
[00:04:51.252] Multi-threading elisp (functions)
We want the functions to be shared, and this should be pretty easy because we don't mutate functions like we do data, except when we do. In Lisp, functions are just lists like anything else. So you can mutate them just like lists. Even if you're not talking about interpreted code, like if you have a native compiled function, you can still mutate the constants inside the function. For example, here we have a function returns a string. We take that string out, we mutate that string, and now the function returns a different string. In Rune, we enforce that all functions, their constants are immutable. You can't mutate the insides of a function. You can still swap out functions and redefine them, but you can't mutate the inside of a function. This enables them to be safely shared across threads. However, there are some caveats to this. For example, some functions actually do need to mutate their own data. The example that we run into is cl-generic. It uses a method cache. So it has to be able to update that cache. In this case, we just made a special case for this particular situation, but we don't know what more of these we're gonna run into the future where this is needed behavior to be able to mutate a function.
[00:05:57.090] Multi-threading elisp (data)
Okay, so functions are pretty easy. They just can be shared between threads, but data can't be immutable, at least not into the model that Emacs currently has. We have two different ways to handle this. One is we require whenever you're calling some other code in a different thread, you have to send all the variables that it's going to need over to that thread. This is how you traditionally do inside actors. Any data that needs to go to a different actor needs to be sent over a channel. It's relatively easy implementation, but this is difficult in the Emacs case because everything is going to be accessing different variables. That means when you call something, you have to know ahead of time, all the different variables that are gonna be accessed inside that other thread and put those in when you call it.
[00:06:38.249] Copy values to other threads on demands
The other option we're using is we're copying values to the other threads on demand. If you're running a thread, it tries to look up a variable. It doesn't have any value for that variable. It will go back and ask the main thread and it will copy that value into that thread and it can continue execution. This is nice because you can just launch some code and it'll take care of handling all the data transfer for you.
[00:06:57.884] Multi-threading elisp (buffers)
But we don't want to be copying around is buffers, because they can be really large. In this case, we have a mutex. Each thread could only have one current buffer that it has an exclusive lock to. This comes with some trade-offs, big one being that if the user tries to access some buffer, they want to type something, and a background thread is holding onto that buffer, what do we do in that situation? And we still need to hold an exclusive lock, even if we're only going to read a buffer. If you have multiple readers, they each still need to take turns because we can't determine if at some point a thread is going to try and mutate the buffer. It has to be an exclusive lock. The other issue is buffer-locals. This is less of a implementation issue as much as it is a technical issue. Because you think about when we switch to a buffer, it has some buffer-local data and we have some thread-local data. As we go through, we're mutating everything. Those can get intertwined and pointing to each other. Then we switch away from that buffer. We need some quick way to be able to separate those out. The buffer-locals can go with the buffer-locals and the thread data can stay with thread data and make copies of anything that was pointing to the other side. But we don't have a good method to determine how to separate those two, like what data belongs to this and what data belongs to this, so that we can do that quickly. We haven't found a good solution to that yet, but it's something we're still working on.
[00:08:11.903] Would this actually be useful?
The question is, would this actually be useful for doing real work inside Emacs? I would say, yes, there's a lot of things you can do with this. You could handle process output in the background. You can do syntax highlighting. You can do buffer search in parallel. You can do LSP. You can do fetching your mail in the background. You can have a window manager that doesn't block your window manager when Emacs is blocked. You could do something like a file system watcher that keeps up on files without blocking Emacs. This wouldn't be so great for building concurrent data structures or operating on shared data or building your own abstractions, because of the trade-offs that we've made here. Okay. That's talking about multi-threading.
[00:08:46.919] Precise garbage collection
The other thing we're going to talk about is precise garbage collection. In Rune, we have a safe, precise garbage collection because of the Rust type system. Let's look at what the problem is with garbage collection in the first place. Really, the tricky part about garbage collection is rooting. How do we find out what the roots are? These are all the values that are on the stack or inside the registers. In this example here, we allocate an object. We call garbage_collect, that object's collected, and then we try and return it. It's no longer valid.
[00:09:16.537] How Emacs used to deal with roots
Let's look at how Emacs used to deal with this problem way back in the day. There was a system called gcpro or GC Protect, which is basically designed that every time a value needed to survive past a garbage collection point, you had to try and protect it. In order to do this, you had to declare a struct, you had to put a macro around it to root the object, and then you had to unroot it when you were done-- past the garbage collection. Now the value is safe. You can see down here, I pulled these eight rules out from a really old version of the Emacs manual about all the things you had to keep track of when you were trying to use this system. All right, so there was a special handling for nested GC protects. You had to make sure the memory was initialized. You had to make sure that traps couldn't occur between allocating and when GC protect would happen. It can be tricky because you don't always know when a function that's getting called could potentially call garbage collection. So if you got something wrong, you also might not catch it for a long time because garbage collection may only get called one out of 99 times. The other 99 times is just fine. That one time it happens and you can't reproduce the issue. When you do get this wrong and some, something doesn't get rooted and it gets overwritten, it generally doesn't show up right where the problem is. It gets showed up way later when you actually try and access the value and the value is invalid. You've got to track it back to where that thing did not get properly rooted. It's a huge source of bugs and very hard to maintain.
[00:10:38.713] Conservative stack scanning
Emacs decided to go with a different path, which we call conservative stack scanning. Basically, the garbage collector just looks at the stack and all the registers and any data inside there that looks like it could be a pointer, it treats it as a pointer. This is nice because you get really easy root tracking, but it also comes with some trade-offs, mostly that your objects are no longer movable.
[00:11:00.157] Movable objects
Why would we want movable objects in Emacs? There's a couple of different reasons. One is compaction. You can take all your heap, you can pack that on down because you can coalesce all your objects together. Another is that it's easy to implement generational garbage collection. You can just copy everything out of your minor heap into your older heap. Really, Emacs is kind of uniquely ideal for generational collection, because the typical way we interact with Emacs is as a series of commands. You execute some command, you'd execute the next command, you execute a command. It could be happening every key press, it could be happening with M-x. However long that command is, that is the ideal length for the minor collection generation, the first generation. Because once you're done with that generation, anything that's still existing is going to be around for a very long time. So that works out really well for Emacs. We want to make this a generational collector. The other thing is with object layout. We use a lot of lists inside Emacs Lisp. Every time you go to the cdr, you've got to be chasing a pointer around the heap and following that. That can potentially result in cache misses and all sorts of other things like that. So it can take a long time. It can be quite slow. But if you have the ability to move objects, you can just relocate an entire list and lay it out in an array right next to each other inside memory. So iterating over it is just as fast as iterating over an array. But you can only do that if you have movable objects. I'll point out here too, that with conservative stack scanning, it's not that all objects are immovable. It's only ones that are pointed to from the stack or from the registers that have to become immovable.
[00:12:38.829] How Rust makes precise GC easy
Let's look at how Rust makes precise garbage collection easy. Here I have some Rust code to show kind of how the lifetime system works and what we call XOR mutability, where we can only have one mutable reference or multiple immutable references to the same thing. Here we declare a vector, we take a reference to the first element of the vector, and then we mutate the vector. Now this could potentially resize the vector and move it to a different location in memory, so that reference is no longer valid. The nice thing is, Rust catches this for us. It says, hey, this is no longer valid. This reference can't survive past when you mutated it. Okay? That's exactly what we want for a garbage collector. You can see here, we take this in a garbage collection context, we create a new context object, we add an object, we call garbage_collect, then we try and access that object. It's no longer accessible, and Rust will prevent us from trying to access that variable. So, how do we solve this? We have a root macro. We declared this root macro, it lets us take the object and let it live past garbage collection, and everything works out. The nice thing is, this root macro will get dropped when it's out of scope, so we don't have to worry about the un-gc-protect step of this. Statically, Rust will verify and tell us any object that needs to be rooted. If we try and access it, it'll tell us it's invalid. We have this root macro and then we can access it. So in that way, we have safe, precise garbage collection without any chance of introducing undefined behavior, which is really, really powerful. It's really easy because the type system will catch it all for us.
[00:14:13.227] Other Rust niceties: proc macro
There's some other Rust niceties I want to kind of talk through that are useful, but are not, you know, star features. One is proc macros. You can see up on the top, you can see how you declare a function inside the C core. All right. You have to use the macro. You have to put the list type, the function type, the struct type, the different types of arguments or different number of arguments, the doc string, and then you can put your argument listing down inside there. On the Rust side, we just write this like we would any other Rust function. And then we put the defun proc macro on there and it takes care of everything for us behind the scenes. A couple of cool additional things we can do with this is that we don't have to make everything just an object. We can actually make things more specific types. Here we have symbols. As well as you can see subfeature, it's an optional parameter, and we just make it an option inside Rust and it automatically make it an optional inside Elisp. This makes them really easy to write. I can't take credit for this is because this is something that I saw inside Remacs and I stole from them, but it makes the functions really easy to call from each other and really easy to write as well.
[00:15:14.560] sum types
Another thing that's really nice is sum types. In the C core, if I wanted to get a string out of an object, I would first need to check that it's a string and then dereference it as a string. But if it's not a string, I may introduce undefined behavior. So in complicated code, I have to make sure that I have always checked what type it is before I try and dereference that type. We don't have to worry about any of that inside Rust because we can untag a value and we can use their some types, basically create an enum and we can match on what the different values can be. Then we only get out the types that are viable or are actually there. So we never accidentally get something out of an object that we didn't mean to, or dereference it as something that doesn't really exist. We can just match on it and we can get out the values that we need, which is really, really powerful. So there's some other Rust niceties as well working with here. One is the regex engine inside Rust is really fast, high performance. We are using that for the Elixir regex engine to give it high performance and worst-case guarantees. The other is that Rust has a lot of really good parsers for things like JSON that are no copy parsers that are high performance. We can use those inside Rune as well.
[00:16:27.210] Other changes: GUI first, terminal second
There's a handful of other changes we're working on that are not Rust-specific, but we'd like to see. The first is being GUI first, terminal second. This means two things. First is that we have all of our key bindings. Right now inside Emacs, C-i and TAB are bound to the same key binding by default, because that's how it works inside the terminal. In the GUI, you shouldn't have that limitation. The second is that the GUI should not block when Lisp is blocked. It should be independent of that. Your GUI can still continue to operate when Lisp is running.
[00:16:58.919] Off-screen cursor
The other is the ability to have an off-screen cursor so that you can be typing on something, you can scroll up and down and the point doesn't have to follow you where you lose your place where you were before. You don't have to intentionally set a mark. You can just scroll and then start typing and it'll go back up to where it was before, like it works in most applications. And this can be optional.
[00:17:16.305] Image flow
The other is image flow. We want it so that you can easily flow images and you can have large images and scroll past them without jumping and you can flow text around images. How are we testing this project? Because there's a lot of things that you could get wrong here. One thing we're doing is we're using ERT. Emacs ships with over 7,000 built-in tests--Elisp tests. We are using this test suite to test our project as well. We can kind of use this as a dashboard of saying how close are we to getting to parity with GNU Emacs. The other thing that we have is a tool called elprop, which is an external utility that basically tests for correctness. Because really, the correctness of Rune is whatever Emacs does, because there's no official spec on how things should behave. To do this, we can go look at the Rust function signature. We know what the arguments are, we know how many they are, and we know what types they should be. Given that information, we can generate a whole bunch of random functions feeding those types in. And then we send a copy over to Emacs, we send a copy over to Rune. They each evaluate it and they return the result and we make sure the results are the same. Then you do that for thousands of different implementations of the function. And it helps us find corner cases really easy without having to handwrite a whole bunch of different cases to test things and say, where are these two functions different? So the current status: we already have a multi-threaded Elixir interpreter and bytecode engine inside there. There's no actual text editor in there yet, but the primitives are there. Like you can insert text, move point around, delete text, do different things like that. But we don't have a GUI hooked up to different key bindings to actually type on. There's just a REPL to operate in. We have about 250 of the 1500 built-in functions already implemented inside there. There's a lot of low-hanging fruit inside this area to still be implemented.
[00:19:07.247] Next directions
The next directions we're working on is we're optimizing the GC. We want to make it generational. Like I said, right now, it's just a simple semi-spaced copying GC. We want to add a proper GUI. We need to implement text properties, overlays, process and job control, all that goodness right there.
[00:19:22.739] How to get involved
How can you get involved? This is hosted on GitHub. You can come on over. If you have any ideas about how to implement something or something you'd like to see done, go ahead and just open an issue so we can have a discussion about it. We've had lots of interesting discussions with different people coming in to the GitHub repo. If you're interested in contributing, the easiest way is probably to run elprop, pick some function, run elprop on it. I promise it won't take long to find some issues, some discrepancy between Emacs and Rune, and that lets you dive into the Rust code and figure out, and the C code, and figure out what the difference is between the two. or come along and help implement your favorite functionality. This has been a really interesting project so far, and we've had a handful of different contributors on it who just kind of want to learn Rust or get more into systems-level programming. Thank you.

Captioner: sachac

Q&A transcript (unedited)

Okay, so I'm going to look at some of the questions showing up in the etherpad we got here.
[00:00:08.500] Q: Have you considered using CRDTs to share buffers between threads and merge any concurrent edits automatically?
It says, have you considered using a CRDT to share buffers between threads and merge any concurrent edits automatically? So I have looked at that. And the problem with CRDTs is that even though they give you a mathematically correct answer when you're trying to merge two conflicts, it's not always a useful answer. Like, it's not coherent. If you have two things trying to edit the same thing, there's no good way to resolve that. And so they really work well when you have two people working live, both editing the same document, because they can fix any particular issues like that, like you would with Google Docs. But you have different packages that aren't aware of each other, and you're going to run into problems. And so this is something, if you read from the Xi editor, which was one of the first ones to use CRDTs, in the retrospective, he talks about how they had this problem, where the CRDTs They give you an answer, but it's not always an answer that's useful. And so I feel like locks at least are going to make it. It's not going to be as efficient if you have a whole bunch of packages, but I don't imagine there's going to be a ton of those. It can actually, I think it'll be more useful in practice.
[00:01:05.874] Q: Why hosted on GitHub? GitHub is nonfree. Is it possible to report bugs/send patches without using GitHub?
I host on GitHub because that's what I know. If there's a way to host it on somewhere else, I'd be interested in doing that. If you're interested in setting patches without using GitHub, you could always send an email. I'm more than happy to accept email patches.
[00:01:22.960] Q: Do you think it's possible to achieve 100% compatibility with current Emacs code?
Do you think it's possible to achieve with the current Emacs code? I do. I think, I think you can do that. Um, like I said, there's a couple things inside there that are intentional breaks with existing Emacs code. And some of those being like functions are immutable. As well as having data shared switch between different threads, which means there's going to be some copying going on. So there's going to be subtle things that are going to be different. And we've really got to think about those intentionally, but I'm really going for bug compatibility with GNU Emacs so that you can take an existing Elisp package and just run it and it just works, 'cause I think that's one of the big strengths of the Emacs ecosystem is the millions of lines of Elisp that people have written. Um, So I'm not, okay...
[00:02:11.913] Q: so you're re-implementing elisp in rust? have you considered using a more modern lisp, such as scheme? [11:03]
So since you're re-implementing Elisp and Rust, have you considered using more modern Lisp such as Scheme? So I'm not re-implementing Elisp and Rust. I'm re-implementing the C in Rust. In fact, I would like to make more of the core that's written in C in Elisp instead of C or Rust, because then it's actually introspectible. There's a talk by Tom Tromney that he gave a while ago about Emacs should be Emacs Lisp. I kind of like that philosophy, that as much of it as should be Elisp as possible, and we should only have C or Rust or some systems level language for really low-level stuff. Using a more modern LISP such as Scheme. I know there's, I mean, there's two talks, I think, in this conference about using Scheme inside Emacs. And I looked at this at one point about what if you wrote it inside Common LISP, because that's also has some really low level details. And then you could go from Elisp down to Common Lisp. But the problem is, is under the hood, you still need a systems language. You can't, you still need either C or Rust or something underneath the Common Lisp to implement the primitives. And so it's not going to give you just two languages, you know, you'll have three. You'll have the elisp, common lisp, and C under the hood. And so in this case we just have the two. We have the Elisp and the rest. All right that's all the questions I see there. Let me go look at... Okay, so I see into the chat.
[00:04:01.400] Q: Do you have specific features from the Rust compiler that are missing (or are nightly-only) that you would take advantage of?
Does it features from the Rust compiler that are missing that way you would take advantage of? Oh, that is a great question. Um, there's a handful of them. Uh, I should've written down a list of these. One of them is Polonius, which is the new borrow checker because we're trying to be used lifetime to track our objects. We often run into situations where we've kind of got a hack around things because the limitations with the borrow checker. And so I have a whole bunch of like notes inside there about where. A better Polonius would help inside there and help ease some of the issues. Another thing is enum variances types, because right now we have an object which is defined as a big enum that had all the possible objects, but if we want to have a subset of those objects or just pass in one of those objects, we've got to define a new struct. And so we have a whole bunch of boilerplate code to define that all out. And if we had variances types, that would make the code a lot easier. Another one is the allocator API. Right now we're kind of working around it, but ultimately we're going to need our own allocator. And the allocator API is still nightly only. So there's a couple more. I'll look at that more, but that's a great question. Let's see.
[00:05:26.880] Q: Can remacs be reused?
Okay. I see a question you might not have noticed just asking about reuse of Remacs. Oh, yes, so I have reused some of Remacs code, and some taken, like I mentioned, taken some of their ideas, but ultimately we're using a different model because under the hood in Remacs, everything is just defined as an opaque external type that's defined inside Emacs and so it just pulls those in interacts with those and passes them back into C. We're trying to see what we can do if we say okay we're not going to take the same... So they're bound to the implementation details of Emacs, and we don't want to do that. We've re-implemented all the core types ourselves. So that means that we can't just take the Remacs code one for one and use it in our project, but we can take a lot of their ideas. I've spent some time reading through their documentation, different things about how they approached strings and GC and different stuff like that. Looks like all the questions.
[00:07:23.600] Q: What are you thoughts on the GUI layer. Any plans on how to reimplement it?
Okay, so another question. What are your thoughts on the GUI layer? Any plans on how to reimplement it? This is something I've thought a lot about, but I still don't have a solid plan for. I'm not really a GUI person. I mostly work with low-level. And so my two thoughts is you can go the GTK route. There's Rust bindings for that. That's well understood. It's got a good support. But there's also some interesting projects to try and do GUI in Rust, native Rust, and have it use Rust idioms and stuff like that. And so those are things like Druid and there's eGUI and a bunch of those that are that way. And I've never used one of them, but I'd be interested to try that out first and see how well does this work and how well supported is this doing a Rust-first GUI.
[00:08:21.240] Q: If money could fix the problem, how much would it cost to ship this with feature parity before 2026?
If money could fix the problem, how much would it cost to ship this with feature parity before 2026? Ooh, in a year. Uh, that's a good question. Even if we had the money, it would take more than just me, even if I was working on it full time. Um, I don't know. That's a good question. I would think it would take a team of at least a handful of people to get this shipping within one year. Because there's still a lot of work to do. And even once you have everything implemented, there's a lot of bug finding and smoothing it out so that it runs as well as Emacs, which has been, you know, battle tested for a long time. Um, okay, so this might be a good moment for me to break in and just say that we've got about 10 minutes left before we resume new talks on on both tracks. Of course, we're happy to keep this. This chat open and keep the recording going here, which will share after the conference as long as as long as there's discussion here. Thank you.
[00:09:28.350] GObject implementation
GTK project has automatic binding with a framework called GObject introspection, which is what I'm using for gypsum project. Probably Rust has a G object implementation, which you can use. Yeah, I know it has some GTK bindings. Um, I'm not sure if it's specifically GObject, but that's a good path because I feel like the problem with the Rust GUIs is that they're all very new. And so, you know, everything works in a demo, but you need something that can work across all different devices and all different platforms and have really good support and good accessibility and stuff like that.
[00:09:56.600] Q: elisp is implemented in c, so if you're not implementing elisp in rust, are you using/keeping the c implementation of elisp?
All right, Elisp is implemented in C. So if you're not implementing Elisp in Rust, we're keeping the C implementation of Elisp. So let me see if I can do a better job of explaining this. So inside Emacs, you have about a million lines of Elisp. And underneath that, you have the C, which is the primitives everything's implemented yet. And so we're keeping all of the Elisp, and we're just taking that C layer and replacing it with Rust. And so when you call a built-in function, it's calling into a Rust function instead of a C function under the hood. So all the Elisp stays Elisp, but the C becomes Rust.
[00:10:42.680] Clarifying rewriting Elisp in Rust
So looking at the IRC chat, it feels to me like maybe there's a little bit of confusion around what do we mean when we say rewriting Elisp in Rust, right? I think there are some people that are like, A, we're reinventing ELISP, and there's other people that are like, no, we're trying to be byte-for-byte compatible with Rust. So some people are questioning your no answer on that. Aren't you really, maybe, is what I'm reading in there. Do you want to respond to that? Yeah, I'm trying to think about how I can make this clearer. So the Elisp stays Elisp. We're not changing the Elisp at all, or at least very minimally. We want to be able to take, like I said, bug compatible. So whatever works inside existing Emacs, you can take all the Lisp and you can run it in Rune and it works the same. So the Elisp stays the same. It's just the under the hood core that is getting replaced. And this in turn adds some new features such as multi-threading. So it's not exactly compatible, but you should be able to use your existing code and the Elisp will stay Elisp. So the idea is that anything that I've written as an Emacs user, my config, my custom packages, whatever it may be, that's all going to be valid code. If you take, you know, the Elisp implementation as being the C code and the parts of Elisp written in Elisp that represent opportunity space thinking about Rust as an implementation language... Okay fine. You know, you can make a semantic argument, okay we're re-implementing, we're creating an alternate implementation of Elisp but what Elisp is isn't the problem space here. That's a fixed, a given, if you will. Is that all right? That's a good way of saying it. Okay. Yeah, what you said makes sense. I was kind of responding to some comments, like I'm not sure it connected for everybody. Makes a lot of sense. Yeah, I wasn't sure how much I needed to expand on that and explain that, but I appreciate you jumping in. Um, okay. So if I were just going to, but...
[00:12:57.908] Q: Will your Rust implementation also be able to run Emacs bytecode? Or are you implementing it at the Lisp level?
Will your Rust implementation also be able to run Emacs bytecode or the implemented at the Lisp level? So I already have a bytecode interpreter inside there that runs the existing Elisp bytecode. And so that was one of the first things I did was bootstrap the interpreter and then bootstrap the bytecode engine. And so we compile, we use the... because the bytecode compiler is written in Emacs Lisp. So you bootstrap that and it gives you the Emacs bytecode. I have a bytecode engine that runs the bytecode. So that's already done. And you can potentially, on top of that, do something like the data compilation or a JIT. But we have both an interpreter and a bytecode compiler. And I'll just break in one more time to say with about five minutes left in our live time with this Q&A session, which we're happy to keep going as long as there are questions. Coming up in five minutes, we'll have a talk on color on the gen track. And then right here, we'll have the p-search talk. Thank you.
[00:14:20.100] Q: Is it possible to bootstrap with just the bytecode interpreter?
Is it possible to bootstrap with just the bytecode interpreter? So I'll have to put in a link to one of my blog posts. So that was my original idea was to say, I don't want to have an interpreter, a bytecode interpreter and a native compiler. I want to just have just one. So I'm only going to have the bytecode. And so that's what I did initially. The problem with that is, is that a bunch of the early bootstrap Emacs code is written with the assumption that it's going to be interpreted. This is especially true with macros, where you'll have a function defined, you'll evaluate part of the function. The other half of the function has macros in it that are not defined yet, but it doesn't matter because they don't get used. But with the bytecode interpreter, it expands all macros when it gets the function definition. And so those weren't macros when the function was expanded, and therefore they got instantiated as functions, but they're not functions, they're macros. And so I initially spent a bunch of time trying to work around this, trying to move code around, work stuff around, refactor the code to try and get it to work with only bytecode interpreter. And eventually I just gave up. I said, you know what, I'm just going to write an actual interpreter to handle this because trying to handle all these lazy macros is too much work. And everything in the bootstrap is built with the assumption that you have lazy macro expansion. I'm guessing the Emacs bytecode interpreter isn't complete. So it's mostly complete. There's a handful of opcodes that aren't implemented that are pretty easy to add that I haven't run into. And there's some of them that are deprecated that aren't implemented, but it's essentially complete. We also provide a bytecode JIT compilation via libgcc the way Emacs currently does it. Eventually I would like to... I'm more inclined to have a proper runtime JIT than an ahead-of-time compiler like libgcc, like the current Emacs native compilation, because it allows you to take type information and actually apply that to the code, which can let you do more aggressive optimizations to it.
[00:17:03.960] What would it take to bootstrap Guile in Rune?
He said, we may either get a new Emacs with an ancient C core with a modern Lisp, or an Emacs with modern core, but stuck with ancient Elisp. So there was another project I was talking to, one of the Guile implementations, about how we could potentially, he was like, what would it take to bootstrap Guile in Rune, where you have both, you could have Elisp and Guile running inside this project. And so we've started that discussion, which I think would be, which would be interesting. But it's tricky too, because fundamentally Elisp and Guile are two different languages. They have different semantics. They have different ways of handling things. You've really got to consider both of those when you're trying to make them work together. How would you do the native module system? What would be different? We can do the same thing. We have an FFI. So I haven't looked into it a ton, but I feel like it could be similar. And I'm actually interested, there's a couple projects on GitHub right now to have an FFI written in Elisp. So you don't even need to create a separate C or Rust module, because you can actually write native modules in Rust or C. And so you can just have direct bindings to a C FFI written in Elisp. You don't need any C code inside there. And I think that would be an interesting approach to look at as well. Oh, Ramin. Yeah, that's right. We were talking about that, about bootstrapping Scheme. And at this point, we have broken away from, uh, from this talk, but we're continuing to record and this will all be published. I'd say, go ahead and keep going as long as you'd like in here. And, um, thanks once again for the awesome discussion. Thank you so much. So I'll just pay attention to the ether pad and the chat and see. That sounds good. I'll keep an eye on IRC. And if there are more questions here, I'll bounce people toward the ether pad or this, uh, chat room. All right, thank you.

Questions or comments? Please e-mail troy@troyhinckley.com

Back to the talks Previous by track: Gypsum: my clone of Emacs and ELisp written in Scheme Next by track: p-search: a local search engine in Emacs Track: Development