Back to the schedule
Previous: Extending the "model" of Emacs to other applications
Next: Old McCarthy Had a Form

Emacs Lisp native compiler, current status and future developments

Andrea Corallo

Q&A: live
Duration: 39:08

If you have questions and the speaker has not indicated public contact information on this page, please feel free to e-mail us at emacsconf-submit@gnu.org and we'll forward your question to the speaker.

Talk

Download video, 39:08, 89M

Download .odp (3.6MB)
Download --compressed56.webm (40.8MB)
Download --main.vtt
View on Toobnix

Q&A

Download video, 75:02, 141.8M
Download --answers--chapters.vtt

00:00 Thanks 01:16 Why is Elisp not a general-purpose programming language, at least not completely? 02:05 Is this activity related to the garbage collector? 02:37 Is the idea to eventually develop Emacs itself in Elisp? 03:42 How did you work on this? 04:27 Does this compilation pipeline introduce vulnerabilities? 05:39 What code, if any, will still benefit significantly from being written in C? 07:28 What's the risk of (setq native-comp-speed 3)? 08:49 Are there any limits introduced by native comp with respect to runtime introspectability, changeability/redefinability, etc? 09:15 Is there a benefit in setting native-comp-compiler-options to "-mtune=native -march="? 10:11 You mentioned native-comp coming in emacs 28. Will this be the default at build time, or will distros have to set this themselves? 11:54 Could we avoid libgccjit.so? Or consider using another jit lib (e.g. dynasm used by luajit) et al to gain better optimization? 14:22 How much of Emacs's C code base could be translated to emacs-lisp? What is the minimum C code base necessary? 16:23 Could we statically type elisp code (via macros?) to provide more optimization hints to compiler? 17:27 Elisp and Python all are dynamically typed langauge, but benchmark shows that Elisp runs slower than Python. Could we learn some best practices from the Python community? 18:55 Did you try to optimize with Rust too? What are your thoughts on Rust for this particular optimization and security? 21:35 Does the native compilation interface with the Emacs profiling tools? 22:59 Where did funding for your work come from? 27:04 What kind of application do I envision native comp enabling to work well in Emacs in the next few years, and which one would not be possible? 28:36 Is this the first real-world practical use of libgccjit? 29:47 Is there any task you need help with? 33:49 What's a good way to proceed? 38:37 What kind of packages do you think could now be practical with native comp? 40:46 Why not implement Emacs Lisp in Guile and use Guile's compiler? 46:29 What are some other hobbies/interests of yours besides Emacs? 48:27 Will you be presenting at ELS or anywhere else in the next year? 51:04 How to make Emacs more popular? 59:46 Do you have 'wish list' features, things you long for Emacs to be able to do? 01:02:04 From BBB chat: dickmao has a patch that makes Gnus async.... 01:05:33 Advice for anyone who wants to bring something into Emacs core 01:10:20 Do you have any advice on how to approach the upstream development community?

Description

Emacs Lisp (Elisp) is the Lisp dialect used by the Emacs text editor family. GNU Emacs is traditionally capable of executing Elisp code either interpreted or byte-interpreted after it has been compiled to byte-code.

In this talk I'll discuss the Emacs Lisp native compiler. This feature recently merged into the main Emacs development line allow for automatically compiling and executing Elisp as native code.

During the presentation I'll touch on:

design goals
compiler and runtime design and implementation
performance implications
upstream process
area of improvements and future developments

Format: 40 minutes

Discussion

Pad:

Q1: Why do you say that Elisp is nearly a general purpose programming lang? What's missing? (and btw, huge thanks for your work!)
- A:
Q2: Is this the "rudiments" that the garbage collector talk was discussing yesterday? Feel free to ignore this n00b question.
- A:
Q3:Is the idea to enventually develop Emacs itself in ELisp (c.f. Smalltalk VM developed in Smalltalk)?
- A:
Q4: How did you work on this? Did you use Org Mode to keep track of your progress? Did you use pictures to keep track of your compiler transformations or you made only for the presentation? Asking because it seems a complex project and I am not sure how you kept that all in your mind! For example, make sure to pick stuff that FSF was okay with while also deciding how to implement the optimization. Great job anyway!
- A:
Q5:Is this pipeline a possible source of new security vulnerabilities, or a new category of vulnerabilities? Is it something you are worried about or have had to deal with?
- A:
Q6: What code, if any, would still benefit significantly from being written in C? Could/should some of the existing C code be converted without significant performance loss?
- A:
Q7: What's the risk of (setq native-comp-speed 3)?
- A: Not sigificant risks. Some side effects might include: needing to recompile a whole file or compilation unit when redefining a function, otherwise the old function definition could be used.
Q8: Are there any limits introduced by native comp with respect to runtime introspectability, changeability/redefinability, etc?
- A:
Q9: Is there a benefit in setting native-comp-compiler-options to "-mtune=native -march="?
- A: Not at the moment. Maybe in the future if, e.g. libgccjit is enhanced further.
Q10: You mentioned native-comp coming in emacs 28, will this be the default at build time, or will distros have to set this themselves?
- A: It will not be enabled by default. Distros would need to enable it themselves.(Thanks!)
Q11: Could we avoid libgccjit.so? Or consider using another jit lib (e.g. dynasm used by luajit) et al to gain better optimization
- A: libgccjit is more for AoT compilation, more in-depth optimization, which JITters don't typically do, so they aren't really equivalent.
Q12: How much of emacs C code base could be translated to emacs-lisp? What is the minimum C code base necessary? (seems duplicate of Q6)
- A: Very hard questions to answer. Not generally feasible/worth to convert most of it.
Q13: could we statically type elisp code (via macros?) to provide more optimization hints to compiler?
- A: Hope to extend existing Elisp variable-type annotations to arguments and use that for optimization.
Q14: Elisp and Python all are dynamically typed langauge, but benchmark shows that Elisp runs slower than Python. Could we learn some best practices from the Python community? As you mentioned. make parameter type annotated is a promising point.
- A: Not sure if Elisp is really generally slower than Python. The Elisp bytecode VM is similar in design to the Python VM. Some native-compiled Elisp may already be faster than Python, e.g. for certain math code.
Q15: Did you try to optimize with Rust too? What are your thoughts on Rust for this particular optimization and security?
- A: Optimize what? There is no Rust here. Rust is interesting, though. There may be some possibilities, e.g. with regard to some similarities between Rust and some CL implementations.
Q16: Why not implement Emacs Lisp in Guile and use Guile's compiler?
- A: (not Andrea answering) This has already been tried and done, lookup Guilemacs, e.g. on EmacsWiki.
  - A: I think they meant to implement Elisp in Guile, and not to replace Elisp with Scheme
    - Yes, that's already been done. Guile can already run some subset of Elisp. Look it up.

BBB:

Where did funding for your work came from? Will you be able to maintain this in the foreseeable future?
akrk: What kinds of applications do you envision native-comp enabling to work well in Emacs in the next few years, that wouldn't otherwise be possible?
Is this the first real-world practical use of libgccjit?
Is there any easy tasks you need help with?
yes, your updates and communication with the community have been great
I believe, going by the aliases, there's at least a couple of GNU compiler hackers in this BBB room now
you mentioned that these improvements are orthogonal to the garbage collector. Do you know of any work on that area?
hmmm XEmacs got a new GC, much better -- and it turned out slower than the original. This stuff is hard...
- NullNix: Really? Yesterday Stefan mentioned XEmacs's GC and said that it could be used in GNU Emacs
- Alas, quite noticeably slower cache locality, probably. It may be possible to make it faster with more work, though, while the existing Emacs one is probably at its implementation limits.
- IIRC, the XEmacs one was a generational GC. It was a long time ago though, ICBW
- The JDK and Go both have interesting GC designs in this area...
some level of "naive" optimism is necessary to start working on these big projects
what kind of packages do you think could be now practical with native comp?
ok, but what are the limits, in your opinion? How fast can elisp go?
relevant link: https://www.emacswiki.org/emacs/GuileEmacs (with caveats. lots of caveats). note: guile has native support for elisp -- but the lack of buffers, the different rep with strings, etc, was hard. the guile compiler changed a lot in v3. v1.8 -> 2.0 -> 2.2 -> 3.0 were all big changes in the guile comp world
indeed
What are some other hobbies/interests of yours besides Emacs?
- Questioner: cool, reminds me of Thierry Volpiatto who's a professional climber, and there are several Emacsers who do DJing; that would be fun; "Emacs Plumbers Conf";
will you be presenting there or anywhere else in the next year?
Emacs keeps creeping toward being the Lisp Machine of the 21st century
- not only that, but with LSP and things like Doom, there is an increasing appeal for Emacs as an editor for the general public; native compilation will help with that
(Probably a live question)
- yes, myths about emacs-devel are hard to dispel
- Eli is so patient with the people on Reddit
- agreed, we owe him a lot
I've always wanted to do Emacs trading cards; like features, devs, etc :P
- that's a great idea
- I designed some for an old gaming clan years ago, it was a lot of fun
- Not American?~; in the US we have a tradition of baseball, and other sport playing cards, each with a picture and some stats on the back
- we have that with football (real football) in Europe
(Probably live question
- there might be even more from RMS that weren't accounted for as separate commits when the repo was converted to git
Where is the pre-git repository?
- I thought it was from bzr.
- I don't remember the details, but ESR's written about it on his blog IIRC
do you have 'wish list' features, things you long for Emacs to be able to do?
look for posts about reposurgeon
dickmao has a patch that makes Gnus async.... 8k lines from what I've heard
- I recall that dickmao posted it to the Emacs tracker but it was somewhat monolithic and wasn't well accepted
- i think he was asked to break it down to more reviewable units
One thing I wondered if you wanted to talk more about andrea was, to the prior point about "myths" about what emacs-devel is "like", there's also "make the culture your own". do you have advice on how to approach the community you haven't gotten to yet?

BBB feedback:

Extremely impressive work, Andrea. Also, very needed for the long term sustainability of the platform. Thanks a lot for the hard work!
- Yes, if there were an Emacs Hall of Fame, Andrea would deserve a prominent place in it.
- we really do need an Emacs Hall of Fame so we can remember those who laid the foundations
I think I can safely say that we are very grateful for your work, because there aren't many people in the world that could do it and who also have enough interest in Emacs to do so
I have to go, thanks for replying to our questions, for the chat and for the incredible work, Andrea
Thanks very much, and for your work on gccmacs
thanks for the presentation and answering all of the questions. thanks to you andrea for your awesome talk, and for the extended q&a!

IRC nick: akrl - What's the risk of (setq native-comp-speed 3)? will it melt my cpu? - The same as the risk of using -O3 with gcc. It gives itself the latitude to aggressively optimise away code elements which it believes are unnecessary, up to and including violating the language semantics of lisp (which gcc doesn't inherently care about). -O3 is something which is fine for specific cases where you test afterwards, but you would never, for example, set -O3 globally in Gentoo. - so we can get type annotations in Elisp?! - Is there a benefit in setting native-comp-compiler-options to "-mtune=native -march="? - Would Eli agree on replacing the C parts of Emacs with Elisp? - Why not implement Emacs Lisp in Guile and use Guile's compiler? - https://www.emacswiki.org/emacs/GuileEmacs - guile elisp is a very-long-term project, but so far has never been good enough: the problem seems to be, alas, time overhead involved in bidirectional conversion of strings and things like that: and unfortunately Emacs is all about strings... guile can run elisp (or something very like it) but that's not anywhere near replacing the elisp interpreter...

Feedback:

what a great talk. this will rise the hype for emacs 28
This work is really amazing. Congratulations on the effort and the deep insights that made this possible.
excellent presentation and work that will be greatly appreciate by all Emacs users.
It is very humbling to see this depth of knowledge and how it positively impacts my day to day computing experience.
this is a very interesting update on his talk at last year's GNU Tools Track at LPC
The worse thing about native comp is that you get used to it after a couple of days and you don't appreciate it anymore! which is not fair...

Transcript

Hi everybody, my name is Andrea Corallo, and this presentation is about the Emacs Lisp Native Compiler -- having GNU Emacs able to compile and run Emacs Lisp as native code. This project has been my hobby project for the last about two years and a half And we will see a little bit where this project is coming from, where we are, and where we want to go. So essentially everything you need to know about the Emacs Lisp Native Compiler, and probably a little more. Just a little bit of context on Emacs Lisp. Well, Emacs Lisp is a programming language, it's indeed a Lisp, and one year ago I looked for some statistics, and I was kind of pleased to see -- surprised to see actually -- that it's still kind of popular as a programming language. It doesn't rank that bad against other programming languages. Also, the other important fact about Emacs Lisp is that there is a lot of Emacs Lisp out there, and this will have an impact on this project. It's a programming language that is capable of almost any task, so it's almost a general-purpose programming language, and this reflects on Emacs itself, that it's capable of almost any task. Indeed this &quot ;almost&quot ; is something we want to fix, because we want to do everything, we want to do all of our computing with Emacs. Also, an interesting aspect for me is that it's not specified by any standard. This implies it can evolve in a more agile way without having to change the standard, etc. And, in fact, it's kind of improving, I believe relatively fast. A little bit about Lisp in general. First, it's the best programming language ever, we all know it. It has a lot of very nice properties, like it's dynamic, it's homoiconic, etc. But the interesting thing for implementors, is that, in theory, it can be implemented with very few primitives. You build very few primitives that are like magic, and on top of this, you can implement the whole language. This sounds very nice, and very appealing for implementors meaning to implement a new Lisp implementation, or improving or modifying an existing one. So the question is: how many primitives do we have to implement if we want to change the GNU Emacs Lisp implementation? Unfortunately, not really as few as we would like. In GNU Emacs, we have about 1500 primitives, and the main reason for that is performance. Actually, GNU Emacs was written when performance was a big issue, and nowadays certain parts are still performance-critical. We have a lot of C code; 30% of the GNU Emacs codebase is C code, and we have to maintain this. But not only that, this is the main barrier for people that tried in the past to change the Emacs Lisp implementation. Because not only do you have to replace these primitives, all of them or part of them, but sometimes they also share (these primitives), internal data structures. For instance, it's very difficult to say: Now I want to go from C to a different programming language for implementing these primitives. So this has been, effectively, the main barrier for doing this work. Another interesting aspect about the GNU Emacs implementation is that Lisp can run interpreted or byte-compiled for performance, and the byte compiler itself is written in Emacs Lisp. This implies that GNU Emacs has to go through a bootstrap procedure in order to be built. But it's kind of interesting for something that started as a text editor, or something like it. The byte-code that is Emacs Lisp when it's been byte compiled, it's running on a stack-based virtual machine that is implemented in C. OK, so I've listed a bunch of areas where Emacs Lisp could improve: Namespace, Extensibility, Performance, and Debuggability, and Diagnostic, we could say. This activity, this project in particular, is affecting primarily Performance and the performance area the Execution Engine. That said, I think it has an impact also on Extensibility, and I hope it will have an impact also on programming diagnostics, so giving better warnings, unknown. So which are the benefits of increasing the Emacs Lisp performance? Indeed, we will have, if we do that, programs that run faster. But the main implication of that is that we could write less C; we could maintain and debug less C. That is kind of a time-consuming task. And we could also allow for writing performance-critical extensions directly in Lisp without having to use systems like I think there's a bunch. These are very consistent benefits. OK, Project Goals, but I think the title of this slide maybe should be Project Requirements. So when I started this activity, I set some requirements for the project, with the main goal in mind: to go upstream. So I wanted to create something, a modified implementation of GNU Emacs, that was compatible as close to 100% as possible to the current implementation. And when I say &quot ;current implementation,&quot ; I don't refer to what the Emacs Lisp Programming Manual specifies as expected behavior of the implementation, but really the implementation itself. This is because there are a lot of corner cases that are not specified by the manual, but programs do rely on that, and given there is a ton of Emacs Lisp already around, compatibility was definitely a major requirement. I wanted to produce something that had reduced impact on the Emacs codebase, at least as much as possible. So I didn't want to rewrite all of GNU Emacs. Indeed, because it would have been too much work for one single person, but also still thinking to an upstream outcome all of this time. Another requirement was to have no, or very reduced, impact on the user, so I didn't want to change the way Emacs is used by you. And last but not least, introducing new dependencies, those dependencies had to be Free Software, possibly GPL, and also an important requirement is that these dependencies had to be some kind of trusted software that we know is going to be maintained in the future. Given Emacs has been around since forever, and will be around forever and ever, this was another very important point. A little bit of history of this project/ a quick timeline. 2019, in May, I did my first commit, I think it was when I tried to write my first primitive function ever in C, in GNU Emacs. And this was an attempt to try to compile a function that, once executed, returning ? That was it. Six months after (about), I had something that was kind of working, So I was able to start up a semi-standard Emacs and then to compile and load, and replacing most of the functions that I had defined floating in my Lisp universe. Those functions are the functions that are essentially composing Emacs, at least the Lisp side of Emacs. A lot of features were missing, like I had no Garbage Collector support, no bootstrap, I was not optimizing these functions Because optimization would be broken. No image dump support, etc. But I think this proved the design could work. So I sent to email to emacs-devel. I said I have this stuff I'm working on, and I wanted some feedback from the upstream to see if there was some interest. I believe the outcome of this was positive because about one month after, I pushed my branch within the Emacs git as a feature branch, and shortly after, we started to use the bug tracker to track bugs. So essentially we moved the development on the upstream infrastructure. I believe two years after the first commit, the project was merged after literally hundreds of bugs solved, and improvements, suggestions unknown and this was about six months ago. Before discussing how the native compiler works, I think it's worth looking at how Lisp is implemented in GNU Emacs. We have Lisp_Objects floating around our Lisp universe, and they are internally represented in this way. We have what is called a tagged pointer, that is just a regular pointer that is pointing to the area of memory where we hold the real data of the object. But within this tagged pointer, we reserve a few bits to indicate the type of object we are pointing to. This is important because each time we access an object, we have to typically check those bits to check that the object we are manipulating is of the right kind, remove those bits, and, if we are happy, access the object, otherwise unknown. All the objects are like this, except for typically Fixnums, that are small integers that we manage to fit directly within the pointer. Also for manipulating Fixnums, we have to check the tag bits each time. Whenever we are not sure of the type of object we are manipulating (read: almost every time), we have to check those bits and remove those bits before doing any manipulation on the Fixnum. How Emacs Lisp is byte-compiled and executed in, let's call it, &quot ;Vanilla&quot ;. If we have a Lisp expression of this kind: We take the variable 'a' we do plus 2, and then we multiply the result by 3, the byte compiler will produce this LAP code. LAP code is essentially the assembly for the byte-code, so it's the &quot ;intermediate representation&quot ; that will assembled into byte-code. (.elc files) How is this program executed? As I mentioned, it's executed in a virtual machine that is stack-based, but we start with an execution stack that's empty, and a stack pointer pointing to its bottom. And we execute the first instruction, that is pushing in the stack the value of 'a', in this case, 100. Then we push the constant 2. Then we do the summation, and we have the result in the stack. Same: we push the constant 3, we do the multiplication, and we will be able to return. Now, what's good and what's bad about this? A good thing is that it's very simple to start from Lisp and compile this kind of LAP output. At least it's reasonably simple. The compiler is not that complex. The bad thing is that all this machinery -- push and pop, etc. -- it's very different from how a modern CPU works. Because modern CPUs, they are not stack-based anymore but they have instead a fixed number of registers, and they work with assignment and operation within these registers that are generally called &quot ;general-purpose.&quot ; So to execute this LAP program, there is another program, that is the implementation of the VM itself that is doing conversion during runtime. So it's interpreting the LAP program and it's converting it into instructions that we can execute on the CPU. This conversion is done each time we will run some byte-code. And it's something that we want to avoid. Instead of this live conversion, we want to convert once: our Lisp program into native code, that is, a binary program that can be executed directly by our CPU. We want to save all this unnecessary conversion that we do each time we are running a program while we are running it. And we want to do this process just once, when we are compiling. That's the main goal of this activity. How is the byte compiler implemented? As any compiler it's a pipeline of transformations. We go through macro expansion, closure conversion, we have a bunch of source level optimization. Then we go into LAP, that's the transformation we are interested in, and after a few optimizations on LAP, LAP is assembled into byte-code. So if we list it in terms of intermediate representations, we can simplify this pipeline like this. We start with Lisp, and at a certain point we are manipulating the program in LAP form, and then at the end we produce the byte-code that is the .elc file that you run What I wanted to realize was something like this. I wanted to start from LAP, do something, and jump into GCC using libgccjit and in particular the libgccjit Intermediate Representation that we will discuss. Now, why I wanted to do something like this? Essentially, writing a compiler from scratch for Emacs Lisp would have been a very big task. So I wanted to rely on, as much as I could, the Emacs Lisp byte compiler, because I had to produce something that was as compatible as possible to the current implementation. So this was (I believe) a very good idea to save an enormous quantity of work and to produce something that was compatible in terms of semantics with the original implementation. Also, I didn't want to implement a code generator for each architecture we were targeting, nor wanted to implement all these optimizations that are already in GCC, so I thought it was a good idea to rely on an existing compiler like GCC ?. Let's talk about libgccjit. It was added by David Malcolm in GCC 5 It allows you to describe a C-ish semantic to GCC and have it compile. It's good for reading Jitters or AoT compilers. And if we talk about GCC: it's a compiler, it's a very good one, it has support for a remarkable number of target architectures. It's also very good at generating fast code, and it's been around for a long time; I believe it's like 1 year younger than Emacs. It's still very well maintained, and we can assume it will be maintained for quite a while. And, as I mentioned, this was a very important point. Also, it's GPL; it's Free Software, and it's developed under the GNU umbrella, so I thought it was a very good option. So we can imagine a simple translation that goes from LAP to this subset of C that we can describe to libgccjit. This simple translation we can see here, it's actually pretty trivial. Instead of doing operations within the execution stack we have seen before, we have just an array that is replacing it, and it's called 'local' in this case, and we have assignments within this array, so that they are done in place of the original push and pop activity of the stack. The nice thing is that, when you have done this translation, GCC will be able to optimize this, and remove all the unnecessary operations, and generate code for the specific CPU you are targeting, which will be running your code. This sounds great; it sounds like a very simple and effective translation, and in fact the first iteration of my compiler was doing just this. It was essentially a big C function that was taking LAP and doing this conversion describing the output to libgccjit. Unfortunately, if you do this, you will discover that you have a performance upper bound limit of about 3x. So it was an option, but I thought it was a good occasion for trying to do something more. And doing something more means implementing a smarter compiler that is doing some advanced analysis on the code, and will be able to perform Lisp-specific optimizations -- optimizations that take advantage of the specific Lisp semantics, something that GCC is not aware of. And while I was thinking about that, I thought that having a smarter compiler had also other advantages, like a smarter compiler that understands the semantics of the programming language being compiled would be also capable of giving feedback to the programmers, like better warnings and errors. So I was really fascinated about this idea, and I wanted to change my implementation because I was not really happy about it. I had a lot of C code in terms of lines that were not doing any smart job. And I wanted to write all the interesting logic in Lisp. So optimizing outside GCC before jumping into GCC, as I mentioned, has two main targets: Either optimize the code before going into GCC, or present to GCC some code that we know GCC can optimize effectively. And also, this will give, as I mentioned, better options for the compiler to provide warnings, errors -- better diagnostics. So this is pretty much what the native compiler looks like nowadays, in terms of passes. We have a list of passes, each of which is taking an input and producing an output. So it's doing either analysis on the program that's being passed, or it's performing a transformation. All of these passes are implemented in Lisp, and only the last pass is implemented in C. That is the one that is talking to libgccjit. To do that, I have introduced a new intermediate representation that I call LIMPLE, as a tribute to GCC GIMPLE, that is the main internal representation of GCC, at least one of the main ones. Introducing a new intermediate representation -- a new way of representing my program -- solved a bunch of problems. First, it allowed me to implement non-trivial analysis and transformations, the ones I needed in my compiler pipeline. But also, it solved the problem of what was the boundary between what I had to implement in Lisp, and what in C. Because once I had my intermediate representation defined, essentially the boundary between Lisp and C is just a function, that is, the one that is implementing the final pass. That is taking, as an input, all of my programs in LIMPLE representation and it's doing his bit. So I was convinced this design at least had sense. When we go through some of these passes, just to give you an idea of what these are doing: the first pass is just responsible for spilling the LAP from the byte compiler that effectively here we are using as a front end for our compiler pipeline. The second pass, called 'limplify', will be in charge of converting LAP into LIMPLE. LIMPLE is an intermediate representation that is Control Flow Graph based, and it's capable of SSA. So we can have a look to what this means. Let's assume we have our LAP program, as any program, that's a simple list of instructions that we will execute one after the other. Some of these instructions are special instructions that we call conditional branches, where we check for a condition, and if this is verified, we jump to a different address within the program. (Addresses that here we are calling 'labels'.) So we can split our program in chunks, and those chunks we execute without interruption, so we always enter from the top of those, and we exit from the bottom. We can name those, and split them apart, and these are what we call basic blocks. And now we have a bunch of these basic blocks that are floating, and they are not any more sorted. This is what is called a Control Flow Graph based representation. Now we can get into the SSA topic. That stands for Static Single Assignment. I don't want to get into the details, but just give you a feeling. I added into our basic blocks in our Control Flow Graph a few assignments. We will transform this into SSA just for the variable 'x', just for the sake of demonstrating it. This is done through a number of phases that are essentially some analysis, mainly renaming. But the outcome, the one we see here, looks quite similar to the original one, but we can see that the variable 'x' has been renamed. And now we don't have anymore just one, but a number of these variables. The interesting property is that each of these variables is assigned just once. And this allows for the compiler to do prediction of the value of that variable, depending on the position within the Control Flow Graph. This is very important. For instance, a very simple case is 'x1' that we see is assigned once by definition, in particular here at the beginning. Here it's very simple to understand that x1 will have the value 3. While, for instance, it's more difficult to prove what is going to be the value of x5, because it's calling a function, or we don't know at the moment what x4 is. So the compiler will gain the capability to do prediction on all the variables, and the more we get information on one variable, the more we can prove about the others. Coming back to our passes, the next one is forward propagation. This pass is responsible for doing what I briefly mentioned just before: doing proof over all the different variables in different positions of the Control Flow Graph, about the values, types, or ranges. This pass is also responsible for executing functions when we know that the function has no side effect and the pass managed to prove all the values of its argument. So the function is then executed at compile time and it doesn't even exist anymore in the produced code. Then we have another pass, this is an example of a pass that is very specific: it's trying to remove the call to funcall when those are not necessary. There are a couple situations where this is very useful. And not only is this beneficial because we are generating better code, but when we manage to do that, we allow GCC better analysis over the code, because GCC knows nothing about funcall. So if we are calling, from 'foo', directly, 'bar', for GCC it's way easier to do its analysis on top of this code. Another interesting pass we can mention is 'tco'. This is performing Tail Recursion Elimination. It allows a more functional programming style, if you want. We can jump to the last pass that is called 'final', and as I mentioned, this one is responsible for taking our program in LIMPLE representation and describing it to libgccjit in the gccjit IR. That's the main task. It's also defining inline functions for accessing fundamental data types, and so on. This pass is also responsible for using some of the predictions done by previous passes to generate better code. Things we had to add to have all of this machinery work and to be controllable: The first one is an opt called 'native-comp-speed' and it's equivalent to Common Lisp's 'speed'. It represents the optimization level. The default is 2 and is the maximum optimization level that is meant to reflect all the original semantics of Emacs Lisp. So it's the one that should be used by default. The second one is 'compilation unit' and it's a kind of new object that has been added to Emacs. Let's have a look to how the Garbage Collector works in this case. The GNU Emacs Garbage Collector is a simple mark-and-sweep garbage collector. It does a tree walk through all the objects and follows references from one object to another. All the objects reachable during the mark phase will be kept in our Lisp universe. All the other ones will be freed. In this case we have a bunch of functions, 'foo1', 'foo2', 'bar1', etc., that are defined. When a function is defined, it's accessible through its symbol, so we have the symbol referring to the function. The function, in this case a native-compiled one, is referring to the compilation unit. The compilation unit is essentially the ELF file that has been compiled, and contains all those functions that came from the original .el file, and that we have loaded into memory. If, for instance, 'bar1 and 'bar2 are undefined, functions 3 and 4 will be no longer reachable, and we will be able to free them and unload the compilation unit. We discussed quite a lot about Control Flow Graph, SSA, and a lot of boring stuff, and I promised you that we are doing a lot of interesting proofs over variables, So let's have some examples of them. Let's jump into a quick demo to see what all of this abstract theory and this esoteric propagation engine can do for us and how the user can interact with it. I've defined a bunch of functions, and I will native-compile and load it. Alright, Emacs Lisp native compiled and loaded. At this point, I can disassemble 'foo1' to make sure it's native code and I'm not lying. These are the instructions that will be executed directly by my CPU when I call this function. Alright, very cool. Now, Lisp: (symbol-function #'foo1) Interestingly, this is returning a subroutine, as it would be a primitive function. Because this is native code, even if it's written in Lisp, has been converted to native code as if it's a primitive function. But we can do also a new thing: asking for the type of the subroutine. Alright, very cool. It says this is a function, it's taking one argument of type 't' (that means anything because we don't have any information), and is returning a type 't', so also there we don't have much information. OK, very cool, but not very useful. Let's see #'foo2. #'foo2 is slightly different, it doesn't take any argument, but it's returning an integer included between 3 and 3. Wow, amazing! Let's get into something a little more complex:

'foo3 takes one argument we know nothing about,

but it's returning a number. And why it's returning a number? Essentially because 1+ is returning a number, and in all the other cases, it would signal an error if it's not happy about its input argument. Let's have a look to #'foo4.

'foo4 is a little bit more complex.

It will return nil if the 'when' condition is not satisfied, so it's type 'null' here. It can return a floating point; we don't do propagation of floating point so far, or it can return any integer between 4 and 9. Wow. Let's go on with #'foo5.

'foo5 is even more complex

because other than having to satisfy this condition, we can see that the result of the propagation of this complex condition is propagated also across the 'plus'. So this foo5 can return nil, a floating point we know nothing about, or an integer included between 12 and 24. Let's go on with #'foo6.

'foo6 is returning anything but an integer.

I think it should be pretty obvious why, because if it's not an integer we return it, otherwise we signal an error. Let's finish with #'foo7 very quickly.

'foo7 has another very complex condition,

at least for me, but it's also interesting to see that we are also propagating values for symbols. So we can return the symbol 'big, the symbol 'small, or an integer included between -100 and 100. Now, the question is: why all of this is useful other than having Andrea very happy when he's playing with this all day? Well, we have to come back one second to how Lisp_Objects are represented within Emacs. Lisp_Objects are represented as machine words, where we reserve a few bits to indicate the type. And every time we access the object, when this is a Fixnum, or a regular object where this is a pointer, we always have to extract these bits, make sure that they satisfy a condition, so make sure that we are going to manipulate the object of the type we expect, and then we can extract the object and do the manipulation. If the compiler managed to prove that the contained object is of the right type, it will be able to not emit the type check, and save a lot of instructions. This is a very powerful optimization. Let's discuss some potential future development in this area. First I think it would be extremely nice to extend the propagation to types that are not built in. There are a lot of cases where we could optimize effectively. For instance when we do a lot of accesses to structures -- lots of stuff like that -- where we keep on checking and checking the same object for the type where it's obvious, where it should be trivial to prove that it's the right type after the first access or things like that. So I believe this is a low-hanging fruit in terms of performance. Also I think it would be really nice to extend the declare mechanism to allow the user to declare argument types. (Optionally. Indeed, optionally.) Doing that would give the compiler a lot of information to propagate value types within the function. But those will allow the compiler to give the user really good diagnostics during compile time. Like the compiler could say: Hey, here you are calling a function that is expecting this argument of this type, but I'm proving that you are calling it with an argument that is of THIS type, and is not a subtype of the expected one. So you are doing something not coherent. This kind of interprocedural logic. And I think the compiler should also take advantage (under certain circumstances) of this interprocedural analysis in order to optimize even more, when possible. Also I think we should work on our ? to improve the code generation depending on the prediction that the compiler is doing. We already take advantage of those predictions, but I think we could do better. A quick look at some performance results. These are from the elisp-benchmarks package within GNU ELPA. This is the performance uplift, and we can identify about 4 &quot ;classes&quot ; of results. The first one there is no performance uplift, because there is not much we can do, and the time is probably not spent within the execution engine. And the ones around 3x are the ones Where probably we are not triggering manually specific optimizations, but just the fact that we are converting into native code is giving us this performance uplift. Then there is a bunch of other benchmarks where the Lisp optimizations are triggering, and the uplift is way bigger, and then we have 3 benchmarks that at the time are completely optimized out. That means the compiler became &quot ;so smart&quot ; that it was able to compute the result in the compile time and just put the result in the generated binary. Let's discuss a little bit the compilation model. This is an Hybrid one; it's both JIT-like and Ahead-of-Time-like. Emacs is composed of what we call an Emacs image, essentially the Emacs binary that we start. It's including all the C code, plus all the Lisp code that we preload. Then we have the rest of the Emacs Lisp codebase that can be loaded just if it's required. Same for the external packages, if we have any. If we build an Emacs Lisp with native compilation enabled, by default, only the Emacs image will be native compiled. All the other code will be compiled on the fly when it's loaded and executed the first time, if it's necessary. Same for the packages, in a transparent way and asynchronous way. Also worth noting that the result of this compilation will be stored into a cache directory within the home directory of the user. So it will be reused in the following sessions if the same file is loaded, without having to recompile multiple times the same file in different sessions. It works a little bit like this: When we load the byte-code for the first time, we spawn a native compilation process. Meanwhile we keep using the byte-code available. When the native compilation is finished, we hot-swap the definition of the functions that are contained in the file and start using the native code transparently. We do this asynchronously, and for more compilation units at the same time, so it looks a little bit like this. Let's try a quick demo of all of this machinery. I've started a fresh Emacs with native compilation support, and at this moment nothing is going on. We will test the machinery with Tetris, because I can't imagine anything better to test this. What I do expect is that when I launch Tetris it will be loaded, it will immediately start execution of the byte-compiled version, so we won't see any delay in the user experience, and in the meanwhile, a parallel process will start to native-compile Tetris itself. When the native compilation will be finished, the functions of all Tetris will be hot-swapped. So we will not see any interruption. So Tetris started, and it's running, we have seen no delay, and in the meanwhile, the native compilation probably already finished, we can have a look. In this I see the native compilation log buffer. So we see that Tetris has been native compiled, and all of its dependencies. Now Tetris is still running, but I can do &quot ;C-h f tetris&quot ; and we can see that 'tetris' is an interactive native compiled Lisp function, so it has been native-compiled. I can even disassemble if I want. OK, so very cool. I guess we can say this mechanism is working. Also worth noting that if I go back to the Async-native-compile-log buffer, we see we have compiled another bunch of files. I think these are because of my 'C-h f', this help function command and disassemble, and so on. The first time you run Emacs, you will have, from time to time, these processes spawned. Emacs is &quot ;compiling itself&quot ;, and it's replacing the byte-code definition with the native one. But after a few sessions, you will not see this anymore, because the output of this compilation, as I mentioned, are stored in the user directory. To conclude: Emacs with native compilation support is coming up in Emacs 28, that is gonna be the next major stable release that will be released. So we ought to celebrate with a big party, I believe. But before going to the party, I'd like to list a few points that I think have been success factors in upstreaming this work. It has been extremely important to get in touch with upstream as soon as possible, as soon as I had a proof of concept. It's been extremely important to involve the community as much as possible, and this included keeping a development blog, and posts about that on emacs-devel, and also producing material, participating in conferences, and giving presentations like the one I'm doing, to explain what I was doing and how it works. It has been extremely important, also, to be able to rely on the upstream infrastructure. So, to develop the software as a feature branch in the official git, but even more, I would say, to use the official bug tracker for solving bugs of this branch. This gave the opportunity to stay really in close touch with maintainers, and senior developers of Emacs. That helped me a lot. And at the same time they were informed about what I was doing and what was the status of this feature branch. Extremely important. And also I think it played a major role to try to design this enormous patch in a way that the impact on the current codebase was minimized (at least as much as possible). And also minimizing the impact on the user operation of the software, in this case Emacs. So yes, mandatory Special Thanks: Emacs developers, and especially maintainers and senior developers like Stefan Monnier, that helped me a lot across this long journey. And, well, all the community that really got so excited about this project and gave me the energy to go through all of this time and development and bugs and solving, etc. etc. So yes, it was a really exciting time, and I think we have to look forward and start thinking about how to improve all this for the following years. And that's it. I think I should be online for questions. Thank you very much. captions by John Cummings

Back to the schedule
Previous: Extending the "model" of Emacs to other applications
Next: Old McCarthy Had a Form