Back to the talks Previous by track: REPLs in strange places: Lua, LaTeX, LPeg, LPegRex, TikZ Next by track: Windows into Freedom Track: Development - Watch

Literate Documentation with Emacs and Org Mode

Mike Hamrick

Actually a general-audience talk; just on the development track for scheduling purposes

Format: 43-min talk ; Q&A: BigBlueButton conference room
Status: Q&A to be extracted from the room recordings

Talk

00:00.000 Introduction 00:57.760 Org Babel and literate programming 02:14.080 This presentation 04:53.480 Getting started 06:55.780 README 07:23.500 Writing a code block 08:10.460 :results none 08:40.320 Confirmation 10:36.960 Running blocks automatically 13:53.000 Export options 16:05.700 Substituting constants 17:25.740 Getting the properties 20:03.060 Macros 21:05.240 Properties in practice 22:09.020 Using a prefix 23:42.010 Switching distributions 27:14.150 A tour 30:16.200 TeX and LaTeX 31:09.250 Other prerequisites 32:00.060 Caching 36:20.610 Looking at the PDF 39:29.440 Errors 42:31.990 Final thoughts

Duration: 42:45 minutes

Q&A

Listen to just the audio:
Duration: 11:00 minutes

Description

When writing about programming or other technical subjects, you’re often weaving blocks of source code, program output, and raw data in with your prose. These supplementary materials are usually copied and pasted into your document from other sources, which can be difficult and tedious to keep up-to-date as things change. Inconsistencies and errors can easily creep in when you “hard-code” dynamic information like program output into your writing.

Wouldn’t it be great if the tool you used for writing knew how to run code in a variety of programming languages, collect and format output, and let you refer symbolically to all this dynamically generated content in your prose? In this talk I’ll demonstrate how to use GNU Emacs’ Org mode to create technical documents that do just that. We’ll explore the features of Babel, Org mode’s literate programming add-on, that makes it convenient to edit, evaluate, and manage embedded code, output, and data all from inside GNU Emacs.

We'll also show how these literate documents can be exported to LaTeX and ultimately PDF format to create professional looking output that looks stunning when printed or viewed.

Also shared at SeaGL 2023

Discussion

Questions and answers

  • Q: Did you develop a variant of your document for Centos?
    • A:
  • Q: Great presentation. The preparation is outstanding. For someone like me that never touched the org--mode side of emacs, what do you feel its the more complex part to tackle? You made it seem simple but the complexity there.. woof
    • A:
  • Q: How do you normally debug, e.g. view the logs or see failed statuses, when the commands in the src blocks fail? Especially if they output lots and lots of logs, and you need to see the full history of the build.
    • A:
  • Q: Do you find yourself doing plain-text exports? I saw you doing that as an example for a bit. How do you like to format them so they come out looking nice?
    • A:
  • Q: IIUC if you commit that eval line to your config then theoretically you could open an Org file prepared by someone else and it would automatically run the code in a "startup" block that might be malicious, right?
    • A: for sure. if you agree to have a block run when you load the document, you could get burned if it changes into something eveil.

Notes and discussion

  • Seems like we could use some kind of extension that would hash a source block and allow you to automatically run ones you've marked safe
  • Property inheritance I still don't completely understand, heh.
  • seeing section on Org MACRO, recall having trouble a while back invoking a MACRO from inside a MACRO; is this a limitation or was I holding it wrong?
    • AFAIR, macros do support recursion
      • actually my issue was passing TITLE to a MACRO https://paste.rs/LZunR
      • yeah. "eval" macro arguments in particular are not expanded. you may raise it on the mailing list - looks like something worth considering
  • I almost wanted to pre-process my org mode files with a more advanced macro system like m4. But then I came to my senses.
    • When discussing Org mode as replacement of TexInfo, it has been rised (Texinfo uses m4) (https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#External-Macro-Processors) but why do you need m4 when there is Elisp... can just put a code block that will do all the work and eval on export
      • A: True. You can write elisp to do all the macro replacement, but you end up editing the buffer when you do that, which has its own disadvantages.
        • during export, it is a throwaway buffer
          • A: oh, I didn't think of that. Ultimately though org macros have a ways to go before they're truly useful in all context you might want to use them.
            • org-export-before-processing-hook runs before macro expansion but around the same time (we really need to document the export process step by step)
  • Thanks for the awesome presentation, I can't wait to add some of this stuff to my documents
  • I was pretty terrified to see that ChatGPT could write elisp
  • Also, loved the presentation — great walk-through of the thought process & how to improve. Was happy when Macros made their way in
  • Yeah. tramp would have been cool, but can be dangerous if you start doing sudo apt in the wrong machine
  • I tried cross-compiling Emacs for Serenity. Emacs uses some intermediate binaries (like make-docfile) during its build process, which causes issues with cross-compiling that I couldn't quite figure out.

Transcript

[00:00:00.000] Introduction
Hello, everyone. This talk is on literate documentation with Emacs and org-mode. I'm going to take just a moment here to unpack what I just said. Emacs, as most of us probably already know, is a powerful text editor and list programming environment from the 1970s. Chances are, if you're attending this talk, you already know a bit about Emacs. org-mode is an Emacs major mode and authoring tool that helps you write documents in a plain text markup language called Org. These Org documents can be exported to a number of different document formats, like HTML, PDF, ODT, Markdown, and more. org-mode has a lot of features. It can be an outliner, a to-do list manager, an agenda, organizer, and much more.
[00:00:57.760] Org Babel and literate programming
Today, we're going to be demonstrating what I consider to be org-mode's killer feature called Org Babel. Babel allows you to take human language prose, computer language source code blocks, and their outputs and weave them together seamlessly to form a cohesive document. It is seriously cool. Literate documentation is a play on the term literate programming, popularized by Donald Knuth in the early 1980s. Knuth's literate programming idea was that computer programs could be expressed in a natural language and be human-readable documents rather than written exclusively for machines to read. In a traditional program, you might have a bunch of machine-readable source code and a handful of human-readable comments, which attempt to describe what the program is doing. Literate programming flips this on its head. A literate program is a document that describes how the program works with machine-readable source code blocks inside of it. These source code blocks are later tangled out of the document and submitted to the machine either to be compiled or interpreted and ultimately run.
[00:02:14.080] This presentation
Throughout this presentation, you'll see my browser window here on the left side of the screen. And on the right side, I've got a terminal session running tmux. This allows us to have a virtual terminal window connected to two separate Linux machines, one running Ubuntu Server 2204 and another running Fedora Server 38. I've specifically chosen these two distributions for my demo because they are representative of the two dominant flavors of GNU Linux, Debian and RedHat. In both cases, these are bare-bones server additions with the stock packages installed. I've manually installed a few packages like Git, emacs-noex to get the terminal version of emacs, and tmux. But otherwise, these Linux installs are what you'd get right out of the box. For this demo, I've created a literate org-mode document that describes how to build GNU Emacs from its source code on both Debian and RedHat-based systems. While both operating systems are very similar, they differ substantially on which packages are installed out of the box, how optional packages are named, searched, and installed, and of course, the distributions have different names, like Ubuntu or Fedora. I chose building Emacs from source as a topic for this demonstration because while the process is largely the same on both RedHat and Debian, there are a lot of minor little differences that need to be accounted for, which really prohibits you from hard coding names of packages and package management tools and distributions into your document. I suppose you could create two versions of the same document, one specifically for RedHat and one specifically for Debian, but that would be really tedious to maintain. Like if, for example, you updated some prose in one document, you'd have to remember to do it in the other one too. And if you weren't careful, the two documents could drift out of sync. In this demo, I'll show you techniques for creating dynamic, literate documents that can change based on parameters and constants embedded into the non-exported regions of the document. I'll show how with a single org-mode source document, you can press a couple of keys to configure it to export a RedHat-specific version of my building Emacs from source essay or a Debian-specific version.
[00:04:53.480] Getting started
All right, let's get started. We'll begin by firing up a new terminal Emacs session on my Ubuntu machine. Now, I installed Emacs on this machine using apt-get. And doing that, you get version 27.1, which is, hey, only two major versions behind the current version of Emacs. This is another reason why I thought writing a guide on how to build Emacs from source code might be a good idea. You can get a much newer version of Emacs on Ubuntu if you install it via Snap, but, uh, Snaps. Don't get me started. Now, I wanted to use a completely vanilla terminal mode install of Emacs for this demonstration because my personal Emacs config has a ton of packages installed and is heavily modified. I want folks to be able to follow along with a bog-standard, out-of-the-box Emacs config. The Emacs config on this Ubuntu machine has just two settings. I require org-tempo because my fingers are hardwired to use some of the handy shortcuts that it provides. And I also turn off the menu bar because I just can't stand to look at it. Let's begin by opening a file called buildemacs.org, which will be the source code for our literate org-mode document. Now, in preparation for this talk, I've already written this document, and we'll take a look at the finished product here in a bit, but let's first take a look at how we might approach this task. We'll start at the top of the document by filling out some export keywords. These keywords are something that every backend exporter, be it LaTeX or plain text or ODT or whatever, understands, and they're essentially document metadata. As you can see, I'm typing #+ followed by a couple characters and then M-TAB to auto-complete. If you hit #+ by itself and then M-TAB, you can see all the possible completions. And as you can see, there's a lot. The next thing we're gonna do is make a README section at the top of this document. This section is intended for folks who are looking at the org-mode document, trying to figure out what it's for. We don't want to actually export the section heading, so we're gonna tag it with the :noexport: tag. And then here, we just write something quick to let folks know that this document can potentially execute code and just a little something about what the document is for.
[00:07:23.500] Writing a code block
Okay, so now that we've written some text, let's try our hand at writing a code block. I'm getting pretty sick of looking at the default Emacs theme. All that blue and purple in the document makes it look bruised. Let's make an Emacs Lisp code block that switches the theme to one of my favorite built-in themes, Leuven. Leuven was created by my man, Fabrice Niessen, who I personally have learned a ton of org-mode stuff about just by studying his work. Now, if we cruise back up to the code block, we should be able to hit C-c C-c, and have it execute. And there you have it, a high-contrast color theme that was designed to look great in org-mode. So that's great and all, but there are a couple of things I don't like.
[00:08:10.460] :results none
First of all, we don't need to see a #+RESULTS block here, and that's because we're not really interested in what the Emacs Lisp function load-theme returns. I mean, it's great it returned t and all to indicate success, we just don't need to see it. We can slap a :results none header arg on the code block to keep things nice and clean. There are a lot of different header args, and I often confuse and misremember them. So I'll always refer back to the org-mode manual when working with them.
[00:08:40.320] Confirmation
The second thing I don't like is that when we hit C-c C-c to execute the block, Emacs prompted us if we really wanted to run the block. Emacs Lisp is Emacs' mother tongue, and I don't wanna be hassled when speaking my native language. There's a variable that controls this called org-confirm-babel-evaluate. And this can be either set to t or nil to either always confirm or never confirm. If however, you provided a lambda, an anonymous function, Org will call your function with the name of the language and the source block that it's about to run. And your function can make the decision about if Emacs should ask you for confirmation or not. What I'm doing here is setting org-confirm-babel-evaluate as a "file local variable". This means whenever the file is opened by Emacs, it'll set this variable to be a lambda that returns nil, meaning don't confirm, on Elisp code blocks. As you can see, the variable is currently set to its default value of t, meaning always confirm. Now if we save the buffer, exit Emacs, and pop back in again, org-confirm-babel-evaluate should be set how we like it. We were however prompted for confirmation on setting the file-local variable, which controls if we're prompted for Elisp source code block evaluation. I feel like there's a Yo Dawg joke here somewhere. When we were prompted, we hit the exclamation mark, which automatically marks this variable as being safe. So you won't be bothered the next time you open this file. This variable is called safe-local-variable-values and if we pop over to our .emacs file, you can see that Emacs' customize tooling helpfully updated this variable in our config file for us.
[00:10:36.960] Running blocks automatically
Now that's great and all, but I really don't like having to hit C-c C-c on that source block every time I open this document just to bring up the Leuven theme. Let's have this source block run automatically every time the document is opened. Now I know what you're thinking. Shouldn't you just put all of this configuration stuff in your .emacs file and keep it out of the document? Well, that's what I've done with my personal Emacs config, but we want this document to be able to be used by folks with a completely vanilla Emacs setup, or even a completely tricked out Emacs setup, so we can't assume anything. The idea is if the Emacs user who opens the document agrees to setting all of the variables and running all of the code within, they'll be able to export the document as well as run all of the code blocks inside of it just as we intended. And the differences in base Emacs configuration will be completely minimized. Now it's worth pointing out that the file-local variables we're setting here are local, in this case, buffer-local. The configuration we use in this document won't override someone's carefully constructed org-mode setup. The first thing we're gonna wanna do in order to make this block execute when the document is loaded is to give it a name. It's always a good idea to give every source block you create in your document a unique name, even if you don't refer to it elsewhere. I do this because when I'm debugging my documents, Emacs will prompt me about running a block. If the block has a name, Emacs mentions it, and I know there's a problem with the result caching or something with the "foo" block. But if the block doesn't have a name, it can be really hard to figure out which block Emacs is complaining about. So I always name my blocks. Now we're gonna add another file local variable, but this one is special. If your "variable" just happens to be named "eval", it means that Emacs should evaluate the Lisp expression that follows. Here we'll use the progn function to sequentially run two elisp functions and return the value of the last one executed. The first function is org-babel-goto-named-source-block, which jumps us to the startup block. The second one is org-babel-execute-src-block, which executes the current source block. That should get the job done. Now all we have to do is save the document, exit Emacs, jump back in, and once we've confirmed that we're willing to run the new "eval" line in our file local variables, we're good to go. Now if we want to add new configuration stuff to the document, we can just add it to the startup block and not have to muck about with confirmations or adding new file-local variables or whatever. And just like before, we'll let Emacs' customize system save this decision to our .emacs file. Now that all that business with confirmations, file-local variables, and the startup block are out of the way, we can get on with writing our introduction. We'll create a new top level headline called introduction and explain to the reader of the exported document what this is all about.
[00:13:53.000] Export options
Now as you can see, we've actually hard-coded the name of the Linux distro in our prose. I promised you a single document that could be for either RedHat or Debian distros, so we can't have this. Astute members in the audience have probably been uneasy ever since I hard coded the name "Debian" in the README section above. One way of solving this problem is by using exclude tags. Let's add the #+EXCLUDE_TAGS export keyword to our document. This keyword tells the exporter, "Hey, if you see a headline tagged with any of these tags, don't export it." By default, the tag :noexport: is excluded. And if you'll notice, we tagged our README section with that tag, so it doesn't show up in the exported document. We'll keep this tag in the list, but we'll also add the tag :redhat: as a tag to exclude. Now it's just a matter of creating two introduction sections, one for Debian, one for RedHat. And if you want the RedHat version of the document, you can just modify the #+EXCLUDE_TAGS line at the top of the document. Awesome, right? Right? OK, this is not that great. Well, it does work. And you can see if we export the document, we'll get something that only references Debian, and the :noexport: and :redhat: tagged headlines are omitted. This strategy would work great when the RedHat- and Debian-specific sections are substantially different, but that's not the case with the introduction. We definitely don't want to have to maintain two distinct introductions. I also noticed that the export tags are included in the exported document. That's a terrible default. We'll fix that, and we'll also ensure that my email address appears at the top of the document. Let's also take this opportunity to get rid of the table of contents. We don't need it. These are all export option settings and can be modified using the options keyword at the top of the doc. The manual is really your friend here, as there are a ton of export options. Now when we export the document again, it should look a lot better.
[00:16:05.700] Substituting constants
Now that we've cleaned up the look of the exported document, we'll take a look at a better way of solving the problem with the introduction. Thinking like a programmer for a moment, what I really want here is a way of specifying a constant. Rather than hard-coding the name "Debian" or "RedHat" or whatever into my document, I want to substitute that text with a symbolic constant, named something like "distro", that can dynamically change to "Debian" or "RedHat" or "Slackware" or whatever, depending on how the document is configured. In the past, I've come up with some pretty cumbersome ways of doing this, but eventually I stumbled upon the idea of using Org-mode properties as a way of storing these constants. Like it says in the docs, properties are key-value pairs that are associated with an entry and they live in a collapsible properties drawer. Let's do a bit of cleanup on our document and we'll put things into sections. We'll also add a section for document constants. And that's where we'll put the properties drawer with the "distro" property.
[00:17:25.740] Getting the properties
Now the question is, how do we reference these properties in the document? It turns out there's an Elisp function called org-property-values, which does what we want. If we run it and give it the name of our property, it returns a list with the string "Debian" in it. It's worth noting that this function is named org-property-values with values being plural. In org-mode, there could be a property named "foo" that has different values depending on which heading level you're at in the document, which is why the function returns a list. For our purposes though, we can just pull off the first value in the list with car and we're good to go. Now we'll make an Emacs Lisp list function called get_prop that does just that. This function takes one argument called prop, which is the property to look up and we'll give it a default value of "distro". So we can hit C-c C-c on the block to verify that it works. Now we just have to make an inline call to our get_prop function within the prose of the introduction section. And that should get us much closer to not hard coding distro names into our document. But before we do that, I need to clean up something that's been bothering me. By default, Emacs' fill-column variable is set to 70 characters, which may have been appropriate for 1970, but it's not great for 2023. We'll just cruise up to our startup block and set the variable there. We'll hit C-c C-c, and now our document will wrap at 100 columns, which for our purposes, I think is much more reasonable. The org-mode syntax for making an inline function call within the prose of your document is call_, followed by the name of the function, some optional header arguments, and then the function arguments. Now, when we export the document, we see that it's replaced our previously hard coded "Debian" with the value from the property. Huzzah! Now this is close to, but not exactly what we want. You can see that "Debian" is surrounded by a backtick and a single quote, which is the plain text exporters way of showing you verbatim text. In more sophisticated document backends, verbatim text is rendered in monospace. We can fix that by adding a ":results raw" header argument to the inline call. Now, when we export the document, it looks like what we'd expect. Now this is getting better, but it's still not great. The call_ syntax is pretty cumbersome, and it's a lot to type every time we want to reference a constant and not have it be marked up as verbatim. This is where org-mode macros come to our rescue. If we head to the top of the document, we can create a couple of macros using the #+MACRO: export keyword. We'll define two macros with short names. One named "p" for "property", and the other one named "pr" for "property raw". Org-mode macros are expanded when the document is exported, and any positional arguments provided are referenced by their number. Now in the introduction, we can use the macro replacement syntax, which is three curly braces, followed by the macro name and any arguments, and then three ending curly braces. You see why I kept the macro name short. That's six curly braces in total we're typing, which still takes up a fair amount of space.
[00:21:05.240] Properties in practice
Now let's take a look at how we might use these properties in practice. Debian and RedHat distros differ on how they install packages. So we're gonna want an "install" property, where in Debian we use sudo apt-get install -qq, and on RedHat we'll use something like sudo dnf install -y. Now development packages also have a different naming convention. For example, the ncurses library on Debian is called libncurses-dev, where on RedHat it's called ncurses-devel. There are likely going to be many more little differences like this that we'll need to solve with properties. Now I already don't like where this is going. Switching between the Debian and RedHat versions of the document is gonna mean commenting and uncommenting out a bunch of different properties, which is pretty janky.
[00:22:09.020] Using a prefix
Luckily we can solve this problem with a little bit of Emacs Lisp. We'll start by modifying our properties, so their property names are prefixed with either deb_ or rh_ to signify which distro the property applies to.` We'll also create a single property called "prefix", which will be prepended to the property name by the get_prop function if the requested property is not found. This way, when we want to switch between the Debian and RedHat versions of the document, we just need to change the prefix property. So now we'll change the Elisp code. So we'll use a let expression with two bound variables. The first one is called ret, which determines if the initial call to org-property-values succeeds. The second variable is called prefix, which is the prefix property. If the first call to org-property-values succeeds, we return it as normal. If not, we concatenate the property value that was passed into the function onto the prefix and try again. Now when we call the get_prop function with "distro" as the prop argument, it won't be found. So the code will slap our prefix tag on the front, making it something like rh_distro, and it will be found and returned. Let's see that in action. All right, now we're talking.
[00:23:42.010] Switching distributions
This setup is starting to look pretty good, but there are just a few things that I want to add before we move on. First of all, I think the document should have a subtitle, something that tells you if you're looking at the RedHat or the Debian version of the document. I also think it would be great if the file name of the exported document reflected the distribution as well. I also want to add a quick Debian only section to the document that explains how it got its name. Now let's see what happens when we export the document. This did not work out as we wanted. As you can see, the macro we used in the subtitles didn't expand properly, and as a result, our subtitle didn't render right. Sadly, you can't use macros or inline function calls everywhere. And one place where they don't work is inside of certain export keywords. So we're gonna have to hard code them here. Another mistake that we made is we forgot to update the #+EXCLUDE_TAGS export keyword, because with the RedHat version of the document, we want to exclude the Debian tag. Now when we export the document, everything should be correct. The word RedHat should appear in the subtitle, and the Debian fun fact section should not be present. Now we just need to add a section to the README that explains the steps you need to take in order to switch the document from RedHat to Debian. Okay, let's see here. We have to change #+SUBTITLE, change the #+EXCLUDE_TAGS, change the #+EXPORT_FILE_NAME, and change the prefix property. This is OK, but it's not great. Emacs Lisp can once again come to our rescue. What we'll do is make an Elisp code block that will invite the user to hit C-c C-c on. And the code block will essentially make all these changes in the document for them. This code block, which we'll call switch_distro, takes one argument called os, which by default is set to "Debian". It starts out with a let expression that defines three bound variables. The debian variable is a boolean that is true if the distro we're switching to is Debian. Based on the value of this boolean, we'll set the noexport and prefix variables accordingly. The save-excursion block tells Emacs that we're going to be moving around in the document and to remember to put our point back where we started when the block finishes. After that, we essentially go to the top of the document and search and replace the subtitle, exclude_tags, export_file_name, and the prefix. Pretty cool. Let's see this in action. If we hit C-c C-c on this block, we should see the document automatically change a bit. And now when we export it, we get the Debian version of the doc. If we want to change it back, we can just head back over to the code block and change the default value for the os variable from "Debian" to "RedHat" and hit C-c C-c again. And now when we re-export, we're looking at the RedHat version of the document. Just as an aside, if you ever thought to yourself, "I should learn Emacs Lisp someday" Make it someday soon. You'll be happy you did. Not only is it a fun programming language, but you can do powerful things with it in Emacs, which I hope is a point that folks take away from this talk. All right, that was a lot. Now that we've spent the past 20 minutes or so digging into some of the tips and tricks I used when creating my build Emacs from source document, we'll say goodbye to this document we've been working on and we'll start a tour of the actual literate document I wrote. A document that I'll demonstrate actually downloading and building a new Emacs when I export it on both my Ubuntu and RedHat virtual machines. I'll also show you how org-mode can generate slick professional looking PDF files through the power of LaTeX. We'll start here at the orgdemo2 directory, which I've cloned from GitLab. This repository has all the source materials for this talk. The buildemacs.org file is where most of the good stuff is. So that's where we'll start. There's a lot of file-local variables that we'll need to confirm. So we'll do that too. So the first thing we're gonna do is hit C-u TAB twice, which will give us a top-level overview of all of our headings. As you can see, we've got a lot of the same familiar export keywords we had before. #+TITLE, #+SUBTITLE, #+AUTHOR, #+EMAIL, plus a few we haven't seen before. For example, I've squirreled away a lot of the #+LATEX_HEADER export keywords in this file called latex.setup. And I did this just so they don't clutter up the document. Much of the LaTeX magic that makes the exported document look good is in these headers. LaTeX commands begin with a backslash. And a common one we use a lot here is \usepackage. This lets us bring in packages like geometry, svg for the cool SeaGL SVG logo, fancyhdr and fancy verbatim [fancyvrb] to keep things looking pretty fancy. Using a scalable vector image format makes it possible for us to do really cool things like having a scaled-down version of the SeaGL logo appear in the fancy footer below. I also include some macros in a separate file just to help keep things tidy in the main document. Here I've got the familiar macros we've seen before for get_prop. But here I use different permutations depending on if I want results raw or raw verbatim or just verbatim. I also have a couple of macros here at the top of the file that are for pulling strings out of results blocks and then trimming them so there's no white space on either side. Like in the version of the document we worked on at the start of this talk, the real document also has a README section marked with the :noexport: tag. It also has a section about choosing which version of the document to export and a code block on how to switch between them. It's also got a lot of helpful information in it like what OS and Emacs versions the document has been tested to "run" on, a section on the LaTeX prerequisites and the section on executing the document's various code blocks.
[00:30:16.200] TeX and LaTeX
The latter two sections we'll take a look at now. Out of the box on Fedora and Ubuntu server distros, the TeX typesetting system also by noted computer scientist Donald Knuth is not installed. So we'll need to install some packages. Starting out we'll need the texlive package which gets you a fully featured TeX setup. This also gets you LaTeX which can be viewed as a distribution of TeX macros. You'll also need XeTeX. This gets you Unicode support and lets you use modern fonts. We'll also want to install pdfTeX. This gets us the ability to generate PDFs from TeX sources. And finally, we're gonna need to install latexmk which is a Perl script that knows how to run LaTeX multiple times in order to properly deal with intra-document links.
[00:31:09.250] Other prerequisites
But wait, there's more. We're also gonna need Inkscape to rasterize our SeaGL vector logo at different resolutions. And we're gonna need the JetBrains Mono font to make our source code look snazzy. We'll also need the Inter font to make our prose look snazzy as well. I've helpfully added a bash code block in the README that you can hit C-c C-c on to install. This really does lock up Emacs for a few minutes and it's sort of annoying. When we export the document and turn off all caching and it actually builds Emacs for real, Emacs can be locked up for tens of minutes. There's a package called ob-async that I've been meaning to check out that might help here. But since I wanted this document to work on bog-standard Emacs setups, I didn't get around to it. Before we get into talking about running the document, let's talk briefly about results caching. We'll take a look at the section of the document where we talk about Git tags for an example. The num_tags bash code block determines how many tags there are in the Emacs Git repo. And when I hit C-c C-c on that block several days ago, when I was first creating the document, that number was 183. That result has remained cached in the document since then. And you can see a snippet of the SHA1 hash of the contents of the source block below. You can see where I referenced the result using the sr for string raw macro in the prose below, and how it gets rendered in the exported PDF document. All the source blocks in the exported sections of the document include cached results like this. If I export the document now, it won't take that long to do because while there are a ton of code blocks in the exported sections, they're all cached. Now let's get back to the section of the README that explains how to execute the code in the document. Here I explain that if you want to build Emacs on your computer using this document, you've got a couple of options. The first option is to manually invalidate the caches and take C-c C-c on every code block in the main document. This lets you supervise the entire process, and it also creates new cached result blocks, but it's time consuming. There is also an internal link to the main document here, and you can jump to it with C-c C-o. This is one of those intra-document links that is really tricky to get right with LaTeX, and is why we opted to use the latexmk Perl script to build the PDF version of the document. I'm mentioning it specifically here because it took me forever to figure this out. The second option you've got is to change the default header arg from :cache yes to :cache no at the top of the document. If we cruise up to the top of the document, you can see that this header argument property basically says that unless a code block explicitly says otherwise, it's by default supposed to be cached. That's how we were able to export the document before so quickly. The code block named no_cache_no_confirm uses the save-excursion and regex replace trick that I demonstrated earlier to munch the default cache header arg from "cache yes" to "cache no". And it also turns off confirmations on bash code blocks. Let's do that now. Now we'll export the document to PDF, which will ignore the cache result blocks and clone the Git repository on Savannah, create a branch that points to the most recently tagged version of Emacs 29, run configure a handful of times, installing packages to fix missing dependencies along the way, build Emacs, install Emacs in our home directory, verify that it has successfully built a binary, run it in batch mode with some sample Elisp and show the file sizes and dates of the generated files. This is gonna take a while. And while it's running, we'll pop over to our Fedora box. All right, now we'll fire up Emacs, hit C-c C-c on the configure_document code block to configure the document for RedHat since Fedora here is a RedHat based distro. Then what we'll do is we'll pop down and hit C-c C-c on the rh_install_latex code block to install the LaTeX prerequisites for this Fedora virtual machine. Finally, we'll execute the no_cache_no_confirm block and then kick off the export. Then we'll go and check back on what's happening on the Ubuntu box. Ooh, top looks pretty quiet. I think the export is complete. Ooh, those are the words I love to see in the status area, PDF file produced!
[00:36:20.610] Looking at the PDF
Now I can't use my web browser to take a look at this PDF file because I haven't set up a web server or anything like that on the Ubuntu virtual machine. I can, however, use TRAMP with the ssh method to poke around on the ubuntu host on my personal version of Emacs. So let's do that. Okay, so now if we go into the source directory and then we hop into the orgdemo2 directory and then we look at the deb version of the PDF, there she blows. Now, if we go down to the Building Emacs section, we can see that it built. And if we look in the bin directory, we can see that at 17:01, that's when all of those files got created. Also the file creation date on the PDF is 17:01. So all of this code executed roughly the same time the PDF was created. All right, so now let's head back over to the Fedora box and then we'll navigate to the source directory, the orgdemo2 directory, and there is our RedHat version of the built Emacs PDF. And Bob's your uncle. And you can see it is the RedHat version of the document because this is a RedHat box. And if we go over to the What did we install? section, you can see that these binaries were built at 17:35. And now if we pop open dired and we take a look at the PDF, we can see it also was created at 17:35. All right, in the couple minutes remaining, I thought it would be a good idea just to take a look at the document and maybe just go through some of what it actually does in explaining how to build Emacs from source. We'll look at the RedHat version since we're here. And the first thing you do is you have to get access to the source code. And before you can do anything, this is a RedHat-specific section where you need to install some development tools. And this development tools group actually has Git. Now I installed Git earlier, but if you didn't do that, that would be the first thing that you need to do. We create a source directory, we cd into it, we clone the repo from Savannah. And then we start to take a look at some of the Git tags. And we showed this before where we check out how many different tags there are. And then we run this kind of funky Git command to sort of list all the tags that begin with 'emacs-29', and we sort them by when they were tagged. So we can see that Emacs 29.1.pretest is the most recent version. So that's the one we grab and that's the one we decide to build. And then we create a branch that is based on this tag. And this is dynamically generated based on what we saw here. So that's what we use here. In this case, we're piping standard error to where standard out goes. That's another trick. If you want to actually see an error get created, org-mode will capture any errors that code blocks produce, and it will show you the error message in a buffer. So if you actually wanna show what it looks like when something errors out, this is the trick you have to use. And then what we do is we look for a configure script and there isn't one. And then we realize, uh-oh, we're gonna have to deal with autotools. So, you know, we run the autogen script and it complains because we're missing some prerequisites. So we have to install autoconf, and then we run it again, and finally it generates a configure script. And this is another case where I pull this number right here into the actual prose. And I can see it's, oh, it's, you know, this how many bytes. When was the last time you wrote a shell script that was this many bytes long? And then we configure the build process. And, you know, it's not gonna work right away because we don't have GNU Texinfo installed. So we gotta do that, which we do with dnf install here. And then there's this section that is either RedHat- or Debian-specific that talks about, like, if you don't know the name of a package that contains a given file name, how do you query it? And in the RedHat world, you use dnf provides makeinfo. In the Debian world, you do something entirely different. And then we have to install the ncurses binary. And finally we get like a minimal configuration and you can see that there's a whole bunch of nos here. So, you know, we don't have cairo, we don't have imagemagick, we don't have dbus, you know, there's a whole bunch of stuff we don't have. We don't have X, we don't have libjansson, no tree-sitter. This is really a bare-bones Emacs that is strictly terminal mode. Then we actually build Emacs, which is, you know, kind of boring, we're just gonna type make and then make is gonna run successfully. And make is gonna spew a ton of output, right? So here's where I do that /dev/null trick, where I pipe everything to /dev/null and then I, or I pipe standard output to /dev/null and then I pipe standard error to wherever standard output's going. And then at the end to say that it ran successfully, I say "Make ran successfully!" Then we take a look at the Emacs binary and you know, it's an elf binary. And, you know, because this is running on my Mac, this is an ARM-based machine, this virtual machine is. Oops, and this is a bug. This really should be a macro call, but I think I have the wrong number of curly braces or something in there. I need to figure out why that's not right. I'll look into that later. And then we install Emacs and then we kind of show like the file sizes of everything in the home directory. And then we, you know, show the binaries that got installed.
[00:42:31.990] Final thoughts
Anyway, so this is the final thoughts section. And my final thoughts are, is I hope you enjoyed this talk and I hope you actually learned a thing or two. All right, thanks everybody. And I'll see you all next time.

Captioner: jc

Q&A transcript (unedited)

10 or 15 minutes of on-stream Q&A time. But if there's more questions than that, people are welcome to stay. If Mike has the time to answer some more, then Awesome. conference. So I am spudpnds, which is spud upside down on IRC, if you want to hit me up on IRC. Nice. and it is, did you develop a variant of your document for CentOS? Red Hat distributions other than Fedora. I would like to expand the document out to Windows and to Mac OS as I think a lot of people really want to build Emacs on those platforms because it's much harder to get Emacs binaries running on those platforms. Although they're around on the internet it's not as bad as it used to be, but building Emacs is very, a very fun thing to do. And I encourage everybody to do that. here on BigBlueButton. EXC or Matt saying, great talk, good demonstration of what's possible. And Aaron thanking Mike, saying awesome presentation. And they missed the first few minutes and have to rewatch to get the portion that they missed. into 40 minutes. So I spoke quickly. I have a feeling I may have left some folks behind who weren't paying close attention. So rewatching might help. the shell functionality or Babel and last March they added async evaluation into session code blocks. Very cool, especially when you're doing something that takes a long time. It would be nice if Emacs wasn't locked up. I will definitely have to check that out. I use this technique at work a lot, like when I write documents to how to explain things to coworkers and such. And 1 of the things I had to explain was how to build AWS MySQL databases and replicas, and how to build them with very specific parameters to work with the system called Vitesse. And when I was running that document, building these kinds of MySQL databases in AWS with lockup Emacs for 20, 25 minutes at a time. So, yeah, I'm really excited about async evaluation. Totally. Oh yeah, Python mode I think has had async for shell blocks for a while. I think there's a third-party package at Elba that adds async support for that. But yeah, I explicitly wanted to make sure that it would work with super vanilla stuff. Oh, it's built in. I see. Yeah, I didn't realize it was built in for Python blocks. I'll have to check that out. There's so much Emacs. It's hard to wrap your head even around a tiny portion of it. It's such a deep topic. Looks like somebody in IRC said, I can't wait to add some of this stuff to my documents. And that really makes me happy. I hope people go out and write literate Org Mode documents that do amazing things. When's the next talk? We have like, minutes live on stream for Q&A. Blaine asks, are you running Emacs from the host machine? And yeah, so I'm running Emacs on the exact same machine that I'm building Emacs on. And I had first thought about doing that over Tramp. And I thought that would be a very cool demo to show how you could do that remotely on Tramp so you didn't need Emacs on the host machine. But I decided it would be a lot easier, and as I ran into a deadline to get the talk completed, I abandoned that notion for the straightforward approach. But ideally, I would spin up virtual machines and then using the Org Mode document and having Org Mode reach out to those machines via SSH and Tramp. Oh yeah, there's also a little bit of discussion on IRC about org macros and how they made their way into the document. And I remember when I first discovered org macros by reading the org mode documentation, I was really excited because I thought I could limit a lot of the boilerplate I end up typing. But as we discussed, ORD macros, I think, only work in 1 context in your ORD mode document, and I think that's in the pros section. So You can't resolve a macro inside a header arg, for example, or inside an options block. It would be awesome if macros worked everywhere, but I'm happy to have them just as they are now. what's possible with literate documentation. This is mind-blowing. Yeah, I think so too. I first saw this technique in Howard's video, Literate DevOps, and I remember I was just picking up parts of my mind after it exploded after having watched that video. So I wanted to do some of it myself, and that's where I came up with a couple different approaches to that. It's not just for, you know, making literate Emacs configurations. question on the pad. Someone saying great presentation. The preparation is outstanding. And for someone like me that never touched the org-mux side of Emacs, What do you feel is the more complex part to tackle? You made it seem simple, but the complexity there. set up the way you want it is the hardest part. So some of the defaults are, you know, they don't look good when you render them out in LaTeX and finally PDF. And there's a lot of work to be done to tweak the LaTeX environment so it looks as pretty as you might want it. And then just Org Mode has a lot of knobs that you can tune, and they have a pretty large impact on how your document is exported. So I think the hardest part is just knowing what's possible and knowing where all the knobs are to tune and twist. And I think we have about a minute or so on the stream. So I'll read this question as well. But folks, you're welcome to continue on the pad or just come join here on BBB after myself and the stream move on to the next talk. Yeah, and the next question is, how do you normally debug, for example, view the logs or see failed statuses when the commands in the source blocks fail, especially if they output lots and lots of logs, and you need to see the full history of the build. whenever I export a document. If there's a failure, that's typically where it's written to. And I will actually kill the messages buffer before I export so I know that only the messages in the buffer are for my given export and I mentioned that debugging trick where you name all of your org-mode source blocks So if there is a problem in 1 of the blocks, it'll actually tell you what the block, the name of the block the error occurred in. If you don't do that, it just gives you a position number in the buffer. And whenever I tried to convert those position numbers to actual places where the error occurred, it was never exactly where I suspected it would be. So I found that very difficult in debugging. So the only real debugging tip I have is name your source blocks, even if you don't refer to them later. stream. And I also have to drop as well. But thanks again so much, Mike. And folks are welcome to come here and continue discussion here. Thanks again. Thank watching. You you

Questions or comments? Please e-mail emacsconf-org-private@gnu.org

Back to the talks Previous by track: REPLs in strange places: Lua, LaTeX, LPeg, LPegRex, TikZ Next by track: Windows into Freedom Track: Development - Watch