Parallel Text Replacement

Lovro, Valentino Picotti - IRC: hokomo,

Format: 15-min talk; Q&A: BigBlueButton conference room
Status: Q&A finished, IRC and pad will be archived on this page


00:00.000 Introduction 00:23.440 Problem: Goal 01:12.360 Problem: Naive Multi-pass 01:34.200 Problem: Clever Multi-pass 01:57.720 Problem: Terminology 03:04.440 Problem: Scaling Multi-pass 03:55.920 Solution: Single-pass 04:18.240 Solution: Existing 06:29.080 Solution: query-replace-parallel 06:55.240 Demonstration: Swap 07:53.970 Demonstration: LaTeX 08:48.700 Demonstration: Regex 09:36.320 Demonstration: Order 10:54.440 Demonstration: Fun 12:29.120 Implementation 14:18.740 End

Duration: 14:46 minutes


Listen to just the audio:
Duration: 10:16 minutes


We present our Emacs package for performing parallel text replacement.

"Parallel" in this context does not refer to improving efficiency through parallelism, but to the concept of performing more than one text replacement without them interfering with each other. This is in line with the usage of the term in the Lisp community when contrasting the behaviors of LET and LET*, SETQ and PSETQ, etc. (e.g.

We will present the package's features and its integration with Emacs' query-replace system, a comparison with previous solutions, and a few notes on our implementation. We will describe some common use-cases and showcase how the package is used.

The package is currently not yet published in a package archive, but the code is already publicly available at The name "query-replace-parallel" is not yet final and we are thinking of alternatives. Our current best candidate is "replace-parallel" (similar to the built-in "replace.el"), but suggestions are welcome.


Questions and answers

  • Q: This looks great, and was very well-presented.  Do you have plans to upstream this functionality into Emacs?
    • A: Would require some refactoring upstream, so not suitable for upstreaming as-is.
  • Q: Did you use pair-programming while developing it, or did you work independently, alternating and reviewing?
    • A: Yes, we did! I was at the keyboard, Valentino was at the whiteboard, and we kept bouncing ideas back and forth, trying out prototypes, coming up with various tests, checking the edge cases, etc.
  • Q: What is your background in programming? Was it difficult to implement following the same API and architecture as what is already in Emacs?
    • A: Both Valentino and I are PhD students in computer science, but a PhD or similar is definitely not a requirement. It wasn\'t too difficult because we could reuse the interactive functionality from query-replace\'s internals. Figuring out what and how to reuse is what took a bit of creativity, but a lot of the necessary knowledge for that came from just reading and poking around Emacs\' replace.el. Don\'t be afraid to go and read the source!
  • Q: What did you learn about Emacs programming or programming in general while working on this project?
    • A: That Emacs is so flexible that you can even advise its `message` function. Similarly, being able to prototype functionality so quickly and immediately integrate it into the rest of Emacs is so fun and so satisfying!


  • One usecase could be character names in a novel manuscript, if one has named a character and want to now rename it to some other character names or swap it with another one.
  • I never saw so many talks about repls. so great!!!
  • Nice, I was wondering if it utilized rx
  • package installed, ready to use!
  • excellent talk, and also such a cool package
  • great talk, very clever concept
  • that SRE "paper" you linked to is interesting
  • just saw the "Parallel Text Replacement" talk - 👏 great talk!


[00:00:00.000] Introduction

Hi everyone! Welcome to our talk on Parallel Text Replacement. My name is Lovro, and I'll be telling you about an interesting problem that my friend Valentino and I set out to solve one afternoon. We will describe the problem, take a look at some of the existing work and then present our solution. Afterwards, we will show some demos and conclude with a quick overview of the implementation. Let's get straight into it!

[00:00:23.440] Problem: Goal

Here is a problem that most of us have dealt with at some point. Assume we have a piece of code such as the following. We use a code example here, but in general what we're about to discuss can be applied to any piece of text. After a bit of thinking, we decide that the names of the two variables, "foo" and "bar", should actually be swapped. That is, "foo" should be replaced with "bar", and "bar" should be replaced with "foo". The question is: what is a good way to achieve this? We could perform the edits manually if the code is small enough, and we might even be done reasonably quickly. However, consider two things. Imagine the usual case where there's just too much code to edit by hand. We have no other option than to automate the task. More importantly though, we have a whole programmable text editor right at our fingertips. We should object to doing things that the computer can do for us.

[00:01:12.360] Problem: Naive Multi-pass

So, one way to automate it is by using our old friend query-replace (M-%) multiple times in a sequence. We first do a pass where we replace "foo" with "bar", then we do another pass where we replace "bar" with "foo". But that's clearly not right. We all know that this naive multi-pass approach doesn't work because it results in interference between the two replacements.

[00:01:34.200] Problem: Clever Multi-pass

Instead, we have to be a bit more clever. We should first replace "foo" with a temporary string, in this case "oof", that we will call a "token". To avoid interference, we must be careful to ensure that the token does not contain whatever we're about to replace next. Then we do a second pass to replace "bar" with "foo", and finally a third pass to replace the token with "bar". This gives us the result we want.

[00:01:57.720] Problem: Terminology

Putting the implementation aside for a moment, this style of text replacement, where we replace multiple sources with their targets, without running into interference issues between replacement pairs, is what we call a "parallel replacement". This is the essence of the problem we're trying to solve. The examples with swapping that we've shown so far are really just one of the many use cases that are supported by a general parallel replacement utility. To avoid confusion, let us clarify that the word "parallel" is not in reference to hardware parallelization, but rather comes from analogy with the Lisp let operator, where the bindings of variables are performed in parallel, rather than sequentially as in let*. Parallel in this context means that none of the bindings are in scope within any of the initial value forms. In other words, just like a let's initialization form cannot refer to any of the earlier bindings, a replacement pair's source should not be able to replace the previously substituted targets of any other pair. This is what we mean by "no interference".

[00:03:04.440] Problem: Scaling Multi-pass

However, manually invoking multiple carefully chosen query-replace commands gets old very quickly. Say we scaled up the problem and wanted to perform n swaps instead of just two, e.g. to swap, or rather, rotate, "foo" to "bar", "bar" to "baz", "baz" to "quux" and "quux" to "foo". We would first have to perform n - 1 additional replacements to introduce the necessary tokens, effectively doubling the number of steps. Even if we tried to automate this, think about what tokens the code would have to generate if we had no prior knowledge of the replacement pairs given by the user. We would have to program defensively and use long randomly-generated strings that, one, hopefully do not interfere with any of the replacement pairs, and two, might slow down the search if they're overly long. Can we do better?

[00:03:55.920] Solution: Single-pass

Yes we can! We can actually perform just a single pass. The trick is to alternate between the replacement pairs, replacing whichever source occurs the earliest, and making sure to continue scanning after the end of the substituted target in order to avoid interference. This interleaving of replacements is not something that's easy to do by hand with query-replace.

[00:04:18.240] Solution: Existing

Since this is Emacs we're talking about, of course there already exist solutions that implement this idea. Here are few that we could find. The EmacsWiki has a page dedicated to this problem. Stack Overflow has an old post where a couple of users provided their solutions. Mastering Emacs also gives a method along with other interesting query-replace-regexp (C-M-%) patterns. More recently, Tony Zorman made a blogpost providing a solution with an interface based on query-replace. I encourage you to take a look at these solutions if you're interested in the details. But while a step in the right direction, these solutions are not satisfactory because they all lack one or more of the following. One, they are not completely automated and require the user to come up with a relatively complicated and verbose query-replace-regexp invocation. Two, they are restricted to performing only 2-element swaps rather than general parallel replacements. Three, they don't provide any sort of interactivity during replacement and instead perform it in one shot. Four, they don't attempt to integrate with the familiar query-replace interface, which supports skipping, undo, history and more advanced features like Lisp expressions and recursive query edits. Most importantly however, five, none of them were designed with regular expressions in mind and instead only ever consider literal strings. In fact, the only one that comes close is the half-automated solution that invokes query-replace-regexp with a specially crafted replacement. As an example, here's how you would use this technique to perform a 3-element parallel regex replacement. It uses the backslash-comma Lisp expression feature in order to choose the appropriate target to substitute. Aside from being very clumsy and tedious to write out, this approach makes it really hard to use more complex regular expressions that make use of capture groups themselves. This was the biggest limitation that we wanted to get rid of and the main motivation for our work. So, as an alternative to the existing zoo of 80% solutions, we aim to provide a 100% solution, one that handles regexes and consolidates all of the existing ideas into a single package.

[00:06:29.080] Solution: query-replace-parallel

We call it query-replace-parallel. The package is free and open-source and can currently be found on GitHub under hokomo/query-replace-parallel. The name is not yet finalized and we're open to any suggestions. We hope to get it published on an Elisp package archive in the near future, but for now you can just download and load the main Elisp file manually. With all of that said, let's go through a few demos to illustrate some use cases and see how to use the package.

[00:06:55.240] Demonstration: Swap

Our first demo is a simple swap, like the one we showed at the beginning of the presentation. This chunk of text is actually one of the tests from our package's code. Assuming we have loaded the package, we can execute the query-replace-parallel command, a parallel version of the standard query-replace. This command works with literal strings and will ask for each source and target in turn. Our goal is to replace "foo" with "bar" and "bar" with "foo". After inputting our replacements, we terminate the prompt by pressing enter with empty input. At this point, everything functions the same as in a standard query-replace invocation. The echo area shows the match and the replacement we're about to make. We can perform replacements, undo them, skip them, execute them until the end, and so on.

[00:07:53.970] Demonstration: LaTeX

The second demo shows our first regex use case. Imagine we have the following LaTeX code. We realize that we haven't been completely consistent in our use and naming of macros, so we decide to fix the problem. This time we execute query-replace-parallel-regexp because we want to work with regex instead of literal strings. We want to achieve two things. First, we want to wrap all usages of the variable n with the natvar macro. Using the backslash-less-than and blackslash-greater-than constructs allows us to only match letters n not appearing as part of a larger word. Second, we want to rename natvar to intvar because the variables a, b and c are integers and not natural numbers. We enter empty input to terminate the prompt and can now perform the replacements. There we go, the fixes are done and we didn't have to think about in which order to apply them.

[00:08:48.700] Demonstration: Regex

We now take a look at a more complicated regex example to demonstrate that even advanced query-replace features are supported. Each "foo" and "bar" in this example is followed by a number. The goal is to not only swap "foo" and "bar", but also increase or decrease the corresponding number. We first match "foo" and capture the number that follows it. For the target, we make use of the backslash-comma Lisp expression feature in order to replace the match with "bar" followed by the number's successor. We do the same thing for "bar", except that we replace the number with its predecessor. Performing the replacements, we can see how each number is incremented or decremented appropriately.

[00:09:36.320] Demonstration: Order

We haven't covered it explicitly so some of you may be wondering how parallel replacement deals with overlapping matches and whether the order of the replacement pairs is significant. This demo will clarify the exact behavior. The first example has the sources "watch" and "stopwatch". Conceptually, the matches overlap, but the rule is that matches are always processed earliest first, regardless of their length or the ordering of the pairs. Therefore it is "stopwatch" that gets replaced, and not its substring "watch". The second example uses the sources "watch" and "watchword". Both of the matches now conceptually start at the same position. In situations like these the order of the pairs does matter, and ties are broken by prefering the pair that was entered first, which is behavior that is inherited from the Elisp regex engine. So, the substring "watch" in "watchword" is what gets replaced in this case. Situations where the order of the pairs is significant are not very common however, so the user generally doesn't have to worry about this edge case. The order only matters when two or more sources share the same prefix, as in this example.

[00:10:54.440] Demonstration: Fun

The final demo tests the limits of the package and shows that it fully integrates with query-replace. It is really just for fun and can even serve as a small Emacs brainteaser. See if you can keep up! We open a directory and enter Writable Dired mode in order to rename the directories "foo" and "bar". Instead of doing it quickly by hand, we decide to show off and use query-replace-parallel-regexp. We enter our pairs and make use of the backslash-question-mark query edit feature. Now whenever we perform a replacement, the query edit makes Emacs stop and prompt us for additional input to use as the target. We confirm the renames and now enter the "bar-lib" directory in order to perform the same kind of replacement on "baz" and "quux". Rather than save time, we decide to be extra lazy and take the long route. We recall the first pair and initiate a recursive invocation of query-replace-parallel-regexp. We are now replacing the replacement. We apply our fixes and then do the same thing again with the second pair. Recall and recurse. We confirm the prompt and finally rename our directories. Wow, that really paid off.

[00:12:29.120] Implementation

Before we finish, a few quick words about the implementation for the curious. Both query-replace-parallel and query-replace-parallel-regexp delegate to the complex perform-replace function which is the workhorse of query-replace's interactive mechanism. The way we achieve multiple interleaved replacements is by providing perform-replace with a big "matcher regex" and a special replacement function. Essentially, a complex parallel replacement like this is transformed into a standard replacement like this. This is similar to the trick shown earlier in the presentation. Each source is put in its own capture group to allow the replacement function to determine which one matched and return the appropriate target. However, we now take care to support arbitrary regular expressions as sources. We achieve this by converting each source regex into an equivalent one for which we can guarantee that its capture groups will not clash with our matcher regex. Information about this conversion is stored, and once the replacement function is called it has enough data to apply the replacement from the viewpoint of the original regex. The regex transformation is reliable because it uses the rx library, allowing us to treat regexes as s-expressions and avoid any nasty manual parsing. In fact, rx itself is based on one of Olin Shivers' 100% solutions: SRE, or the S-expression regex notation. We all stand on the shoulders of many giants, so let's strive to design good solutions that we can all benefit from, many years into the future! Finally, because query-replace's core is not completely customizable, we did have to sprinkle in some advice to get certain things working. This concerns only minor cosmetic fixes and not the core replacement functionality, but we have nontheless tried to do it in the simplest and least intrusive way possible.

[00:14:18.740] End

In conclusion, go download and play with the package. Even if you're not performing overlapping replacements, you can still use query-replace-parallel for the peace of mind knowing that things won't go wrong if you perform more than one replacement at a time. Feel free to let us know about any interesting or crazy use cases you might come up with, as well as improvements or bugs that make it only a 99% solution. Thanks for listening and have a great EmacsConf!

Questions or comments? Please e-mail