Back to the schedule
Previous: Optimizing Emacs Lisp Code
Next: Yak-shaving to a UI framework

Tree-edit: Structural editing for Java, Python, C, and beyond!

Ethan Leba

Q&A: live
Duration: 10:24

If you have questions and the speaker has not indicated public contact information on this page, please feel free to e-mail us at emacsconf-submit@gnu.org and we'll forward your question to the speaker.

Talk

Download video, 10:24, 15M

Download --compressed56.webm (13MB)
Download --main.vtt
View on Toobnix

Q&A

Download video, 1:59, 3.3M

Description

Name pronunciation	E-than LEE-ba
Pronouns	he/him
Homepage	https://ethan-leba.github.io/
Preferred contact info	ethanleba5@gmail.com

In this talk, I'll discuss a vision for how writing code could be, where the editing operations map directly to the primitives of the language itself -- and my humble attempt of implementing this vision. tree-edit seeks to provides a structural editing plugin supporting conceivably any language with a tree-sitter parser.

Structural editing does not have to be relegated to lisps or niche DSLs.

I liken the state of code editing today to writing assembly. The reason why people like Python more than assembly is that for most purposes, the building blocks of the language are mismatched with our thought process. We don't think in terms of registers and addresses, we think in terms of variables, functions, etc. So when we write and edit code, why do we edit in terms of deleting, inserting, replacing characters – not wrapping, inserting, raising, deleting expressions and statements?

I'll also discuss the implementation of tree-edit, which uses a novel combination of the fantastic tree-sitter parser with an embedded logic programming DSL (miniKanren, using elisp port reazon) to power it's syntax tree generation.

Check out the GitHub repo here!

Discussion

IRC nick: ethan

Q1: so tree-edit is orthogonal to the LSP features?
- A: only uses tree-sitter yeah
Q2:any chance you tried this with Clojure as well?
- A: haven't tried it yet, i don't think tree-sitter-langs has a clojure grammar AFAIK
Q3: Would we be able to do things like extract statement to a variable? For example, extract a a math operation happening in a fucntion call argument into a separate variable an then replace the funtion call arg with the variable name.
Q4: How do tree-edit and combobulate compare?
- A: a lot of similarities, tree-edit replaces traditional text editing style combobulate still implements
Q5: Are similar packages for structural editing common to other editors or they are just popular in Emacs cause of the paredit tradition?
- A: emacs seems to be a trend-setter
Q6: Great talk! How difficult do you imagine adding more languages to Tree-edit will be?
- A: Trying to add python, not super simple, C-like should be drop in replacements
Q7: @ethan Could tree-edit be made to work with Org (orgdown!) itself, or maybe rather what would be needed to get such a unified tree-editing framework to work also for complex Org trees?
Q8: Any plans for an Evil mode integration? evil-textobj-tree-sitter seems like it has a long way to go if it's to catch up to tree-edit.
any chance you tried this with Clojure as well?
- ethan: haven't tried it yet, i don't think tree-sitter-langs has a clojure grammar AFAIK
  - yeah I use sogaiu's (https://github.com/sogaiu/tree-sitter-clojure) but it does not have if else and the rest, only the main data types
so tree-edit is orthogonal to the LSP features?
did not know that miniKanren had an elisp port
Seems like a really cool use of logic programming. All the examples I've heard of are much simpler than this.
Wow, voice control is a good point
Could tree-edit be made to work with Org (orgdown!) itself, or maybe rather what would be needed to get such a unified tree-editing framework to work also for complex Org trees?
Awesome talk. I'm definetly going to try out tree-edit later
Amazing talk!! Such a cool project
`andrea: absolutely! Also for Orgdown - which is a good start since it is much easier.
Andrew Blinn's talk on Fructure and Ethan Leba's talk on Tree-edit are really insightful.
Agreed about the lack of formal grammar (only a proliferation of parsers) being a limiting factor. Maybe we could bridge directly to the available commands without going through a grammar though. A unified tree-editing framework across languages but definitely including Org would be awesome (ala lispy/etc).

Links and other notes:

Github repo : https://github.com/ethan-leba/tree-edit
editing operations that map directly to the structure of the language
inspired by paredit and lispy
Another similar project is https://github.com/mickeynp/combobulate by Mickey Petersen, the writer of Mastering Emacs.
It's an open source project so contributers are welcome
Future implication for this kind of work could be voice controlled code writing/editing

Outline

Discuss motivation (Why should I care?)
Demonstrate tree-edit (Live-coding with tree-edit)
Demonstrate tree-edit syntax tree generator (Elevator pitch on miniKanren)

Transcript

Hi. My name is Ethan, and today I'm going to be speaking about tree-edit, which is a package which aims to bring structural editing to everyday languages. So what is structural editing? The way that we typically write code today is working with characters, words, lines, paragraphs, and so on, and these objects have no real relation to the structure of programming languages. In contrast, tree-edit's editing operations map exactly to the structure of the programming language, which is typically in a tree form with different types of nodes such as identifiers, expressions, and statements. Using this structure can enable much more powerful editing operations, and crucially editing operations that map much more closely to the way that we think about code. tree-edit was inspired by paredit and lispy, which are two great Lisp structural editors. However, what makes tree-edit unique is that it can work with many languages, such as some of the more mainstream languages like C, Java, Python, and so on. So now I'm going to show off tree-edit in action, working with a Java program. So we can see on the left, we have a syntax tree, and the node in bold is what I call the current node. So instead of the concept of a cursor, where we have a point in 2D space, we instead work with a current node which all our editing operations take place upon. So we can move up and down, or rather side to side, move inwards down to the children of the tree, back up to the parents. We can also jump to a node by its type. So we're going to jump to a variable declaration. We can jump to an if statement. And as you might have noticed, tree-edit by default uses a vim-style mode of editing, so it's a verb, which would be jump, and then a type, which would be if statement. So now I'll show off the syntax tree modification in action. So if I delete this deleteme node, we can see the node is deleted, and also the comma is removed since it's no longer needed. We can add some nodes back in. Here we just have a placeholder node called tree, which we can swap out with whatever we like. So if we want to put in, for example, a plus or minus operator, it'll put these two TREE things here since there needs to be something there, but we can go fill them out as we like. So that's what that is. Then I'll delete these again. Next we can see raising. So if I raise reader, then it will replace the outer function call with the node itself. I could raise it again. The opposite operation to that is wrapping. So I can wrap reader back into function call, and I could wrap this again if I wanted to. So that is wrapping. We can also do it on a statement level, so if I want to wrap this in an if statement, I can wrap the statement, and there we go. And let's just raise it back up, raise it again. There we go. Finally, I'll show off slurping and barfing, which... a little bit gross words, but I think it accurately describes the action, so let me just add a couple breaks here. So let's say we want this if statement and a couple of breaks to be inside of the while, so we can just slurp this up, and if we don't actually want them, we can barf them back out. So that's where those words have come from. And we can just... delete as we please. So yeah, that's a quick overview of the tree editing plugin in action. So now I want to talk a little bit about the implementation of tree-edit. Tree-edit uses the tree-sitter parser to convert text into a syntax tree. Tree-sitter is used by GitHub for its syntax highlighting, and it's available in a bunch of editors, including Emacs, so it's a fairly standard tool. However, the unique part about tree-edit is how it performs correct editing operations on the syntax tree and then converts that back into text. So to do that, we use miniKanren, and miniKanren is an embedded domain-specific language for logic programming. So what exactly does that mean? In our case, it's just an Emacs Lisp library called reazon, which exposes a set of macros which enables us to program in this logic programming style. I'm not going to get into the details of how logic programming works. However, one of the most unique aspects about it is that we can define a predicate and then figure out all the inputs to the predicate that would hold to be true. So in this case, we have our query variable q, which will be what the output is, and we are asking for all the values of q that pass this predicate of being set-equal to 1 2 3 4. So if we execute this, it will take a little time... It shouldn't be taking this long. Oh, there it goes. We can see that it's generated a bunch of different answers that are all set-equal to 1 2 3 4. So it's just a bunch of different permutations of that. We can extend this notion to a parser. In tree-edit, we've defined a parser in reazon, and we can use that parser to figure out any tokens that match the type of node that we're trying to generate. If I execute this, we can see that reazon has generated these five answers that match what a try statement is in Java. Here we can see we can have an infinite amount of catches optionally ending with a finally, and we always have to start with a try and a block. We can see this again with an argument list. We have the opening and closing parentheses, and expressions which are comma delimited. Now, for a more complex example, and something that is along the lines of what's in tree-edit, is if we have this x here and we want to insert another expression, so x, y. We can assert that there's some new tokens, and we want an expression to be in those new tokens, and we can essentially state where we want these new tokens to go within the old list of tokens, so replacing it after the previous expression, before the closed parentheses, and then we can state that the whole thing parses. If we run that, we can see that as we wanted earlier, which was a comma and then expression, we have that here as well. We can see this again. Here, the only change is that we've moved the tokens to be before the expression. So we want to put an expression before this x, so we want something like y, x, and if we execute that, we can see that it is correctly asserted that it would be an expression and then a comma afterwards. One last example is if we have an if statement and we want to add an extra block, we can see that it correctly figures out that we need an else in order to have another statement in an if statement. So, next steps for tree-edit. The core of tree-edit is in place but there's a lot of usability features to add, and a lot of testing that needs to be done in order to iron out any bugs that exist. I'd like to add support for as many languages as is possible. I think my next step will probably be Python. There's some performance improvements that need to be made, since using this logic programming language is fairly intensive. There's some optimizations both on the library side and on tree-edit side that can be made. Contributors are of course welcome, as tree-edit is an open source project. For future work, I think the prospect of voice controlled development with tree-edit is actually something that's really exciting, since syntax can be very cumbersome when you're working with voice control software. I can envision something like saying, "Jump to identifier, add plus operator, jump to if statement, wrap if statement in while." So that's something I'd like to investigate. I also would just like to provide the core functionality of tree-edit as something that can be used as a library for other projects, such as refactoring packages, or other non-Vim-style approaches, and just making the syntax generation available for reuse. Finally, I'd like to thank the authors of reazon and elisp-tree-sitter, which in turn packages tree-sitter itself, since tree-edit relies very heavily on these two packages. I'd also like to thank the author of lispy, since a lot of the design decisions when it comes to the editing operations are based very heavily on lispy. So that's the end of my talk. Thank you for watching. captions by sachac

Back to the schedule
Previous: Optimizing Emacs Lisp Code
Next: Yak-shaving to a UI framework