00:00.000 Intro
00:41.520 What are transducers?
03:27.590 Common issues
05:47.280 Transducers
07:35.280 Using transducers
09:52.625 A more involved example with comp
11:49.333 In Emacs
14:29.469 Hash tables
14:58.040 Clarity
15:55.800 How do transducers work?
20:00.520 Transducers in the wild - CSV
26:03.240 Issues and next steps
01:09.920 Q: When I tried comparing transducers.el to cl-lib and dash (benchmark-compiled), I got the following results
05:40.840 Q: Do you know of any theoretical texts on transducers?
07:04.720 Q: Did you think about [compiler features, macros] viz your cl, fennel, elisp, porting of your transducers?
08:16.579 Q: Does t-buffer-read provide a lazy stream that\'s linewise, or charwise, or do something else entirely?
09:09.424 Q: Can the Elisp library be combined with the stream.el API or seq in general?
11:47.543 Q: How does one debug a t-comp expression? Can you single step and see intermediate results of the different statements you declare?
14:42.495 Q: Is there a path for transducers to enable elisp processing of otherwise overly large datasets as if just normal Emacs \"buffers\" (i.e. just pulling one thing at a time so essentially stream-like under the hood but buffer-like in interface), with none of the usual perf issues with a traditional buffer structure?
16:51.200 Q: Is there an option to read a csv/json and produce an alist or plist instead of a hash table for an entry?
17:50.520 Q: Is the common lisp version ready for 'production' use? Is it complete enough and the API stable enough?
18:17.477 Q: Do we need a pre-written \"t-\" version for every already existing reducing function like + or is there a function to construct them from already defined reducer 2-arg functions?
20:26.320 Q: Is the compelling argument for transducers is that it's a better abstraction?
Transducers are an ergonomic and extremely memory-efficient way to process a
data source. Here "data source" means simple collections like Lists or
Vectors,
but also potentially large files or generators of infinite data.
Transducers…
allow the chaining of operations like map and filter without allocating memory between each step.
aren't tied to any specific data type; they need only be implemented once.
vastly simplify "data transformation code".
have nothing to do with "lazy evaluation".
are a joy to use!
In this talk, Colin will introduce Transducers, show how to use them, and
demonstrate some Emacs-specific workflows that make live processing of large
data sets in JSON and CSV a breeze.
About the speaker:
Colin has been active in the FOSS world since 2011, publishing libraries and
applications primarily in Haskell and Rust. Since 2023 he has been using
Lisps
more and more, and after falling in love with Transducers from Clojure has
ported the pattern to three other Lisps.
Colin is originally from Canada and lives in Japan.
Discussion
Questions and answers
Q: When I tried comparing transducers.el to cl-lib and dash
(benchmark-compiled), I got the following results:
cl-lib: 0.5 sec, 2 GCs
dash: 0.3 sec, 1 GC,
transducers: 0.75 sec, 2 GC
cl-loop: 0.025 sec, 0 GC (note: 0.025, one order-of-magnitude faster)
I expected transducers to be slower than cl-loop but faster than the
cl-lib or dash. However this isn't the case. Any idea why? (benchmark
is here:
[https://old.reddit.com/r/emacs/comments/1h5c778/which_emacsconf_2024_talks_have_your_attention/m061dge/](https://old.reddit.com/r/emacs/comments/1h5c778/which_emacsconf_2024_talks_have_your_attention/m061dge/))
(benchmark-run-compiled 1000 (cl-loop for n from 1 below
2000 by 2 sum (* n n) into total
finally return total))
A: Loop is always going to win in cases like this where we are
not doing two nested (e.g.) change calls, what I called comp
steps. tranducers shines while we need to do things which chain
things together; we can sometimes express ourselves more clearly
vs loop. this may sound sounds like moving the goal-posts:
it's really about internal function calls, avoiding stepping
though each loop in ways which loop doesn't need to do, so loop
might "win".
Note: I'm comparing against cl-lib and dash -- the cl-loop is
only for reference. I'm curious about the performance gap
between transducers and cl-lib/dash. The number of function
calls is the same for cl-lib and dash and transducers.
Q: Did you think about generators as a source cf lists, vectors,
etc? Maybe I got a word wrong. After the development that generators
and Series operations needed-each-other, not being redundant as had
been thought. I forgot who the generators person was in lisp.
A: (not yet answered)
Q: Do you know of any theoretical texts on transducers?
A: My README and Rich Hickey (inventor of Clojure) may be the
key texts on transducers.
Q: Waters (lazy series in lisp, late 70s) said that this *should
have* been done as an additional compiler feature in compilers, but
if not it must be a macro package. Did you think about that viz your
cl, fennel, elisp, porting of your transducers?
A: I think my work could help provide the basis for this;
there's much more to be done.
Q: Does t-buffer-read provide a lazy stream that's linewise, or
charwise, or do something else entirely?
A: I'd say these libraries are completely orthogonal. (Re: what
made me ask this question was seeing `t-buffer-read' and
hoping that this doesn't convert a buffer into a string) With
seq/generics it should just work: magic of defgeneric
Q: How does one debug a t-comp expression? Can you single step and
see intermediate results of the different statements you declare?
A: In Common Lisp it is wonderful. In Emacs Lisp it is more
complicated. I don't know if step debugging would work, but
there is a "log" function to print out values.
Q: Is there a path for transducers to enable elisp processing of
otherwise overly large datasets as if just normal Emacs "buffers"
(i.e. just pulling one thing at a time so essentially stream-like
under the hood but buffer-like in interface), with none of the usual
perf issues with a traditional buffer structure?
A: Possibly so yes
Q: Re the performance issues mentioned and that popped up recently
in the lists/forums, to what extend does tailcall optimization and
other mechanisms (incl. inlining, GC-friendliness, etc.) could
eventually alleviate issues and enable their use at little to no
extra cost?
A: Over time, with further work and optimizations. Some already
there (tailcall notably)
Q: Is there an option to read a csv/json and produce an alist or
plist instead of a hash table for an entry?
A: Absolutely.
Q: Is the common lisp version ready for 'production' use? Is it
complete enough and the API stable enough?
A: I use it all the time. I use it extensively. Programming a
game, not realizing a dent in the frame rate.
Q: Do we need a pre-written "t-" version for every already
existing reducing function like + or is there a function to
construct them from already defined reducer 2-arg functions?
A: already defined. This is basically fold. Some built-in functions like plus already function like reducers. It's a coincidence that they do that. But there's an example in the README. Max is one that does not act like that. For instance, maybe I could screen share later, but if you just type in plus one, If you call plus one in Emacs or Common Lisp, you get back one. It actually only needs one argument. If you only type plus, it actually gives you zero. Plus and multiple satisfy the API of reducers. But if you have one that doesn't, like the max function, and similarly, just type in plus as a function call, just plus with nothing else, and you'll see. No, as a function. zero will come out. This basically means it satisfies the reducer API. But a function like max does not. If you just type in max and then one, it won't work. Pardon me, it did. But if you type in max with nothing else, it wouldn't work. Hence, we have to wrap it in something like fold. I would say go look at the fold function.
Q: Is the compelling argument for transducers is that it's a better
abstraction? Seems like a lot of concerns/objections (while
pragmatically valid) are focused on implementation. Can this
abstraction allow for advances in implementation?
A: Yes, what I've basically done is mostly followed the pattern of usage that exists in Clojure and in Scheme's SERP 171. In theory, the service level API is the same no matter where you're using this, and that's the idea. If you learn them in one list, you should be able to use them everywhere. Then what it's actually doing under the hood is free for us to change around. My implementations are mostly based on the scheme with a few alterations here and there. And in the Common Lisp case, like adding some Common Lisp isms to improve usage like UX a little bit. But overall, we are free to do whatever we want internally to speed up performance. I just haven't done that work.
Q: is there a path for transducers to enable elisp processing of otherwise excessively huge data ets as if just a normal Emacs "buffer" (i.e. but just pulling one thing at a time so essentially stream-like under the hood), with none of the usual issue when a traditional buffer structure?
(not yet answered)
Q: So the "reducer API" means that the function accepts a variable number of arguments in a monoidal way?
that's what I gathered
Q: From your investigations and tests so far, do you think there would be the necessity of transducers to eventually go down into the C level code for things like using them to solve "infinitely-big" buffer-like interfaces and such?
A: Yeah, like, if we really wanted to go that hardcore, maybe there's some like C level stuff that we could you know, significant demand for such a thing. You know, so far there hasn't been such demand, but maybe there will be in the future. Yeah, perhaps there's some custom stuff we could do.
Q: why does the reducer have to sometimes be a custom function, but other times just something like #'+?
it depends on if that reducer function needed special input or not
Q: do you have FSF copyright assignment? it is nice to get low-level libraries like transducers on ELPA so other copyright-assigned packages can use them (and so they can be included in Emacs when they reach wide adoption)
transducers is on MELPA
Q: Is that #' business for some lisps and not others the lisp-1/lisp-2 distinction?
Sharp quote refers to (symbol-function 'my-symbol) in a lisp2
yes, it emphasizes using the "function" associated with the symbol (if there's one in the "function slot" for the symbol) as opposed to some "variable"-type value
(and in this case pkal is not asking about the sharp quote but a t-prefixed function as opposed to a standard function like +)
that's because of the separate namespace for function symbols?
function rather than symbol-function to be extra-nitpicky (to accomodate lambda forms)
If I remember correctly, single quote (') does not respect when a function symbol is overridden by a let, but (pound quote) #' does?
yes, my question was about the t- prefix.
"let"s only bind the symbol value slot, not the function slot.
yes, iiuc; in effect, it's sort of an early-binding/late-binding distinction
My bad. Should've specified flet.
@can't speak for the elisp case, but in the clojure case using things in transducer form has a slightly different "typing" shape as you'd expect and may require a wrapper if a function can't be used as is and such, though I missed the context and that may not be relevant to your point
Q: Question about how the transducers video was made? Did you use Reveal.js? Do you have a pointer to the html hosted presentation? How did you generate the content for Reveal?
So the presentation itself was done with RevealJS from Org Mode. So as you saw, I had a raw Org Mode buffer, which was which was the presentation itself, which I then just exported with a few certain settings, a few customizations. And then for screen recording, I used OBS, which worked flawlessly on Arch Linux. I used Sway, Wayland, and all of that. So all of that just worked, which was very impressive. Where do the HTML host the presentation? I don't have that presentation hosted anywhere.
Text is a little small bth
no keep the latin on screen, i was trying to read that
It was Lorem Ipsum
Translating lorem ipsum plus plus
Thanks for larger text
Notes
What made transducers click for me way back when was
realizing that the usual operations (think map/reduce/filter/etc and
their generalizations) implicitly or explicitly demand realized
collections, plus intermediate ones between those operations (and
GC), instead of composing such transformations into one operation
and then making this applicable to streams of unknown-sizes and
such.
Sorry but map is collect and filter is select for me
there are many ways to get to them (some may
already think of those functions as folds, etc.), but for the bulk
of people familiar with map/reduce/filter/etc, it's useful to enter
the thinking realizing that you may have infinite streams (of any
type even) on input, and may not want to have realized collections
at every step of the usual function applications, thus the need to
compose the things that make up a map/reduce/filter/etc before
applying them one by one on a continued stream of things one at a
time
ellis: I wrote about half of one in binutils libctf
(generators, anyway). See the *_next() functions. Having seen this
I'll try to mutate the internals to something closer, right now
every iterator is independently implemented (mostly).
(inspired by generators, not transducers, since I
didn't know about them)
still *far* less ugly than "call a function with
every item" *_iter() stuff
Thanks for the answers fosskers, I'm quite hopeful
with transducers working their way into many things, so thinking
ahead to where that may happen and to solving perf hurdles
I'm totally sold. I'm working on a CL codebase
right now, and these are going in there immediately
(also CL does not require TCO but good compilers support
it to the extent that extensive use of FP is practical)
it's a tricky protocol to get perfect the first time for
sure, is something that transcends language barriers so always fun
to see more impls
definitely watching this one more carefully. if it's
CLOS-oriented i'm going to like it
note that full TCO is strictly more expressive than
special-case optimizations as with emacs's cl-labels
in the general case, there not need to actually process such a composed operation on an item from the input until needed/"pulled" from the output side, so yes in a way
yea i think the next step in terms of performance would be using a 'plan' object internally to rewrite the lambda calls
Julia does loop fusion through its broadcast operator.
that gets very hairy and makes it far less simple imo
Re the current answer, it doesn't yet, but it's on the path where it could eventually (i.e. introspecting on the composed transformation and simplifying it, be it fusion-type or otherwise)
Waters agreed that ergonomics was the key key key thing. (In the PhD world, because of studies that people have trouble reading other peoples' iteration = loop codes)
"galaxy brain" is great expression.
CLTL2 docs on SERIES for those who are curious http://www.cs.cmu.edu/afs/cs.cmu.edu/project/ai-repository/ai/html/cltl/clm/node347.html#SECTION003400000000000000000
i'm curious about what mathematical structures are related to transducers but that's probably a README question
Q: transducers C impl when
ecl slime repl + transducers or series
I wrote about half of one in binutils libctf (generators, anyway). See the *_next() functions. Having seen this I'll try to mutate the internals to something closer, right now every iterator is independently implemented (mostly).
note that full TCO is strictly more expressive than special-case optimizations as with emacs's cl-labels
(inspired by generators, not transducers, since I didn't know about them)
still far less ugly than "call a function with every item" *_iter() stuff
Thanks for the answers fosskers, I'm quite hopeful with transducers working their way into many things, so thinking ahead to where that may happen and to solving perf hurdles
I'm totally sold. I'm working on a CL codebase right now, and these are going in there immediately
(also CL does not require TCO but good compilers support it to the extent that extensive use of FP is practical)
it's a tricky protocol to get perfect the first time for sure, is something that transcends language barriers so always fun to see more impls
absolutely. I will try of course libctf is ABI-stable so I can't just change what the user sees, but maybe I can make the internals less repetitive
(and maybe add another API that is more transducery.)
Nice ielm
@pkal: that's what I meant back up, that some functions happen to have the right form for use in transducers, but fosskers expounds much better, small incompatibilities (think order of params, variadicity, etc)
I used emacs for so long before finding out that ielm existed. (I wish I knew sooner.)
Feedback:
👏👏
impressive
cool presentation
(still, very nice!)
I am so, so thrilled with this. This blew my mind. Thank you! 😊
👏
nice 👏👏
deeply not ugly
Thank you for the excellent talk!
Thanks for the talk
Thanks fosskers
👏
awesome 👏
always a pleasure to watch transducers in action - thanks fosskers!
Inspiring talk, promptly added to my aoc.asd. Will come in handy.
definitely watching this one more carefully. if it's CLOS-based i'm going to like it; CLOS-oriented, rather
Thank you fosskers! it is bound to end up in some, if not all of my common lisp projects
YouTube comment: Nice, this seems to do basically the same thing as C++ ranges, which I've enjoyed using a lot
Hi everyone, this is EmacsConf 2024. I'm Colin, and todayI'll be talking about transducers.After introducing them, I'll share a bit of history abouttransducers and the problems that they solve, some basicsabout how we can use them, how they work, like how they'reimplemented, some demonstrations of how we can actuallyuse them in the wild, and then some other discussions aboutissues that they have.
Okay, let's get right in. What are transducers?Transducers are a way to do streaming iteration with amodern API.Who are transducers for, and thereby, who isthis talk for? Well, it's for people who want to do streameddata processing in Emacs. It's for people who perhapsaren't satisfied with the existing APIs, for example, theseq API, or some other common libraries that providesimilar functionality. Maybe you're not a fan of the loopmacro. Some people find it difficult to understand. Ormaybe you've done a bunch of Clojure before, and you'd likemore aspects of Clojure in your Emacs Lisp. Or maybe you'rejust interested in transducers in general, because thepattern has now been ported to multiple different Lisps.So I'm Colin. I'm fosskers on everything online, and I domainly back-end programming work and a lot of open sourcesoftware. I wrote Haskell for a long time, both as a hobbyistand professionally. Since the COVID years, I've beenwriting Rust, both open source and professionally. But nowI find that in my spare time, I'm mostly writing Common Lisp.Some things I learned from my years of Haskell was that a lotof programming is just altering the shape of data. You know,sometimes we work through our algorithm line by line. We'retrying to just tell the computer exactly what to do. But if westep back, a lot of the time we're just getting in data of someshape, changing it, and then passing it along. A lot ofthese patterns are common, identifieddecades ago. For instance, we have some collection, and wewant to transform every element of that collection and thenpass it on. Or maybe we're trying to filter out bad elementsin that collection. Or maybe we're looking for a specificelement in that collection. Yes, you could write all thatwith for loops, but these kind of common patterns wereidentified and given names decades ago. So why not use them?They say that there are two major problems in computerscience, one being cache validation and the other beingnaming things.
I've identified five other problems thatcome up when we're trying to deal with collections of data,or big streams of data. One is that if we were trying toload a file all into memory all at once and process the wholething, sometimes we can have memory problems. You'veprobably seen out-of-memory errors or such things.A second issue that comes up is that if we were looking at agiant for loop, in particular a nested for loop or suchthings, it can be hard to tell just by looking at the code whatit's trying to do, what it intends. If we don't go characterby character or line by line, it can be hard to understand it.Furthermore, and this is particularly an issue with EmacsLisp, is that if one call, for instance, to seq-map, thenpiped into seq-filter, for instance, will have anintermediate allocation, the map will take the sourcecontainer, allocate a new one, and then the filter willoperate over the second one. This is wasteful.Furthermore, it can often be difficult to abort a stream.For instance, if we were filtering through our collection,but we knew we only wanted to go halfway, for instance, forsome reason, we have no way to stop it halfway through. Wejust have to process the whole thing, even if we know we don'tneed to. Another issue is that for languages that havetraits, or in Haskell they're called type classes, if youare defining what it means to map over something, you oftenhave to redefine that for every kind of container or thingthat you're iterating over. Wouldn't it be nice if we coulddefine things like map just once and then reuse themeverywhere? Now, transducers solve all five of these,without the addition of new language features, and withlittle more than plain old function composition.
If this is your first time hearing of transducers, yeah,no problem. They were originally invented in Clojure byRich Hickey, and this is a quote from him. He thinkstransducers are a fundamental primitive that decouplecritical logic from list or sequence processing, and if hehad to do Clojure all over, he'd put them at the bottom, at thevery bottom of all the fundamental primitives. Now, that'sRich speaking quite highly of them. And I think he has a pointhere.They were invented originally in Clojure. In morerecent years, they were brought over to Schemevia SRFI 171. That's where I found themwhen I was learning the Guile language.In the process of submitting a patch, I realizedthat there were other things to be improved. So I ported thepattern to Common Lisp, then Fennel, and then morerecently, Emacs Lisp. The Common Lisp and Emacs Lisp APIsare identical. And the Fennel one is not identical, butfairly similar. Overall, everywhere you findtransducers, they should basically be fairly uniform.When I originally made the Common Lisp variant first, Isampled the APIs from a number of different languages andcame up with what I believed to be a representative sample ofwhat most people would want out of such a library. I gavefunctions their common modern names. For instance, mapis map and filter is filter and so on.
What does the usage of transducers look like? Well,these examples will all be the Emacs Lisp variant, but theCommon Lisp will look basically exactly the same, minusthis little t- prefix.Running transducers requires three things. It requires asource. This could be an obvious thing like a list or avector, but it could be other things like a file, or in Emacslist in particular, a buffer.A reducer is a function. It's something likethe + operator or the * operator,or certain constructors of various containers.It takes values and collates them into some final version.Now, finally, we have what we're calling herea transducer chain. This could be one transducer functionor it could be multiple composed together. These are thefunctions that actually take data and transform themsomehow. For instance, this. We have a list of threeelements. We want to reduce it into a vector. How we aregoing to transform the elements along the way: we are doingplus one to each of them. If this syntax is new to you, justknow that this #' just means that this thing thatcomes after it is the name of the function. In Common Lisp andEmacs Lisp, this is necessary, but for Clojure and Scheme,it is not. So we can see here that just this example is not muchdifferent than any other normal map call you might see made,but if nothing else, it's a handy way to convert a list to avector or anything else. There are many, many reducersavailable and many different forms that we cancollate the final value into.
Let's see a more involved example.Okay, now we've got some more meat here.Here we can see usage of the comp functionand a custom source, ints.Ints is an infinite generator of integer values. That's notlike a list or a file. It will generate infinitely.Comp is letting us compose multiple transducer functionstogether. Notice that this is the opposite order of whatwe'd usually be used to from a function like comp. The orderhere is top to bottom, basically, so that the map goes first,then the filter, and then the take. So effectively is whatwe're doing is taking all the integers that exist,positive, adding one to them, filtering out only the evenones, but then just taking 10. Cons here is a function thatjust produces the ending result as a list. So what happenshere specifically is how we are avoiding intermediateallocations. First, the number 0 will come through.It will be pulled out of this source internally by transduce.It will make its way into the map. The map will add it. Then itwill immediately go into this filter step. So it's not likeall the maps occur, and then all the filters occur. We doeverything for each element. So the 0 comes in, now it's 1.The filter would occur. Well, it's going to fail thatbecause it's not even, so it will just bail there. Now we'llgo to the next one. Now 1 will come, it will become 2, thenit will be saved by this evenp call, and then the take willcapture it, because we only want 10 values here. You cansee 2, 4, 6, 8, and so on is the result that weexpect. So let's play around a little bit.
Let's jump into Emacs and see what we can do.Alright, you should see my Emacs screen here.These are the actual notes for the actualpresentation done in Org Mode. I'll boost that up in size fora little bit. That should be more than big enough for you.Just by changing the reducer, we can change the result.Okay, now it's a vector. Well, what else can we do to it? Well,let's just add up the results. Maybe we just want to count theresults. Oh, indeed, there were 10. What if we want to findthe average of the results? What if we want to find the medianof the results? And so on. Here's some more interestingthings that we could do. We could add different steps. Sohere we have all the integers. Let's add, hmm, okay, we'llkeep that. We're going to add t-enumerate. What enumerate doesis for each item that comes through, it isgoing to add a sort of index to it and make it a pair. In thiscase, it's going to be equal to what came in here. Well, we canchange it. If we start this at 1, now it will be different.1 will be paired with 0, and then 2 would be pairedwith 1, and so on. We'll accept that the even call will changethat a little bit. Why we're doing this is because we wantto form a hash table. Let's move that down to 3, maybewe'll get a better result. What do we see? Okay, here now theresult is a hash table. What are its values? Well, 0 seemsto have... The key of 0 seems to be paired with 2, the key of1 seems to be paired with 4,and 2 seems to be paired with 6.Maybe let's jazz that up even a little bit more.We're going to start from a stringand we'll call it hello.That's not going to work anymoreand neither is that, but what we could do iswe could say t-map #'string.I believe we'll do that.Let's see if that works. It did. So that'sgoing to convert a character into a string.Let's just go twojust to make it a little easier. Now you can see that we'veconstructed a hash table here. The key of 0 is mapped to thestring of h and 1 is mapped to e. Now, I really like havingthis reducer in particular.
Know that hash tables arealso legal sources. I find that both in Emacs Lisp and inCommon Lisp, dealing with hash tables--like creating themand altering them--can be a bit of a pain. Having themimmediately available like this with transducers is veryhandy, I find. We can work with something that wasn't a hashtable. We can construct it in a way that makes it amenable tothat, and then reduce it down into a hash table, and here yougo. Very handy.
One last point is that you can see very clearly whatthis is attempting to do, as opposed to, say, a for loop. It'svery clear what that step is doing, and then you can see whatthat is doing, and you know that the result is going to be two.Each line is kind of its own declarative step, and it shouldbe clear, just by staring at this, basically what you'regoing to get out. This is one main difference from otherlanguages that have things--say, for instance, Rust'siterator API--is the difference between the transducersand the reducers. If we go up here, for example, thedifference between the transducers and the reducers andthe sources is not explicitly laid out, whereas withtransducers, it is. You have to be aware of how these thingsare different. I think that that helps clarity.
Moving on. How do transducers work? Well,we want to go see the README.So, what we're going to do iswe're going to go to here.You should still be able to see this.This is the CL example, actually.Let's go to transducers.el.Their APIs and READMEs are the same,but just for the sake of it, we will go seehow this looks on the Emacs side,just so that nothing is a surprise.But recall that the APIs are essentially the samebetween the two. If you go to this section, writing yourown primitives, you can read about how transducers areactually formed, whether or not you want to write themyourself or not. We can see here t-map. We accept thefunction that you want to operate with. Then you've gotthis extra little lambda here that's coming in, and it'sreceiving a thing that is named reducer. Now, while herewe're calling it reducer, it's actually the chain of all thecomposed functions together. It's all those maintransducer steps. Finally, it's the reducer allcomposed together with normal function composition.That will matter very soon. Now here's the actual meat.We can see the accumulative result that's coming in with thecurrent element. Now we need to operate on this.Were it normally mapped, we would see usapplying the F to the input.But here, you can see us applying the F to the input and thencontinuing on. So us calling the rest of the composed chainhere is the effect of, in the previous slide, moving to thenext step. We could ignore this line for now.If you're curious, please read the README in detail.Now, what about reducers?What do those look like? Well, let's just scrolldown here. Recall that a reducer is a function that'sconsuming a stream, right? Zoom that up for you a little bit.Now, in the case of count, recall that this is how it'sworking, how we saw a moment ago. So clearly this list of fiveelements only has five things in it. Well, a reducer bystructure is a function of two, one, or zero arguments. So wecan see here in the case of two, this is the normal iterativecase. We don't care about the input for count, we just careabout the current accumulated count that we're doing, andwe add one to it, and that's it. This then goes back tothe loop and the whole process starts again with the nextelement. In this kind of done case, this is used internal tothat sort of the supervising function transduce. It's justconfirming the final result. Sometimes somepost-processing is necessary here, but in the case ofcount, as it is so simple, that is not necessary. And nowhere's the base case. This is also used within thatsupervising transduce function at the very top. Well, ifyou're counting, you have to start from somewhere, right?In this case, well, what you're starting with is zero.In the case of cons, you'd be starting with an empty list.In the case of vector, you'd be startingwith an empty vector and so on.Once again, if you are more curious, please take a look atthe README.
Okay, transducers in the wild. Well, let's go take a look atprocessing some CSV data.We're going to open up a new Emacs Lisp bracket here. So I havea file. And in this file, let's just go look at C-x b rightthere, you will see that we've got some bank transactioninformation. It's got these transactions from a wholebunch of different people into different accounts,whether it's money coming in, money going out, and then abasic description. How's your Latin? But for this littletest, what we want to do is we want to find Bob's final bankbalance. Let's get on to it. First of all, let'sjust confirm, let's do some basic stuff.with-current-buffer, find-file-noselect.What's the name of that file?This is pre-organized, so youwill just see it right here.t-transduce and t-comp. We don't know what we're going to compyet. Actually, I'll just pass to show you. And then we willsee, let's just do a little t-count just to confirm. What'sour source? Well, our source is a buffer, t-buffer-read.And note that because we're using with-current-buffer,if we go like this, if we go current-buffer, this will just work. Sonow let's... Well, that was odd. I should have done it likethat. There we go. So now we should make that a little smallerso you can see what it is. Now if we hit RET, we should get theright result. Okay, so there are 50,001 lines in this file,but the one extra one is the name of the headers, right?We want to process this file in more detail. So how can we dothat? Well, let's start by just automaticallyinterpreting the results as CSV. If we do that, okay, wellnow we only have 50,000 entries as we expected, right?Because it's going to pull out the header line. If we now saywe want to just filter out, you know, We only want Bob, right?So if... gethash, it was in the row of name. Each line here ismade into, at least by default, is made into a hash map. So ifwe go like this, we should see that. Okay, so 12,000 of theselines or thereabout belong to Bob.Let's just move that over a little bit. Actually, I suppose we don't evenneed that anymore. I'll just keep that full size for you.Okay, so all right, there's about 12,000 results for Bob ofthe 50,000. What's next? Well, we want to confirm,we want to pull out everything,all of the in and the out entries.Thank you. So, string to number, because we know thateverything came in as strings. Unfortunately, the from-csvdoesn't try to be smart at all, it's just pulling everythingin as string values. If you want actual things to benumbers or whatever, that is up to you to do the parsingyourself. Okay, so we have those two values now. We knowthat we saw from the data just a moment ago that you're onlygoing to have a value in one column or the other. It's eithergoing to be 0 in the empty one, or you're going to have somenumber in the other. So we know that we can just naively addthem. If it was in, it would always be positive. So we'll justadd that. But in the negative case, we want to just make itnegative really briefly before we add them all together.let's now just prove to ourselves that we are sane here. Whatwe're going to do is we're going to quickly go say take5 just to convince ourselves, and we'll go cons, and let'ssee if we get kind of results that make sense. Okay, thesesort of make sense. It looks like you know Bob's got some bigexpenses here. If we take say 15, does it look any better?Okay, looks like he had a payday. All right, good job Bob.Let's get back in there. Now we only really care aboutadding the final result, right? So there we go. Add that alltogether and we'll see what we get in a moment. Okay, wow,Bob's rich. Okay, so it looks like in his 12,000transaction, Bob has an overall net worth of $8.5 million.Looking pretty good.So here's an example of how you can, particularly in EmacsLisp, how you can very easily just get a file, consider it thecurrent buffer, and then just do whatever you want to it.Note that there is sort of first-class support for both CSVand JSON, and then you have, and both of those bring in theirvalues as hash maps, and then you're just free to do whateveryou want and process them, potentially both writing themback out as CSV or JSON once again.
Some issues with transducers that can come up isthat one, a zip operator is missing, but I'm working on it.Two is that performance, particularly in Emacs Lisp, isn'tthat great. It could be due to the sort of nested lambda callsthat have to occur internally, but the common Lispimplementation is quite good. and there's yet no supportfor parallelism. You can imagine that a lot of those stepsyou could potentially perform in parallel depending on theplatform, but research has not yet gotten that far. Okay,that's all. Thank you very much. If you have any questions,please contact me.
Captioner: sachac
Q&A transcript (unedited)
Hopefully the internet goes well. It's a nice Mondaymorning here in Tokyo.Are we connected all right?Okay, I seem to be struggling still with my audio. 1 2ndcalling. Yeah, you were muted for a moment there. Okay,there we are. Okay. All right. Sorry about that. I got a muteout my, my back office chatter. That's kind of distractingme a little bit. All right. Sorry. I may have lost the plot alittle bit. I think I did. However, find the 1st question.I got pretty distracted by conversation backstage. Yeah,
[00:01:09.920]Q: When I tried comparing transducers.el to cl-lib and dash (benchmark-compiled), I got the following results
no problem. So the first question here, someone's asking,when they first tried comparing transducers.el, the cl-liband Dash bookmark compiled, and they give some detailedresults we're sharing on the stream. Um, they expectedtransducers to be slower than CL loop, but faster than CL libor dash. However, this isn't the case, any idea why. And soI'll, I'll come back into their data to show there's they'reshowing, um, you know, there's not a lot of detail on the, onthe, on the use case here. We could certainly click throughit, do it.Oh, I should've waited to zoom until I find my spot here.There we are.All right, so there's our example.Looks like we are doing a simple map and a sum.Mm-hmm. Yeah, that's right. Yeah, question aboutperformance. So a case like this, a simple, I just want to ripthrough a collection of numbers and sum them all. That's acase where basically loop is always going to win becauseloop is optimized. This is true in both Emacs Lisp and inCommon Lisp. For a case like this where you're not reallydoing two nested of chained calls, like you don't have manysort of what I was compositional steps. If you're justripping through a collection of numbers, loop is alwaysgoing to win. Transducers kind of shines when you have to dothings that loop can't in terms of expressing yourself. Sothere are lots of different transducers that you can chaintogether. And in that case, you're kind of prioritizingdeveloper time and developer happiness because you'reable to yourself more clearly, whereas sometimes thosekind of algorithms can get very hairy if you're just usingloop. Now that sounds like I'm moving the goalposts, andthere's really no excuse for these things not being asperformant as possible. In this specific case, my guess isthat the transducers is slower because it has to do a wholebunch of like inner function calls in order to actually dothe adding and the collecting. So there's a lot of stuff thatjust the raw loop doesn't have to do, which transducersdoes. And so in this case, that's why it would be slower.All right, makes sense.Um... I cannot comment against Dash. And also a reminderthat transducers both in CL and in Emacs Lisp here doesn'tattempt to do any, you know, fun, you know, inner rewritingor, you know, what's called an Haskell fusion. Like if youhave two different map steps, like in a row, it's not gonnasee that and somehow fuse them internally. It's a fairly, inthat sense, the implementation is just as is.to make it you know as raw fast as possible. The idea beingthat ergonomics is more important up front. Yeah, that'skind of a whole fascinating sub-panel, right? My theme thisconference has been, oh, all these different things weshould try to get sub-panels going for and use that. Maybefill in the dev track or even have a third track or whatever.I'm not that concerned about the logistics of squeezinginto the schedule so much. But anyway, interesting, I mean,to say.
[00:05:40.840]Q: Do you know of any theoretical texts on transducers?
Did we already speak to theoretical texts? No, right? No,let's continue. Okay, so another question from the group.Do you know of any theoretical texts on transducers? Myreadme, particularly of the Common Lisp implementation,is the theoretical text on transducers. Rich Hickey hassome YouTube videos which also come close. I mean, heinvented the things. But in terms of having a fullexplanation of everything, it's my readme and it's alsothe...The info manual of Guile Scheme, their documentation onSurfy 171 is what I used to learn transducers and tore-implement them in other LISPs. So if you just want like adocument explaining them, MyReadMe is actually theclearest that I've found. Awesome. Okay, next question.And I'm sorry, you gave a name, you referred to somebody'svideos. Rich Hickey, the inventor of Clojure. Rich Hickey,thank you. Hope I got the spelling right, and maybe somebodycan catch that and fix it. If not, I'll reach on. Thank you.
[00:07:04.720]Q: Did you think about [compiler features, macros] viz your cl, fennel, elisp, porting of your transducers?
Reach on to the next question. Waters (Lazy Series inLisp, late 70s) said this should have been done as anadditional compiler feature in compilers, but if not, mustbe a macro package. Do you think about that vis your CL,Fennel, Elisp, porting of transducers? I think thatthere's definitelysome Galaxy Brain Lisp author out there is probably smartenough to turn a bunch of this stuff into macros. I believethat's how the common Lisp library series works. It seesthat you were calling map or whatever, and it actually knowsthat that's a special macro key. in order to be fast. I did notdo that. The implementation as I have it is very simple andsimplicity shouldn't be underestimated.I love it. What a nice succinct answer. Even I can manage totype that out as I scroll us to the next question.
[00:08:16.579]Q: Does t-buffer-read provide a lazy stream that's linewise, or charwise, or do something else entirely?
So, does t-buffer-read provide a lazy streamthat's line-wise or character-wise or do something elseentirely?Okay, there are two functions. I showedt-buffer-read. There's also one called t-file-read,which does that. You actually have the buffer open,it's much more clever.t-buffer-read, I believe, is simpler. As long as you have anEmacs list, what is called the current buffer active. I'mfairly sure you're able to just call next-line on it. I don'tbelieve that I'm doing anything fancy there, looking forline ends. I believe I'm just grabbing the next line and thenprocessing that line-wise. Very good.
[00:09:09.424]Q: Can the Elisp library be combined with the stream.el API or seq in general?
Can the Elisp library be combined with the stream.el APIor seq in general? I would say that these librariesare completely orthogonal. You saw that everythingwas prefixed by t-.Basically, transducer is its own zone. However, one thingthat I do in the common lisp, which is theoreticallypossible for the Emacs Lisp as well, is kind of like littleshim libraries. So I provide, at least for Common Lisp, for anumber of, you know, popular sort of third-partycollection types, I provide an ability to use them assources. Maybe that's what you mean. Likethe built-in containers for Emacs Lisp are alreadysupported. So, you know, a vector hash table and so on.make sense so i think what i heard there is yeah go aheadplease sorry in terms of mixing like you know like forinstance you know like seq-map used in transducerswe'll put it that wayi was just gonna say i think it um it it sounds like you'resaying Yeah, probably they are actually. We don't know yetabout any places where they don't play nicely together. Soquite possibly so. We can use sequence and transducerstogether, for example. As a source potentially, yeah. It'svery easy because that just uses defgeneric. As long as youhave a new, like if you have a new collection type, as long asyou implement a def method for it somewhere, it'll justmagically work with this library. That's the magic of...Yeah, as an Emacs user enjoying, you know, sort of therenaissance of new features it's had, or sorry, Emacs ERCuser for chat. I've seen a lot of awesome stuff get done in thelast couple of years with generic set. JP never was workingon that. And like, that's just making me my eyes pop and go,wow, that does make a whole lot of things simpler, doesn'tit? I think we're a lot of us running into generics and howthat solves problems in Emacs.
[00:11:47.543]Q: How does one debug a t-comp expression? Can you single step and see intermediate results of the different statements you declare?
How does one debug a t-compexpression? Can you talk in terms of single step,step-by-step, intermediate results of the differentstatements you declare? Yes. So in Common Lisp, this isand sly stickers and things like that. In Emacs Lisp, it's alittle bit, shall we say, more difficult. For stepdebugging,so what comp does is comp internally, it should be a macro,but currently it's not, although there's work to improvethat. It's doing an internal reduce and it's turning intoone giant kind of composed lambda inside. So I don't know ifstep debugging would work there. However, we do have onefunction called log, which lets you inspect intermediateresults. So you could technically use that to injectyourself somewhere into the transduction chain and, youknow, halt or, you know, inspect the current value, etcetera. So you get a bunch of questions lined up. I thinkwe're coming up, uh, within our last five minutes, uh,before some declared, uh, reset time that we haveinternally to just roll our closing credits, so to speak.Um, not that I would want to cut the question and answershort, but I might have to step away personally. But, um, aswe discussed before, you can just kind of run the QA, howeveryou want here. Um, or, or take questions offline if you'dlike to answer them off the pad. And I just want to say one moretime. Kitt said it managed later. Thanks again for your talkfor dedicating the time to this live QA. And I think we can seeby the many questions that are here. So I'll try to kind offlip us through as many of them as I can with our last couple ofminutes, if that sounds good. Alternately, this might be agood time if you have kind of wrap it up, final thoughts, asLeo Sopanda saying. By all means, have at. Sure, thanks alot. I'd say that if you are still curious, check out theread-me's because those have a lot of information,including a full description of the API and everythingthat's available.Otherwise, just give them a shot. Using these things is thebest way to learn them, of course. I use them everywhere,basically, all across my Emacs list and all across my commonlist now. They get a lot of mileage. All right. You'respeaking our language now. As Emacs users, all our ears pokeup when you say, I'm getting a lot of mileage. I'm using itacross everything. Every Emacs user has a story thatharmonizes with that, I think.
[00:14:42.495]Q: Is there a path for transducers to enable elisp processing of otherwise overly large datasets as if just normal Emacs \"buffers\" (i.e. just pulling one thing at a time so essentially stream-like under the hood but buffer-like in interface), with none of the usual perf issues with a traditional buffer structure?
So our next question, isthere a path for transducers to enable Elisp processing orotherwise overly large data sets as if just normal Emacsbuffers, i.e. just pulling one thing at a time. Soessentially stream like under the hood, but buffer like aninterface. I think that makes sense to me. with none of theusual performance issues, like as if, you know, the historywith long files is what that brings to mind, I guess. Yes, soas you saw before, the withBufferRead sort of streamfunction does have to have the actual buffer in memory, andthen you can go really fast. But there's another one withfile read. Now, again, I haven't tried to optimize that yet.But in theory, it is able to read right from the underlyingfile without having to open it as a buffer first.Awesome. Ari, the performance issues mentioned, and thatpopped up recently in the list and forums, to what extentdoes tail call optimization and other mechanisms likeinlining, garbage collection friendliness, and so on,could these alleviate issues, enable their use at little tono extra costs? I feel like we're leading the witness here,but I'm sure you see where we're going. Yeah, no problem. Soin terms of tail optimization, that's already happeningbecause the internal loop mechanism is using CL labels. Andin Emacs Lisp, CL labels is just a macro that is likeextremely tail recursive. So that's very, very fast. It'snot tail recursive, but it's using like goto. So it'sextremely, extremely fast, like the raw looping of it. So,okay, well then where does the slowness come from? It'sprobably coming from those lambdas and it's probablycoming from, uh, like extra consing, extra allocationsomewhere, which is, um, sort of what you were, what you'rereferring to with the GC friendliness. So perhaps there'ssome, um, um, yeah, some, like some fusion that I can do tospeed it up. Yeah, that just sounds fascinating endlessly.
[00:16:51.200]Q: Is there an option to read a csv/json and produce an alist or plist instead of a hash table for an entry?
Are there options to like read from a CSV, JSON, produce analist or plist instead of hash table? Absolutely.Yes, I need to double check that, but we can read both CSV andJSON, and you should be able to just turn on the plist option.I will double check, but there's fairly free conversionbetween those three types because hash table is not alwayswhat you want. And actually, I suspect that slowness that wesaw in the demo before was because it was allocating hashtables for every, like, all of the 50,000 lines. And had itbeen a plist, it would have been faster. Interesting, somaybe there's opportunities even if you end up with hashlists, but then they're shared strategically and you paythe cost of a little extra layer in there that buckets themtogether the way that we might group files by the first fourcharacters in the file name once we've got a million files.
[00:17:50.520]Q: Is the common lisp version ready for 'production' use? Is it complete enough and the API stable enough?
Anyway, is the Common Lisp version ready for productionuse? Do you want to comment on API stability? I use it all thetime. I'm writing a game in Common Lisp right now, and I'musing transducers everywhere in there, and it doesn't evenmake a dent in the frame rate, and I'm using themextensively. Okay, well, I'll just read from chat. Thanksso much for the answers.
[00:18:17.477]Q: Do we need a pre-written \"t-\" version for every already existing reducing function like + or is there a function to construct them from already defined reducer 2-arg functions?
Do we need a pre-written or t-minusversion for every already existing reducing function,plus, as an example? Or is there a function that constructs,in my, I'm going to add the word, auto-visualifies themalready, auto-defines or something, or just genericallywraps function calls some way? already defined. This isbasically fold. Some built-in functions like plus alreadyfunction like reducers. It's a coincidence that they dothat. But there's an example in the README. Max is one thatdoes not act like that. For instance, maybe I could screenshare later, but if you just type in plus one, If you call plusone in Emacs or Common Lisp, you get back one. It actuallyonly needs one argument. If you only type plus, it actuallygives you zero. Plus and multiple satisfy the API ofreducers. But if you have one that doesn't, like the maxfunction, and similarly, just type in plus as a functioncall, just plus with nothing else, and you'll see. No, as afunction. zero will come out. This basically means itsatisfies the reducer API. But a function like max does not.If you just type in max and then one, it won't work. Pardon me,it did. But if you type in max with nothing else, it wouldn'twork.Hence, we have to wrap it in something like fold. I would saygo look at the fold function. Right, which that I won't do.I'm not that well enough prepped. Darn it. Leo would havebeen here, but oh, well, you got me. Yeah, no problem. Butfold is sort of the ultimate reducer function. Great. So isthere, where was I? Here we go. We're way past this, right? So
[00:20:26.320]Q: Is the compelling argument for transducers is that it's a better abstraction?
is the compiling argument for transducers that it's abetter abstraction? It seems like there are concerns,objections, while problematically valid focused onimplementation. Can this abstraction allow for advancesin implementation? Yes, what I've basically done is mostlyfollowed the pattern of usage that exists in Clojure and inScheme's SERP 171. In theory, the service level API is thesame no matter where you're using this, and that's the idea.If you learn them in one list, you should be able to use themeverywhere. Then what it's actually doing under the hood isfree for us to change around. My implementations are mostlybased on the scheme with a few alterations here and there.And in the Common Lisp case, like adding some Common Lispismsto improve usage like UX a little bit. But overall, we arefree to do whatever we want internally to speed upperformance. I just haven't done that work. Awesome.Awesome. So here's where I have to, where we're getting thehook. We've just been pulled off the stream. The viewersjust saw the crawl by as it sent us over to the other pad where Iget to jump on and get involved with that now. But I can'tthank you enough, Colin. Would you like me to stop therecording here? Any other comments you'd like to make? Uh,yeah, sure. Like, I mean, I'll stick around for any more livequestions. I'm looking at both IRC and, and, um, uh, big bluebutton here. So if people have more questions, I'll hangaround for a bit. I'm going to leave the channel open. I seeyou do have a few people in here, so I'm just going to go aheadand leave the recording. We can always trim it. Um, trim itup. If you, uh, let us know, Hey, the last 10 minutes weren'tanything, you know, or whatever. No, no pressure, noworries, no mistakes. Thank you. Really appreciate you.Yep. Thanks a lot.
[00:22:31.960]Q: Question about how the transducers video was made? Did you use Reveal.js? Do you have a pointer to the html hosted presentation? How did you generate the content for Reveal?
OK, does anyone else have some questions? I see Mohsen in theBigBlueButton chat is asking how I made the video. So thepresentation itself was done with RevealJS from Org Mode.So as you saw, I had a raw Org Mode buffer, which waswhich was the presentation itself, which I then justexported with a few certain settings, a fewcustomizations. And then for screen recording, I used OBS,which worked flawlessly on Arch Linux. I used Sway,Wayland, and all of that. So all of that just worked, whichwas very impressive. Where do the HTML host thepresentation? I don't have that presentation hostedanywhere.I'll look at the.I don't see that.Here it is. So we've got the file here as well.Looks like that's it for questions, basically.Yep, and it looks like everyone's moved on for now. Let'ssee. I mean, it would be so this is answering lounge 81 on IRC.
[00:24:20.160]Q: From your investigations and tests so far, do you think there would be the necessity of transducers to eventually go down into the C level code for things like using them to solve "infinitely-big" buffer-like interfaces and such?
Yeah, like, if we really wanted to go that hardcore, maybethere's some like C level stuff that we couldyou know, significant demand for such a thing. You know, sofar there hasn't been such demand, but maybe there will be inthe future. Yeah, perhaps there's some custom stuff wecould do.And otherwise, magic one.Well, it looks like some people are quite happy with this.All right. That's about what I've seen. So why don't we end ithere? I think I can control the recording from my end. If Ipause it, will that work? All right. Thank you, everyone.