Back to the talks Previous by track: Enhancing productivity with voice computing Next by track: Improving compiler diagnostics with overlays Track: Development - Watch

LLM clients in Emacs, functionality and standardization

Andrew Hyatt (he/him) - ahyatt@gmail.com - https://urbanists.social/@ahyatt - http://github.com/ahyatt

Format: 21-min talk ; Q&A: BigBlueButton conference room
Status: Q&A to be extracted from the room recordings

Talk

00:00.000 Intro to the Talk 00:25.080 What are LLMs? 01:56.360 Power of LLMs (Magit Demo) 03:32.240 Drawbacks of LLMs (regex demo) 05:20.120 Embeddings 07:32.800 Image Generation 08:48.480 Fine-tuning 11:08.160 Open Source 12:02.840 The Future 14:08.200 LLMs in Emacs - existing packages 18:15.960 Abstracting LLM challenges 19:04.080 Emacs is the ideal interface for LLMs 20:01.960 Outro

Duration: 20:26 minutes

Q&A

Listen to just the audio:

Duration: 28:32 minutes

Description

As an already powerful way to handle a variety of textual tasks, Emacs seems unique well poised to take advantage of Large Language Models (LLMs). We'll go over what LLMs are and are used for, followed by listing the significant LLM client packages already for Emacs. That functionality that these packages provide can be broken down into the basic features that are provided through these packages. However, each package currently is managing things in an uncoordinated way. Each might support different LLM providers, or perhaps local LLMs. Those LLMs support different functionality. Some packages directly connect to the LLM APIs, others use popular non-Emacs packages for doing so. The LLMs themselves are evolving rapidly. There is both a need to have some standardization so users don't have to configure their API keys or other setup independently for each package, but also a risk that any standardization will be premature. We show what has been done in the area of standardization so far, and what should happen in the future.

About the speaker:

Andrew Hyatt has contributed the Emacs websocket package, the triples (making a triple-based DB library) and the ekg package (a tag-based note-taking application). He has been using various other LLM integrations, and ss part of extending ekg, he's been working on his own.

Discussion

Questions and answers

Q: What is your use case for Embedding? Mainly for searching?
- A:
  - I got you. It's kinda expand our memory capcity.
Q: What do you think about "Embed Emacs manual" VS "GPTs Emacs manual?
- A:
  - yes GPTS actually how it's kind of embedding your document into its memory and then using the logic that provided by GPT-4 or other versions. I never tried that one but I'm just wondering if you have ever tried the difference
Q: When deferring commit messages to an LLM, what (if anything) do you find you have lost?
- A:
Q: Can you share your font settings in your emacs config? (Yeah, those are some nice fonts for reading)
- A: I think it was Menlo, but I've sinced changed it (I'm experimenting with Monaspace
Q: In terms of standardisation, do you see a need for a medium-to-large scale effort needed?
- A:
  - I mean, as a user case, the interface is quite simple because we're just providing an API to a server. I'm not sure what standardization we are really looking at. I mean, it's more like the how we use those callback from the llm.
Q: What are your thoughts on the carbon footprint of LLM useage?
- A:
Q: LLMs are slow in responding. Do you think Emacs should provide more async primitives to keep it responsive? E.g. url-retrieve is quite bad at building API clients with it.
- A:
  - Gptel.el is async. And very good at tracking the point.
Q: Speaking of which, anyone trained/fined-tuned/prompted a model with their Org data yet and applied it to interesting use cases (planning/scheduling, etc) and care to comment?
- A:
  - I use GPTS doing weekly review. I'm not purely rely on it. It's help me to find something I never thought about and I just using as alternateive way to do the reviewing. I find it's kind of interesting to do so.

Notes and discussion

gptel is another package doing a good job is flexible configuration and choice over LLM/API
I came across this adapter to run multiple LLM's, apache 2.0 license too! https://github.com/predibase/lorax
It will turn out the escape-hatch for AGI will be someone's integration of LLMs into their Emacs and enabling M-x control.
i don't know what question to ask but i found presentation extremely useful thank you
I think we are close to getting semantic search down for our own files

yeah, khoj uses embeddings to search Org, I think

- I tried it a couple of times, latest about a month ago. The search was quite bad unfortunately
- did you try the GPT version or just the PyTorch version?
        - just the local ones. For GPT I used a couple of other packages to embed in OpenAI APIs. But I am too shy to send all my notes :D
    - Same for me. But I really suspect that GPT will be way better. They now also support LLama, which is hopeful
- I keep meaning to revisit the idea of the Remembrance Agent and see if it can be updated for these times (and maybe local HuggingFace embeddings)

I think Andrew is right that Emacs is uniquely positioned, being a unified integrated interface with good universal abstractions (buffers, text manipulation, etc), and across all uses cases and notably one's Org data. Should be interesting...!
Speaking of which, anyone trained/fined-tuned/prompted a model with their Org data yet and applied it to interesting use cases (planning/scheduling, etc) and care to comment?
The ubiquitous integration of LLMs (multi-modal) for anything and everything in/across Emacs and Org is both 1) exciting, 2) scary.
I could definitely use semantic search across all of my stored notes. Can't remember what words I used to capture things.
Indeed. A "working group" / "birds of a feather" type of thing around the potential usages and integration of LLMs and other models into Emacs and Org-mode would be interesting, especially as this is what pulls people into other platforms these days.
To that end, Andrew is right that we'll want to abstract it into the right abstractions and interfaces. And not just LLMs by vendor/models, but what comes after LLMs/GPTs in terms of approach.
I lean toward thinking that LLMs may have some value but to me a potentially wrong result is worse than no result
- I think it would depend on the use case. A quasi-instant first approximation that can readily be fixed/tweaked can be quite useful in some contexts.
not to mention the "summarization" use cases (for papers, and even across papers I've found, like a summarization across abstracts/contents of a multiplicity of papers and publications around a topic or in a field - weeks of grunt work saved, not to mention of procrastination avoided)
```
- IMHO summarization is exactly where LLMs can't be useful because they can't be trusted to be accurate
```
https://dindi.garjola.net/ai-assistants.html; A friend wrote this https://www.jordiinglada.net/sblog/llm.html; < https://blogs.microsoft.com/on-the-issues/2023/09/07/copilot-copyright-commitment-ai-legal-concerns/>
I have a feeling this is one of the 'em "if you can't beat them join them" scenario. I don't see that ending with a bit global rollback due to such issues anytime soon...
(discussion about LLMs, copyright, privacy)
I spent more time than I was hoping to setting up some custom Marginalia(s?) the other day, notably for cases where the "category" is dynamic, the annotation/affixation function varies, the candidates are an alist of key-value pairs and not just directly the value, and many little specificities like that. Idem for org-ql many moons back, org-agenda, etc. That sort of workflow always involves the same things: learning/reading, examples, trials, etc. I wonder if LLMs could be integrated at various points in that recurring exercise, to take just a sample case.
that's yet another great use case for LLMs : externalizing one's thinking for its own sake, if only to hear back the echo of one's "voice", and do so with an infinitely patient quasi-omniscient second party.
- oooh, might be a good one for blog post writing: generate some follow-up questions people might have
- Yeah, a "rubber duck" LLM could be very handy
- I'm sure there would be great demand for such a thing, to dry-run one's presentations (video or text) and generate anticipate questions and so on. Great take.
- I've seen some journaling prompts along those lines. I think it'll get even more interesting as the text-to-speech and speech-to-text parts get better. Considering how much people bonded with Eliza, might be interesting to see what people can do with a Socratic assistant...

Transcript

[00:00:00.000] Intro to the Talk

Hello, I'm Andrew Hyatt and I'm going to talk to you about large language models and how they relate to Emacs. And I'm going to talk to you about the technology and how we're going to use it in Emacs. There'll be demos and there'll be talks about, I'll finish up by kind of talking about where I think this should go in the future.

[00:00:25.080] What are LLMs?

So to start off with, let's just talk like, I just want to make sure everyone's on the same page. What are large language models? Not everyone may be caught up on this. Large language models are a way... Basically, the current versions of large language models are all based on the similar architecture called the transformer. It's just an efficient way to train and produce output. So these things are basically models that predict the next word or something like that. And they're trained on an enormous corpus of information and they get extremely good at predicting the next word. And from that basic ability, you can train through further tuning from human input, human ratings and things like that. You can train different models based on that that will do question answering. And this is how basically ChatGPT works. There's a base LLM, like GPT. And then you have a chat version of that, which is just trained to just... You give it a prompt, like what do you want it to do? And it gives you an output that does what you told it to do, or at least attempts to do it. Those are the power of large language models is they're extremely, extremely impressive. Certainly this is, in AI, this has been the biggest thing to happen probably in my lifetime, or at least my lifetime as my working lifetime.

[00:01:56.360] Power of LLMs (Magit Demo)

So let me give you a demonstration of what kinds of stuff it could do in Emacs. So here I have a Emacs file. So this is my Emacs init file. I have a change. Let's commit that change. And, you know, I don't like writing commit messages, so I can generate it. And it did an actually just looking. So all it does is it's looking, it's just reading the diff. I'm just feeding it the diff with some instructions. And it is this a incredible commit message? It's not bad, actually. You can see that it actually has really extracted the meaning of what I'm doing and has written a reasonably good commit message. Now I have to edit it because this is not quite correct. But it's kind of impressive how good it is. And my editing, it's kind of easier for me to edit this than just to write a new one. And quite often it's good enough to just submit as is. So this is kind of, you know, you could say this is just commit messages. You could respond to emails. You could, you know, using your own custom instructions about what you want your email to say. It'll write the email for you. It could do like this Emacs is a way to interact with buffers. This could basically just output text. So it's super useful for understanding something and outputting text based on that, which is just useful for Emacs.

[00:03:32.240] Drawbacks of LLMs (regex demo)

So the drawback is, yeah, it's good, but it's not that reliable. And you'd think it's very easy to get caught up in like, oh my gosh, like this is so powerful. I bet it could work this, whatever idea could work. And these ideas, like they almost can. For example, I was thinking, you know what I could do? I don't like writing regexes. Why can't I have a regex replace that's powered by LLMs? And that way I could give just an instruction to regex replace. And so for example, I could do Emacs LLM regex replace. This is not checked in anywhere. These are just my own kind of private functions. My description lowercase all the org headings. Let's see if it works. It might work. No, it doesn't work. So if I, I'm not going to bother to show you what it actually came up with, but it's something, if you looked at it, it'd be like, wow, this is very close to being... It looks like it should work, but it doesn't. Okay. It's not quite good enough to get it right. And it's possible that perhaps by giving it a few examples of, or explaining more what makes Emacs regexes different. It could do a better job and maybe could solve these problems, but it's always a little bit random. You're never quite sure what you're going to get. So this is the drawback. Like there's a lot of things that look like you could do it, but when it actually comes down to trying it, it's surprisingly hard. And, you know, and whatever you're doing, it's surprisingly hard to get something that is repeatably, that's, that is always good. So yeah, that's currently the problem.

[00:05:20.120] Embeddings

So I want to talk about embeddings. They're another thing that LLMs offer and that are extremely useful. They are, what they do is they encode from a input text that could be a word, a sentence, a small document. It encodes a vector about what the meaning, the semantic meaning of that is. That means you could, something that is, uses completely different words, but is basically talking about the same thing, perhaps in a different language, should be pretty close as a vector to the other vector. You know, as long as they're similarly semantic things, like the words highway and Camino are two different words. They mean the same thing. They should have very similar embeddings. So it is a way to kind of encode this and then you could use this for search. For example, I haven't tried to do this yet, but you could probably just make an embedding for every paragraph in the Emacs manual and the Elisp manual. And then, and then there's a very standard technique. You just... You find that you have a query, oh, how do I do whatever, whatever in Emacs again? And you could, you just find that 20 things that are closest to whatever you're trying to... the embedding of your query. You send those things to the LLM, as you know, with the original query, and you're basically telling the--asking the LLM, look, the user is trying to do this. Here's what I found in the Emacs manual. That's on the Elisp manual. That's close to what they're trying to do. So can you kind of just tell the user what to do? And from this, and you could say, just use things from this, you know, that I give you. Don't just make up your own idea. You know, don't use your own ideas, because sometimes it likes to do that and those things are wrong. So you could try to, you know, do this and you get, you could get quite good results using this. So no one has done this yet, but that should not be hard to do.

[00:07:32.800] Image Generation

Image generation is something that's, you know, it's not quite an LLM in the sense of... These are... It's a different technology, but these things are kind of packaged together in a sense. And you'll see that when I talk about Emacs packages, a lot of them bundle image generation and large language models. You know, the APIs are often bundled together by providers. And the general idea is it's kind of similar because it's very similar to large, you know, doing a chat thing where you, you know, the chat is like, you give it a text request, like write me a sonnet about, you know, the battle between Emacs and vi. And it could, it could do it. It could do a very good job of that. But you could also say, you know, draw me a picture of Emacs and vi as boxers, as a character-character boxing in a ring, like a, you know, political cartoon style. And it can do that as well. And so you could basically think of this as just sort of... it's kind of the same thing with what you're doing with large language models, but instead of outputting a text, you're outputting a picture.

[00:08:48.480] Fine-tuning

There's also, I want to mention the concept of fine-tuning. Fine-tuning is a way to take your-- take a corpus of inputs and outputs and just from a large language model, you're like, okay, given this base large language model, I want to make sure that when I give you input, you give me something like output. And this is what I'm just going to train you further on these, these mappings between input and output. And for example, you could do this. Like, let's say you wanted to fix that regex demo I had to make it good. I don't think it, I think it'd be relatively effective to train, to have regex descriptions and regex examples, Emacs regex examples as inputs and outputs. You could get, you know, maybe a hundred, a few hundreds of these things. You could train it. I think that is a reasonable way to, let's just say, I don't know how well it would work, but these things definitely work some of the time and produce pretty good results. And you could do this on your own machine. Corporations like OpenAI offer APIs with, you know, to build your fine tunes on top of OpenAI. And I think, I'm not a hundred percent sure, but I think then you can share your model with other people. But if not, then you just, you know, you could use your model for your own specialized purposes. But in the world of models that you could run, for example, based on Llama, which is like... Llama is this model you can run on your own machine from Meta. There's many fine-tuned models that you could download and you could run on your own. They can do very different things too. Some output Python programs, for example, that you could just run. So you just say... Tell me how old... Let's just say you have a random task, like tell me how old these five cities are in minutes, based on historical evidence. It's kind of a weird query, but it probably can figure, it could probably run that for you. It'll encode its knowledge into whatever the Python program, then use the Python program to do the correct calculations. So pretty, pretty useful stuff.

[00:11:08.160] Open Source

So I also want to mention open source and basically free software here. These LLMs are mostly not free software. They're sometimes open source, but they're generally not free without restrictions to use. Most of these things, even Llama, which you can use on your own machine, have restrictions that you cannot use it to train your own model. This is something that, you know, it costs millions and millions of dollars to train and produce these models. And that's just computation costs. They do not want you stealing all that work by training your own models based on their output. But there are research LLMs that do, I believe, conform to free software principles. They're just not as good yet. And I think that might change in the future.

[00:12:02.840] The Future

So speaking of the future, one of the things I'd like to point out is that like the demos I showed you are based on, I'm using OpenAI 3.5 model. That's more than, well, no, it's like a year old basically at this point. And things are moving fast. They came out with 4.0. 4.0 is significantly better. I don't have access to it. Even though I'm using the API and I'm paying money for it, you only can get access to 4.0 if you can spend a dollar. And I've never been able to spend, use so much API use that I've spent a dollar. So I have, I don't have 4.0, but I've tried it because I do pay for this so I could get access to 4.0 and it is substantially better. By all reports, it's, the difference is extremely significant. I would not be surprised if some of the limitations and drawbacks I described mostly went away with 4.0. We're probably at a stage where regexes will work maybe 5% of the time if you try them. But with 4.0, it could work like 80% of the time. Now, is that good enough? Probably not, but it's a, I wouldn't be surprised if you got results like that. And in a year's time, in two years time, no one knows how much this is going to play out before progress stalls, but there are a lot of interesting research. I don't think, research wise, I don't think things have slowed down. You're still seeing a lot of advances. You're still seeing a lot of models coming out and that will come out. That will be each one, one upping the other one in terms of quality. It'll be really interesting to see how this all plays out. I think that message here is that we're at the beginning here. This is why I think this talk is important. I think this is why we should be paying attention to this stuff.

[00:14:08.200] LLMs in Emacs - existing packages

Let's talk about the existing packages. Because there's a lot out there, people have, I think people have been integrating with these LLMs that often have a relatively easy to use API. So it's kind of natural that people have already put out a lot of packages. Coming off this problem from a lot of different angles, I don't have time to go through all of these packages. These are great packages though. If you're not familiar with them, please check them out. And they all are doing slightly different things. Some of these are relatively straightforward. Interactions, just a way to almost in a comment sort of way to kind of have just an interaction, long running interaction with an LLM where you kind of build off previous responses, kind of like the OpenAI's UI. Two very more Emacsy things where you can sort of embed these LLM responses within a org-mode block using the org-mode's context. Or GitHub Copilot integration where you can use it for auto completion in a very powerful, you know, this stuff is very useful if it could figure out what you're trying to do based on the context. It's quite effective. But I want to kind of call out one thing that I'd like to see change. Which is that users right now, not all of these have a choice of, first of all, there's a lot of them. Each one of them is doing their own calls. And each one of them is, so each one of them has their own interfaces. They're rewriting the interface to OpenAI or wherever. And they're not, they don't, most of these do not make it that configurable or at all configurable what LLM use. This is not good. It is important that we use, we give the user a way to change the LLM they use. And that is because you might not be comfortable sending your requests over to a private corporation where you don't get to see how they use their data. Your data, really. That's especially true with things like embeddings where you might be sending over your documents. You're just giving them your documents, basically. And, you know, that does happen. I don't think really that there's a reason to be uncomfortable with this, but that, you know, people are uncomfortable and that's okay. People might want to use a local machine, a local LLM for maximum privacy. That's something we should allow. People might want to especially use free software. That's something we should definitely allow. This is Emacs. We need to encourage that. But right now, as most of these things are written, you can't do it. And they're spending precious time just doing things themselves. This is why I wrote LLM, which is... it will just make that connection to the LLM for you and it will connect to, you know, it has plugins. So if you can, the user can configure what plugin it actually goes to. Does it go to OpenAI? Does it go to Google Cloud Vertex? Does it go to Llama on your machine? We're using Ollama, which is just a way to run Llama locally. And more things in the future, I hope. So this is, I'm hoping that we use this. It's designed to be sort of maximally usable. You don't need to install anything. It's on GNU ELPA. So even if you write something that you want to contribute to GNU ELPA, you can use it because it's on GNU ELPA. It's part of the Emacs package, Emacs core packages. So, but it has no functionality. It's really just there as a library to use by other things offering functionality. Okay.

[00:18:15.960] Abstracting LLM challenges

And it's a little bit difficult to abstract. I want to point this out because I think it's an important point is that the, it's, some of these LLMs, for example, have image generation. Some do not. Some of them have very large context windows, even for chat. You say, okay, all these things can do chat. Okay. Yeah, kind of. Some of these things you could pass a book to, like Anthropic's API. Most, you cannot. So there really are big differences in how these things work. I hope those differences diminish in the future. But it's just one of the challenges that I hope we can work through in the LLM library. So it's compatible, but there's definitely limits to that compatibility.

[00:19:04.080] Emacs is the ideal interface for LLMs

I want to point out just to finish off, Emacs is the, Emacs has real power here that nothing else I think in the industry is offering. First of all, people that use Emacs tend to do a lot of things in Emacs. We have our to-dos in Emacs with the org mode. We have mail. We, you know, we might read email and we might, and respond to email in Emacs. We might have notes in Emacs. This is very powerful. Using... there's not other stuff like that. And you could feed this stuff to an LLM. You could do interesting things using a combination of all this data. No one else could do this. We need to start thinking about it. Secondly, Emacs can execute commands. This might be a bad idea. This might be how the robots take over, but you could have the LLMs respond with Emacs commands and run those Emacs commands and tell the LLM the response and have it do things as your agent in the editor. I think we need to explore ideas like this.

[00:20:01.960] Outro

And I think we need to share these ideas and we need to make sure that we're pushing the envelope for Emacs and actually, you know, doing things, sharing ideas, sharing progress, and kind of seeing how far we can push this stuff. Let's really help Emacs out, be sort of, take advantage of this super powerful technique. Thank you for listening.

Captioner: bala

Q&A transcript (unedited)

I think this is the start of the Q&A session. So people can just ask me questions here. Or I think maybe these questions are going to be read by someone. Yes, thank you. Should I start doing that? I also know that there's questions in the either pad room, so I could start out answering those as well. If you prefer to read the questions yourself, by all means, or if you would prefer me to read them to you, that also works. I think it'll just be more interesting then. what is your use case for embedding, mainly for searching? searching. And I think it is very useful when you're searching for something in a vague way. Just to give you an example, I have a note system called EKG. I type all my notes on it. You can find it on GitHub and Melba. But I wrote something at some point a year ago or something. I wrote something that I just vaguely remembered. Oh, this was about a certain kind of communication. I wanted communicating to large audiences. There's some interesting tip that I wrote down that was really cool. And I was like, well, I need to find it. So I did an embedding search for something like, you know, tips for communicating. Like those words may not have been in what I was trying to find at all, But it was able to find it. And that is something that's very hard to do in other ways. Like, you know, if you had to do this with normal search, you have to do synonyms. And like maybe those synonyms wouldn't cover it. Like with embedding, you can basically get at like the vague sentiment. You're like, you know, you're, you know, you can really query on like what things are about as opposed to what words they have. Also, it's super good for similarity search. So you could say, look, I have a bunch of things that are encoded with embeddings that I want to show. For example, you can make an embedding for every buffer. You'd be like, well, show me buffers that are similar to this buffer. That doesn't sound super useful, but this is the kind of thing you could do. And so if you have a bunch of notes or something else that you want to search on, you'd be like, what's similar to this buffer? Or what notes are similar to each other? What buffers are similar to each other? It's super good for this sort of thing. And it's also good for this kind of retrieval augmented generation, where you sort of, you retrieve things and the purpose is not for you to see them, but then you pass that to the LLM. And then it's able to be a little bit more accurate because it has the actual text that you're trying to, that is relevant, and it can cite from and things like that. And then it could give you a much better answer that's kind of, you know, not just from its own little neural nets and memory. next question. What do you think about embed Emacs manual versus GPT's Emacs manual? trying to say. So I mean, if someone wrote that and wants to expand on it a little bit, but I think that maybe you're saying like you could embed, have embeddings for like various, like every paragraph or something of the Emacs manual. But it's also the case that like GPT is already for sure already read it, right? And so you could ask questions that are about Emacs and our ELISP or whatever part of the manual you want to find. And it will do a reasonably good job, especially the better models will do a reasonably good job of saying you something that is vaguely accurate. But if you do this retrieval augmented generation with embeddings, you can get something that is very accurate. At least I think. I haven't tried it, but this is a technique that works in other similar cases. So you can also imagine like, oh, this whole thing I said, like, oh, you can query for vague things and get parts of the manual, perhaps. I'm not exactly sure if that would be useful, but maybe. Usually when I'm looking things up in the Emacs manual or Elist manual, I have something extremely specific and I kind of know where to look. But having other ways to get at this information is always good. if you would like to read that yourself, or would you like me to read it for you? I've never tried. Yeah, the question is like OK, there is a difference between the kind of thing as I just described. I have not tried the difference with the EMAX manual itself. It'd be interesting to see what this is, but I would expect like these techniques, the retrieval augmented generation is generally pretty good. And I suspect it would, I would bet money on the fact that it's gonna give you, you know, better results than just, you know, doing a free form query without any retrieval augmented generation. When deferring commit messages to an LLM, what, if anything, do you find you might have When deferring anything to a computer, like, you know, I used to have to remember how to get places, and now, you know, on the few occasions which I drive, like, It could just tell me how to get places. So similar things could occur here where like, okay, I'm just leaving the LLM. And so I'm kind of missing out on some opportunity to think coherently about a particular commit. Particular commits are kind of low level. I don't think it's usually relatively obvious and what they're doing. And in this case, I think there's not much loss. But for sure, in other cases, if you're starting to get into situations where it's writing your emails and all this stuff. First of all, it's in 1 sense, I'm not sure you might be losing something by delegating things. On the other hand, you know, when you're interacting with these LLMs, you have to be extremely specific about what you want, or else it's just not going to do a good job. And that might actually be a good thing. So the question might be that maybe you might gain things by using an LLM to do your work. It might not actually even save you that much time, at least initially, because you have to kind of practice again super specific about what you want to get out of the output it's going to give you so like oh I'm you know maybe you know you're on the emacs devel mailing list and you're like okay write this email about this about this And here's what I want to say. And here's the kind of tone I want to use. And here's the like, oh, you might want to specify like everything that you kind of want to get into this. Usually it's easier just to write the email. But I think that practice of kind of understanding what you want is not something you normally do. And I think it's going to be an interesting exercise that will help people understand. That said, I haven't done that much of that, so I can't say, oh, yeah, I've done this and it works for me. Maybe. I think it's an interesting thing to explore. Let's see. Can you share your font settings in your Emacs config? Those are some nice fonts for reading. Unfortunately, I don't save those kinds of things, like a history of this. I've kind of switched now to, what was that? I think I wrote it down in the, I switched to MunaSpace, which just came out like a week or 2 ago, and is also pretty cool. So I think it's Menlo. The internal question, what font are you using? as well that it might be Menlo. OK, Cool. Yeah, next question. In terms of standardization, do you see a need for the medium to large scale effort needed? And then they also elaborate about it. I don't know if it's large scale, but at least it's probably medium scale. There's a lot of things that are missing that we don't have right now in emacs when you're dealing with LLMs. 1 is, a prompting system. And by that, I mean, you know, prompts are just like big blocks of text, but there's also senses that like prompts need to be composable and you need to be able to iterate on parts of the prompt. And so it's also customizable. Users might want to customize it. On the other hand, it's not super easy to write the prompt. So you want really good defaults. So the whole prompt system is kind of complicated. That needs to be kind of standardized, because I don't think there's any tools for doing something like that right now. I personally use my system, my note system for EKG. I don't think that's appropriate for everyone, but it does, I did write it to have some of these capabilities of composability that I think are useful for a prompt generation. It'd be nice to have a system like that, but for general use. I don't, this is something I've been meaning to think about, like how to do it, but like this, you know, if someone's interested in getting this area, like, I would love to chat about that or, you know, I think there's a lot of interesting ideas that we could have to have a system that allows us to make progress here. And also, I think there's more to standardization to be done. 1 thing I'd also like to see that we haven't done yet is a system for standardizing on getting structured output. This is gonna be super useful. I have this for open AIs API, cause they support it. And it's really nice, cause then you can write elist functions that like, okay, I'm going to call the LLM. I'm gonna get structured output. I know what that structure is going to be. It's not going to be just a big block of text. I could turn it into a, you know, a P list or something. And then I could get the values out of that P list. And I know that way I could do, I could write actual apps that are, you know, very, very sort of, you know, useful for very specific purposes and not just for text generation. And I think that's 1 of the most important things we want to do. And I have some ideas about how to do it. I just haven't pursued those yet. But if other people have ideas, I think this would be really interesting to add to the LLM package. So contact me there. So I'm not sure how long we're going to be on stream for, because this is the last talk before the break. If we are on the stream long-term, then great. But if not, folks are welcome to continue writing questions on the pad. And hopefully, Andrew will get to them at some point. Or if Andrew maybe has some extra time available and wants to stay on BigBlueButton here, then folks are also welcome to join here and chat with Andrew directly as well. Okay, awesome. So yeah, the next question is, what are your thoughts on the carbon footprint of LLM usage? I don't have any particular knowledge or opinions about that. It's something I think we should all be educating ourselves more about. It is really, I mean, there's 2 parts of this, right? They take a, there's a huge amount of carbon footprint involved in training these things. Then running them is relatively lightweight. So the question is not necessarily like once it's trained, like I don't feel like it's a big deal to keep using it, but like training these things is kind of like the big carbon cost of it. But like right now, the way everything's going, like every, you know, all, you know, the top 5 or 6 tech companies are all training their LLMs, and this is all costing a giant amount of carbon probably. On the other hand these same companies are pretty good about using the least amount of carbon necessary you know they have their own their tricks for doing things very efficiently. responding. Do you think Emacs should provide more async primitives to keep it responsive? Like the URL retrieve is quite bad at building API clients with it. Building API clients with it? people should be using the LLM client. And So right now, 1 thing I should have mentioned at the top is that there are new packages that I recorded this talk that you just saw several months ago. And so like Elama, there's this package Elama that came out that is using the LM package. And so for example, it doesn't need to worry about this sort of thing because it just uses LLM and package and the LLM package worries about this. And while I'm on the subject of things I forgot to mention, I also should just mention very quickly that there is now an open source model, Mistral. And so that's kind of this new thing on the scene that happened after I recorded my talk. And I think it's super important to the community and important that we have the opportunity to use that if we want to. Okay, but to answer the actual question, there has been some talk about the problems with URL retrieve in the URL package in general in EmacsDevEl. It's not great. I would like to have better primitives. And I've asked the author of Please PLZ to kind of provide some necessary callbacks. I think that's a great library. And I'd like to see that kind of like, It's nice that we have options, and that is an option that uses curl on the back end, and that has some benefits. So there's this big debate about whether we should have primitives or just use curl. I'm not exactly sure what the right call is, but there has been discussions about this. is async and apparently very good at tracking the point. to LLM, although I believe it's going to move to LLM itself sometime soon. anyone trained or fine-tuned or prompted a model with their org data yet and applied it to interesting use cases like planning, scheduling, et cetera, and maybe care to comment? I think it is interesting. Like this is what I kind of mentioned at the very end of the talk. There is a lot of stuff there like you could you know if you especially mean an LLM can kind of work as sort of like a secretary kind of person that could help you prioritize Still it's a slightly unclear how what the best way to use it is So I think there's more of a question for the community about like what people have been trying. I see someone has mentioned that they are using it for weekly review. And it's kind of nice to like, maybe you could read your agenda or maybe this for like weekly review. It could like read all the stuff you've done and ask you questions about it. And like, what should happen next? Or like, is this going to cause a problem? Like, I can, I can understand if that could happen? That's like, that's kind of nice. And this kind of people have had good success out of using these LLMs to bounce ideas off of are, you know, for, you know, I've seen people say that like they want, they use it for reading and they kind of dialogue with the LM to kind of like do sort of active reading. So you can imagine doing something similar with your tasks where it's sort of you're engaged in dialogue about like planning your tax with some with a alum that could kind of understand what those are and ask you some questions I think it. You know, if it'd be nice. So, the problem is like there's no great way to share all this stuff. I guess if you have something like this, put it on Reddit. If you don't have Reddit, I don't know what to do. I would say put it somewhere. At the very least, I could maybe open up like an LLM discussion session on the LLM package GitHub, But not everyone likes to use GitHub. I don't know. It'd be nice if there's a mailing list or IRC chat for this sort of thing. But there isn't at the moment. of the questions on the pad so far. There was also some discussion or some chatter, I believe, on IRC. I'm not sure. Andrew, are you on IRC right has the chatter. So if there's chatter, then I'm not seeing it. channel. Oh, yes. I mean, I could see the channel, but I missed whatever came before. So if there's anything you want to kind of call out, I can try to answer it here. who are participating in the discussion there who have also joined here on BigBlueButton, Codin Quark and AeonTurn92. So you folks, if Andrew is still available and has time, you're welcome to chat here and ask questions or discuss here as well. and thank you for reading all the questions. great talk and the discussion. there's any questions. If not, I will log off after a few minutes. there was a small chat about local alarms. Because chat dpt is nice, no, but privacy concerns, and it's not free and stuff. Which, so The question is, what is the promise for local models? Misral, which you could run. The LLM package allows you to use, I think there's 3 kind of local things you could use. Like many of these things, there's like many kind of ways to do the same sort of thing. So LLM is supporting OLAMMA and LLAMMA-CPP. And let's see, 1 other. Which 1 is it? And maybe that's it. Maybe the, oh, GPT for all. So each 1 of these kind of has slightly different functionality. For example, I think GPT for all doesn't support embeddings. And I hear that Olama's embeddings are kind of currently broken. But basically they should support everything. And the open source models are, so the local models are reasonably good. Like I don't think you'd use them and be like, what is this horrible nonsense? Like it's, it gives you relatively good results. Like it's not gonna be at the level of like GPT 3.5 or 4, but it's not far away from GPT 3.5, I think. for connecting the actual working servers for Olama? what you could do is you could like for example you could download Olama which is just a way of setting up local models and running local models on your machine. So typically what it does, you like download a program, let's say Olama. Then Olama will have the ability to download models. And so you could choose from just a host of different models. Each 1 of these things has a bunch of different models. So it downloads all these things to your machine. But I would say that the key problem here is that it requires a fairly beefy machine. Why I was asking, because you briefly mentioned that there are some Israeli servers. I understand that they run it like a government or stuff like that? No, no, sorry. People want everyone? that sounded like Israeli servers. I know. Although, I'm sure the governments are working on their own LLMs, et cetera. But yeah, basically your choices are spend a, I mean, if you use open AI or something or anything else, you're really not spending any money. Like I've never been able to spend any money on OpenAI. Like unless you're doing something very intensive and really are using it to, you know, if you're using it for your personal use, it's just hard to spend any money. But on the other hand, it's not free. So you can, you know, There's no question about that. The problem is that it has a bad track record on privacy. This is probably the number 1 reason why you might want to use a local AI, a local LLM. Another 1 is like, you may not agree with the decisions. You know, there's a lot of trust and safety stuff that these companies have to do. Like they don't want like the LMs to kind of like give you, like tell you how you can make meth or how you can make a bomb, which they would do. They would totally do it. So, But each time you kind of restrict what is happening with what you can get out of the LM, it gets a little worse. So some people I guess even open source language modules will soon have HR spaces because it's simply a legal issue. probably will be, although I don't know of any offhand, that will are completely uncensored. I know people are interested and are running uncensored models. I don't know how to do it. I think it's a little bit dubious, but some people do want to do it. There's another reason for using local servers. Do you have any recommendation for models to run locally and also comments on whether a GPU is required? Usually a GPU, well, you can run it without a GPU, but it does run much better. Like for example, I think when I used, Lama is sort of like a standard. This was the model for that Facebook came out with for local use. And It was, yeah, it's good. It's, but it's now it's I think, Mistral is kind of like has a better performance, But there's also different model sizes. There's 7B, like the Lama 7B is OK. The Mistral 7B, 7 billion, are like, basically it'll take like, you can run it with like 16 gigs of RAM, is pretty good. It's probably about as equal to the LLAMA13B. Those are the number of parameters, if I remember correctly. And then there's a 7B, which I've never been able to run. And even if the 7B, if you run it without a GPU, it takes quite a while to answer. I think I've had experiences where it took literally like several, like 5 minutes before it even started responding, but you do eventually get something. And it could be that like things have gotten better since the last time I tried this, because things are moving fast. But it is super recommended to have a GPU. This is the problem. It's kind of like, yes, free software is great. But if free software is requiring that you have these kind of beefy servers and have all this hardware, that's not great. I think there's a case to be made. Yeah, yeah, that's right. it would be nice if FSL for all things could run something for open source model. And not free, but the key point is that it's Libre? I'll have to look it up, but I haven't explored this yet. But Google's server, which LLM does support, supports arbitrary models. So you can run LLMA or things like that. The problem is that even if you're running Mistral, which has no restrictions. So this is the kind of thing that like the Free Software Foundation cares a lot about. Like you want it to be like no restrictions, legal restrictions on you as you run the model. So even if it's running Mistral, just by using the server, the company server, it will impose some restrictions on you probably, right? There's gonna be some license that you have to, or something you have to abide by. So I think, yes, it depends on how much you care about it, I guess. I should find out more about that and make sure that it's a good point that I should, you know, people should be able to run free models over the server. So I should make sure we support that in the LLM package. So, is there any other questions Or is otherwise we can end the session. Yeah, all right. Thank you. Thank you. Thank you everyone who listened. I'm super happy like I, the interest is great. I think there's great stuff to be done here and I'm kind of super excited what we're going to do in the next year, so hopefully, like next year, and the conference we have something even more exciting to say about LLM and how they can be used with Emacs. So thank you

Questions or comments? Please e-mail ahyatt@gmail.com

Back to the talks Previous by track: Enhancing productivity with voice computing Next by track: Improving compiler diagnostics with overlays Track: Development - Watch