00:00.000 Introduction
00:37.400 Three activities in voice computing
01:02.560 Talk is not about ... and about ...
01:53.520 Motivations
03:33.240 Data
03:58.680 Voice In in the Chrome Store
04:25.628 Works in web pages with text areas
05:16.880 Built-in commands in Voice In Plus
06:41.740 Common errors made by Voice In
08:14.760 Custom speech-to-text commands
09:59.420 Custom speech-to-commands
10:37.540 Introducing Talon Voice
12:28.400 Talon GUI
14:02.540 Talon file with web scope
15:34.015 Terminals on remote and virtual machines
16:52.500 Recommendations
18:17.720 Acknowledgements
Help wanted: Q&A could be indexed with chapter markers
The Q&A session for this talk does not have chapter markers yet.
Would you like to help? See help with chapter markers for more details. You can use the vidid="voice-qanda" if adding the markers to this wiki page, or e-mail your chapter notes to emacsconf-submit@gnu.org.
(If you want to work on this and you think it might take you a while, you can reserve this task by editing the page and adding volunteer="your-name date" or by e-mailing emacsconf-submit@gnu.org.)
Voice computing uses speech recognition software to convert speech into text, commands, or code.
While there is a venerated program called EmacSpeaks for converting text into speech, an
``EmacsListens'' for converting speech into text is not available yet.
The Emacs Wiki describes the underdeveloped situation for speech-to-text in Emacs.
I will explain how two external software packages convert my speech into text and computer
commands that can be used with Emacs.
First, I present some motivations for using voice computing.
These can be divided into two categories: productivity improvement and health-related issues.
In this second category, there is the underappreciated cure for ``standing desk envy'';
the cure is achievable with a large dose of voice computing while standing.
I found one software package (Voice In) to be quite accurate for speech-to-text or dictation
(Voice In Plus, https://dictanote.co/voicein/plus/), but less versatile for speech-to-commands.
I have used this package daily, and I found a three-fold increase in my daily word count almost
immediately.
Of course, there are limits here; you can talk for only so many hours per day.
Second, I found another software package that has a less accurate language model (Talon Voice,
http://talon.wiki/)) but that supports custom commands that can be executed anywhere you can
place the cursor, including in virtual machines and on remote servers.
Talon Voice will appeal to those who like to tinker with configuration files, yet it is easy to
use.
I will explain how I have integrated these two packages into my workflow.
I have developed a library of commands that expand 94 English contractions when spoken.
This library eliminates tedious downstream editing of formal prose where I do not use
contractions.
The library is available on GitHub for both Voice In Plus
(https://github.com/mooersLab/voice-in-plus-contractions) and Talon Voice
(https://github.com/MooersLab/talon-contractions).
I store my daily writing in a multi-file LaTeX document with one tex file per day.
365 files are compiled into one PDF per year. This is usually about 1000 pages.
I am not going to push my luck with a multiyear document.
Each month is a chapter. The resulting PDF is a breeze to scroll and search.
It has an autogenerated table of contents and an index. I have posted
a blank version for 2023 and another for the upcoming year
(https://github.com/MooersLab/diary2024inLaTeX)
One could take a similar approach in org-mode by using Bastian Bechtold's
org-journal package (https://github.com/bastibe/org-journal).
I gave a 60-minute talk on this topic to the Oklahoma Data Science Workshop
2023 Nov. 16 (https://mediasite.ouhsc.edu/Mediasite/Channel/python).
This workshop meets once a month and is for people interested in data
science and scientific computing. You do not have to be an Oklahoma
resident to attend. Send me e-mail if you want to be added to our mailing list.
About the speaker:
I am an Associate Professor of Biochemistry at the University of
Oklahoma Health Sciences Center. I use X-ray crystallography to study
the structures of RNA, proteins, and protein-drug complexes. I have
been using Python and LaTeX for a dozen years, and Jupyter Notebooks
since 2013. I have been using Emacs every day for 2.5 years. I
discovered voice computing this summer when my chronic repetitive
stress injury flared up while entering data in a spreadsheet. I
tripled my daily word count by using the speech-to-text, and I get a
kick out of running remote computers by speech-to-command.
Q: Could you comment on how speaking vs. typing affects your
logic/content. Thanks!
A: I find that this is like the difference between writing your thoughts
down on a blank piece of printer paper versus paper bound with a
leather notebook. I do not think there has any real difference. I know
that some people believe there is a solid certain difference but this
is, for the purpose I am using this, for the purpose of generating the
first draft, because my skills with the-- using my voice to edit my
text is still not very well developed, I am still more efficient using
the keyboard for that stage.
So the hardest part about
writing generally is getting the first crappy draft written. I
have found that dictation is perfectly fine for that phase. I
find it actually very conducive for just getting the text out. The
biggest problem that most of us have is applying our internal editor and
that inhibits us from generating words in a free-flowing
fashion.
I generally do my generative writing--actually, I divide my writing
into two categories: generative writing (generating the first crappy
draft) and then rewriting. Rewriting is probably 80-90% of writing
where you can go back and rework the order of the sentences, order of
paragraphs, the order of words in a sentence and so forth. It is
really hard work that is best done later in the day when I am more
awake. I do my generative writing first thing in the morning when I am
feel horrible. That is when my internal editor is not very awake and I
can get more words out more words past that gatekeeper. I can do this
sitting down. I can do this standing up. I can do this 20 feet away
from my computer looking out the window to get my eyes a break. I find
it is just a very enjoyable to use it in this fashion. The downside is
that I wind up generating three times as much text. That makes for
three times as much work when it comes to rewriting the text, and that
means I am using the keyboard a lot and later on in the day.
I have not made any progress on recovering from my own repetitive
stress injury. I hope that I will add the use of voice commands,
speech-to-commands, for editing the text in the future and I will
eventually give my hands more of a break.
This allows you to actually separate those two activities not only by
time... So many professional writers will spend several hours in the
morning doing the generative part and then they will spend the rest of
the day rewriting. They have separated this to activities temporally.
What most people actually do is they they do the generative part and
then they write one sentence, and they apply that internal editor
right away because they want to write the first draft as a perfect
version, as a final draft, and that is what slows them down
dramatically.
This also allows you to separate these two activities in terms of
modality. You are going to do the generative writing by Voice In, the
rewriting by keyboard. I think this is like what most people... One way
that many people can get into using speech-to-text in a productive way
that sounds great...
A: (not the author, just an audiance): So, for example, when
you're talking, you have an immense feeling of the topic you
have. You can close your eyes and do your body gestures to
manipulate a concept or idea, and you have... I just feel you
feel more creative than just tapping. Definitely you have much
more speed advantage over tapping, but more important thing is
you use your body as a whole to interact with those ideas.
[this one is done via voice...]
but typing is definitely good for acturate control, such as
M-x some-command ...
Q: Have you tried the ChatGTP voice chat interface, if so how has
been your experience of it? As someone experienced with voice
control, interested to hear your thoughts, performance relative to
the open source tools in particular.
A: I do not have much experience with that particular software. I have
use Whisper a little bit, and so that is related. Of course, you have
this problem of lag. I find that Whisper is good for spitting out a
sentence maybe for a docstring and a programming file. I find that it
is very prone to hallucinations. I find myself spending half my
time deleting the hallucinations, and I feel like the net gain is
diminished as a result, or there has not much of a net gain in terms of
what I am getting out of it.
Q: Are any of these voice command/dictions freemium?
A: To be able to add custom commands, you have to pay
$48 a year. The Talon Voice software is free and the only
limitation there is access to the language model. If you want to get
the beta version, you need to subscribe to Patreon to support the
developer. I did that, and I really did not find much of
an improvement. I really do not intend to do that in the future.
But otherwise in Talon Voice, everything is open and free. The Slack
community is incredibly welcoming. Its parallels with
the Emacs Community are pretty striking.
Q: How good is Talon compared to whisper?
A: With Talon, I find that the first part of the sentence will
be fairly accurate. When I am doing dictation and then towards
the end, the errors... In general, I think its error rate is
about five words out of 100 or so or will be wrong. Whisper is
wonderful because it will insert punctuation for you, but I
guess its errors are longer and that will hallucinate full
sentences for you. So they both have significant error rates.
They are just different kinds of errors. Hopefully, both over
time... [Talon] errors are generally shorter in extent. It do
not hallucinate as long.
Q: are any of those voice command/dictation tools libre? i can not find that information on the web
Mistral 7B is apache 2.0 license i.e. no restrictions
Notes
From the speaker: I really appreciate the high level of accuracy that I am getting from
Voice In. I would use Talon Voice for dictation, but at this point,
there is a significant difference between the level of accuracy of
Voice In versus Talon Voice. It's large enough of a difference that I'll
probably use Voice In for a while until I can figure out how to get
Talon Voice to generate more accurate text.
When you do Org mode and you have the bullets, it can allows you to naturally shard your thoughts in a way that is really easy to edit. ... It has a
summarizing capability. It allows you to you know pull back and get a
overview.
Hi, I'm Blaine Mooers. I'm an associate professorof biochemistry at the University of OklahomaHealth Sciences Center in Oklahoma City.My lab studies the role of RNA structure in RNA editing.We use X-ray crystallography to study the structuresof these RNAs. We spend a lot of time in the labpreparing our samples for structural studies,and then we also spend a lot of time at the computeranalyzing the resulting data.I was seeking ways of using voice computingto try to enhance my productivity.
I divide voice computing into three activities,speech-to-text or dictation, speech-to-commands,and speech-to-code. I'll be talking aboutspeech-to-text and speech-to-commands todaybecause these are two activitiesthat are probably most broadly applicableto the workflows of people attending this conference.
This talk will not be about Emacspeak.This is a venerated program for converting text to speech.We're talking about the flow of informationin the opposite direction, speech-to-text.We need an Emacs Listens. We don't have one,so I had to seek help from outside the Emacs worldvia the Voice In Plus. This runs inthe Google Chrome web browser,and it's very good for speech-to-textand very easy to learn how to use.It also has some speech-to-commands.However, Talon Voice is much betterwith the speech-to-commands,and it's also great at speech-to-code.
The motivations are, obviously, as I mentioned already,for improved productivity.So, if you're a fast typistwho types faster than they can speak,then nonetheless you might still benefitfrom voice computing when you grow tired ofusing the keyboard. On the other hand,you might be a slow typist who talks fasterthan they can type.In this case, you're definitely going tobenefit from dictation because you'll be able toencode more words in text documents in a given day.If you're a coder, then you may get a kick out ofopening programs and websites and coding projectsby using your voice.Then there are health-related reasons.You may have impaired use of your hands, eyes, or bothdue to accident or disease, or you may suffer froma repetitive stress injury. Many of us have thisin a mild but chronic form of it.We can't take a three-month sabbatical from the keyboardwithout losing our jobs, so these injuries tend to persist.And then you may have learnedthat it's not good for your health to sitfor prolonged periods of timewith your staring at a computer screen.You can actually dictate to your computer from 20 feet awaywhile looking out the window,thereby giving your lower body a breakand your eyes a break.
I'm not God, so I have to bring data.I have two data points here,the number of words that I wrote in June and July this yearand in September and October.I adopted the use of voice computingin the middle of August. As you can see,I got an over three-fold increase in my output.
So this is the Chrome store website for voice-in.It's only available for Google Chrome.You just hit the install button to install it.To configure it, you need to select a language.It has support for 40 languagesand it supports about a dozen different dialects of English,including Australian.
It works on web pages with text areas,so it works. I use it regularlyon Overleaf and 750words.com,a distraction-free environment for writing.It also works in webmails. It works in Google.It works in Jupyter Lab, of course,because that runs in the browser.It also works in Jupyter Notebook and Colab Notebook.It should work in Cloudmacs.I've mapped option-L to opening Voice Inwhen the cursor is on a web page that has a text area.So [the presence of a text area is] the main limiting factor.
[Voice In] has a number of built-in commands.You can turn it off by saying "stop dictation".It doesn't distinguish betweena command mode and a dictation mode.It has undo command. You use the command"copy that" to copy a selection.The "press" commands are used in the browser.You [say] "press enter" to issue a command or [submit] textthat has been written in a web form,and then "press tab" will open up the next tabin a web browser. The scroll up and downwill allow you to navigate a web page.I've put together a quiz about these commandsso that you can go through this quiz several timesuntil you get at least 90 percent of them correct,90 percent of the questions correct.In order to boost your recall of the commands,I have a Python script that you can probablypound through the quiz within less than a minute, once you know the commands.I also provide an Elisp version of this quiz,but it's a little slower to operate.
These are some common errorsthat I've run into with Voice In.It likes to contract statements like "I will" into "I'll".Contractions are not used in formal writing,and most of my writing is formal writing, so this annoys me.I will show you how I corrected for that problem.It also drops the first word in sentences quite often.This might be some speech issue that I have.It inserts the wrong word because it's not in the dictionarythat was used to train it. So, for example,the word PyMOL is the name of a molecular graphics programthat we use in our field. It doesn't recognize PyMOL.Instead, it substitutes in the word "primal".Since I don't use "primal" very often,I've mapped the word "primal" to "PyMOL"in some custom commands I'll talk about in a minute.Then there's a problem that the commands that existmight get executed when you speak them when, in fact,you wanted to use the words in those commandsduring your dictation.So this is a problem, a pitfall of Voice In,in that it doesn't have a command modethat's separate from a dictation mode.
You can set up through a very easy-to-use GUIcustom voice commands mapped to what you want inserted,so this is how misinterpreted words can be corrected.You just map the misinterpreted word to the intended word.You can also map the contractions to their expansions.I did this for 94 English contractions,and you can find these on GitHub.You can also insert acronyms and expand those acronyms.I apply the same approach to the first names of colleagues.I say "expand Fred", for example,to get Fred's first and last namewith the [correct] spelling of his very long German name.You can also insert other trivia like favorite URLs.You can insert LaTeX snippets.It handles correctly multi-line snippets.You just have to enclose them in double quotes.You can even insert BibTeX cite keys for referencesthat you use frequently. All fieldshave certain key references for certain methods or topics.
Then it has a set of commands that you can customizefor the purpose of speech-to-commandsto get the computer to do somethinglike open up a specific website or save the current writing.In this case, we have "press: command-s"for saving current writing.You can change the language [with "lang:"],and you can change the case of the text [with "case:"].
But the speech-to-command repertoire is quite limitedin Voice In, so it's now time to pick up on Talon Voice.This is an open source project. It's free.It is highly configurable via TalonScript,which is a subset of Python.You can use either TalonScript or Python to configure it,but it's easier to code up your configurationin TalonScript.It has a Python interpreter embedded in it,so you don't have to mess around with installingyet another Python interpreter.It runs on all platforms, and it has a dictation modethat's separate from a command mode.You can activate it,and it'll be in a listening state asleep.You just bark out "Talon Wake" to start to wake it up,and "Talon Sleep" to have it go into a listening state.It has a very welcoming communityin the Talon Slack channel.Then I need to point out that there's several packagesthat others have developed that run on top of Talon,but one of particular note is by Pokey Rule.He has on his website some really well-done videosthat demonstrate how he uses Cursorlessto move the cursor around using voice commands.This, however, runs on VS Code.At least that's the text editorfor which he's primarily developing Cursorless.
I followed the [install] protocol outlined by Tara Roys.She has a collection of tutorialson YouTube as well as on GitHub that are quite helpful.I followed her tutorial for installingTalon on macOS without any issues,but allow for half an hour to an hourto go through the process. When you're done,you'll have this Talon icon appear in the toolbaron the Mac. When it has this diagonal line across it,that means it's in the sleep state.So, this leads to cascading pull-down menus.This is it for the GUI.One of your first tasks is to selecta language model that will be used to interpretthe sounds that you generate as words.And the other kind of key feature is that there's a,under scripting, there's a view log pull-downthat opens up a window displaying the log file.Whenever you make a change in a Talon configuration file,that change is implemented immediately.You do not have to restart Talonto get the change to take effect.
This is an example of a Talon file.It has two components. It has a header above the dash that describesthe scope of the commands contained below the dash.Each command is separated by a blank line.If a voice command is mapped to multiple actions,these are listed separately on indented linesbelow the first line.The words that are in square brackets are optional.So, I have mapped the word toggle voice in,or the phrase toggle voice in,to the keyboard shortcut Alt Lin order to toggle on or off voice in.If I toggle voice in on,I need to immediately toggle off Talon,and this is done through this key command for Control T,which is mapped to speech toggle.Speech toggle. Then there are,there's a couple other examples.So, if there's no header present,it's an optional feature of Talon files,then the commands in the file will apply in all situations,in all modes.
Here we have two restrictions.These commands will only workwhen using the iTerm2 [ccc] terminal emulator for the Mac,and then only when the title of the window in iTerm2has this particular address,which is what appears when I've logged intothe supercomputer at the University of Oklahoma.One of the commands in this file is checkjobs.It's mapped to an alias,a bash alias called cj for "check jobs",which in turn is mapped to a script called checkjobs.shthat, when it's run, returns a listingof the pending and running jobs on the supercomputerin a format that I find pleasing.This \n after cj, the new line character,enters the command, so I don't have to do thatas an additional step. Likewise,here's a similar setup for interacting witha Ubuntu virtual machine.
In terms of picking up voice computing,these are my recommendations.You're going to run into more errorsthan you may like initially,and so you need some patience in dealing with those.And also, it'll take you a whileto get your head wrapped around Talon and how it works.You'll definitely want to use these custom commandsto correct the errors or shortcomingsof the language models. And you've seen how,by opening up projects by voice commands,you can reduce frictionin terms of restarting work on a project.You've seen how Voice In is preferredfor more accurate dictation.I think my error rate is about 1 to 2 percent.That is, 1 to 2 out of 100 words are incorrectversus Talon Voice where I thinkthe error rate is closer to 5 percent.I have put together [a library of English] contractions[and their expansion] for Talon [too],and they can be found here on GitHub.And I also have [posted] a quiz of 600 questionsabout some basic Talon commands.
I'd like to thank the people who've helped me outon the Talon Slack channeland members of the Oklahoma Data Science Workshopwhere I gave an hour-long talk on this topicseveral weeks ago.I'd like to thank my friendsat the Berlin and Austin Emacs Meetupand at the M-x research Slack channel.And I thank these grant funding agenciesfor supporting my work. I'll be happy to take any questions.
Q&A transcript (unedited)
The stream is here. So folks if you wouldplease post your questions on the pad andwe'll take them up here.Thank you.Thanks.little bit, I can provide a livedemonstration of the use of this Voice Inplugin for Google Chrome.So I have, let's see, say new sentence.I'm on a website that is called 750 words.It provides a text area where without anyother distracting icons for the purpose ofwriting and I'm using it for the purpose ofcapturing my words that I'm dictating and Ihave enabled the Voice In plugin by hittingthe option L command. New sentence.So it interpreted that command new sentenceeven though I didn't pronounce it correctly,which is a pretty good demonstration of itsaccuracy. New sentence.Oops, that didn't work.Undo. New sentence. So new sentence is acombination of 2 commands,period and new line. So I've found it moreconvenient just to say new sentence thanhaving to say period and new line.You can see that it's able to keep up withmost of my speech, and it has to interpretthe sounds that I'm making and convert thoseinto words, so there's always going to be alag. New sentence. But I've found that I cangenerate about 2,000, up to 2,000words an hour as I gather my thoughts andtalk in my rather slow fashion of speaking.New sentence, if you're a really fastspeaker, it might have trouble keeping up.New sentence. I like to write When I'm usingthe keyboard with 1 sentence per line,so that when I copy my text and paste it intoEmacs, for example, I can resort thesentences very easily by just selecting 1line at a time. I like to keep the sentencesunwrapped in that fashion because thatgreatly eases the rewriting phase.And I'm almost have sort of a hybrid reverseoutlining approach by doing that.New sentence. Looks like I have gotten aheadof it a bit and it has not kept up.But generally, it does keep up pretty well.Let's see. I think we have.Yeah, sorry.You can see that it has this EN means Englishand then dash US. There's actually about 40languages that it supports,including several variants of German andabout a dozen English dialects.comments and questions trickling in.So someone is saying that there is a text tocommand application or utility called Clipia,C-L-I-P-I-A, that they think is awesome.Clipia that they think is awesome.And someone else is also saying that Sox,S-O-X is another good alternative.So thank you very much for the suggestions.page here in the chat and on the big bluebutton if you'd like to open that up as well.But I'll continue reading the comments andquestions. So the first question,I guess, is that could you comment on howspeaking versus typing affects your logic orthe content, quote unquote,that you write?between writing your thoughts down on a blankpiece of printer paper versus paper boundwith a leather notebook.I don't think there's any real difference.I know that some people believe there is asolid certain difference,But this is for the purpose,I'm using this for the purpose of generatingthe first draft because my skills with usingmy voice to edit my text is still not verywell developed. I'm still more efficientusing the keyboard for that stage.So the hardest part about writing generallyis getting the first crappy draft written.And so I have found that dictation isperfectly fine for that phase.And I find it actually very conducive forjust getting the text out.The biggest problem that most of us have isapplying our internal editor.And that inhibits us from generating words ina free-flowing fashion.So I generally do my generative writing.So actually I divide my writing into 2categories, generative writing,generating the first crappy draft,and then rewriting. Rewriting is probably 80,90% of writing where you go back and reworkthe order of the sentences,order of paragraphs, the order of words in asentence and so forth.The really hard work. That's best done laterin the day when I'm more awake.I do my general writing first thing in themorning when I feel horrible.I'm not very alert. That's when my internaleditor is not very awake and I can get morewords out, more words past that gatekeeper.And so I can do this sitting down,I can do this standing up,I can do this 20 feet away from my computerlooking out the window to give my eyes abreak. So I find it's actually very enjoyableto use it in this fashion.And the downside is that I wind up generating3 times as much text, and that makes for 3times as much work when it comes to rewritingthe text. And that means I'm using thekeyboard a lot later on in the day and Ihaven't made any progress on recovering frommy own repetitive stress injury.I hope that I will add the use of voicecommands, speech to commands,for editing the text in the future.And I'll eventually give my hands more of abreak.flow of sort of being able to get your wordsout while your internal editor is still notinhibiting things. And then later in the dayor days, get back to the actual rewriting andediting.those 2 activities, not only by time.So many professional writers will spendseveral hours in the morning doing thegenerative part and then they'll spend therest of the day rewriting.So they have separated those 2 activitiestemporally. What most people actually do is,you know, they do the generative part andthen they write 1 sentence and they applythat internal editor right away because theywant to write the first draft in a perfect,as a perfect version as the final draft Andthat slows them down dramatically.But this also allows you to separate these 2activities in terms of modality.You're going to do the generative writing byvoice and the rewriting by keyboard.So I think this is 1 way that many people canget into using speech to text in a productiveway.Let's see. I think we have about 3 or 4minutes live. So I think we have time for atleast another question.Have you tried the chat GPT voice chatinterface? And if so, how has been yourexperience of it? As someone experienced withvoice control, interested to hear yourthoughts, performance relative to the freesoftware tools in particular?particular software. I have used Whisper alittle bit. And so that's related.And of course you have this problem of lag soI find that it's a whisper is good forspitting out a sentence you know maybe for adoc string in a programming file.But I find that it's very prone tohallucinations. And I find myself spendinghalf my time deleting the hallucinations,I feel like the net gain is diminished as aresult. There's not much of a net gain interms of what I'm getting out of it.Whereas I really appreciate the high level ofaccuracy that I'm getting from voice-in.I would use Talon Voice for dictation,but at this point, there's a significantdifference between the level of accuracy ofvoice-in versus Talon voice.It's large enough of a difference that I'llprobably use voice-in for a while until I canfigure out how to get town voice to generatemore accurate text.another 2 or 3 minutes.So if folks have any other questions Pleasefeel free to post them on the pad and I'llcheck IRC now as well.Right, so I see 1 question on IRC asking,Are any of these voice command slashdictating dictation tools free Libresoftware? They cannot find that informationWhich I think is part of it.You just mentionedThere's It's a freemium so The answer is noTo be able to add the commands,the custom commands, you have to pay $48 ayear. The Talon Voice software is free.And the only limitation there is access tothe language model. If you want to get thebeta version, you need to subscribe toPatreon to help support the developer.And I found, I did do that and I reallydidn't find much of an improvement.So I really don't intend to do that in thefuture. But otherwise,Town Voice, everything is open and free,and the Slack community is incrediblywelcoming. The parallels with the Emacscommunity are pretty striking.I think we have about another minute on thelive stream, but I believe the big bluebutton room here is open and will be open,So if folks want to join,if Blaine maybe has a couple of extraminutes. Awesome. Yeah,then you're welcome to join and chat withBlaine and ask any further questions or justdo general chatting. Chatting.compared to Whisper? So with Talon,I find that the first part of the sentencewill be fairly accurate and then when I'mdoing dictation And then towards the end,the errors start to accumulate.So in general, I think it's error rate isabout 5 words out of a hundred or so will bewrong. And whisper, Whisper is wonderfulbecause it will insert punctuation for you.But I guess its errors are longer and thatit'll hallucinate full sentences for you.So they both have significant error rates.They're just different kinds of errors.Right.Let's see. There's a question.Are the green block the author for this talk?Not sure what that question means.think being generated from voice to text,speech to text. At the top of the pad,I think that's the question.this GitHub, on this 750words.comsite where I do my generative writing at thestart of the day. And it just provides a textarea that's free of distractions.And you can see the text that's beingrecorded as I talk. I haven't been saying thecommand new sentence, so there isn't anypunctuation over our discourse.1 thing that I do at the start of the day isI like to write in LaTeX.Ultimately, that's how I store my writing.So new sentence, new sentence.See, insert start day.So This is an example of a chunk of LaTeXcode. So I have some reflections on,you know, what did I wake up this morning?And how do I feel? I have reflections on theprior day in terms of what did I get doneyesterday? Do I remember what I didyesterday? What happened last night?Focus of today. What's to be done today?And so on. So I actually,I think I have more down here.Then I've set up these lists so that I canexpand them easily. If I say item,then the cursor shows up at the start of anitem. And I have it coded so that that newphrase that I speak will start with a capitalletter. As you can see,so capitalize the word and.So in spite of its rather limited commandsyntax, There's some, it's enough to getstarted and maybe in the future,they'll add more features.you know, doing things like expanding thenames of people. So you can do set upcommands like expand the name of a colleagueto go from their first name to their fullname with a proper spelling of their lastname, which, you know,you can wind up spending a lot of time tryingto look that up. And so this voice in withthe custom commands enables you to store hardto remember information like that.How good is Talon compared to Whisper?I think you might have answered that already,at least partially, but...Whisperer will carry out hallucinations,so it will generate long tracks of error,whereas Talon will tend to generate moreerrors towards the ends of sentences,in my experience. And the errors aregenerally shorter in extent.It doesn't hallucinate for long tracks.that we have on the pad.If folks want to join here on Big Blue Buttonfor a few minutes and chat with Blaine,that also works. Let's see,I'm probably going to have to drop in a fewminutes to catch the next speaker.But many thanks, Blaine,for a great talk and for the interestingdemos and the question and answer.this conference with people from all aroundthe world connected together through webbrowsers.if and when it's working correctly.times, but when it's working,it's wonderful. Yep.
computers run the same code,so that people, you know,a lot of people work on the same thing andbuild upon each other's works.For journaling I found 1 good compromisebetween editing and stream-of-thoughtjournaling. 1 good compromise between editingand stream of thought journaling.1 good compromise between editing and beingable to do it again and just kind of helps medo my thoughts even when I do it is when youdo org mode and you have the bullets it kindof allows you to naturally chart yourthoughts in a way that's really easy to editreorder I saw you kind of did that with yourmac la tech macro where you said item and itwould put you down to the next item.Does... How much do you do stuff like that?How much do you do stuff like that where youuse like org mode headings and then youreorder them because like I did that withalso the K outline from HyperBolt package forthe for Emacs org mode later on after theso I have a lot of snippets for Org Mode.I could have Org Mode version of my insertstart day snippet and carry things out in orgmode. So I use org mode from time to time.I often use it for the purpose of writingreadme files for projects to outline thepurpose of the project,and say for a director that contains a codingproject. And I think this would,so the main limitation of VoiceIn is it onlyworks in a web page and you have to have anInternet connection, whereas Talon voice isperfect for something like org mode in thatyou don't need an internet connection and itwill operate anywhere that you can place acursor. I haven't found a place where itdoesn't work. It's amazing.So as you saw my talk,perhaps You can run it in a terminal or aremote computer. You can run it in a virtualit will work. And so as you might imagine,if you use bash aliases,I've worked for, 1 of the first things I didwas map Talend commands to bash aliases sothat I can do all kinds of crazy thingsinside of the terminal.And there are, you know,there's some support already for using Talonin Emacs. There's some Emacs functionalitythat's built into Talon.So when you are in Emacs,there's some features that are automaticallyavailable. And then others have developed orare developing packages,which I don't think are available yet inELPA. There's 1 that does the font locking orsyntax highlighting of Talon files,and another that adds some additionalfunctionality that I'm regrettably not yetfamiliar with.sharding of the thoughts,like let's say, oh, how has my day went?It's went good for reasons 123,and bad for reasons ABC.And then later on, I might think,oh, there's an, I also,my day went good for reasons 456,then you, I can, then you jump up.And so the, like I found like,yeah, the org mode subheadings,because you're able to jump around,easily reorder them after the fact,the very streamlined approach to the streamof thought and the editing.just because like, even when you're editingthat in real time, like,oh, wait a minute, I thought of anotherreason that my day went good,even though I was talking about how it wasgoing bad now. So you jump up.And then you do that. And then you have it.You easily summarize your thoughts andwhatnot.ideal for that kind of interact.So yeah, I see your point in terms of thatsort of a blend of generative writing andediting. And it's also kind of parallel tomind mapping. I use this mind mappingsoftware called iThoughtsX where I'llgenerate all these children items,and then I'll drag them around and resortthem. And they can have children of their ownand grandchildren and so on,in terms of the levels of the nodes.And it's pretty much the same sort of thingwith a nested hierarchy that you can havewith org mode. I think having severalalternate modes or modalities of playing withthoughts is useful. So sometimes I'll hit awall and we're just not really generatinganything in a text mode.But if I switch to using the mind mapping,just seeing it arranged with the connectinglines plays on a different part of the brain,I think, and it can be incrediblystimulatory. It can stimulate a lot of newtoo much with is the mind mapping software,but...have to it in Emacs is Orgrimm in the interms of like the 3D visualization of withOrgrimm GUI ordiagrams and stuff like that,I think those 2 things would allow you stufflike Orgrimm or denote And then the diagramswould be the good ways of doing that inEmacs, but they don't have the mind mapprograms as well.There are a couple mind mapping packages,but they're not as advanced.it that Emacs interacted with.Very well. And so they kind of,you know, worked around and had a little.Integration with the 2.So when you be jumping around your.When you'd be clicking on the web page itwould be pointing you to different places andbuffers okay like those are those the There'san like org-roam node program where it kindof shows the looks like a mind map.You can click and drag them a little bit,so it's a little interactive.I'll have to look into that.That sounds very interesting.though, than Org-ROM, so it doesn't.I want to be able to, I don't like thefeeling of being trapped inside org-modedocuments. Like I want to be able to write,even though I don't really use Markdown and Ilike org-mode better than that.Like for instance, I also use the Koutlinefrom the Hyperbole package.That's what my I got a talk on the stream ofthought journaling for with Koutline and Iwas like, I just don't like the feeling ofbeing tracked in 1 document and denote hasthe ability to it renames the file so you getkeywords in like a PDF file so you can takeso you can link to that with your noteswithout it all disappearing because it's notan org mode document. Plus the ability ofhaving it run on multiple computers or withmultiple people, the database kind of getsscrewed up when you try running it under syncthing. Sync. More fragile.How far are you? So are you a regularpractitioner of the Zettelkasten approach?I partly work too much like testing out theorg-roam versus the notes to use it too much.So part of it is I just tweak with it toomuch before using it and then.I know where they are.So whenever I do need them,I can use them, even though I don't alwaysuse them.room. Zettelkasten. I've actually,it's kind of cool that you can export it andmove it into other programs.I have moved it to Obsidian and played withit in Obsidian for a while,maybe added to it in Obsidian,moved it back to Orgrim.But I'm not convinced.I mean, that I think that Nicholas Luhmannwas very successful with it because he spent5 hours a day or whatever working with it.And I think I would have to do,put in a similar amount of effort to get thiskind of benefits that he gained from it.I'm waiting for somebody to do a scientificstudy, controlled trials to see,to prove whether there's a real benefit.one of the things where you have the 1 for thesections, and then the 1.1,or you know how the notes that it does that'sdifferent. The denote,it has the ability to use a hierarchy manage,which Org-ROM does everything it can toeliminate. But you can use them both intandem. They call it signatures.And to me, 1 of the cool features of denotewould be being able to use like thesignatures for the things that make sense.Like 1 of the ideas is if you don't exactlyknow where this is, but you know,it goes to the section,you can just use the signature.Maybe don't even have too much of a filename. Like oh, this is just another thoughton, well you wouldn't use it for this,but like my day went good for reasons 1,2, 3, 4, 5, and you could just use the denotesignature to do 1, 2, 3,4, 5, just as you have new ideas on like asubject, or like cars are cars are not thiscar is nice because of reasons XYZ,or these types of four-wheelers are nicebecause of XYZ. And you could just keep ondoing that rather than having to get a newname for each 1 of those files.Or you could choose not to have it,but the ability to have it optionally in,to me, sounds like a really nice combo.Because then youI've actually imposed a hierarchy in myZettelkasten and Orgrim.I just, I can't imagine having random ideas.They need some kind of structure.Always have some kind of parent node toattach them to.it, part of it is I'm just trying to optimizethe workflow before it feels really,really, really good, and I don't want totweak with it, or I don't know.Or maybe I don't always need the tool,but some of the distinctions it seems likethat I want is, I want a daily journal Foryour stream of thoughts,then I want a separate 1 for your to do listbecause what you like.You want very different properties for eachof those. Like for to-do lists,you want hierarchical,limited. But if you have more than 3 priorityitems, you don't have a priority item andit's not a good to-do list.It's just unordered thoughts.most of those things done beyond the first 3.trying to do the other stuff,the stream of thoughts,all that stuff I probably don't want to gostraight into like my Zettelkasten becausesome of those problems,like it's noisy, it might be redundant,you don't know how it fits into it becauseyou haven't done that processing on it.This hasn't been refined.So, like, you don't want to refine it.Like, I find that spell checking isdetrimental to me. I don't want spellchecking. I don't want spell checking.I don't want syntax highlighting.I just want to talk or to just write.If I have mistakes, I can turn on that later,do it. Because otherwise,it will distract me and makes that processflow.you're doing the getting things done likethat's why I want them would be want wouldwant them in separate files is that you wantthem like ordered, numbered lists,smaller. And then with the other,with the stream of thought,with journaling, you'd want it justunordered. Thoughts land wherever they may.Maybe not even like machine-generatedtimestamps, So you don't even have to worryabout the names of it,as an example. So yeah,very different properties for what you wantfor both of those modalities.had that at, you know,working on my to-do list at the start of theday, but in a certain sense that is not idealtime. I really haven't optimized the timingof assembly of the to-do list,I think, in retrospect.It's just by lifelong habit.I do that at the beginning of the day,but probably would be better to do it atnight or the night before.And so you sort of prime your brain to go,just get up and go, go after those items.You were, you maybe you want to revise theitems a little bit after sleeping on it,but after your subconscious has worked onthose items. Do you have a daily routine thatyou follow in terms of generating those kindof lists?for this stuff when I want to do it.I enjoy building the scaffolding and I knowwhere the tools are when I need it.And I start using them when I need it,but I don't have it too consistent.org-roam, and you're using k-outline.And are there other tools that you'veexplored?and nerd dictation to do What your talk wasabout? Speaking speech to text to see howthat changes Because it does change what youthink What you write down when you speak itrather than write it. Same thing as whenyou're thinking about when you eliminate theediting, it changes the way you write.When you have the spell checking,it changes the way you write to a muchsmaller degree. But that's the stuff I reallyhaven't gotten working as well,or underdeveloped.I'll move it in. Often I move it into onOverleaf, this website for a lot of techdocuments. I have a plug-in for Rightful,And I use that to clean up my word choicesand some grammar. And I use Grammarly.I'll copy and paste. It just depends on thenature of the writing,how serious it is, how polished it has to be.If I, if it's really vital,like for a grant application or something,I'll paste that into Grammarly and work ontrying to get the writing level to the lowestpossible grade level to make it as clear aspossible to as wide of an audience aspossible. 1 of the things I kindis I kind of wish you could say,hey, what would the subtle cast in personthink of what I wrote who what would einsteinthink of what I wrote because rather thanjust trying to make 1 uniform way of talkingit's like people talk differently and that'san advantage and I can't I really wish likeyou maybe these GPT programs could do well.I really wish it could help you with thegrammar, that maybe give you thoughts on whatyour notes are. What does this person thinkof your thoughts? What does this person thinkof your thoughts? Well,does this person think of your thoughts?Well, does this person think of yourthoughts?even through chat GDP now.I haven't spent time trying that out.But I bet that capabilities are already.It would be nice if it was like built in toEmacs, right? It's a package.Yeah. That'd be very cool.like, the grammar where they help you the wayyou write. Like, for instance,removing redundant words.And Yeah, it's supposed to be like beyondjust spell checking, right?package for Emacs, and you get some of thefunctionality out of it.I've paid for the subscription to get theadvanced features, but I've maybe I don'thave my configuration set up correctly.I just found it was easier to copy and pastea paragraph at a time into the desktopapplication and it will go through and findthose redundancies, junk English.1 of these That was my problem with a lot ofthe grammarly type Programs is I'm I wantsomething that would do that like be realinteresting seeing 1 that's like an oldEnglish type thing or like Lumen person whereit's just like how does this person write andBecause it would be it would spit outsomething a lot different.Just different. Like, yeah,you put different people.completely different thinking and writingstyle. And so the purpose of doing that wouldbe to stimulate A new way of thinking orwriting I guess on your partand writing you know 1 of the targets forthat could be yourself so it's like I'd muchrather have a comprehensible sentence than atruly correct 1. 1 of those is far morevaluable and far more correct English orto yourself. Yes.one's the other you're trying to be used bythe tool. And they're not the same thing.responsible for my writing and being thefinal judge of it and as a scientist I haveto my mantra is it's got to be clear and thenprecise and then concise in that order.And I claim that, you know,that's the order with which I go throughdoing revisions. Clarity is,you know, if it's not clear,it's useless. It's got to be clear to me,but it's got to be clear to a lot of peoplefor whom English is not a first language.And then after that, I got to worry aboutprecision and then conciseness,but those can't be done at the expense ofclarity. So it's quite a battle.where it's like if you have more than 3 itemslike here the purpose of doing that is tohelp or grant of a to-do list is help is toHave you help choose what you're going to dofor the day. Which is why if you have morethan 3 items, if you have 50 items on there,you're not going to get 50 of those itemsdone. So maybe you pick the easiest ones todo, not necessarily the ones that you want orneed to be done. So it's like the process ofchoosing those, like, I don't know,like I found that a very good rules,like up to 3 priority items if you,and then also when you look back and you seethat you did those 3 items,Who cares about this? I'd rather get those 3items done than any number of secondarytasks.very right about that.I don't, I used to, you know,use a pattern of assigning letters.And so you have like, you know,based on like a hierarchy of,you've got the urgent and important,of course, that you got to deal with those.And then the next thing down is the importantand so on. But I tend to just generate theseterribly long lists that most of those itemswould go on what is known as a grass catcherslist of things that you may get to someday,but there's no way you can get to them today.But I feel compelled, I need to capture them.I may want to do them eventually.They wind up on my list.Zettelkasten where you have the day thoughtsand the day journal, then you have yourZettelkasten which I don't think should havetoo close of a connection because one's a lotmore, what's the word?Yeah, that's the word.Yeah, one's actually much more processed.The other is you don't want that processbecause you want it to flow from your headwith as little friction as possible.The other 1 you want to be processed so thatwhen you look it up and stuff like that'smore efficient Same thing with your to-dothings. So like oh, yeah,I guess there's 1 more Category like Ithought I found my 3 favorite way rather thanlike priority 123 is primary tasks whichbasically generally goes up to 3,secondary tasks, and then I like to have athird category, unplanned tasks,and I just have those wrote down in a headingin an org mode file, and then I put the tasksin there, rather than using the agenda,like too much, I don't know,just I found that that was my favorite way ofdoing it and then you have like another filethat would just be your dump of anything youwant to do and that would be like that youcould pull from to get your day or I guesssomething that's actually better than a dayis doing it all by a week at a time I foundthat that's actually a lot nicer becausethinking about what you do in a week seemslike a nicer unit, where you have a week,then you have your day,and then you have the 3 categories ofpriority, secondary, and unplanned.At least that's been my favorite iteration onplanning on a weekly basis and he would justget his weekly list of things to get done andhe was very good at pounding through thatlist and getting them done.I have been too much of a day-oriented personand a week-oriented person to adapt hisapproach, but I've been considering that too.I think what I don't do enough of is pullingback to the month level,semester level, year level,5 year level, 10 year level.And...is like you can have like so you'd have yourweek and then maybe you have like 1 sectionafter Friday or last day of the week and thisis like your this is just your like stagingso this is where you stage all the tasks andthen what like you can just stay in yourstaging write them all down and then use altand your arrow keys to quickly reorder all ofthem in the week and then when you're lookingat 1 day and you're just looking at orderingeverything well it makes a lot of sense whenyou just say, I don't really want to do that.Like I want this done this week.I don't necessarily want it done on this day.So it just, that's why I found that the weekapproach works a lot nicer even.in your week to do the staging.like, these are the things I would like toget done. And then when you schedule it,then you kind of schedule it by just usingthe Alt-Left key, the Alt-Arrow keys to just,oh, I want this done. It looks like thiswould work really good on this day.This 1 looks like it would work on this day.I found that it works at least better withoutit. Yeah, that's fine.Because that way I also get a log ofeverything I've done, which I can't find away that, it seems easier to just make newfiles for it. And rather than,like you could use it with Org Agenda,but like 1 of the things that you want iswith it is to look back at it,reflect. And so like if you have the,if you have, if you open up the file with 2levels or 3 levels of headings to where youjust see the priority task,you can get a very nice overview of saying,I did my priority task this day.So you get the numbers next to the things.And so you can easily just say,I've done this. I mean,it would be nice if I could figure out a wayof doing agenda to give me percentages.But I haven't figured that out.Seeing the granular level,I can easily scan that with my eyes.So I just did it by hand rather than theagenda.times and pretty seriously,but I keep bouncing off it.I think I get too many things built in orscheduled and I just don't get to them.I feel bad about it and I wind up abandoningit. So that's 1 area where there's probablysome potential for optimizing and making thatwork better. There's a lot of customizing youcan do with Agenda. It's amazing.I wanted there to be a separation between thedaily to-do lists and like your grab bagwhich I think agenda works a lot better for agrab bag. I want a nice way of looking backat my to-do daily to-do logs.So I kind of want them to be separated,so I just did them separate.With the agenda, I could never figure outexactly how I want that to work,how the files would look,and how all the Emacs settings would interactwith it. I mean, I'm sure I could,but that's why I opted for weekly files.Or at least That's my most refined idea onthe process.is a little different that I'm generatingthis text on a daily basis and popping itinto this to 1 document file per day and alike a diary on Overleaf as a big so it windschapter and it's compiled quickly enough eventhough it's often up to 1,000pages long by the end of the year.And I have all these, of course,with the PDF, I can search through it.So that's not as you can't do the kind ofreally sophisticated searching that you cando with Org Mode. But just doing that,It sure has been very helpful in digging upinformation, like the little protocols on howI attack, accomplish a certain task that Ihave to do a year later,or to have a record of what I did on acertain day and then somebody above me mightbe trying to hold me to account what gotdone. I can look that up pretty very quickly.It's documented. I find that to be just anykind of thorough documentation system is veryrather than by a weak file.I ran into trouble with,like, once you get a lot of items,like if you have 1,000items, headings, I've had org files with1,000 headings. It can be so hard to scrollthrough. Maybe it's some limitations I'm runinto with the Emacs being single threaded.It was like, that's 1 of the things is like,how exactly do you want this,the information structured because it canchange how it's retrieved.logs and I put it all in the date and thenthe priority, secondary,unplanned tasks, and then I had it stay atthat, get auto expanded by that level bydefault so I didn't see the individual taskand you and then I had a but And then itwould say like I complete 205 or somethinglike that of secondary tasks.And then just being able just to quickly scanall the days and say, oh,it just, the feedback you get from that isworth a lot. And I don't think it'ssomething, it's not something I could thinkof how you do an agenda.Even though I got done in the text files justbecause you get that doesn't expand all theway so you so you can quickly just see onthis day I did this well on this day I didthis well all within and 4 lines per day.So it's not, that doesn't,that's not very visually verbose.Probably about as visually verbose as youwant it. They're not super long.You easily see the 2 of 3 and stuff like thatthat you get done so you can quickly and say,oh well, these are the days where I got myprimary tasks done or this week,and this day I didn't do it well and youcould helps you correlate like your feelingswith your to-do lists and journals andwhatnot.Because it's summarizing capability.It allows you to, you know,pull back and get an overview.Get an overview.from that almost when I did that,it feels like half the reason or should belike half the reason is and it's somethingthat I don't if you use the agenda as it is,you wouldn't, I don't know how you would getit, like saying, like looking at the week byweek basis, breakdowns,you might be able to get like percentages,which would be nice. Like I did this well,or like habit, I don't,there might be things that could offer youbut. Yeah,on various kinds of projects,or various kinds of activities,and to get some feedback in that regard.And then you, but you got the,So I define a project as anything thatrequires work at different points in time,more than 1that I made that demonstrates that.I don't know if you, do you have your emailin your talk notes or anything?slide. There should be my email address.I can add it to my talk notes.I'm going to share screen button,right? There's a share screen button,right?Can you not share the screen on this?Let's see. I have, I see some stuff on here.Wonder if I'm still active.It shows share screen.Cancel.I can put my email address in the chat.but Let's see. Yeah, I think the way thatthey did it on the Any of the other videos ifthey shared the screen they just shared thewebcam they just took over the webcam withOBS and shared what they wanted with it.Yeah, I'll give that to you.Okay. I guess I'll let you go watch the restof the Emacs videos.Thank you very much. I appreciate yourwillingness to share your thoughts on thismatter. This is vital,time management. It's a kind of key aspect oflife.Reasons to use emacs is to use the keyboardis. It's not to speed you up.Like, yeah, that's nice.But it keeps you in the stream,keeps you in the flow state and which thenjust makes you think better and yeah and thething with that is you I have you I have noidea what the limits of that would be.Because you think, because yes,it's not about beating up how many words yousay a minute. I mean that's nice and all,But when you start doing that,when you start removing all these frictionpoints, all of a sudden the number,quality, and types of thoughts you get startEnjoy the rest of the meeting.