Back to the talks Previous by track: Writing a language server in OCaml for Emacs, fun, and profit Next by track: EmacsConf.org: How we use Org Mode and TRAMP to organize and run a multi-track conference Track: Development - Watch

What I learned by writing test cases for GNU Hyperbole

Mats Lidell (he, him, his) - IRC: matsl, @matsl@mastodon.acc.sunet.se, matsl@gnu.org

Format: 27-min talk ; Q&A: BigBlueButton conference room
Status: Q&A to be extracted from the room recordings

Talk

00:03.120 Introduction
03:11.160 ERT: Emacs Lisp Regression Testing
04:14.360 Assertions with should
04:56.920 Running a test case
06:54.560 Debug a test
07:50.380 Commercial break: Hyperbole
09:10.480 Instrument function on the fly
10:39.120 Mocking
14:41.240 cl-letf
15:24.100 Hooks
15:55.720 Side effects and initial buffer state
17:05.100 with-temp-buffer
17:16.520 make-temp-file
17:33.288 buffer-string
18:09.920 buffer-name
18:51.980 major-mode
19:02.680 unwind-protect
20:15.100 Input, with-simulated-input
21:38.460 Running all tests
23:03.220 Batch mode
24:05.060 Skipping tests
26:08.460 Conclusion

Duration: 26:55 minutes

Q&A

Listen to just the audio:

Duration: 26:22 minutes

Description

I'm maintaining GNU Hyperbole. I volunteered for that at a time when FSF was asking for one since it was unmaintained. I did not have much elisp experience but a passion for the package. Not much happened.

To my great delight a few years ago the author of Hyperbole Bob Weiner joined the band and we started together to actively develop Hyperbole again.

One of my focus areas in that work has been to add test cases. We have now gone from no tests to over 300 ert tests for the package. This talk is about my test case journey. What I have learned by doing that.

Discussion

Questions and answers

Q:How many tests do you have for Hyperbole and how wouild you rate the test coverage compared to other packages?
- A:
  - With all tests including the interactive we have 354 tests. Havng said that I must point out that the size of the tests can be very different. I tend to split tests so they are logically (in some sense) different. So that if a test fails it will more likely point you to what the error is. This makes it become more tests. Codewise you could collect similar tests to one ert-deftest making the name of the test point out some group or collection of functions, but I don't do that!
  - I have not studied other packages so I don't know how our test coverage compares to other packages. In fact I don't know what code coverage we have. That is another thing to look into.
Q: One small suggestion, to me 'should' means optional, whereas 'shall' or 'must' means required. Not sure if it is too late to make a major grammar change like that Very nice presentation. (I see
- A: The assertions come from the ert package so any changes would have to be suggested to that. I guess you could make your own version of the assestions using aliases for should et al.
Q: FYI, you may find this helpful for running Emacs tests/lints, both from a command line and from within Emacs with a Transient menu: https://github.com/alphapapa/makem.sh It also works on remote CI.
- A: Thanks for the suggestion. I did have a look at makem.sh but a long time ago so I don't remember why we did not try to apply it. I might give it another look now when I have used plain ert more.
Q: Is it easy to run ad hoc tests inside of an Emacs session, given the command line scripts you need to run to get a batch test session running? In other words, can you tweak tests in an Emacs session and run them right away?
- A:
  - Yes, in principle you just load your tests and run them all using ert and give it the test selector t. That runs all loaded tests.
  - If you want to modify a test you can do that. You change it, evaluate it, and run it again. Just as you change any function.
Q: Did you have to change Hyperbole code and design to be more readily testable as you were increasing your test coverage?
- A:
  - Yes, we have done that to a small extent but we should do more of that. Some Hyperbole functions are large and by that complicated to test. Splitting them into smaller logical parts can make testing easier.
  - Also moving code into pure functions and avoid side effects is a good thing. Pure functions are easier to test. Maybe haveing the side effects separated out into fewer places. This has not been applied but is something I have been thinking about. With side effects I here mean things like adding or modifying text in buffers.
Q: What's the craziest bug you found when writing these tests?
- A: This is not a bug but I always assumed giving a prefix argument to a cursor movement would give the same result as hitting the key the same amount of times. So like C-u 2 C-f would be the same as hitting the C-f key twise. It is not! When moving over a hidden area, the three dots '...' at the end of folded line in org-mode or outline-mode, you get different behavior. Trying to write a test case for the kotl-mode and its folded behavior teached me that.
Q: Why do you prefer el-mock to mocking using cl-letf. (Question asked in BBB)
- - With cl-letf you need to keep track if the mocked functionality is being called or not. The el-mock package does that for you which is what you normally want. Doing this with cl-letf means the definition becomes longer and more complicated. Sort of blurs the picture. el-mock is more to the point.
  - BUT since cl-letf does allow you do define a "new" function it is more powerful and it can be the only option in cases where el-mock is too limited. So it is good to know of this possibility with cl-letf when el-mock does not provide what you need.

Transcript

[00:00:03.120] Introduction

Hi everyone! I'm Mats Lidell. I'm going to talk about my journey writing test cases for GNU Hyperbole and what I learned on the way. So, why write tests for GNU Hyperbole? There is some background. I'm the co-maintainer of GNU Hyperbole together with Bob Weiner. Bob is the author of the package. The package is available through the Emacs package manager and GNU Elpa if you would want to try it out. The package has some age. I think it dates back to a first release around 1993, which is also when I got in contact with the package the first time. I was a user of the package for many years. Later, I became the maintainer of the package for the FSF. That was although I did not have much knowledge of Emacs Lisp, and I still have a lot to learn. A few years ago, we started to work actively on the package, with setting up goals and having meetings. So my starting point is that I had experience with test automation from development in C++, Java and Python using different x-unit frameworks like cppunit, junit. That was in my daytime work where the technique of using pull requests with changes backed up by tests were the daily routine. It was really a requirement for a change to go in to have supporting test cases. I believe, a quite common setup and requirement these days. I also had been an Emacs user for many years, but with focus on being a user. So as I mentioned, I have limited Emacs Lisp knowledge. When we decided to start to work actively on Hyperbole again, it was natural for me to look into raising the quality by adding unit tests. This also goes hand in hand with running these regularly as part of a build process. All in all, following the current best practice of software development. But since Hyperbole had no tests at all, it would not be enough just to add tests for new or changed functionality. We wanted to add it even broader; ideally, everywhere. So work started with adding tests here and there based on our gut feeling where it would be most useful. This work is still ongoing. So this is where my journey starts with much functionality to test, no knowledge of what testing frameworks existed, and not really knowing a lot about Emacs Lisp at all.

[00:03:11.160] ERT: Emacs Lisp Regression Testing

Luckily there is a package for writing tests in Emacs. It is called ERT: Emacs Lisp Regression Testing. It contains both support for defining tests and running them. Defining a test is done with the macro ert-deftest. In its simplest form, a test has a name, a doc string, and a body. The doc string is where you typically can give a detailed description of the test and has space for more info than what can be given in the test name. The body is where all the interesting things happen. It is here you prepare the test, run it and verify the outcome. Schematically, it looks like this. You have the ert-deftest, you have the test name, and the doc string, and then the body. It is in the body where everything interesting happens. The test is prepared, the function of the test is executed, and the outcome of the test is evaluated. Did the test succeed or not?

[00:04:14.360] Assertions with should

The verification of a test is performed with one or more so-called assertions. In ERT, they are implemented with the macro should together with a set of related macros. should takes a form as argument, and if the form evaluates to nil, the test has failed. So let's look at an example. This simple test verifies that the function + can add the numbers 2 and 3 and get the result 5.

[00:04:56.920] Running a test case

So now we have defined a test case. How do we run it? The ERT package has the function (or rather convenience alias) ert. It takes a test selector. The test name works as a selector for running just one test. So here we have the example. Let's evaluate it. We define it and then we run it using ERT. As you see, we get prompted for a test selector but we only have one test case defined at the moment. It's the example 0. So let's hit RET. As you see here, we get some output describing what we have just done. There is one test case it has passed, zero failed, zero skipped, total 1 of 1 test case and some time stamps for the execution. We also see this green mark here indicating one test case and that it was successful. For inspecting the test, we can hit the letter l which shows all the should forms that was executed during this test case. So here we see that we have the should, one should executed, and we see the form equals to 2, and it was 5 equals to 5. So a good example of a successful test case.

[00:06:54.560] Debug a test

So now we've seen how we can run a test case. Can we debug it? Yes. For debugging a test case, the ert-deftest can be set up using edebug-defun, just as a function or macro is set up or instrumented for debugging. So let's try that. So we try edebug-defun here. Now it's instrumented for debugging. And we run it, ert, and we're inside the debugger, and we can inspect here what's happening. Step through it and yes it succeeded just as before.

[00:07:50.380] Commercial break: Hyperbole

It's time for a commercial break! Hyperbole itself can help with running tests and also help with running them in debug mode. That is because hyperbole identifies the ert-deftest as an implicit button. An implicit button is basically a string or pattern that Hyperbole has assigned some meaning to. For the string ert-deftest, it is to run the test case. You activate the button with the action-key. The standard binding is the middle mouse button, or from the keyboard, M-RET. So let's try that. We move the cursor here and then we type M-RET. And boom, the test case was executed. And to run it in debug mode we type C-u M-RET to get the assist key, and then we're in the debugger. So that's pretty useful and convenient.

[00:09:10.480] Instrument function on the fly

A related useful feature here is the step-in functionality bound to the letter i in debug-mode. It allows you to step into a function and continue debugging from there. For the cases where your test does not do what you want, looking at what happens in the function of the test can be really useful. Let's try that with another example. So here we have two helper functions, one f1-add, that use the built-in + function and then we have my-add that uses that function. So we're going to test myadd. And then let's run this. Let's run this using hyperbole in debug mode C-u M-RET. We're in the debugger again, and let's step up front to my function under test and then press i for getting it instrumented and going into it for debugging. And here we can expect that it's getting the arguments 1 and 3, and it returns the result 4 as expected. And yes, of course, our test case will then succeed.

[00:10:39.120] Mocking

The next tool in our toolbox is mocking. Mocking is needed when we want to simulate the response from a function used by the function under test. That is the implementation of the function. This could be for various reasons. One example could be because it would be hard or impossible in the test setup to get the behavior you want to test for, like an external error case. But the mock can also be used to verify that the function is called with a specific argument. We can view it as a way to isolate the function on the test from its dependencies. So in order to test the function in isolation, we need to cut out any dependencies to external behavior. Most obvious would be dependencies to external resources, such as web pages. As an example: Hyperbole contains functionality to link you to social media resources and other resources on the net. Testing that would require the test system to call out to the social media resources and would depend on it being available, etc. Nothing technically stops a test case to depend on the external resources, but would, if nothing else, be flaky or slow. It could be part of an end-to-end suite where we want to test that it works all the way. In this case, we want to look at the isolated case that can be run with no dependency on external resources. What you want to do is to replace the function with a mock that behaves as the real function would do. The package I have found and have used for mocking is el-mock. The workhorse in this package is the with-mock macro. It looks like this: with-mock followed by a body. In the execution of the body, stubs and mocks defined in the body is respected. Let's look at some examples to make that clearer. In this case, we have the macro with-mock. It works so that the expression stub + => 10 is interpreted so that the function + will be replaced with the stub. The stub will return 10 regardless how it is called. Note that the stub function does not have to be called at this level but could be called at any level in the call chain. By knowing how the function under test is implemented and how the implementation works, you can find function calls you want to mock to force certain behavior that you want to test, or to avoid calls to external resources, slow calls, etc. Simply isolate the function under test and simulate its environment. Mock is a little bit more sophisticated and depends on the arguments that the mock function is called with. Or more precise, it is checked after the with-mock clause that the arguments match the arguments it was called with or even if it was called at all. If it is called with other arguments there will be an error, and if it's not called, it is also an error. So this way, we are sure that the function we were expected to be called actually was called. An important piece of the testing. So we are sure that the mock we have provided actually is triggered by the test case. So here we have an example of with-mock where the f1-add function is mocked, so that if it's called with 2 and 3 as arguments, it will return 10. Then we have a test case where we try the my-add function, as you might remember, and call that with 2 and 3 and see that it should also then return 10 because it's using f1-add.

[00:14:41.240] cl-letf

Moving over to cl-letf. In rare occasions, the limitations of el-mock means you would want to implement a full-fledged function to be used under test. Then the macro cl-letf can be useful. However, you need to handle the case yourself if the function was not called. Looking through the test cases where I have used cl-letf, I think most can be implemented using plain mocking. Cases left is where the args to the mock might be different due to environment issues. In that case, a static mock will not work.

[00:15:24.100] Hooks

Another trick is that functions that uses hooks. You can overload or replace the hooks to do the testing. So you can use the hook function just to do the verification and not do anything useful in the hook. Also, here you need to be careful to make sure the test handler is called and nothing else.

[00:15:55.720] Side effects and initial buffer state

So far we have been talking about testing and what the function returns. In the best of words, we have a pure function that only depends on its arguments and produces no side effects. Many operations produce side effects or operate on the contents of buffers such as writing a message in the message buffer, change the state of a buffer, move point etc. Hyperbole is not an exception. Quite the contrary. Much of the functions creating links are just about updating buffers. This poses a special problem for tests. The test gets longer since you need to create buffers and files, initialize the contents. Verifying the outcome becomes trickier since you need to make sure you look at the right place. At the end of the test, you need to clean up, both for not leaving a lot of garbage in buffers and files around, and even worse, not cause later tests to depend on the leftovers from the other tests. Here are some functions and variables I have found useful for this.

[00:17:05.100] with-temp-buffer

For creating tests: with-temp-buffer: it provides you a temp buffer that you visit, and afterwards, there is no need to clean up. This is the first choice if that is all you need.

[00:17:16.520] make-temp-file

make-temp-file: If you need a file, this is the function to use. It creates a temp file or a directory. The file can be filled with initial contents. This needs to be cleaned up after a test. Moving on to verifying and debugging:

[00:17:33.288] buffer-string

buffer-string: returns the full contents of the buffer as a string. That can sound a bit voluminous, but since tests are normally small, this often works well. I have in particular found good use of comparing the contents of buffers with the empty string. That would give an error, but as we have seen with the output produced by the should assertion, this is almost like a print statement and can be compared with the good old technique of debugging with print statements. There might be other ways to do the same as we saw with debugging.

[00:18:09.920] buffer-name

buffer-name: Getting the buffer name is good to verify what buffer we are looking at. I often found it useful to check that my assumptions on what buffer I am acting on is correct by adding should clauses in the middle of the test execution or after preparing the test input. Sometimes Emacs can switch buffers in strange ways, maybe because the test case is badly written, and making sure your assumptions are correct is a good sanity check. Even the ert package does some buffer and windows manipulation for its reporting that I have not fully learned how to master, so assertion for checking the sanity of the test is good.

[00:18:51.980] major-mode

Finally, major-mode: Verify the buffer has the proper mode. Can also be very useful and is a good sanity check.

[00:19:02.680] unwind-protect

Finally, cleaning up. unwind-protect. The tool for cleaning up is the unwind-protect form which ensures that the unwind forms always are executed regardless of the outcome of the body. So if your test fails, you are sure the cleanup is executed. Let's look at unwind-protect together with the temporary file example. Many tests look like this. You create some resource, you call unwind-protect, you do the test, and then afterwards you do the cleanup. The cleanup for a file and a buffer is so common, so I have created a helper for that. It looks like this. The trick with the buffer-modified flag is to avoid getting prompted for killing a buffer that is not saved. The test buffers are often in the state where they have not been saved but modified.

[00:20:15.100] Input, with-simulated-input

Another problem for tests are input. In the middle of execution a function might want to have some interaction with the user. Testing this poses a problem, not only in that the input matters, but also as how even to get the test case to recognize the input!? Ideally the tests are run in batch mode, which in some sense means no user interaction. In batch mode, there is no event loop running. Fortunately, there is a package with-simulated-input that gets you around these issues. This is a macro that allows us to define a set of characters that will be read by the function under the test, and all of this works in batch mode. It looks like this. We have with-simulated-input, and then a string of characters, and then a body. The form takes a string of keys and runs the rest of the body, and if there are input required, it is picked from the string of keys. In our example, the read-string call will read up until RET, and then return the characters read. As you see in the example, space needs to be provided by the string SPC, as return by the string RET.

[00:21:38.460] Running all tests

So now we have seen ways to create test cases and even make it possible to run some of them that has I/O in batch mode. But the initial goal was to run them all at once. How do you do that? Let's go back to the ert command. It prompts for a test selector. If we give it the selector t, it will run all tests we have currently defined. Let's try that with the subset of the Hyperbole tests. Here is the test folder in the Hyperbole directory. Let's go up here and load all the demo tests. And then try to run ert. Now we see that we have a bunch of test cases. We can all run them individually, but we can run them with t instead. We will run them all at once. So now, ert is executing all our test cases. So here we have a nice green display with all the test cases.

[00:23:03.220] Batch mode

So that was fine, but we were still running it manually by calling ert. How could we run it from the command line? Ert comes with functions for running it in batch mode. For Hyperbole, we use make for repetitive tasks. So we have a make target that uses the ert batch functionality, and this is the line from the Makefile. This is a bit detailed, but you see that we have a part here where we load the test dependencies. For getting the packages such as el-mock and with-simulated-input etc. loaded. We also have... I also want to point out here the call to or the setting of auto-save-default to nil to get away with the prompt for excessive backup files that can pile up after running the tests a few times.

[00:24:05.060] Skipping tests

Even with the help of simulated input, not all tests can be run in batch mode. They would simply not work there and have to be run in an interactive Emacs with the running event loop. One trick still to be able to use batch mode for automation is to put the guard at the top of each test case as the first thing to be executed, so that it kicks in before anything else and stops Emacs to try to run the test case. Now, it looks like this: (skip-unless (not noninteractive)). So when ert sees that the test should be skipped, it skips it and makes a note of that, so you will see how many tests that have been skipped. Too bad. We have a number of test cases defined, and to run them, we need to run them manually. Well sort of. Not being able to run all tests easily is a bit counterproductive since our goal is to run all tests. There is however no ert function to run tests in batch mode with an interactive Emacs. The closest I have got is either to start the Emacs from the command line calling the ert function as we just have seen, and then killing it manually when done; or add a function to extract the contents of the ERT buffer when done and echo it to standard output. This is how it looks in the Makefile to get the behavior of cutting and paste, getting the ERT output into a file so we can then kill Emacs and spit out the content of the ERT buffer. One final word here is that when you run this in a continuous integration pipeline, you might not have a TTY for getting Emacs to start, and that is then another problem with getting the interactive mode.

[00:26:08.460] Conclusion

We have reached the end of the talk. If you have any new ideas or have some suggestions for improvements, feel free to reach out because I am still on the learning curve of writing, how to write good test cases. If you look at the test cases we have in Hyperbole and you think they might contradict what I am saying here, it is OK. It is probably right. I have changed the style as I go and we have not yet refactored all tests to benefit from new designs. That is also the beauty of the test case. As long as it serves its purpose, it is not terrible if it is not optimal or not having the best style. And yes, thanks for listening. Bye.

Q&A transcript (unedited)

It's you and I. I have a question. How many tests do you have for hyperbole and How would you rate the test coverage compared to other packages? Well, that's a tricky 1. Shall I spell it out loud and then maybe type it at the same time? So, I believe it's around like more than 300 test cases now. But I cannot compare the test coverage to any other other package. Maybe I can type that later. What do you say, Badal? sure, yeah, that's totally fine. Feel free to just answer them with voice. 1 small suggestion to me, should means optional, where shall or must means required. Not sure if it is too late to make a major grammar change like that. Very nice presentation. So thanks for presentation, but the package ERT, well, it's not something that we have come up with. It's a standard package. So I believe it has been around for a long time. So, but please feel free to make suggestions and maybe you can, you know, like do a copy or like an alias for that. If you believe it makes more sense for your test cases to have that instead. And then we have another question here. For your info, you may find this helpful for running MX test lint both from a command line and from within MX with a transit menu. GitHub alpha papa make sure, yes. It also works on remote CI. Yeah, thank you, Alpha Papa. I think I've looked into that, but we haven't made any use of that. But maybe you'll inspire me to give it another look. Hi, Bob. Hey, how are you? Congratulations, man. Thanks, Hugh. Thank you. I have another question here. It is easy to run ad hoc tests inside an Emacs session given the command line scripts you need to run to get the batch test session running? You said it's to run an ad-hoc test. I'm not sure I understand that question. Yes, please. So I think what I understand is that since you have to use some of these command lines scripts to get a batch test session running, is it easy to run ad hoc tests in an Emacs session or does that, like in your experience, has that been difficult? if you look at the command line, you'll see that it's only like a few image functions to call to get that behavior to run the batch tests. So I think we made some support function for that in hyperbole. So it's not, I don't think it's possible out of the box to do it, but it's not complicated to do it. right? Just like a new function. So that's ad hoc. You just write your test and you can run it. but I got the impression it was about running all your tests like we did with the command line. Well, so the question is more about how would you run all your test cases from within Emacs? And the easy answer to that is actually you load all your test case files, and then you run ERT with the T as the test selector and then it will run all your test cases. their question a little bit as well, clarifying that. In other words, can you tweak tests in an Emacs session and run them right away? Which I believe, if I understand correctly what Bob was saying, you can basically define or redefine functions on the fly and then have them be run, right? you just change it and you run it again. And either you have to sort of load it or you can use like the commercial thing I did. You use hyperbole and just hit meta return on the test case and it will load it and run the test case again. So that's of course what you normally do when you're defining a test or debug a test case or develop a test case. Just start with something small, just make sure maybe you can prepare the test properly and run it again and again and again until you're ready with it. That's a good point. You can definitely do that and that's part of how I normally develop the test cases that I mean start with something small so I can see that I get there maybe the right input in the buffer that I want to test on or something and I expand on that more and more and add more and more more and more more how many test cases you have. I guess you commented on that and like what happens, you know, with the CICD pipeline, every time we commit, you know, across all the versions and what you have set up there because you know I wish people could see it. You can go and check on GitHub and you can see the logs right of any of the builds and but tell them a bit about that Mats because I think that's pretty impressive. CD, part of how we developed this using GitHub and workflows that you get out of the box from there. So this more than 300 test cases on our round for I think 5 different versions of Emacs when we do a pull request or a commit. So that's a good way to ensure that it works from version 27.2 up to the latest master version because there's some changes in Emacs over different versions that can affect your functions or your code. under 60 seconds I think you've got all of them run so you've got pretty extensive testing which does catch interesting bugs here and there, right? I mean, you normally develop with 1 version and then you think everything is okay. But then when you're tested with the different versions, you find out that there are some changes and there are things you might not sort of keep track of what's happening also. So that's a way to get noticed that the core developers of Emacs have changed something that you sort of based your code on. Now I got another question here. Did you have to change hyperbole code and design to be more readily testable as you were increasing your test coverage? Well, we haven't done that to a lot, to a big degree, although I believe that that is an important thing for sort of the future to do that because some of the hyperbolic functions are very complicated and long and that makes testing them rather difficult. So, at a few places we have sort of broken up functions in smaller pieces so it'd be easier to do like unit tests of the different parts of it. But there's a lot of more work that has to be done there. environment in Lisp where we're able to do a lot of interactive bottom-up testing before we even get to lighting tech pieces. So it does tend to be more higher level bugs, I think, that get caught in cross-functional interaction. We had 1 recently that was an Emacs version change. It had been a function that had existed for a long time. It had an and rest in it, in its argument list, so it would assemble the list of arguments from individual arguments that you would give it, and they decided in a recent version, I think with Stefan's input, to change that to a list and allow the prior behavior, but it would issue a warning if you use the prior behavior. So all of a sudden, the way you were supposed to do it became semi-invalid. And so we started getting the warning, and we've tried to eliminate all those warnings in recent hyperbole developments. So we're like, what do we do? You know, because we wanted to be backward compatible to where you couldn't use a list. It required you to use individual arguments. And now it's sort of requiring you to do that. And all of that was caused by the automatic testing on it. So you said, Max, you were going to tell us what you learned. So what are the major things that you learned in doing all of this work? All of this work? presentation, but as I was going along, the presentation became like twice as long as fitted into the time we had so I had to cut it out. But I think some of the core things still is in the presentation. From a personal perspective, And this might not be hard to realize, but forcing yourself to test functions, test code really forces you to understand the code a little bit better in a way that sort of makes it easier than just to read the code. I don't know how it is for the rest listening to this, but for me it works so that if I just read the code then I don't sort of become as sharp as I should be but if I try to write the test case for it then I really need to understand better of all the edge cases and all the sort of states and etc that is involved and I think that's That's what's sort of 1 of the learning things I wanted to communicate as well that I don't think I covered in detail in the presentation. Maybe all this, but try it. 1 other sort of more from the fun side is that I really think it's fun to write the test. So if you haven't tests in your package, you should start doing that because it is fun. It might feel like some extra work, but it really pays off in the long run, especially if you have it in like a pipeline and where you can run it regularly when you do new commits, et cetera. So, I mean, that's maybe obvious from, if you look from the commercial side or your work side to do it like that. But even for your hobby project, it can be very sort of pay off really well. functionality or we're changing some of the plumbing in the system. You know, you go and you do some surgery and then you run the tests. And sometimes 6 to 10 tests will fail. And you find there, you know, it tends to be they're all interconnected and it leads you back to the single source. You fix that and you know it could be an edge case and off by 1 or Sometimes it's an assumption about the way something is used and it's not actually always true. And so, Matt's just really good at identifying some of those scenarios and keeping us honest, I guess I would say. So I love, I run it as much as I before, you know, even before I commit something. So I get to see, you know, if anything has progressed. So yeah, I really recommend this process to people. I haven't seen it done. I don't think that, I don't know any other package that has done it to this level. And it's been working really great for us. And I think, well, we'll see too when we release to the general public. different packages is not the first thing you look at. So I know there are packages that have testing, a lot of testing, but how much, much testing they have or not, I don't know. It's not what you normally look into when you look at someone's else code. You look maybe on the functionality side but not on how they've done the sort of the quality side. So there could be other packages out there that are well equipped. writing these tests? Well, What springs to my mind just now is that we were doing some tests or I would do some tests for when you narrow, what do you say that? When you, in outlining, when you sort of compress things in an outline, so you just, sorry Bob, maybe you have it, when you hide. So I was doing some cursor movement over that. And I always assume that if you do like a prefix argument to like a simple cursor movement, like control F moving 1 character position, and you would give it the, and then the prefix, like you want to move like 2 or 3 positions, you would do like control U 3 and then control F and you move 3. I always assumed that that would be exactly the same as if you just hit the key control F 3 times, but it's not. So it's not the bug, it's a feature, but that was the craziest thing. I spent the night trying to figure out why our code was wrong, but It turns out that's how Emacs behaves. Try it out yourself. Try to move over the 3 dots at the end of that and see what happens. Do it with cursor hitting the key or using a prefix argument and you see it behaves differently. That was the craziest thing. I think there was some other crazy thing or deep learning also, but I can't come up with it at the moment. So maybe I can write it in the Q&A later. but people are welcome to join Mats and Bob here on BigBlueButton to further discuss this. Thank you both. Makaay. Thank you. I don't know, Is it only me and Bob here? So Bob, do you want to say something? And I'm glad we did this. It takes a lot of energy. I'm just really excited about the progress that this, and we're actually doing a lot of QA at work and my professional software work and looking at you know how we can do more test driven development and so everybody's talking about this you know we've got AI over here that can generate test cases. But, you know, strangely enough, with the rapidity of development and web applications, I think the level of testing has gone down in recent years compared to where it used to be, right? Because the pace has gone up. And so I think it's starting to turn again where people are saying, we can't just release crap into the Webisphere and we have to better ourselves. And with all these advanced tool sets that you have, that you can do CICD testing, you know, I just, I just see it coming around, you know, as people develop new things. So That's kind of exciting to me because I came from a manufacturing culture originally where we, our company actually started a lot of the manufacturing quality efforts that you saw in Japan and elsewhere in America for a long time and that was you know entirely through testing. We used to just build incredible test cases because we were combining software with hardware. And if, you know, the hardware doesn't work and you ship a million units, you're, you're in trouble. So, that was just something we had to do. And so it's nice to start to see that curve come around. And I think, you know, Matt Vance is very modest, but I think he's really the 1 that started us down this path and really made it into a reality. So everybody else just gets to benefit from that work. So thanks. more here, then maybe we should just close this and I go over to write in the etherpad the replies we had. I see 1 other person here, I believe Ihor just joined us. Yeah. Yeah, so if you do want to discuss with Mats and Bob, you're welcome to, otherwise, yeah, we can close the room now. because I had power outage, but the part I heard was about the mock library. And you mentioned that you don't like CL-let, but instead you use mock. lot more work when you use the CL letdef. It's for more ambitious and maybe more complicated cases where you want to really make a new implementation, test implementation. If you use the mock, you get a lot of things out of the box, verifying that you actually, like the mock was actually called for instance, whereas if you do with the CLLatf, you would have to take correct track of that yourself. And so, so a lot of more work. Oh yeah. used for simple cases actually. Because, just for example, the function always returns the same. And it tends to be simple lambda that ignores all the input arguments. So that's really trivial most of the time but I actually thought the opposite that mock is supposed to be used for non-trivial cases. Mock was supposed to be used for non-trivial. Yeah I mean I don't know how to explain this. I mean, CLF can be used for non-trivial definitely. You can define then any behavior you want. You can write your own function, but you need to keep track of whether that function is called or not, for instance. So you have to make note of that the function was called so you can fire sort of an error in case your function wasn't called because that would be 1 error case. mocked function was actually not called? you sort of document with the mock also your assumptions how your code is going to be called. And if those are wrong, you will get an error. So you would, so if the implementation would maybe change, for instance, and not call the thing you're mocking, then you will notice that. But if you see a letdef, then you will have to keep track of that yourself. Okay, I see. I see. test. In our mode, we have a lot of tests we don't use third-party libraries at all. Yeah. Yeah. Yeah. At First I found it very powerful to use that, but then I sort of, I learned more about how we can use the mocking library for what I needed. And I prefer that at the moment. Because I had seen it, but I didn't consider that it's gonna be useful even in simple cases. So it's like life, how you turn depends. But maybe I should look more into the org mode and the test case to learn more about that. So thanks for pointing that out. It's almost impossible for org. But yeah, we keep adding more tests. Someone's typing. I don't know. Any more questions? No? Okay, then I'll go back and try to document this in the etherpad. Thank you everybody for Take care. Bye-bye.

Questions or comments? Please e-mail matsl@gnu.org