The video above was recorded at the inaugural Assert.js 2018 conference in San Antonio, Texas.
If you’ve ever been frustrated with how a test faked out a network request, mocked something out, or continued passing while production burned, then this talk is for you. In it, I do my best to boil down nearly a decade of practicing test-driven development with JavaScript to summarize the Do’s and Don’ts of poking holes in reality when writing tests.
By watching this talk, you’ll learn about 9 abuses of mocks and 1 workflow that uses test doubles effectively—each explained at a level of detail you probably haven’t seen elsewhere. While you’re at it, you’ll also get a pretty good handle on how to use our testdouble.js library, as well.
If you enjoy the talk, please share it with your friends and colleagues! And if you know a team that could use additional developers and is looking to improve its JavaScript craft, we’d love to hear from you.
Transcript of the talk
[00:00:00] Hello, alright, let's get rolling. The title of this presentation is Please Don't Mock Me. That's what my face looked like seven years ago. I go by my last name, Searls, on most internet things. If you'd like my contact info, you could npm install me. I am the creator and the current maintainer of the world's second most popular javascript mocking library.
[00:00:23] If you don't count Jest's mocks or Jasmine's spies. Here's a download chart. That's SinonJS, the most popular, and then that's us down there. And then this was a really important day. We finally caught up last month, but then NPM turned the servers back on. So we're fighting. You can help goose our numbers by installing Test Double.
[00:00:45] That's the name of the library today and playing with it. We're gonna see a bunch of examples of it today. And the reason I'm here is to tell you that the popularity of a mocking library doesn't matter. And of course, you should respond by saying you're just saying that because your thing's not popular.
[00:00:58] And that's probably true. That's a I'm rationalizing a bit here. But really the reason that I'm saying that is because literally nobody knows how to use mocks. And that's shocking. I probably shouldn't say that. I should say figuratively nobody knows how to use mocks.
[00:01:14] And the reason comes down to this Venn diagram. Because there is a group of people who can explain how to best mock things. And then there's another group who always mock things consistently. But unfortunately, there's a much larger universe of people who use mocking libraries. And I wasn't aware of this because when we designed Test Double JS it was really just targeting the intersection of people who know how to use fake things well in their tests.
[00:01:39] And it's a small group of people, so just becoming more popular doesn't actually solve any problems. So instead, our goal in both writing the library and its messages and its opinionated documentation, as well as this conversation itself, is to grow the intersection of people who know how to use mock objects well, so that they can write better designed code and better designed tests.
[00:01:58] But before we dig in, we need to define a few terms. First, subject. Or subject under test. Whenever I say this, imagine like you're performing an experiment and this is your test subject, the thing you're testing. Dependency, I'm gonna use to define anything that like, your subject relies upon to get its job done.
[00:02:15] Usually it's another module that it requires. Unit test, this is a loosey goosey term that has lots of definitions, but for today we're just going to say that a unit test is anything that exercises like a private API by invoking a method, whether or not it calls through to its real dependencies or not.
[00:02:32] Test double is a catch all term that just means fake thing that's used in your test. It's meant to evoke the image of a stunt double, like for the purpose of a test instead of a movie. So whether you call them stubs or spies or mocks or fakes, for the purpose of this conversation, it doesn't really matter.
[00:02:47] Also, oddly enough, I come from a company called Test Double. We are a consultancy. We pair with teams that are probably just like your team, if you're looking for extra software developers. We'll deploy, work with you, get stuff done, but also with an eye to, making things better. You can learn more about us at our website.
[00:03:02] And part of the reason I like that we named the company Test Double is that something about mocking is an evergreen problem because it sits at this intersection of communication, design, technology, testing, all these different concerns. They need to be sorted out among every single different team. And so it's never like it's going to truly be solved.
[00:03:20] It's something we can keep on working at getting better at. Of course I did not think very hard about creating a library in a company with the same name and the brand confusion that would result. So I've had a few people stop me and ask do I really have 30 full time employees who just work on a mocking library?
[00:03:34] And an unpopular one at that. The answer is no. We're consultants. And there's going to be lots and lots of code examples in this talk today. But there's no test at the end. They're just here for illustrative purposes. I don't expect everyone to follow every single curly brace and semicolon.
[00:03:51] Also, there's no semicolons, because I use standard. And I'm also, if you're a pedant like me, I'm going to be very lazily using the word mock for situations that mock isn't technically the correct word if you're into jargon. It's just monosyllabic and popular, so I'm gonna be saying mock a whole lot.
[00:04:07] Alright, this presentation has four parts. The first part is obvious abuses of mock objects. Then we're gonna move into the slightly less obvious abuses. for those. Then we're going to talk about rational use cases for mocking stuff, but nevertheless dubious. And then finally, the only real good use that I found after a decade of doing this.
[00:04:24] And so let's just start off top to bottom with the obvious abuses and talk about partial mocks. So let's say that you run a park ticket kiosk and people tap on this screen so that they can order tickets. And say that's a 12 year old who calls. We call through to our inventory to ensure we have child tickets available before we sell them.
[00:04:42] If the person had said they were 13, we'd call back to the inventory to make sure we have adult tickets. And then either way, regardless of the age you type in, we want to try to upsell them to express passes, and so we ensure that we have those available. The logic for this might look like something like this function, if they're less than 13, we call the child ticket function.
[00:04:59] If it's over, call the adult. And then if the express module's turned on, we call that. And then the rest of your purchasing flow. Now how to test this? You'd probably think I'll create a kiosk test module. I'll invoke it just like I always would. And then I've got this call to inventory, but this method doesn't return anything.
[00:05:18] It doesn't have any other measurable side effects. And so what I'll do is I'll mock it out. And then I'll just verify that the call took place how I expected. You can do that in a test like this. You could replace that method on the inventory, and then have the test just, pass in the 12 year old, and then verify that the insured child ticket was called one time.
[00:05:34] To do the adult code path, you just do the basic same thing. Punch a hole in reality. While you're there, you can make sure that you don't call the adult ticket during the child code path. And then everything else is copy paste, but with changed values. You change the age, you make sure it's calling adult instead of child, and so forth.
[00:05:50] Pretty perfunctory. You run the test passes, everything's great. Problem is, time passes, and as time passes, you get a phone call because the build is broken and it's your test that's failing, so you run the test again, and sure enough it's failing, and it's failing in a weird way where the child code path is calling the adult ticket checker thing and the adult one is calling it twice, so you don't know what's going on, you open up the code and you look and you try to understand it, and you realize the only other potential cause for this is this call to ensure express pass, and so you open up that inventory.
[00:06:21] js, and remember We, we faked out two of these functions, but not all of them. And so this ensureExpressPass is still the real thing and it looks like somebody changed it to make sure that they don't try to upsell the adult one's not available. And so you can't really, who's to blame in this case, right?
[00:06:36] You can't really blame the person maintaining this for having not anticipated somebody would have a half real, half fake inventory floating around their tests. And What do we do when we hit this situation? One thing you can do that's just a quick fix is to also just punch a hole in that ensureExpressPass thing and doing nothing more than that will make this test pass, but it doesn't quite feel right.
[00:06:56] The advice is when your ship is sinking to just keep poking holes until it stops. Something is wrong. with this design, and the problem is it's only superficially simple. It feels simple because the text in that test is only concerned with what we care about, but it's not actually forming a good experimental control.
[00:07:13] So instead, I'd knock all that out, replace the whole inventory module, and then require the thing that I depend on. That way I'd know that it's not going to change for nonsensical reasons. So in general, rule of thumb, if your dependency is real, and you're calling a thing with a test, It's going to be really easy, because then we're invoking it just like we always would.
[00:07:33] If the dependencies are fake, then it's still pretty straightforward, because mocking libraries are pretty sophisticated. But if you have a dependency that's simultaneously real and fake, that's when you start to get feedback like, what's being tested, or what's the value of the test here? It's just a bad time, don't do it.
[00:07:47] There's a special subtype of partial mocking called mocking the subject. And long story short, I'm just exhorting people, please don't fake parts of the thing that's being tested. And you should laugh and be like, that's funny, why would you fake out the thing that you're testing that doesn't make any sense.
[00:08:02] But people do it so much that I've had to name a pattern after it called contaminated test subjects. And I get a lot of issues opened about this. Typically the report, retort that somebody has to me to say I've got a good reason for this is oh, this module is so huge that I got to fake out this thing and this thing in order to test this other thing.
[00:08:19] But it's if your thing is too long, poking holes in it all over isn't going to make it shorter. Now you just have two problems. You got a big object that nobody understands and tests of it that nobody trusts. So I don't have anything more to say here. Just don't do that.
[00:08:31] A third obvious abuse of mock objects replacing some but not all of your dependencies. Some people use the term over mocking, as if mocking is a thing, like an affordance to be moderated. It makes me think that there's this invisible mockometer or something, and as you mock stuff it starts to fill up, but be careful, because it might blow up, and now you've over mocked because you crossed this invisible threshold.
[00:08:54] It just makes no sense to me, and to explain why, we're going to take an example. Say that you handle seat reservations. So you're pair programming and in your setup, you think for this thing that requests seats, I need a seat map, something to price the seats. To tell whether they're vacant, and then finally to make the booking request.
[00:09:10] And that's what our setup looks like. And I imagine you have different test cases, like the seat's open, or taken, or expensive, and yada. And so because you're pairing, you got to normalize on approach. And the person on the right prefers writing isolated unit tests that mock out all of their dependencies.
[00:09:23] And the person on the left just wants maximally realistic tests to make sure that things work and wants to avoid mocking. And since you're pairing, what do you do? You compromise. And so you just flip a coin and then you just mock out half the things instead of all of them. Again, that's not a laugh line that happens all the time, you check your Macometer on this whole over mocking ideology and you're only at 47 percent looks okay.
[00:09:44] They just snuck in there. Time passes, and you get a call again because the test is failing. What's happening here is that this request seat function that's the thing under test is calling a map to get a seat back. Normally it gets that seat, but it's not anymore because the format of the string is now, instead of 23A, it's 23 A, and nobody called us, so our test started failing.
[00:10:06] Now that person on the right who prefers to isolate their subject might say, this failure had nothing to do with the subject, and she'd be right. Because if you look at where a seat number is here, it's just simply passed to this other thing. There's no logic or string stuff here. And it's the dependency that should be dealing with that.
[00:10:20] This test shouldn't necessarily have broken. So she fixes it, per se, by going into that require and then just knocking it out with a fake and updating the test and getting us passing again. Now more time passes, but this time production is broken. And what's happening in this case is that book seat dependency that we pass the itinerary and the seat and the price to actually make the booking that's faked out.
[00:10:41] And it turns out that the person who maintains that transposed the second and third argument and now expects price and then seat in that call order. But again, nobody called us, and this time our test is still passing because the test didn't enforce the arity of that particular method. And so now production blew up because no test was there to catch it.
[00:10:59] And of course, now the developer on the left who likes to have tests be as realistic as possible will say this wouldn't have happened if we hadn't mocked Bookseed, and he's right too. So if we look at this here, sure enough, we're calling Bookseed with the incorrect argument order.
[00:11:12] There it is right there, and he's upset. So he goes in and looks at where we've mocked out Bookseed and replaces it with the real thing. I'm Pretty sure this isn't what people mean when they say ping pong pairing. But it's the kind of passive aggressive back and forth that I see on code bases all the time when people can't agree on a common approach to, to, to known situations.
[00:11:33] The funny thing about it is that tests are supposed to fail. If a test never ever fails in the thousands of times that you run it, that test has told you effectively nothing. So we should plan for that failure by designing how things should fail. Take a wired unit test, for example. This one calls through to all of its dependencies.
[00:11:52] Nothing is mocked, our macometer readings are safe. And you ask when it should fail? The test should fail whenever its behavior or its dependencies behavior changes. And as a result of that, we need to be careful and like cognizant of the amount of redundant coverage across all of our tests to reduce the cost of change.
[00:12:09] For example, if one of those dependencies is required by 35 other things and then you need to change it, now you've got 35 broken tests. That's a pain. Now in an isolated unit test, you have the opposite situation where all of your dependencies are mocked out. And it's a hundred percent mocked out, so people are flipping out because it's so over mocked.
[00:12:25] Readings are off the charts. And you ask, when should it fail? That test in particular is going to fail when the dependency's protocol changes, when the contract or the relationship between the caller changes. And to be safe, you should create some kind of integration test that just ensures that everything's plugged in together correctly.
[00:12:39] That's typically all you need, but you do need it. But what about this test, where things are like half real and half fake? From the over mocking ideology perspective, it's safe, it looks fine, but you ask, when should it fail, and the answer is, anything could change and break this test. And so therefore, we should not write tests like this.
[00:12:58] Please don't do this. So instead of critiquing how much somebody is mocking, critique how mocks are being used and for what purpose and start to get into the actual like strategy behind a test. The common thread between all these obvious abuses is that like it's clear people tend to, we have an intrinsic just default setting to value maximal realness.
[00:13:17] over experimental control. I really like this quote. Instead of treating realness as a virtue to be maximized, we should clarify what our test is trying to prove, and then use all of our tools, mocks included, to ensure we have sufficient experimental control. Unfortunately, that quote was me just now, because I wish somebody had told me this years ago.
[00:13:39] So let's move on to the less obvious abuses, and mocking out third party dependencies. This is a Frequent request on our GitHub issues. So let's say you depend on this third party dependency in your application called AwesomeLib, and it's so awesome that references to it have leaked all throughout your code base.
[00:13:56] And the usage of it is a little bit a little bit strange, though. First you call it it takes in the config file, which you read from disk, and then it returns an instance with this optimize method and a callback, and then you get these cool interpreter hints to make your code go fast, the problem is that's a really weird usage of the library and so it's really hard to mock out awesome lib. It's really painful. Here you can see like we have to replace the whole file system. And then we replace the module itself. Then we have to create sort of an instance that the lib will return.
[00:14:23] And then we have to do all this stubbing. We say oh when, The file system has this path that gets this config. When that config is passed to AwesomeLib, it returns that instance. And then finally, when we call that optimize method on the instance, then we get a callback trigger that, that passes back the hint.
[00:14:36] That's a lot of work. And it's easy to look at that, because it's a lot of test setup, and it says TD everywhere. It's easy to look at that and blame the mocking library, but really the root cause is the design of the code. So hard to mock code is hard to use code typically, but because we're staring at the test, we tend to blame the test.
[00:14:54] That pain just kind of festers throughout our code base. We've got a lot of kind of messiness in our code, but also a lot of duplication in our tests. And so maybe this'll save us, but there's been an announcement that Awesome Lib 2 is coming out, and so you're excited, like maybe this'll make it easier.
[00:15:07] But of course, all they did was change the callbacks to promises. So you gotta update all those references. And now you like check it out instead of stubbing returns and callbacks. You got to change those to resolve. And then you got to do an 18 different places, which is frustrating. And now add assault on wounds, hacker news calls, and nobody uses awesome live anymore.
[00:15:26] Instead everyone uses mind blow now. And your team goes from frustrated to angry and just watching the churn of all these third dependencies go through your system. So TLDR, if something is hard to mock and you own that thing, it's a private module with its own API that you can change. The remedy is change it.
[00:15:45] That, that test is providing you with valuable feedback saying like the design of this thing is hard to use. But if you're mocking a third party thing, and it's painful to mock, like, what do you do then? Send a pull request? You do nothing. It's just useless pain. And useless pain, if everyone here cares about testing.
[00:16:01] You came to a conference just about JavaScript testing. If your team is suffering a lot of useless pain in their testing workflow, that's fine. Other people are going to notice that's money going out the window and you're going to lose credibility and political capital to keep on testing. So we should be trying to minimize it.
[00:16:15] So instead of mocking out third party dependencies, I encourage people to wrap them instead in like little custom modules that you do own. This is a module that wraps awesome lib. It like tucks under a rug, almost like all of the complexity and how to call it. Instead, you get this simple little callback API.
[00:16:33] And as you think, you realize, now you have breathing room to consider how we use this thing. You realize we could probably cache that file read, right? Because there's just a single place that's using this, but we'd crammed it in the interest of terse ness as we'd copied that third party reference everywhere.
[00:16:47] Or maybe it's got certain errors we understand that we could handle, or at least translate to something more domain specific. The tests, obviously, get a lot easier, because we're gonna depend on that wrapper and mock that out. So we can delete all this, And then this stuff gets a little bit simpler too.
[00:17:02] It just creates a lot more space and breathing room to consider, the design of what you actually care about, which is the subject, instead of all of this, like cognitive energy going to test setup. You don't need to worry about writing unit tests of your wrappers, trust the framework, they probably work.
[00:17:18] You definitely don't want them to be isolation tests, because then if they are hard to mock, you're gonna have that useless pain again, which is just money out the window. Another thing that I see people do that isn't obviously a problem, but usually turns out to be one, is when they tangle up, Logic with their delegation to dependencies.
[00:17:35] And so let's take an example. Let's say you own a fencing studio and you rent out swords. But you might want to be able to ask, Hey, how many swords do we expect to have in stock next Tuesday? And you could write a function for that. So you pass in a date, you call something to fetch rentals, and it gets you a call back with their rentals that are currently outstanding, and then you need to filter them down.
[00:17:53] So you. Take the duration of days and get it into milliseconds, and then you add that to the start time to figure out when they're due, and then if they're due before the given date, you can expect them to the rental to be due at that point, and then you just pluck the swords out of that. In addition, you have current standing inventory, so you call your inventory module, which gives you a callback.
[00:18:12] And you just concatenate that inventory with the swords that you expect to be back. So it's pretty straightforward. We can write a test of it. Start by creating a rental object. It has like a, we'll stub fetch rentals to return a rental to us. We'll stub the current inventory to return at least a sword to us.
[00:18:28] We'll call it with the correct dates so they all match up. And then we'll just assert that we get those two swords back from the function. This test passes. That's great. So time marches on yet again. This time your test didn't break, but rather somebody points out, hey, this current inventory thing with the callback, we can make this faster now because we have a new inventory cache, which is synchronous.
[00:18:47] And so she jumps into the code and deletes that callback method. Instead of this calls, the synchronous one outdents things and things are a little bit cleaner now, a little tidier and definitely a lot faster. But of course it broke the test. And that's really frustrating. You look at the test, here it is.
[00:19:02] The test is specifying that it relies on current inventory and some. Some guy on the team shows up and says, Oh this test is coupled to the implementation, and that is very bad, and we should feel bad. And those scare quotes are mine, because I think it misdiagnoses the real root cause problem here which is that, We have this naive assumption that tests exist to make change safe, but we rarely really inspect in our minds.
[00:19:26] What kind of tests make what kind of change? What kind of safe? And the answer is different depending on the type of test. So let's dig in. First, if you're writing a test of something that has no dependencies, no mocking at all, it probably specifies logic, ideally a pure function. And if it does, that means that if the logic's rules change, then the test needs to change.
[00:19:45] So take, for example, like you convert a for loop to a for each. That calls on an array that's just a refactor. It's probably not going to change the behavior or the test should keep passing. But, if we're changing the duration to some duration plus a one day grace period, you should absolutely expect that test to fail, because you've just changed the rules of the game, and the test job is to specify those rules.
[00:20:05] Consider the case of writing a test of something with mocked out dependencies. The test job there, It isn't to specify logic, but it's to specify relationships. How is this subject going to orchestrate this work across these three things that actually go and do it? And so when the contract between it and the things it depends on changes, that's when the test needs to change.
[00:20:24] So if this duration and grace period thing happens, that test should keep on passing because that should be implemented in something responsible for actually doing the work. But if we start calling inventory cache instead of current inventory, of course you should expect that to change because the relationships and the contracts are different.
[00:20:39] I So where things fly off the rails is when somebody asks what if the subject has both mock dependencies and logic? To graph out our current situation, we have this sword stock module, which depends on current inventory and fetch rentals, but it also has this like third responsibility that it just brings in house, which is to calculate when swords are due.
[00:20:57] And so the first one specifies a relationship. The second one, it's specifying a relationship. But the third responsibility it's implementing this logic itself. And you can see this really clearly in the test, right? This stuff's focused on relationships. This stuff's focused on logic. Same thing in the code.
[00:21:14] This stuff, relationships. This, a big ol chunk of logic. And what it represents is, in a sense, mixed levels of abstraction, where two thirds of our thing is concerned about delegating, and one third is concerned with logic. Which is a classical design smell. And we just got to find out about it because we were mocking.
[00:21:31] What I'd recommend you do, spin off that third responsibility so that SwordStock's only job is to be a delegator of responsibility to other things, and trust me, you'll feel better. And on top of it, SwordsDo, taking in a date and rentals and returning swords, is now a pure function, and your intuition will start to agree with you as to when it should change and when it shouldn't, and what kind of safety you're going to get from it.
[00:21:53] In short, if a test specifies logic, it's a test. It makes me feel great because there's nothing easier to test than passing in a couple values and getting one back. If a test specifies relationships, I feel good because I've got some sort of evidence that the design is being validated, that things are easy and simple to use.
[00:22:09] But if a test specifies both, I feel no safety whatsoever from it. I'm just bracing for the pain of a hard to maintain test. If you struggle with this, or if your team does, and you need some slogan to like tape to a wall have it be this one. Functions should either do or delegate, but never both.
[00:22:27] I think a lot of our designs could improve if we followed that. Another less obvious abuse is when people mock data providers at intermediate levels in the stack. So let's say hypothetically that we invoice travel expenses and we have an invoicing app that calls via HTTP to an expense system.
[00:22:43] We get some Jason back, and so our subject under test sends these invoices. The system's way over there, though, because we depend on building invoices, which depends on filtering approved expenses. Which depends on grouping them by purchase order, which depends on loading the expenses, which depends on HTTP GET, which finally calls through to the other system, and then of course we get all the data transformed all the way back up the stack, and so when you have this layered of an architecture, it's reasonable to ask, okay, so what layer am I supposed to mock when I write my test?
[00:23:10] It depends. If what you're writing is an isolated unit test of sendInvoice, then you always want to mock out the direct dependency because it's going to specify the contract that you have with it and the test data that you need is going to be absolutely minimal to just exercise the behavior in the subject that you're concerned with.
[00:23:25] But if you're trying to get regression safety and make sure that the thing works, then you either want to keep everything realistic or stay as far away as possible. Ideally, at the HTTP layer, because then those fixtures that you save off can have some kind of meaning later on if anything changes, and you can use that to negotiate with whoever's writing the expense system.
[00:23:44] But what you don't want to do is just pick some arbitrary data. Deep depth, like this group IPO thing and mock that out. Maybe it's the nearest thing or maybe it feels convenient, but it's going to fail for reasons that have nothing to do with send subject send in voice. And. The data at that layer is this like internal structure that doesn't mean anything outside of our system.
[00:24:03] So if the mock layer is like a direct dependency and it fails, it's a good thing because it means that our contract changed and we need to go and inspect that if the mock layer is the external system and it blows up. It's another good failure, because it's the start of a conversation about how this API's contract is changing underneath us.
[00:24:21] But if you mock out at an intermediate layer, again, it's another form of useless pain. You learn nothing through the exercise except that you had to spend half a day fixing the build. The common thread between all these lesser abuses of mock objects is that Through uncareful, undisciplined use of mocks, we undercut our test's return on investment.
[00:24:41] Where we're just not clear about what the value of the test is anymore. And if you can't defend the value of your tests excuse me. We can't defend the value of our tests if they fail for reasons that don't mean anything. Or have to change in ways that we can't anticipate.
[00:24:56] Unfortunately, that's also me because there aren't that many quotes about mocks out there. It turns out. All right, I got my own quote wrong in how I read it, too. All right, so let's move on to like rational uses but are often dubious in my opinion. The first is when you use mocks and tests of existing code.
[00:25:14] So let's say that you work at an Internet of Things doorbell startup and because you're a startup, you move fast and loose, you got a lot of spaghetti code that that you wish you could clean up, but you just won more funding, and so now you're excited. You can finally start writing some unit tests.
[00:25:27] You've been putting this off forever. And it's been hanging over your head for a long time that you got zero test coverage. So you write what you think is gonna be the easiest test that you could ever write. It's just pass a doorbell to the thing that rings the doorbell and increments a ding count.
[00:25:42] So here we go. We require our doorbell. We require the subject that rings it. We create a doorbell. We pass it to the subject. And then we should assert that the doorbell's ding count goes up to one. Great. Couldn't be easier. So we run our test, and of course it fails. And it says, oh, by the way, doorbell requires a door property.
[00:25:58] Okay, we can go in, we require a door, create a door, pass it to the doorbell, run our test again. Oh you can't have a door if you don't have a house. Alright, fine. So we'll import a house, create a house, pass it to the door, which will pass the doorbell, which will pass the subject, and run our test.
[00:26:13] It says, house requires a paid subscription. And now we're talking about like payment processes and databases because it's like this code was not written to be used more than once. It was just used once in production. So it's not very usable. And so we run into this pain of all this test setup.
[00:26:25] And so a lot of people here will look at this, they'll table flip, they'll delete all that stuff, and then they'll just replace it with a mock object. Something fake. That they can control for the purpose of just getting a test to run. And they'll set the ding count to zero, they'll tuck it all in. And clearly that smells.
[00:26:40] And if you're not like if you're mocking smell, O meter is not like finely attuned. The nature of this smell is actually really subtle because every other example that I've shown so far of a mock. In the test is replacing a dependency of the subject that does part of the work that the subject delegates to not a value that passes through it either an argument or a return value.
[00:26:59] The reason for that is like values carry type information, which is useful, but also that like they should be really cheap and easy to instantiate free of side effects. So if your values are hard to instantiate. It's easy, just make them better. Don't just punch a hole in it and mock them out.
[00:27:15] And nevertheless, you got that test going, you got a little bit of test coverage. Another thing I see teams do is if they've got a really big, hairy module they'll want to tackle that first when they start writing tests. And so they want to write a unit test and if they try to write it symmetrically against the private API of this thing, it tends to just result in a gigantic test that's every bit as unwieldy as the module was, and gets you absolutely no closer to cleaning that huge module up and making the design better.
[00:27:42] Cause if you depend on 15 different things Like, whether or not you mock those dependencies out or not, you're going to have a ton of test setup pain. It's going to be really hard to write that test. And the underlying test smell was obvious, or smell rather, was obvious before you even, started writing the test, which is this object is too big.
[00:27:58] And so if you know that the design has problems what more is a unit test going to tell you except be painful? So instead odds are the reason you're writing tests is you want to have refactor safety so that you can improve the design of that thing. So if you're looking for safety, you need to get further away from the blast radius of that code, increase your distance.
[00:28:17] The way that I would do that is I'd consider the module in the context of an application. There's other complexities here, like it may be there's a router in front of it or something, but go create a test that runs in a separate process, but then invokes the code the same way a real user would by maybe sending a POST request.
[00:28:31] And then you can probably rely on it going and talking to some data store or something and after you've triggered the behavior that you want to observe, maybe you can interrogate it with additional HTTP requests, or maybe you can go and look at the SQL directly. But this provides a much more reliable way for creating a safety net to go and aggressively refactor that problematic subject.
[00:28:51] And integration tests, they're slower, they're finickier, they're a scam in the sense that if you write only integration tests, your build duration will grow at a super linear rate and eventually become an albatross killing your team. But, they do provide refactor safety. So in the short and the medium term, if that's what you need, that's great.
[00:29:10] And plus, they're more bang for your buck anyway. When you don't have a lot of tests, you'll get a lot more coverage out the gate. Another thing I see people who have existing code that they're trying to like get under test and using mock objects. They'll use mocks just as a cudgel to shut up any side effects.
[00:29:23] So you might see something like this init bell is being knocked out whenever we require init, which is the thing responsible for starting up the app. And somebody might point out hey, why are we faking this? And it turns out oh it dings every doorbell some startup time sequence at require time.
[00:29:37] Now, that's a smell, right? Like the root cause is like a good module design. It's free of side effects and item potent and you can require the thing over and over again. But because like we didn't have any tests, we got into the habit of having all these side effects everywhere because everything was only required by one thing.
[00:29:50] So like again, root cause, write better modules, don't just knock out reality left and right in your test. Unless you're open to the idea that writing isolated unit tests is going to inform your design and that you're going to improve your design as a result, don't even bother writing them.
[00:30:05] Write some other kind of test. Another rational use, but dubious one, that I see a lot is when people use tests to facilitate just like overly layered architectures. One thing that I've learned in like Lots of years of test driven development and testing is that people who write a lot of tests tend to feel pushed towards making smaller things.
[00:30:25] So instead of one horse sized duck, we have a hundred duck sized horses. And if you've got, like a big order JS normally, and you start writing a lot of tests, it wouldn't be surprising at all to find you have an order router controller and yada, yada, lots of small, tiny things, but it's really easy to, especially up on a stage to say this big, knotty mess of code is really bad and maybe it's even provably bad. But it's also important to note that smaller isn't better just by virtue of being smaller. It has to be focused as well. It has to be meaningful. Because look at this case, like where we've got this sort of stack of all these different units.
[00:30:58] If every single feature that we do has the same six cookie cutter things that we have to create, then all we're really doing is creating that large object over and over again with extra steps. Now there's more files and indirection, but we're not actually designing code better. You might ask what's this unsolicited code design advice have to do with mocking?
[00:31:17] If we have a test of one of these layers, and we mock out all the layers beneath it, then what we can do using that tool is create like a skyscraper of infinitely many layers, never having to worry about the fact that if something down at the bottom changes, it'll break everything above us. And so you tend to see teams who use a lot of mock objects create really highly layered architectures.
[00:31:37] And I get it, right? Because layering feels a lot like abstraction. You're doing a lot of the same motions. You're creating a new file. It's a small thing. It's got a really clear purpose. But it's not actually necessarily like the same as thoughtful design of a system. Instead, I'd focus on generalizing the concerns that are common across the features, instead of just creating a checklist of, we do these six things for every story in our backlog.
[00:32:02] So for instance, have a generic router and controller and builder that's variadic and can handle all the different resources in your application. And then, When you need something, some custom order behavior, treat the controller like an escape hatch, like a main method. So this is just plain old JavaScript that's focused on nothing other than whatever is special about orders, as opposed to just building another stack and increasingly broad application.
[00:32:23] So yes, make small things, but make sure that they're meaningfully small and not just arbitrarily small. The last dubious thing that I see people do with mocks is over reliance on verification, or verifying that a particular call took place. And I'm a firm believer that, like, how we write assertions in our tests, especially if we're practicing some kind of test first methodology, will slowly and subtly steer the design of our code.
[00:32:49] So let's say you run a petting zoo. And you were given a copy of Test Double JS for your birthday. And you're really excited because you love the petting zoo and you can finally verify every pet. Somebody pets the sheep, it was pet one time, somebody pets the llama, pet one time, sheep again, now you can verify two times, really exciting, sorry, nobody pets the crocodile sorry crocodile but you can write a test for it.
[00:33:10] And there's a problem in your petting zoo, which is that kids hands are dirty, and they make the animals dirty, but the pet function doesn't say how dirty, and so we just have to guess, and we clean them daily with a 10 p. m. cron job that hoses down all the llamas. It's hugely inefficient and a waste of water.
[00:33:28] How did we get here? Take a look at this test, right? So we have a few stubbings where we ask if a kid likes a sheep. Yeah. They like the sheep. We ask if they like the Lama. They do. They don't like the crocodile. We pass all that into the subject. And then we verify the sheep was pet and the Lama was pet, but the croc wasn't.
[00:33:44] And that's a real tidy looking test. And the subject almost writes itself as a result. We have those two dependencies. It takes in the kid and the animals for each of the animals. If the kid likes the animal, we pet the animal. Great. Passing test, but nobody because the mock library made it so easy to verify that call.
[00:34:01] Nobody thought to ask What should pet return? If we didn't have a mocking library, it would have to return something for us to be able to assert it But we didn't have to because we'd faked it out Anyway, and it's important question to ask and I love this because It gets the kind of fundamental point about tooling generally, because tools exist to save time by reducing the need for humans to think things and take action in order to get a particular job done.
[00:34:27] And that's fantastic unless it turns out that those thoughts were actually useful. So I'm always on guard against any tool that can reduce the like necessary and useful thoughtfulness as I'm doing my job. And that's one example of it. So I've been critical of a snapshot test because a lot of teams, like The median number of tests on a JavaScript project has been and remains zero.
[00:34:50] And so when you tell people Hey, here's the snapshot testing tool and it'll test stuff automatically. People are like, sweet, I'll check that box. And then that means I never have to worry about testing again. And because that's a very obvious form of abuse, I'm critical of snapshot testing tools.
[00:35:05] That said, if you're a highly functional team and you know this nuanced use case where you're only using it to make sure that like your dependencies don't break underneath you, great, but that's 1%. 99 percent are just like abusing them. And I feel the same way, really, about most uses of mock objects.
[00:35:22] Being able to verify calls may end up discouraging a lot of people from writing pure functions when they otherwise would. And pure functions are great, because they don't have side effects, and things are more composable, so we should be writing more of them. If you do use mock objects when you're, like, isolating your tests, Just be sure to ask, what should values return?
[00:35:39] And always default to assuming that things should return values because it just leads to better design. And in this case, when we call pet and we pass it an animal, what could it return? But a dirtier animal. And then we could use that information to figure out when to wash our animals. Taking the same example, let's set up these stubbings instead of the verifications.
[00:35:56] So when we pet the sheep, we get a dirtier sheep, and when we pet the llama, we get a dirtier llama. And then we get a result back now from our subject, so we can actually assert, we end up with a dirty sheep and a dirty llama and a clean crocodile. So now the change here, first we're gonna get rid of this forEach.
[00:36:12] Anytime you see forEach anywhere in your code, it means side effect, because it doesn't return anything. So we're gonna change that to a map so we're trading one array for another. And then in the code path where the kid doesn't like the animal, excuse me first in the codepath where the kid does like the animal, we'll return whatever pet returns, and else, we'll just return the animal as it was.
[00:36:31] And now we got that test to pass. The benefit here is now we know who to wash and when, saving on water. And that's really great. There's just one little problem remaining, which is like, when you run this test, you're gonna get this big, nasty paragraph that I wrote three years ago that says Hey, yo, you just stubbed and verified the exact same method call.
[00:36:50] That's probably redundant. And so that's a little bit quizzical. So just to explain we've got all these verify calls down at the bottom, but they are almost like provably unnecessary because we're stubbing the same thing. and also asserting the return value. The only way for them to get the dirty sheep is to call that method exactly that way.
[00:37:08] And a lot of people, like a llama at a petting zoo, just can't let go of verifying the hell out of every single call. Trust me, you don't need to, just delete it. In fact, I only generally verify calls to the outside world at this point. So if I'm like writing to a log or if I'm scheduling something in a job queue or something like that where it doesn't give me any kind of response or I don't care about the response, then a, like a, like invocation verification might be worthwhile.
[00:37:33] But usually if I'm designing my own private APIs, I always want to be returning values. The common thread between all three of these things is that There's a habit, I think, of neglecting test design feedback when we're using mocks. And the whole purpose of mock objects, the reason they were invented, the reason these patterns exist, is to write isolated unit tests to arrive at better designs.
[00:37:54] It was not just to fake out a database. So if that's why you use mocks spoiler alert, you've been like, doing it wrong. Yeah. Yeah. Yeah. I hate to say that, but like it's really depressing how they have been misused, but I'm all done talking about that now except to say that when a test inflicts pain, people tend to blame the test, because that's the thing you're looking at, it's the thing that you feel like is causing you pain, but the root cause is almost always in the code's design, and hard to test code is hard to use code.
[00:38:21] So we're all done talking about these different abuses. I want to move on to the one good use, the finally, the positive, the, like the happy way that I love using mocks, the reason that I maintain a library about it. And it all comes back to this thing I call London school test driven development.
[00:38:36] I use this phrase because it's based on the work of Nat Price and Steve Freeman. And they're really seminal book, Growing Object Oriented Software Guided by Tests. It's called London because they're members of the London Extreme Programming Group. Some people call this outside in test driven development.
[00:38:51] Some people say test driven design. I call it discovery testing because I've changed a few things. It's very important that I brand everything that I do. And what it does is it solves, it helps me deal with the fact that I'm a highly anxious and panicky person. I'm a really bad learner, for example.
[00:39:11] I seem to take every single wrong path before I end up at the right ones. And I gave a whole talk that just about this, how we're all different people and how developers can customize workflows to suit their, particular traits called How to Program last year. And it's the ideology that like undergirds this particular presentation.
[00:39:30] Because earlier in my career, I was full of fear, like I had blank page syndrome, I was just afraid that staring at my IDE, that I'd never figure out how to make code happen inside of it and that fear led to really uncertain designs, big, knotty, complex stuff with tons of conditionals and comments and stuff and as soon as I got anything working, I would just publish it immediately for fear that any future changes would break it and complete doubt that if anyone asked me Hey, could you finish this in a week?
[00:39:57] I just had zero confidence that I'd be able to do it. But where I am now is I have a certain ease about myself, like when I approached development, because I have processes that give me incremental progress. I'm going to show you that in a second and confidence because through that process, I've realized that coding isn't nearly as hard as I always thought it was.
[00:40:16] And doubt, because I'm just like a really self loathing individual, but I like to think that I'm doubting about more useful things now than I used to be. So let's say, suppose you write talks but it takes you way too long to make all these slides, and so you decide, we're gonna automate it. We're gonna take something that takes our notes, then runs it through sentiment analysis to pair it up with emoji and then create a keynote file.
[00:40:37] So we start with a test, and at the beginning of the test we don't know how we're gonna do this. It seems almost impossibly ambitious. So we just write the code that we wish we had. I wish I had something that could load these notes. I wish I had something that could pair those notes up with emoji and give me slides back.
[00:40:52] And then I wish I had something that could create a file given those slide objects. Then of course I've got my subject. And the test can just be as simple. I'll probably only need one test case. Creates keynote from notes or something. So I run my test. It errors out. It says load notes doesn't exist. But I, this is a file.
[00:41:09] I can create a file. I run my test. Oh, this other file doesn't exist, so I touch it, run my test. This third file doesn't exist, touch it, run my test. And now, my test passed. Now, this is really tiny steps. It's just plumbing, but to me, it's the difference between getting nothing done through like analysis paralysis and just having like incremental forward progress throughout my day, be baked into my workflow.
[00:41:31] It feels like paint by number and you can design workflows that like actually give you the messages to tell you what to do next to prompt and guide your actions in a productive way. So like here so far where we're at, we have this like entry point and we think we've. Maybe could break things down into these three units.
[00:41:49] But let's continue with writing the test itself. I'll start with some notes. I know that I'm gonna need notes, so I invent right here a note value object. And then I say, okay if I call loadNotes, I probably need to pass it a string to search for topics. And I'll call back with these notes.
[00:42:03] And then I need some slides. So I invent a slide object. And when I pass my notes to this pair emoji thing, it should give me those slides back and my subject. Then I pass in that topic and probably like a file path to save the document. And so I'll verify finally that create file was called with slides and a file path.
[00:42:21] Run my test. And now of course get different failure. It says load notes is not a function. This is an artifact of the fact that like the default export in node JS is just a. So I can go in and just export a function instead. And I can see the other two coming. So I'll go ahead and export functions from those files as well.
[00:42:39] Run my test again. And now what it says is it's expected to be called with an array of this slide and with this file path, but it wasn't called. And so finally, after all this work, we get to write some real code. And it turns out though, that writing this code is an exercise in just like obviousness because we've already thought through literally everything it does.
[00:42:59] We know that we have these three dependencies. We know that our exported API is going to have these two arguments. We know that load notes takes a topic and then calls back with notes. We know we pass those notes to pair emoji, which gives us slides. We know that the create file method takes slides and a path and then we're done.
[00:43:15] So we run our tests, test passes, and that's it. And so you might be looking at it and be like, that was a lot of talking to write a five line module. And you'd be right. It was a lot of talking, but Even like those five lines, like I think they belie the amount of stuff that we just accomplished.
[00:43:32] We decided pretty clearly through usage, like what our customers need to provide us to implement this thing. We figured out that if we pass a topic into this thing, we can get some sort of value type back. We also identified a data transformation, converting those notes to slides. And finally, we were confident that if we pass slides in the file path to something, we'd be further along in the story of automating this workflow.
[00:43:55] Job number one, the subject is done. That top layer even though it's really short, very rarely will I ever have to go back and change it again unless the requirements significantly change. Number two the work has now been broken down. So instead of having in my working memory the entire complexity of my overall system, I have this outside in approach where it's no, I've got these three jobs remaining and I can do them in any order and they don't have to know about each other and it's much easier to understand and think about.
[00:44:20] And third, all these contracts are clearly defined. I know what goes in and out of each of these things. So I have no risk of squirreling away and building a bunch of objects with, complexity that ends up not being necessary, or maybe an API that doesn't fit quite right with the thing that ultimately needs it.
[00:44:35] Playing things ahead a little bit, I've got these values note and slide, and those exist off to the side because they're just passed in as arguments back and forth through the system. But if I play forward, like, how the rest of this tree would flow, I imagine that to load my notes, I'd probably need to read a file, parse that outline into some sort of logical set, and then flatten all the points out so that the story is sequential, like a slide deck.
[00:44:59] This first thing is IO, so I'd wrap it with a wrapper like we talked about earlier. The other two though are pure functions, which means I can test them delightfully with no mocks needed. The pair emoji job, would probably need to tokenize those notes, pass them to something that does sentiment analysis, and then convert those to slide objects.
[00:45:16] The first thing would be a pure function. The second thing would probably wrap any of a dozen NPM modules that do sen sentiment analysis. The third thing would be another pure function. Real easy. The third job, creating that file, we probably want to take those slide objects and indicate layout like text up here and emoji down here probably want to generate apple script commands like an in some sort of structured way so we can like rifle through a set of them.
[00:45:40] To create the presentation. And then finally something to automate keynote. The first thing, again, it's pure function. So is the second item. And then the third one, probably just like we'd shell out to that OSA script binary it just wrap that and pass all those commands on through. What this approach has given me as I've developed it and mapped my personal workflow to this one over the years is reliable and incremental progress at work every day, which is something that I didn't have earlier in my career.
[00:46:07] Also, all of the things that I write have a single responsibility. It just shakes out that way. They're all small and focused. And they all have intention revealing names because all of the things that I write start out as just a name in a test, so they'd better say what they do clearly. It also leads to discoverable organization.
[00:46:24] So a lot of people are afraid of Oh man, too many small things in one big file. I'll never find anything. But because I visualize everything as an outside in tree, I just talk every subsequent layer of complexity behind another recursive directory. So if you're interested in how this thing works and maybe only at a cursory level, you look at that first file, but if you want to dig in, you know that you can dive into the directories underneath it.
[00:46:47] It separates out values and logic. And what I found is that by keeping like my data values off to the side and my logic completely stateless, things tend to be more maintainable. And what I end up with at the end of the day is like a maximum number of just simple synchronous, pure functions that are just logic that don't need to know anything about promises or callbacks or anything like that.
[00:47:06] And so this has made me really happy. So that's all I got. You may mock me now. I know you guys have been super patient. This is a long talk. There's a lot of slides. Again, I come from a company, Test Double. We, if your team is looking for senior developers and you care a lot about improving, and this is like one of the ways you might be interested in improving myself and Michael Schoonmaker are here today.
[00:47:25] Wave, Michael. Yeah, he's up in the front row. We'd love to meet you tonight and hang out. You can read more about us here or contact us. We're also always looking for new double agents. If you're interested and passionate about this stuff and you want to do consulting check out our join page and learn what it's like to work with us.
[00:47:39] Quick shout out to an even less popular library that we wrote called TeenyTest. All of the code examples and the test examples in this are real working tests. This is our zero API test runner which we really love using for small projects and examples. The only other time I've been in San Antonio, I gave another talk about testing.
[00:47:54] I apparently have a niche. It's titled, How to Stop Hating Your Tests. And I, and if you liked this talk, I think that would be a really good companion. Because it's about more than just mocking. So finally, would love all your feedback. Would love your tweets and whatnot. Just excited to meet as many of you as I can today.
[00:48:10] I got a lot of stickers and business cards for you. So that's it. I'm all done. Thanks a lot for your time.