We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

Seven Shipping Principles

2025/1/8

REWORK

AI Deep Dive AI Insights AI Chapters Transcript

People

David Heinemeier Hansson

Jason Fried

Kimberly Rhodes

Topics

Jason Fried: 37signals 的软件发布流程注重软件质量，而非速度。即使一个功能大部分完成，如果存在一些小问题，也不会发布。发布的产品必须是一个完整且高质量的，而不是仅仅达到最小可行产品（MVP）的标准。在设计软件时，应该关注用户需求，而不是仅仅关注技术实现。应该考虑多种方案，选择最优方案。软件设计应该简洁明了，避免出现隐藏规则或特殊情况。当遇到困难时，应该重新审视问题的本质，先简化问题，再优化解决方案。 David Heinemeier Hansson: 37signals 只发布高质量的软件，宁可推迟发布也不愿发布质量不高的软件。'好'的软件标准很高，这意味着有时需要拒绝发布未达到标准的软件。'好'的软件是一个整体的评价，而不是部分功能的简单叠加。团队成员应该有权阻止不符合质量标准的软件发布。不应追求100%的发布率，部分工作达不到标准是正常的。发布的软件必须是高质量的，而不是仅仅达到最小可行产品（MVP）的标准。单纯的测试覆盖率数据并不能衡量软件的质量和可靠性。测试的严格程度应该与问题的严重程度相匹配。对于不同严重程度的问题，应该采取不同的测试策略和标准。信心水平应该与问题的关键程度成正比。信心水平是一个主观判断，需要根据经验和上下文进行调整。对软件质量的信心水平应该与法律判决中的证据标准类似，根据问题的严重程度进行调整。大部分软件开发工作属于中等关键程度，不需要过于严格的测试。对于关键性系统（如计费系统），需要更高的测试标准和更严格的审查。避免对所有工作都采用相同的测试标准，应该根据问题的关键程度调整测试策略。对简单问题采用复杂的测试方法是浪费资源。信心是一个主观判断，需要经验的积累。初级程序员倾向于过度测试，这会影响开发效率。程序员需要根据问题的关键程度来调整测试策略，在效率和质量之间取得平衡。在测试和效率之间找到平衡点，避免过度测试或测试不足。在保证质量的前提下，适当的冒险是必要的。开发人员应该对发布后的问题负责。如果开发人员对发布后的问题不负责，他们会变得不那么谨慎，导致软件质量下降。开发人员需要从实际反馈中学习，才能提高软件质量。最终目标是发布无缺陷的软件，而不是仅仅满足测试指标。在发布后留出时间来修复bug和处理客户反馈，这有助于提高软件质量。在开发周期中留出冷却时间，以便开发人员处理发布后的问题。如果开发周期安排过于紧凑，会导致开发人员没有时间处理发布后的问题。开发人员没有时间处理发布后的问题，会影响团队合作和效率。开发人员应该参与到软件的整个生命周期中，才能更好地学习和改进。开发人员应该对自己的工作负责，并从实际反馈中学习。避免过度追求完美，而忽略了核心功能的实现。完美是好的敌人，应该追求'好'而不是'完美'。应该优先考虑核心功能的实现，而不是过度追求细节的完美。应该优先考虑核心概念的清晰度，而不是实现的复杂程度。在有限的时间内，应该优先考虑核心概念的清晰度。一个巧妙的实现可能掩盖了核心概念的缺陷。不要为了赶进度而降低软件质量标准。不要将'比没有好'作为软件发布的标准。应该对自己的工作充满信心，并以更高的标准来要求自己。根据自身能力和资源来决定软件的发布范围和时间。

Deep Dive

Key Insights

What is the first shipping principle of 37signals, and why is it significant?

The first shipping principle is 'We only ship good work.' It is significant because it sets a high bar for quality, ensuring that the software is not just 'okay' but genuinely good. This principle allows the team to say no to shipping subpar work, even if significant time has been invested. It emphasizes the importance of delivering software that the team is proud of, rather than rushing to meet deadlines with mediocre results.

Why does 37signals critique the concept of MVPs (Minimum Viable Products)?

37signals critiques MVPs because they believe the 'minimally viable' bar is too low. Testing incomplete or half-baked ideas often yields poor feedback, which doesn't help in evaluating the true potential of a feature. Instead, they advocate for shipping complete, well-thought-out ideas, even if they are smaller in scope, to ensure the feedback received is meaningful and the product is of high quality.

How does 37signals determine when they are confident enough to ship a product?

Confidence in shipping is determined by the criticality of the problem being solved. For less critical issues, a lower burden of proof is acceptable, while for high-stakes features (e.g., billing systems), rigorous testing and reviews are required. The team avoids applying the same level of rigor to all tasks, instead calibrating their confidence based on the potential impact of bugs or failures.

What does the principle 'We own the issues after we ship' mean at 37signals?

This principle means that the person or team responsible for a feature must also handle any issues that arise after it is shipped. They monitor error trackers, respond to customer feedback, and fix problems. This approach ensures accountability and provides developers with real-world feedback, helping them improve their work and avoid isolating themselves from the consequences of their decisions.

Why does 37signals avoid shipping features that 'aren't right'?

Shipping features that 'aren't right' can lead to hidden rules, awkward user experiences, and unsatisfactory explanations for edge cases. 37signals prioritizes solving the root problem effectively, even if it means stepping back and rethinking the approach. They avoid shipping incomplete or flawed solutions, ensuring that the final product aligns with their high standards and provides a seamless user experience.

What does 'We ship to our appetite' mean in the context of 37signals' shipping principles?

'We ship to our appetite' means that 37signals avoids overextending themselves by pursuing ideas beyond their capacity or interest. They focus on delivering good work within their six-week cycles, avoiding 'gold plating' or over-decorating features. This principle ensures that they prioritize clear, impactful ideas over perfection, maintaining a sustainable pace and avoiding burnout.

How does 37signals balance the trade-off between perfection and practicality in their shipping process?

37signals balances this trade-off by focusing on delivering good work rather than perfect work. They prioritize clear, impactful ideas and avoid over-engineering solutions. For example, if a feature is technically clever but based on a flawed premise, they will step back and rethink the approach. This ensures that their solutions are both practical and aligned with their high standards.

Shownotes Transcript

Translations:

中文

Welcome to Rework, a podcast by 37signals about the better way to work and run your business. I'm your host, Kimberly Rhodes, and I'm joined as always by the co-founders of 37signals, Jason Freed and David Heinemeier Hansen. This week, I want to dive into one of the write-ups on the 37signals website called the Seven Shipping Principles, which is our guidelines for how we ship and shape and build our

software at a sustainable pace is how it's written. David, I believe you wrote this, if I'm not mistaken. Let's go through some of these seven principles. We won't dive into all of them in too much detail, but a few of them I think are really important and interesting for people to hear about. The first one, we only ship good work. I think that's pretty self-explanatory, but tell us a little bit about your thoughts on that. That was actually the point that motivated writing up these shipping principles in the first place. And it sounds self-evident.

But it's not actually as self-evident as it sounds when you have spent weeks on something, you're out of time, and you've built something, something that you could rationalize yourself into being okay to ship.

But it's not okay to ship because it's not good. And good is actually a relatively high bar. It's better than okay. We could have written, we ship okay software. Do you know what? I am sure that would actually be a high bar in some establishments, but that is not our bar. Our bar is good software.

And by articulating that as the bar, it gives us permission to say no. It gives us permission to say, yes, we've spent a fair amount of time on this one feature, this cycle, but it's not good software yet. So we're not going to ship it. We're going to eat that instinct that is to always deliver something and realize occasionally it's quite rare. I can remember only a handful of instances where we literally had to invoke this as a

stop block. This is not going out because it's just not good enough. But it is also one of those things, and I think we talked about in one of the previous podcasts, that the ultimate test of we only ship good work really comes in in the 11th hour. And that was the other reason I wrote this down, is to say, do you know what? That's just facts. Good software is an all-inclusive evaluation of what you've built. It's not this little piece here, this little piece there. It's

All of it, just as it's ready to go out. And that's usually when we're ready to ship or when we want to ship is when we've finished assembling all the pieces. So it's no wonder that that is the ultimate test for whether this should go out the door or not.

And I wrote it down in part because this is the role Jason and I often play, that we are the last stop before something goes out the door. But it'd also be nice if there were more people in the organization who felt empowered because it was written down to essentially say, you know what, we need a timeout here. Maybe this project, for whatever reason, just didn't come together in time.

Which is not a huge amount of shaming. In fact, I think a little suspiciously sometimes about our shipping rate. I think maybe our shipping rate is actually too high that you should not aim to ship 100% of all work you set out to do. You should aim to ship, I don't know,

80, 90, 95%. But there's got to be some part of it that's not good enough. Because otherwise, either you're just the best software maker that's ever been, highly unlikely, or your standards are just not where they need to be. So shipping good work.

good software, something that I would be proud to use, that's got to be the bar. And that's why we wrote it down as such. Well, I was going to add one small thing. This is like, there's been a lot of discussion about MVPs and whatnot lately. I think that's one of the issues we take with MVPs, which is like minimally viable. That's too low a bar. It also is like people use it to test ideas. I saw David wrote up a nice tweet about this recently. It's like,

Testing incomplete half-ass ideas isn't really going to yield you the answers that you're looking for. That's why this stuff has to be good. Because if you want to evaluate whether or not something's any good, you have to put something out there that you think is good, not just minimally viable. Because you're going to get minimally viable feedback on that, which is not really that good either. So that's kind of the idea. When you put something out there in the world...

It also should be a complete idea. It can be a smaller idea than you initially wanted it to be, but it needs to be a complete idea and not some partial half-ass, half version of it in a sense. It can be half, not half-ass in a sense, but the half needs to be good. What you put out there needs to be good. It can't just be a half-ass version of something.

Okay, I'm going to read the beginning of this next one because I think it's interesting. I'm going to get your take on it. The next one is we ship when we're confident. We talk a lot about confidence and like gut reactions here on the podcast. It says the reason we do automated and exploratory testing is so we can ship work with confidence that it won't cause major problems. But exactly how much confidence is required depends both on the team and the criticality of the problem. David.

I think this comes up often in software development circles when we talk about quantification, when we talk about we have so and so test coverage for this product. I remember once upon a time people would say, I need 90% test coverage, which basically meant that out of all the lines of code you have in your product, all 90% of them are going to be tested.

Or they have a test ratio. We write three lines of test code for every one line of production code. That's a ratio I've heard stayed in the path. And I always go like...

That doesn't tell me anything. If you're writing all this sort of mechanically, I need three lines of code for every line of production code, whether what you're implementing is completely trivial and banal or whether you're implementing something that if there's a bug, it costs people thousands of dollars or in the most extreme case of criticality that someone could get seriously hurt or worse from it.

You're not dealing with the context in mind. You should apply far less ultimate rigor to something where a problem is a mere inconvenience or even perhaps an aesthetic issue versus I am losing millions or I'm putting people in harm's way. That's what the criticality ladder is about.

which I think is a great way of looking at the whole problem of confidence. Because what are you confident about? And how confident are you really? So we try to generally have confidence that it's not likely. And this is almost like burdens of proofs, right? Like our...

Beyond reasonable doubt. That is much higher as a burden of proof than I forget what the one below it is called. Something preponderance of evidence. Yes. Something like that. Right. Like in the in the legal world, they clarify these levels of criticality. If you're convicting someone for murder, you damn well better be sure beyond a reasonable doubt. If you're having a dispute about some contract, right.

That's much lower criticality. We don't have to deal with it with the same rigor. We may not even need 12 peers to adjudicate this issue. And we try to look at this the same way, that so much of the work that we do falls in the middle territory. Do you know what? If it doesn't work or there's a bug, that's not great, but also we could fix it quickly.

Sometimes we work. I have pushed out plenty of fixes where I do you know what? I know there's not an issue here, which is always, by the way, cute that there will be an issue. But the issue isn't going to be that large. And then when we tinker with, for example, what we call Queen Bee, which is where we do all our billing, we take a higher burden of proof here. We want to have more tests. They have to be more regimented. We want to have more reviews, right?

But that level of ceremony needs to be proportionate to it. And what I've seen the biggest error is to pick one protocol, one level of ceremony, and just apply it to all of it. Apply it to the trivial stuff, and now you're...

There's a Danish saying of shooting sparrows with cannons. You're shooting sparrows with cannons. That's just overkill. And at the same time, that process may also be underkill if you're literally putting people in harm's way or you could potentially lose hundreds of thousands of dollars, right? So having that sense of fluidity

Of confidence. This is why I like confidence as a word. It doesn't just, it's not a numerical scale. It doesn't go like one to five. It's again, it's a gut feel. Now that takes a while to train.

I see very often junior programmers come in and they want to do everything as safe as possible. And then it takes forever, right? They want to write a million tests. They want to write the five lines of test code for one line of production code. And then they're not going to be able to ship. They're not going to fit within our six-week cycles. They're going to turn a two-week project into a five-week project. And then they quickly realize, you know what? Ah,

ah, okay, all this ceremony that I was just doing because that's the protocol, I can't actually do that and accomplish what I wanted to accomplish. I have to now gauge things and calibrate them such that it is proportionate to the criticality. And that's where the magic happens, in those trade-offs. And I also think this is where the enjoyment happens. There is nothing more frustrating than going through

meaningless checklists, bullshit checklists that are just like, oh, I have to do this because that's what the chart says. Right. When you're doing something that you know isn't proportionate, that you know isn't a fit for the right thing, it's just demoralizing. It feels like you're spinning your wheels and you're wasting your time. I would rather err, if anything, a little bit on the cowboy side. Right. We don't build pacemakers. Right.

So we're not going to kill anyone if we make a slight mistake. So we should err a little bit on the cowboy side. Not so much that no one wants to use our damn software because it's total crap and full of bugs and they constantly run into issues. That's not rewarding either, right? But there's a sweet spot in between these two things that fit within the context of the work that you're doing.

Okay, David, you brought up issues. So I'm going to skip to the fourth shipping principle, which is we own the issues after we ship. I think this one's interesting because it kind of ties in with our theory of managers of one. It says, clean up your own mess if you make one. Pay attention to the error tracker for a while after launching. Be the first port of call for support when they get feedback from customers. If you did the work, you have the context and you should put it straight if it's crooked. Okay.

This is all about incentives. If you are a product designer or developer and you get to throw things over the wall and not worry about how they land or what kind of mess they make when they land, you will inevitably be less careful. Your confidence meter will be off because you have no feedback from reality to calibrate it by. You think you're doing enough testing because you don't see the damn bugs.

You don't see the damn feedback from customers. So you're isolated from reality and isolated

Whenever you're isolated from reality, you will inevitably just invent a reality of your own making. Oh, I ship great software. I don't know what you're talking about. This was wonderful. I did all the tests. The tests don't matter. What matters is whether the software is fault-free or not. That's the ultimate outcome. We're not looking for these intermediary outcomes. We're not looking for you to check off boxes about how many tests you've written. We're looking for

faultless software or faultless enough to the criticality that we're aiming for. So I think we have erred wrong on this one occasion in the past. And some of it came from before we either had or fully respected our cool down. That's usually kind of part of the magic of cool down. Cool down for us. So we work in six week cycles and then we have two weeks of cool down before the next cycle starts.

And that time needs to be for two things. It needs to for you to roam and clean things up if you didn't have a project that ran right to the line. But it's also essentially over time. If you had a project that ran right to the line,

But you need to clean up after it, after it goes into production. We have had instances where things ran too close back to back. Someone would work on a major feature, they'd push it out and boom, they'd be on project number two. And then, yeah, who's going to clean up that mess? Whosoever is, as we call it, on call. And now they're sat there going like, oh, I have to understand this feature from first principles. I didn't build it.

That creates actually a bit of friction within teams. And I've seen that develop even within our teams in the past when we didn't respect this so fully. And I think it's just, it's not good for any party. It's not good for the person who feels like they're just being handed a bag of trash.

And they have to figure out how it all works. And it's actually not good for the person who just gets to throw shit over the wall because they don't get to learn. They actually don't get to improve. They don't get to test whether their theories of software design are adequate or even good because they're robbed of that feedback from reality. And I think there's a sweet spot for software developers who get to work on new stuff and

but also get to see that stuff in the wild. That's the kick of shipping, the kick of seeing something you built actually being put to good use, not from two kilometers away, but from right up front, right up front that you're seeing, oh shit, yeah, there's an issue there. So I think it's just a gift to both parties to make it such that there's a fairness principle that the shit you put in there, do you know what, you have to take it back out again. Okay.

Okay, this one, I feel like, Jason, this one probably applies to the design side just as much as the programming side, which is we don't ship if it isn't right. I know you mentioned a couple podcasts ago that we were working on HiRise for six months, completely redesigned it. Tell me a little bit about how this applies to the design side. Yeah, I would say it's like more on the product side, which is all of it. But I'll give you an example. So we're working on a feature right now in Hay.

which originally started out as being able to move through emails using the keyboard, J and K or the arrows back and forth, whatever. And there were some technical reasons why we couldn't really do it everywhere we wanted to. So it was only going to work in a few spots. And so we kind of built it that way. And I was using it. Something isn't right here. It's like a smell, like to use David's line, like there's something that isn't smelling right about this. Like

Yeah, in most cases it works, but here's an edge condition where it's not that unusual. I could find myself in this corner pretty frequently and like, it won't work and it's frustrating and I don't know why. I can't explain it to people. I mean, I could technically, but like,

It's awkward. It would be a very unsatisfactory explanation. So we backed off of that for a second. We could have shipped it because it mostly worked. But I'm like, yeah, let's back off this for a second here. What are we trying to do? So the main thing we're trying to do here was we're trying to make it easy basically to power through your unread emails.

And the reason you might want to use the keyboard is because you want to look at one and go to the next one and go to the next one. You want to move through them rapidly. The idea is to move through them rapidly. That's the idea. And a keyboard is faster than a mouse in most cases for that. So we stepped back and said, okay, so we're trying to do that. Is there another way we could do that?

And we look through the way we do things in Hay already, and it happens to be an interesting layout where we stack things on top of each other. Scrolling is very fast on computers, both on mobile and desktop. Like, what if we just let you scroll through your unreads in a way that you can then respond to some that you need to, mark some read that you don't need to respond to, but you want to mark read, and then at the end when you're done, say, mark them all done if you want to. Like,

Maybe we could do that. So we started out trying to solve the problem one way, started using it, could have shipped it, but it wasn't quite right, backed off, literally took a step back and looked at it more broadly and came up with, I think, a better idea. When I read that paragraph that David wrote, that's what it means to me.

And it can also be an aesthetic thing, like the proportions aren't right and the buttons aren't in the right place and all that stuff. But to me, it's more about, is this doing what it should do? Is this the best way it can do it? Or not the best way, because there's always a better way, I suppose. But does this feel like the right way to solve this problem? Are we really getting to the root of it? And are we leaving any unknown magical features in the corner? I'll put, let me explain what that mean by that. I don't like special rules, basically, or hidden rules, I should say. I call these magical corners. That's my little stupid...

internal version of it. But basically, when there's hidden rules, like, well, the arrow works in every case, but these four. And no one knows that. No one's going to know that. And when they run into it, they're going to ask us questions. And we have to explain these hidden conditions. Well, in this case, if you have this situation, this won't work. I don't like those.

So whenever I run into those, I start to step back and go, I'm not sure this is right. I don't think we can ship this yet. Maybe sometimes we can, but oftentimes we can't. Let's figure out a different way where we don't have these hidden rules. That's how I think about that particular one. I don't know if that's what it was intended to be, but that's how I interpret it. I think one of the corollaries to this or things that are connected to it is this idea that when it gets hard, when there's something that isn't quite working, you

you should always question the premise. Why is this hard? As Jason said, what are we trying to do here? I think with that specific example, we started out with a solution

rather than a problem. So the solution is I want to be able to go through my emails quickly using a keyboard. Okay. That's not the same as stepping back one step further and looking, what is the problem? I want to catch up quicker on my emails. That's a more generic version of that. That's not about a keyboard. It's not even about stepping from one to the other. It is about a fundamental premise. And when we ran into these technical issues as to why the solution didn't work in all the cases,

That's exactly sort of that hard stuff, what the shipping principles are supposed to do. It's just remind you, wait a second, let's lean back here. There's a reason this is hard.

Could we could we make it easy? There's a great saying in programming that is essentially if you're going to make a hard change, first make the change easy, then make the easy change. Now, it may be hard to make the change easy, but that's how you clean things up both conceptually and practically. We needed to clean up our concept of what we're trying to solve a little bit here.

And as it often happens with these shipping principles, why they're called shipping principles, not development principles, is that that moment of clarity arrives in the 11th hour. It arrives when all the pieces are put back together. That's usually often the only time you can really lean back and question the entire thing. Is this right?

And that's why we need to write it down because you have the other influence. We're done. We're shipping, right? Like it's called shipping principles because we're ready to push it out in the world. That's when you need to pause. That's when the hardest pause needs to happen because you could push deploy. It could go out there and you can convince yourself that what you're pushing out there is better than nothing.

That's a terribly low burden of proof. I don't ever want to be or rarely want to be in this situation where we sign off on better than nothing. I mean, what a depressing way to work. Better than nothing? That's your bar? Like non-existence or something? Whatever that something is, even if it's total shit?

I mean, come on. You got to set your sights a little higher than better than nothing. So it's one of those things where I try to code myself to when I hear my internal monologue go better than nothing. There's just a reaction, right? There's just it connects to this. We don't ship if it isn't right. Better than nothing. Don't have such low standards. Raise your chin a bit. Look proud of what you're about to ship. Better than nothing.

Okay. The last one I want to dive into is the last one. We ship to our appetite. David, you talked a little bit about our six-week cycles and our two-week cool down. I do want to read this one phrase that I found interesting, which says, like a great company can be a terrible stock at an exorbitant price, so too can a great pitch become a mistake if it's pursued far beyond the appetite. Yeah.

Love to get your thoughts on that. And then if you have some examples of maybe where we've pushed the boundaries of the appetite. I think just as a general principle here in programming circles, this is usually called gold platening. That you take something...

That has a valid reason to exist. And you just adorn it. You just, there's so much decoration. There's so much veneer on top of it. There's all these other things because you want to make it perfect, which is also a strain that goes through all of this, that perfect is the enemy of good.

And good is what we're trying to do. The first principle, we ship good work. Not the first principle is we ship perfect work. That could be a total trap to fall into. And oftentimes that perfection masks over actually issues that are deeper. You're perfecting a specific solution to a flawed premise. There's no joy in that. I'd much rather have a good enough solution to a great premise, to a great insight,

And I think that's sometimes what we have to weigh, right? When we're trying to fit things into a short amount of time, as these six-week cycles are, we have to weigh the quality of our inside, of our premise, of our design against the implementation itself.

And when I'm being forced to choose between those two things, I'll always pick the clear concept, always the clear idea, the clear line. And if you look actually at the example Jason referred to, going back and forth between your emails, the implementation of using the keyboard in the more traditional, just jump to another, was actually quite sophisticated.

Like we had to do some sophisticated work to get the caching to be right. And I could look at it like, that's pretty clever. That's a clever implementation of that. Yeah, it's a clever implementation of a flawed premise. Because now I look at the other solution we've come up with, I go like, oh yeah, we're just reusing some implementational bits we already have, but it's being used in service of a quite novel insight into what people are actually trying to do. ♪

Okay. Well, I'm going to link to the seven shipping principles that are on 37signals.com slash thoughts. But until then, Rework is a production of 37 Signals. You can find show notes and transcripts on our website at 37signals.com slash podcast. Full video episodes are on YouTube. If you have a question for Jason or David about a better way to work and run your business, leave us a voicemail 708-628-7850. You can text that number or send us an email to rework at 37signals.com.

Seven Shipping Principles 22:54 Share