JoelOnSoftware

Joel on Software

Things You Should Never Do, Part I

绝对不要做 (第一部分)

by Joel Spolsky Thursday, April 06, 2000

Netscape 6.0 is finally going into its first public beta. There never was a version 5.0. The last major release, version 4.0, was released almost three years ago. Three years is an awfully long time in the Internet world. During this time, Netscape sat by, helplessly, as their market share plummeted.

网景6.0终于要迎来它的第一次公测了,从来没有5.0. 上一次大的发布4.0几乎已经是3年前的事情了。 3年对于因特网世界来说绝对是很长的时间。 这三年里,网景公司只能绝望的坐视它们的市场份额暴跌。

It's a bit smarmy of me to criticize them for waiting so long between releases. They didn't do it on purpose, now, did they?

Well, yes. They did. They did it by making the single worst strategic mistake that any software company can make:

They decided to rewrite the code from scratch.

我隔了这么长时间再来评判他们好像显得有点做作。 这也不是他们故意想要造成的结果,不是么?

当然,他们没想过会这样。 他们只不过是制定了世界上所有软件公司能制定的最最糟糕战略性决策而已:他们决定从头开始重写代码。

Netscape wasn't the first company to make this mistake. Borland made the same mistake when they bought Arago and tried to make it into dBase for Windows, a doomed project that took so long that Microsoft Access ate their lunch, then they made it again in rewriting Quattro Pro from scratch and astonishing people with how few features it had. Microsoft almost made the same mistake, trying to rewrite Word for Windows from scratch in a doomed project called Pyramid which was shut down, thrown away, and swept under the rug. Lucky for Microsoft, they had never stopped working on the old code base, so they had something to ship, making it merely a financial disaster, not a strategic one.

网景绝不是第一个犯这样错误的公司。 Borland做出过同样错误的决定,他们买下了Arago然后决定将其制作成Windows上的dBase, 这个计划花了如此之长的时间乃至微软的Access吞食了他们的市场份额; 然后他们决定重写QuattroPro的时候又犯了同样的错误,人们惊讶的发现居然只有这么点儿功能。 微软几乎同样的错误,他们决定启动一个注定会是失败的叫做金字塔的项目来重写WORD,不过这个项目最终被关停,扔到一边,掩藏起来。 微软很幸运这样他们继续在就的代码库上没有停止过开发, 这样他们就能有些东西能发布, 最终这个决定不过造成了财务大问题,而不是策略大失误。

We're programmers. Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. We're not excited by incremental renovation: tinkering, improving, planting flower beds.

There's a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:

我们是程序员。 程序员从本质上来说更像架构师, 他们每到一个新的地方就想铲平该处从头再来。 我们不会满足于增量式的创新: 修修补补, 改进,修修花床。

有个微妙的原因能说明程序员为什么想扔掉现有的代码库从头再来。 因为他们觉得旧的代码库简直一团糟。 有个有趣的观察结论:他们很可能错了。 他们认为旧的代码一团糟的原因是因为编程的基本法则:

It’s harder to read code than to write it.

This is why code reuse is so hard. This is why everybody on your team has a different function they like to use for splitting strings into arrays of strings. They write their own function because it's easier and more fun than figuring out how the old function works.

阅读代码要比重写代码容易。

这就是为什么代码重用很难。 这就是为什么你团队里的组员每人都有一个不同的函数用来将字符串分裂成一个数组。 因为比起搞清楚旧的代码如何工作 自己重新写一个显得更简单而且更有意思

As a corollary of this axiom, you can ask almost any programmer today about the code they are working on. "It's a big hairy mess," they will tell you. "I'd like nothing better than to throw it out and start over."

Why is it a mess?

"Well," they say, "look at this function. It is two pages long! None of this stuff belongs in there! I don't know what half of these API calls are for."

Before Borland's new spreadsheet for Windows shipped, Philippe Kahn, the colorful founder of Borland, was quoted a lot in the press bragging about how Quattro Pro would be much better than Microsoft Excel, because it was written from scratch. All new source code! As if source code rusted.

作为这个公理的一个推论就是, 你可以马上去问所有程序员关于他们现在工作的代码。 “简直糟透了,”他们会告诉你。 “我更想把它扔了从头开始。”

为什么这是一团糟呢?

“恩,”他们会说,“看这个函数。 居然有两页长! 没有任何一段东西应该放在这里! 我甚至不知道一大半这些API调用在干什么。”

在Borland公司的新Windows表单程序发布之前, Borland那个带种族歧视的创始人PhilippeKahn在媒体上大吹特吹说QuattroPro会如何如何比微软Excel好, 因为它是重头开始重写的。 完全是崭新的代码! 说的好像代码会生锈似的。

The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they've been fixed. There's nothing wrong with it. It doesn't acquire bugs just by sitting around on your hard drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart, that rusts just sitting in the garage? Is software like a teddy bear that's kind of gross if it's not made out of all new material?

Back to that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.

说新的代码比就的代码好简直就是荒谬。 旧的代码至少被用过。 至少被测试过。 大量的代码被发现被纠正。 旧代码没什么不好的。 它不会因为呆在硬盘上就长出了新的错误。 反过来刚,小朋友! 软件难道应该像飞镖一样么?扔车库里就会生锈? 软件难道应该像泰迪熊一样么?如果不用新的材料做就会让人觉得恶心么?

回到那个两页 函数的问题。 我知道, 这就是个显示窗口的简单函数, 它确实多长了点儿头发之类的玩意儿, 没人在知道为啥。 我告诉你为啥吧: 因为修复代码错误。 中间一段代码修复了Nancy尝试在她电脑上安装时候出现的错误因为她电脑上没有IE。 另外一段修复了在低内存机器上会出现觉得错误。 还有一段修复了当软件放在软盘上安装到一半的时候用户弹出软盘的时候会出现的错误。 那段LoadLibrary调用确实很丑,但是它使代码能够在旧版本的Windows95上能工作。

Each of these bugs took weeks of real-world usage before they were found. The programmer might have spent a couple of days reproducing the bug in the lab and fixing it. If it's like a lot of bugs, the fix might be one line of code, or it might even be a couple of characters, but a lot of work and time went into those two characters.

When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.

所有这些错误都要在实际中用个几个礼拜才能发现。 修复这些错误的程序员也许要在实验室里花上个几天才能重现这些错误。 如果跟大多数的代码一样也许修复只要一行代码, 或者只要几个字符, 但是就这几个字符花了大量的时间和精力。 当你扔掉这些代码从头开始的时候, 你扔掉了所有这些知识。 所有那些代码错误的搜集和数年的编程工作。

You are throwing away your market leadership. You are giving a gift of two or three years to your competitors, and believe me, that is a longtime in software years.

你扔掉的是你的市场主导份额。 你给竞争对手送了两到三年的时间做礼物, 相信我, 在软件行业这绝对是很长的时间。

You are putting yourself in an extremely dangerous position where you will be shipping an old version of the code for several years, completely unable to make any strategic changes or react to new features that the market demands, because you don't have shippable code. You might as well just close for business for the duration.

你将你自己放在一个危险的境地,你花了几年发布的还是一个旧版本, 根本不会根据市场的需要制定战略性决策或者添加新的功能, 因为你没有能发布的代码。 你在那段时间内就像是关张不做生意一样。

You are wasting an outlandish amount of money writing code that already exists.

你花了难以想象的金钱来重写那些已经存在的代码。

Is there an alternative? The consensus seems to be that the old Netscape code base was really bad. Well, it might have been bad, but, you know what? It worked pretty darn well on an awful lot of real world computer systems.

有其他办法么?共识好像是网景的旧代码确实非常糟。 好吧,它可能确实很糟, 但是,你知道么? 但是它却能在现实中很多很多电脑上运行。

When programmers say that their code is a holy mess (as they always do), there are three kinds of things that are wrong with it.

但程序员说他们的代码超级糟糕(他们总是这么说), 有三样事情错了。

First, there are architectural problems. The code is not factored correctly. The networking code is popping up its own dialog boxes from the middle of nowhere; this should have been handled in the UI code. These problems can be solved, one at a time, by carefully moving code, refactoring, changing interfaces. They can be done by one programmer working carefully and checking in his changes all at once, so that nobody else is disrupted. Even fairly major architectural changes can be done without throwing away the code. On the Juno project we spent several months rearchitecting at one point: just moving things around, cleaning them up, creating base classes that made sense, and creating sharp interfaces between the modules. But we did it carefully, with our existing code base, and we didn't introduce new bugs or throw away working code.

第一,架构有问题。 代码没有正确重构。 网络相关的代码到处弹出对话框;这些代码应该在UI代码里出现。 这些问题能够一次一个的小心的移动代码,重构代码,修改接口来实现。 可以通过一个程序员一次小心的提交所有这些更改,这样就避免影响到其他的程序员。 再大的架构改动也能够通过这种方式的变动达成而不需要扔掉代码。 在Juno项目上,到某个点的时候我们曾经花了几个月冲架构: 仅仅是把东西搬来搬去,清理干净, 创建更合理的基类, 在模块之间创建更多合理的接口。 但是我们实在现有的代码库的基础上小心的做的, 我们没有引入新的错误也没有扔掉旧的代码。

A second reason programmers think that their code is a mess is that it is inefficient. The rendering code in Netscape was rumored to be slow. But this only affects a small part of the project, which you can optimize or even rewrite. You don't have to rewrite the whole thing. When optimizing for speed, 1% of the work gets you 99% of the bang.

程序员认为旧的代码一团糟的第二个理由就是它不够高效。 传说网景的渲染代码不够高效。 但这只影响了项目的一小部分, 这部分你可以优化甚至重写。 你不需要把整个玩意儿重写啊。 说到优化速度, 1%的工作量就可以带来99%的变化。

Third, the code may be doggone ugly. One project I worked on actually had a data type called a FuckedString. Another project had started out using the convention of starting member variables with an underscore, but later switched to the more standard "m". So half the functions started with "" and half with "m_", which looked ugly. Frankly, this is the kind of thing you solve in five minutes with a macro in Emacs, not by starting from scratch.

第三点, 代码实在太特么丑了。 我工作的一个项目曾经有一个数据类型叫做 干尼玛的字符串。 另一个项目一开始的时候规定程序应该以下划线—开始,后来又改成了m接下划线。 所以一般的函数以下划线开始一半的 函数以m下划线开始,看起来实在丑。 老实说,这就那种你在Emacs下面用宏能够在五分钟之内解决的问题。而不是从头开始。

It's important to remember that when you start from scratch there isabsolutely no reason to believe that you are going to do a better job than you did the first time. First of all, you probably don't even have the same programming team that worked on version one, so you don't actually have "more experience". You're just going to make most of the old mistakes again, and introduce some new problems that weren't in the original version.

很重要的一点要记住:没有任何迹象表明如果你从头开始写就一定会比开始写的更好。 首先,你可能没有你写第一个版本时候的那个团队, 所以你实际上没有“更多的经验”。 你会再次犯所有那些旧的错误, 然后还会引入旧的版本里没有的那些新问题。

The old mantra build one to throw away is dangerous when applied to large scale commercial applications. If you are writing code experimentally, you may want to rip up the function you wrote last week when you think of a better algorithm. That's fine. You may want to refactor a class to make it easier to use. That's fine, too. But throwing away the whole program is a dangerous folly, and if Netscape actually had some adult supervision with software industry experience, they might not have shot themselves in the foot so badly.

老口头禅说:“搭个新的然后扔掉”, 应用到大规模的商业软件上这是很危险的。 如果你只是写个实验性质的代码,你可能像扯掉你上个礼拜写的代码因为你想到了一个新的算法。 这没关系。 你想重构一个类让它更好用,也没关系。 但是扔掉整个程序则是很愚蠢的危险行为, 如果网景公司有一个软件行业实际经验的高管的花, 他们也许就不会这么严重的自杀了。