Joel on Software
Daily Builds Are Your Friend
by Joel Spolsky Saturday, January 27, 2001
In 1982, my family took delivery of the very first IBM-PC in Israel. We actually went down to the warehouse and waited while our PC was delivered from the port. Somehow, I convinced my dad to get the fully-decked out version, with two floppy disks, 128 K memory, and both a dot-matrix printer (for fast drafts) and a Brother Letter-Quality Daisy Wheel printer, which sounds exactly like a machine gun when it is operating, only louder. I think we got almost every accessory available: PC-DOS1.0 the 75$ technical reference manual with a complete source code listing of the BIOS, Macro Assembler, and the stunning IBM Monochrome display with a full 80 columns and . lower case letters! The whole thing cost about $10,000 including Israel's then-ridiculous import taxes. Extravagant!
1982年,我家在以色列提取了第一台IBM-PC。实际上,我们电脑从港口交货的时候,我们全家跑去了仓库并在那儿等着。我设法说法了我爸购买了完整版,两张软盘,128K内存,还有一个点阵打印机(快速打印草稿)和打印书信的菊轮式打印机,菊轮打印机工作的时候就像机关枪一样,甚至更响。我觉得我们买齐了几乎所有能买到的附件:PC-DOS1.0 75美元的技术参考手册以及完整BIOS,宏汇编源代码,以及IBM80列黑白显示器,还有…小写字母!包含以色列那个时候可笑的进口税总共花了约10,000$。太奢侈了!
Now, "everybody" knew that BASIC was a children's language that requires you to write spaghetti code and turns your brain into Camembert cheese. So we shelled out $600 for IBM Pascal, which came on three floppy diskettes. The compiler's first pass was on the first diskette, the second pass was on the second diskette, and the linker was on the third diskette. I wrote a simple "hello, world" program and compiled it. Total time elapsed: 8 minutes.
现在,几乎所有人都懂得BASIC是小朋友语言,它让你写出意大利面条一样的代码,然后把你的脑袋变成卡门佩尔奶酪。所以我们掏了600$买了IBM Pascal,分三张软盘。 编译器的第一遍程序放在第一张软盘上, 第二遍程序放在第二张软盘上, 连接器放在第三章软盘上。 我写的简单的“hello World”程序,编译了一下,总共花费:8分钟。
Hmm. That's a long time. I wrote a batch file to automate the process and shaved it down to 7 1/2 minutes. Better. But when I tried to write long programs like my stunning version of Othello which always beat me, I spent most of the time waiting for compiles. "Yep," a professional programmer told me, "we used to keep a sit-up board in the office and do sit-ups while we were doing compiles. After a few months of programming I had killer abs."
恩,那是很长的时间了。我写了个批处理来自动处理这个过程,把时间节省到7个半分钟。好点儿了。但当我尝试去写特别长的程序 就像那个总是能打败我的令人惊奇的“奥赛罗”的时候,大部分的时间都花在等待编译上。 “是的,”一个专业的程序员告诉我,“我们曾经在办公室里放了块仰卧起坐练习板,编译的时候就去联系仰卧起坐。在编了几个月程序之后就练就了杀手腹肌。”
One day, a spiffy program called Compas Pascal appeared from Denmark, which Philippe Kahn bought and renamed Borland Turbo Pascal. Turbo Pascal was sort of shocking, since it basically did everything that IBM Pascal did, only it ran in about 33K of memoryincluding the text editor. This was nothing short of astonishing. Even more astonishing was the fact that you could compile a small program in less than one second. It's as if a company you had never heard of introduced a clone of the Buick LeSabre which could go 1,000,000 MPH and drive around the world on so little gasoline than an ant could drink it without getting sick.
有一天,丹麦出现了一个叫做Compas Pascal的漂亮程序, Philippe Kahn买下来并将其改名为Borland Turbo Pascal。 Turbo Pascal有点让人震惊,因为它基本上包含IBM Pascal的所有功能,但是仅花费33K内存而且还包含文本编辑器。 这实在是无法不让人震惊。 更让人震惊的是你可以在不到一秒之内编译一个小程序。这就好像你从来没听过的小公司发明了别克LeSabre的克隆体,能够开到1,000,000英里/小时 然后即使绕着世界开一圈也才消耗很少汽油, 比蚂蚁吃了生病的剂量都少 。
Suddenly, I became much more productive.
我突然就变得很有效率了。
That's when I learned about the concept of the REP loop. REP stands for "Read, Eval, Print", and it describes what a lisp interpreter does for a living: it "reads" your input, evaluates it, and prints the result. An example of the REP loop is shown below: I type something, and the lisp interpreter reads it, evaluates it, and prints the result.
那是当我第一次听说REP循环这个概念的时候。REP就是:“READ,EVAL,PRINT”(阅读,演算,打印),它描述了LISP解释器的生存哲学:“它‘阅读’你的输入,演算,然后打印结果。 REP循环的一个例子如下所示:我输入了些东西,然后LISP解释器读了这些输入,运行,然后打印出结果。
On a slightly larger scale, when you're writing code, you are in a macro-version of the REP loop called the Edit-Compile-Test loop. You edit your code, compile it, test it, and see how well it works.
从稍微大一点的尺度来讲,当你在写代码的时候,你就在一个宏观版本的REP循环里,叫做 编辑-编译-测试 循环。 当你编辑你的代码,编译,测试来证实它到底工作得怎样。
A crucial observation here is that you have to run through the loop again and again to write a program, and so it follows that the faster the Edit-Compile-Test loop, the more productive you will be, down to a natural limit of instantaneous compiles. That's the formal, computer-science-y reason that computer programmers want really fast hardware and compiler developers will do anything they can to get super-fast Edit-Compile-Test loops. Visual Basic does it by parsing and lex-ing each line as you type it, so that the final compile can be super-quick. Visual C++ does it by providing incremental compiles, precompiled headers, and incremental linking.
这里的关键点是:你在写程序的时候必须一遍又一遍的执行这个循环,结果就是你执行 编辑-编译-测试 越快 你的效率也就越高,直到瞬间编译的自然极限。 这就是正式的,计算机科学式的理由为什么程序员要非常快的硬件和编译器。 开发者只要能得到超级快的 编辑-编译-测试 循环愿意做任何事情。 VisualBasic通过在你输入的同时进行语法解析和词法分析来达到这一点, 这样最终编译的时候就能够超级快。 Visual C++ 则是通过增量编译, 预编译头, 增量链接来达到这一点。
But as soon as you start working on a larger team with multiple developers and testers, you encounter the same loop again, writ larger (it's fractal, dude!). A tester finds a bug in the code, and reports the bug. The programmer fixes the bug. How long does it take before the tester gets the fixed version of the code? In some development organizations, this Report-Fix-Retest loop can take a couple of weeks, which means the whole organization is running unproductively. To keep the whole development process running smoothly, you need to focus on getting the Report-Fix-Retest loop tightened.
但是一旦你开始在大一点儿的团队,团队里有多个开发人员和测试人员,你还是会遇到相同的循环,更大而已(朋友,这就跟分形一样)。测试人员发现了代码的错误,报告了错误。程序员修复错误。测试人员获得修正的代码要多久?在有些开发机构里,这个 报告-修正-重新测试 循环能花上几个礼拜, 这意味着整个组织的运行都很低效。 为了让整个开发过程运行得很平滑, 你应该专注于精简 报告-修正-重新测试 循环。
One good way to do this is with daily builds. A daily build is anautomatic, daily, complete build of the entire source tree.
一个实现这个目标的好的方法就是每日构建。 每日构建是自动的,每天完整的构建源代码树。
Automatic - because you set up the code to be compiled at a fixed time every day, using cron jobs (on UNIX) or the scheduler service (on Windows).
自动 – 因为你设定每天都在固定的时间来编译代码, 在UNIX上用Cron 或者在Windows是就是用计划任务。
Daily - or even more often. It's tempting to do continuous builds, but you probably can't, because of source control issues which I'll talk about in a minute.
每天 – 或者更频繁。 很容易就想做持续集成,但你可能没法做到,因为一个我马上要讨论的话题。
Complete - chances are, your code has multiple versions. Multiple language versions, multiple operating systems, or a high-end/low-end version. The daily build needs to build all of them. And it needs to build every file from scratch, not relying on the compiler's possibly imperfect incremental rebuild capabilities.
完整 – 有可能你的代码有多个版本。 多种语言版本, 多种操作系统, 或者 高端/低端 版本。 每日构建应该要构建所有的这些版本。 而且必须是从头构建每一个文件, 而不是依赖于编译器的可能并不完美的增量编译功能。
Here are some of the many benefits of daily builds:
下面是每日构建的一些优点:
- When a bug is fixed, testers get the new version quickly and can retest to see if the bug was really fixed.
- Developers can feel more secure that a change they made isn't going to break any of the 1024 versions of the system that get produced, without actually having an OS/2 box on their desk to test on.
- Developers who check in their changes right before the scheduled daily build know that they aren't going to hose everybody else by checking in something which "breaks the build" -- that is, something that causes nobody to be able to compile. This is the equivalent of the Blue Screen of Death for an entire programming team, and happens a lot when a programmer forgets to add a new file they created to the repository. The build runs fine ontheir machines, but when anyone else checks out, they get linker errors and are stopped cold from doing any work.
- Outside groups like marketing, beta customer sites, and so forth who need to use the immature product can pick a build that is known to be fairly stable and keep using it for a while.
- By maintaining an archive of all daily builds, when you discover a really strange, new bug and you have no idea what's causing it, you can use binary search on the historical archive to pinpoint when the bug first appeared in the code. Combined with good source control, you can probably track down which check-in caused the problem.
- When a tester reports a problem that the programmer thinks is fixed, the tester can say which build they saw the problem in. Then the programmer looks at when he checked in the fix and figure out whether it's really fixed.
Here's how to do them. You need a daily build server, which will probably be the fastest computer you can get your hands on. Write a script which checks out a complete copy of the current source code from the repository (you are using source control, aren't you?) and then builds, from scratch, every version of the code that you ship. If you have an installer or setup program, build that too. Everything you ship to customers should be produced by the daily build process. Put each build in its own directory, coded by date. Run your script at a fixed time every day.
下面是如何进行每日构建。 你需要一个每日构建服务器, 这个服务器最好得是你手边能获得的最快的电脑。 写个脚本从当前的代码仓库完整的检出所有的当前源代码(你用了代码管理,是吧?)然后从头开始构建,包括你递交的每一版代码。如果你有个安装程序,同时也要构建这个安装程序。你要发布给客户的所有东西都应该由每日构建过程产生。把每次的构建结果放在不同的文件夹下,以日期编码。然后在每天固定的时候运行你的代码。
- It's crucial that everything it takes to make a final build is done by the daily build script, from checking out the code up to and including putting the bits up on a web server in the right place for the public to download (although during the development process, this will be a test server, of course). That's the only way to insure that there is nothing about the build process that is only "documented" in one person's head. You never get into a situation where you can't release a product because only Shaniqua knows how to create the installer, and she was hit by a bus. On the Juno team, the only thing you needed to know to create a full build from scratch was where the build server was, and how to double-click on its "Daily Build" icon.
- There is nothing worse for your sanity then when you are trying to ship the code, and there's one tiny bug, so you fix that one tiny bug right on the daily build server and ship it. As a golden rule, you should only ship code that has been produced by a full, clean daily build that started from a complete checkout.
- Set your compilers to maximum warning level (-W4 in Microsoft's world; -Wall in gcc land) and set them to stop if they encounter even the smallest warning.
- If a daily build is broken, you run the risk of stopping the whole team. Stop everything and keep rebuilding until it's fixed. Some days, you may have multiple daily builds.
- Your daily build script should report failures, via email, to the whole development team. It's not too hard to grep the logs for "error" or "warning" and include that in the email, too. The script can also append status reports to an HTML page visible to everyone so programmers and testers can quickly determine which builds were successful.
- One rule we followed on the Microsoft Excel team, to great effect, was that whoever broke the build became responsible for babysitting builds until somebody else broke it. In addition to serving as a clever incentive to keep the build working, it rotated almost everybody through the job of buildmaster so everybody learned about how builds are produced.
- If your team works in one time zone, a good time to do builds is at lunchtime. That way everybody checks in their latest code right before lunch, the build runs while they're eating, and when they get back, if the build is broken, everybody is around to fix it. As soon as the build is working everybody can check out the latest version without fear that they will be hosed due to a broken build.
- If your team is working in two time zones, schedule the daily build so that the people in one time zone don't hose the people in the other time zone. On the Juno team, the New York people would check things in at 7 PM New York time and go home. If they broke the build, the Hyderabad, India team would get into work (at about 8 PM New York Time) and be completely stuck for a whole day. We started doing two daily builds, about an hour before each team went home, and completely solved that problem.
For Further Reading:
更多阅读。
- Some discussion on tools for daily builds
- Making daily builds is important enough that it's one of the 12 steps to better code.
- There's a lot of interesting stuff about the builds made (weekly) by the Windows NT team in G. Pascal Zachary's bookShowstopper.
- Steve McConnell writes about daily builds here.