JoelOnSoftware

Joel on Software

The Law of Leaky Abstractions

抽象失效定律

by Joel Spolsky Monday, November 11, 2002

There's a key piece of magic in the engineering of the Internet which you rely on every single day. It happens in the TCP protocol, one of the fundamental building blocks of the Internet.

你每天依赖的因特网工程学上有种关键的魔法力量。 这种力量存在于TCP协议之中,也是构筑起因特网的基石。

TCP is a way to transmit data that is reliable. By this I mean: if you send a message over a network using TCP, it will arrive, and it won't be garbled or corrupted.

TCP是可靠传输数据的方式,这样说的意思是: 如果你在网络上用TCP协议传输一则消息, 消息最终会送达,而不会丢失或者损坏。

We use TCP for many things like fetching web pages and sending email. The reliability of TCP is why every exciting email from embezzling East Africans arrives in letter-perfect condition. O joy.

我们使用TCP做很多事情:查看网页,发送email。 TCP的稳定性保证了从东非海盗湾来的电子有点都一字不阿查。 哦真开心。

By comparison, there is another method of transmitting data called IP which is unreliable. Nobody promises that your data will arrive, and it might get messed up before it arrives. If you send a bunch of messages with IP, don't be surprised if only half of them arrive, and some of those are in a different order than the order in which they were sent, and some of them have been replaced by alternate messages, perhaps containing pictures of adorable baby orangutans, or more likely just a lot of unreadable garbage that looks like the subject line of Taiwanese spam.

相比之下,还有一种传输数据的方式叫做IP则是不可靠的。 没有保证说数据一定会到达, 在到达之前可能就弄错了。 如果你用IP发一堆消息,如果只有一半到达了也不要感到很吃惊,其中有且到达的顺序跟发送相比还不同, 有一些消息被其他的消息代替了,例如包含着可爱的婴儿猩猩的照片, 或者更可能的是一大堆无法阅读的垃圾,看起来就像台语写的垃圾邮件的标题。

Here's the magic part: TCP is built on top of IP. In other words, TCP is obliged to somehow send data reliably using only an unreliable tool.

神奇的部分是:TCP构建与IP之上。 换句话说, TCP能够在用不可靠的工具发送可靠的数据。

To illustrate why this is magic, consider the following morally equivalent, though somewhat ludicrous, scenario from the real world.

为了解释为什么有这种魔法,请你考虑以下观念上相同但是有点儿滑稽的现实场景。

Imagine that we had a way of sending actors from Broadway to Hollywood that involved putting them in cars and driving them across the country. Some of these cars crashed, killing the poor actors. Sometimes the actors got drunk on the way and shaved their heads or got nasal tattoos, thus becoming too ugly to work in Hollywood, and frequently the actors arrived in a different order than they had set out, because they all took different routes. Now imagine a new service called Hollywood Express, which delivered actors to Hollywood, guaranteeing that they would (a) arrive (b) in order (c) in perfect condition. The magic part is that Hollywood Express doesn't have any method of delivering the actors, other than the unreliable method of putting them in cars and driving them across the country. Hollywood Express works by checking that each actor arrives in perfect condition, and, if he doesn't, calling up the home office and requesting that the actor's identical twin be sent instead. If the actors arrive in the wrong order Hollywood Express rearranges them. If a large UFO on its way to Area 51 crashes on the highway in Nevada, rendering it impassable, all the actors that went that way are rerouted via Arizona and Hollywood Express doesn't even tell the movie directors in California what happened. To them, it just looks like the actors are arriving a little bit more slowly than usual, and they never even hear about the UFO crash.

想象我们有某种方式把演员从百老汇送到好莱坞,先把他们塞入汽车,然后横穿整个美国。这当中有些汽车会撞毁,害死那些可怜的演员;有时候有些演员会在半路上喝醉,然后剃头甚至给鼻子刺上纹身,这样就变得太丑而不能在好莱坞工作了;大多数情况下,因为走的路线不同,演员们会以出发的时候不同的顺序到达。现在来想象一下一项叫做好莱坞快递的服务,专门将演员运送到好莱坞,并且保证:一:送达,二:顺序 三:完好无损。神奇的地方就是:除了演员们塞进汽车然后横穿美国之外好莱坞快递并没有任何方法来运这些演员。好莱坞快递在演员们被送达的时候会检查每个演员是否处于毫发无损的状态,如果不是的话,就会打电话给办公室,再送演员的孪生双胞胎过来;如果演员们以错误的顺序抵达,好莱坞快递就会重新安排它们的顺序。如果一个大型的UFO在前往内华达州51区高速公路上坠毁,使得该条公路无法通过,那么走该条路线的演员们就会重新选亚利桑那的路线,好莱坞快递甚至都不会告诉加利福尼亚的电影导演发生了什么。对他们而言,演员们只是看起来比平常晚了一些,他们甚至根本不会知道什么UFO坠毁了。

That is, approximately, the magic of TCP. It is what computer scientists like to call an abstraction: a simplification of something much more complicated that is going on under the covers. As it turns out, a lot of computer programming consists of building abstractions. What is a string library? It's a way to pretend that computers can manipulate strings just as easily as they can manipulate numbers. What is a file system? It's a way to pretend that a hard drive isn't really a bunch of spinning magnetic platters that can store bits at certain locations, but rather a hierarchical system of folders-within-folders containing individual files that in turn consist of one or more strings of bytes.

这大概就是tcp的魔法,也就是计算机科学家喜欢提的抽象。所谓抽象,就是对内部发生的更加复杂的事情的一种简化。结果。,大量的计算机编程就包含了构建抽象。什么是字符串库?就是一种假设计算机能够像处理数字一样容易地处理字符串的方式。什么是文件系统?就是假设硬盘驱动器并不仅仅是一堆能够在特定的位置存储比特的旋转磁盘,而是一种文件夹包含文件夹包含单独的文件进而包含一个或多个字符串比特的分级式系统。

Back to TCP. Earlier for the sake of simplicity I told a little fib, and some of you have steam coming out of your ears by now because this fib is driving you crazy. I said that TCP guarantees that your message will arrive. It doesn't, actually. If your pet snake has chewed through the network cable leading to your computer, and no IP packets can get through, then TCP can't do anything about it and your message doesn't arrive. If you were curt with the system administrators in your company and they punished you by plugging you into an overloaded hub, only some of your IP packets will get through, and TCP will work, but everything will be really slow.

回到tcp的话题。前面问的简单起见我撒了个谎,这样有些人可能会怒气冲天,因为这个谎言可能要把你逼疯了。我说tcp保证你的消息会送达,其实不是。如果你的宠物蛇卡在通往你计算机的网络电缆里,那么没有ip包能够通过,对此tcp也无能为力,你的消息就无法送达。公司的系统管理员给你穿小鞋,他们把你的网络电缆插入到一个过载的集线器,那么你的ip包还是能通过,tcp还能工作,但是所有的事情都会变得非常慢。

This is what I call a leaky abstraction. TCP attempts to provide a complete abstraction of an underlying unreliable network, but sometimes, the network leaks through the abstraction and you feel the things that the abstraction can't quite protect you from. This is but one example of what I've dubbed the Law of Leaky Abstractions:

这就是我说的有纰漏的抽象。Tcp尝试给内在不可靠的网络提供一个完整的抽象,而网络有的时候能脱离这种抽象,然后你就能发现那些抽象并不能保证的东西。这不过是我发明的纰漏抽象定律的一个例子而已:

All non-trivial abstractions, to some degree, are leaky.

所有的非基本抽象,在一定程度上,都是有纰漏的。

Abstractions fail. Sometimes a little, sometimes a lot. There's leakage. Things go wrong. It happens all over the place when you have abstractions. Here are some examples.

抽象会失效。有时候多点,有时少点,但总归有纰漏。东西会出错。所有有抽象的地方都会发生这种情。下面是几个例子:

  • Something as simple as iterating over a large two-dimensional array can have radically different performance if you do it horizontally rather than vertically, depending on the "grain of the wood" -- one direction may result in vastly more page faults than the other direction, and page faults are slow. Even assembly programmers are supposed to be allowed to pretend that they have a big flat address space, but virtual memory means it's really just an abstraction, which leaks when there's a page fault and certain memory fetches take way more nanoseconds than other memory fetches.
  • 哪怕是遍历二维数组这么简单的事情,如果你是横着做,而不是竖着做,都会出现极端不同的性能表现。这取决于“木头的纹路” --一个方向相比另外一个方向会导致大量的缺页错误,而缺页中断则非常慢。哪怕汇编程序员都可以假设有很大的线性地址空间,但对于虚拟内存来讲那只是一个抽象,这种抽象在产生缺页中断或者内存存取相比其他普通的内存存取远超纳秒级别的时候会失效。
  • The SQL language is meant to abstract away the procedural steps that are needed to query a database, instead allowing you to define merely what you want and let the database figure out the procedural steps to query it. But in some cases, certain SQL queries are thousands of times slower than other logically equivalent queries. A famous example of this is that some SQL servers are dramatically faster if you specify "where a=b and b=c and a=c" than if you only specify "where a=b and b=c" even though the result set is the same. You're not supposed to have to care about the procedure, only the specification. But sometimes the abstraction leaks and causes horrible performance and you have to break out the query plan analyzer and study what it did wrong, and figure out how to make your query run faster.
  • SQL是想抽象那些查询数据库所需的具体步骤,这样你就只需要弄清楚你要什么然后让数据库来搞清楚查询这些数据所需的具体步骤。但是在某些情况下,有一些sql查询会比逻辑上相等的其他查询慢一千倍。一个著名的例子就是在一些sql服务器上面如果你规定"where a=b and b=c and a=c"要比你只归定"where a=b and b=c"快很多,哪怕结果其实是一样的。你本不应该关心步骤只需要关心数据规范。但是有的时候抽象是有纰漏的,并且给你造成了可怕的性能问题。你只好打断查询计划分析器,然后研究他哪做错了,并且搞清楚如何让你的查询运行更快。
  • Even though network libraries like NFS and SMB let you treat files on remote machines "as if" they were local, sometimes the connection becomes very slow or goes down, and the file stops acting like it was local, and as a programmer you have to write code to deal with this. The abstraction of "remote file is the same as local file" leaks. Here's a concrete example for Unix sysadmins. If you put users' home directories on NFS-mounted drives (one abstraction), and your users create .forward files to forward all their email somewhere else (another abstraction), and the NFS server goes down while new email is arriving, the messages will not be forwarded because the .forward file will not be found. The leak in the abstraction actually caused a few messages to be dropped on the floor.
  • 即使NFS和SMB这样的网络程序库能够你像处理本地文件一样处理远程机器上的文,有的时候网络连接会变得非常慢甚至断开,网络文件就开始变得不像本地文件一样,然后作为程序员就必须写代码来处理这种情况。“远程文件像本地文件一样”这种抽象就会失效。这里有一个Unix系统管理的具体例子:如果你把用户的home文件夹放在NFS映射的网络驱动(一层抽象)上面,然后你的用户创建了.forward文件来把他们所有的电子邮件转发到其他地方(另一个抽象),然后NFS服务器宕机的新的电子邮件到达了,这样电子邮件不会被转发因为无法发现.forward文件。抽象的这种纰漏实际上导致一些信息被丢在了地板上。
  • C++ string classes are supposed to let you pretend that strings are first-class data. They try to abstract away the fact thatstrings are hard and let you act as if they were as easy as integers. Almost all C++ string classes overload the + operator so you can write s + "bar" to concatenate. But you know what? No matter how hard they try, there is no C++ string class on Earth that will let you type "foo" + "bar", because string literals in C++ are always char*'s, never strings. The abstraction has sprung a leak that the language doesn't let you plug. (Amusingly, the history of the evolution of C++ over time can be described as a history of trying to plug the leaks in the string abstraction. Why they couldn't just add a native string class to the language itself eludes me at the moment.)
  • C++的字符串类是想能够让你处理第一类数据那样处理字符串。他们想尝试着抽象掉字符串困难的那一面,想你觉得字符串跟整数一样简单。几乎所有的C++字符串类都会重载+号运算符以便让你能够书写。S+“bar” 来实现字符串拼接。但你知道吗?不管他们多努力地去尝试,世界上都没有c++字符串类能够让你输入“foo”+“bar”,因为在c++里字符串常数都是char*类型的,而不是字符串类型。这种抽象有一种语言不让你实现的弹性漏洞(搞笑的是,c++随时间进化的历史可以描述成尝试着往字符串抽象插入漏洞的历史。他们为什么不在语言本身就加入一个原始的字符串类型在此刻让我陷入了沉思)
  • And you can't drive as fast when it's raining, even though your car has windshield wipers and headlights and a roof and a heater, all of which protect you from caring about the fact that it's raining (they abstract away the weather), but lo, you have to worry about hydroplaning (or aquaplaning in England) and sometimes the rain is so strong you can't see very far ahead so you go slower in the rain, because the weather can never be completely abstracted away, because of the law of leaky abstractions.
  • 就像下雨天的时候你不能够开得很快,哪怕你的汽车有雨刮器,有头灯,有顶棚,有取暖器,所有的这些都能够让你不用去思考现在外面在下雨(他们抽象了下雨),但是捏,担心下雨打滑(英语里叫aquaplaning),有的时候雨下的如此之大以至于你不能够看到你前方很远的地方,你只好在雨中慢下来。因为纰漏抽象定理,天气永远不能够被抽象掉。

One reason the law of leaky abstractions is problematic is that it means that abstractions do not really simplify our lives as much as they were meant to. When I'm training someone to be a C++ programmer, it would be nice if I never had to teach them about char's and pointer arithmetic. It would be nice if I could go straight to STL strings. But one day they'll write the code "foo" + "bar", and truly bizarre things will happen, and then I'll have to stop and teach them all about char's anyway. Or one day they'll be trying to call a Windows API function that is documented as having an OUT LPTSTR argument and they won't be able to understand how to call it until they learn about char*'s, and pointers, and Unicode, and wchar_t's, and the TCHAR header files, and all that stuff that leaks up.

抽象定理有问题的一个原因就在于:他们不像看上去那样能够极大的简化我们的生活。当我在培训一个人成为c++程序员的时候,如果可以不用教他们char和指针运算似乎很好,我可以直接教他们使用STL字符串。但是有一天刚开始写“foo”+“bar” 这样的代码的时候就会发生奇怪的事情,然后我还是得停下来教他们Char。或者有一天他们会去尝试调用一个WindowsAPI函数,函数的文档上写着该函数有一个OUT LPTSTR的参数,然后他们就不明白如何调用这个函数,直到他们学习了char*,指针,Unicode wchar_t,以及TCHAR的头文件,以及所有抽象遗漏的地方。

In teaching someone about COM programming, it would be nice if I could just teach them how to use the Visual Studio wizards and all the code generation features, but if anything goes wrong, they will not have the vaguest idea what happened or how to debug it and recover from it. I'm going to have to teach them all about IUnknown and CLSIDs and ProgIDS and ... oh, the humanity!

在教某人关于com编程的时候,如果我能够只教他们如何使用VisualStudio向导以及代码自动生成的功能会很方便,但是一旦有任何事情出错,他们就浑然不知发生了什么?或者如何去调试,如何从该错误中恢复。然后我还是得教他们所有关于Iunkown CLSID ProgID… 天哪还有人性么!

In teaching someone about ASP.NET programming, it would be nice if I could just teach them that they can double-click on things and then write code that runs on the server when the user clicks on those things. Indeed ASP.NET abstracts away the difference between writing the HTML code to handle clicking on a hyperlink () and the code to handle clicking on a button. Problem: the ASP.NET designers needed to hide the fact that in HTML, there's no way to submit a form from a hyperlink. They do this by generating a few lines of JavaScript and attaching an onclick handler to the hyperlink. The abstraction leaks, though. If the end-user has JavaScript disabled, the ASP.NET application doesn't work correctly, and if the programmer doesn't understand what ASP.NET was abstracting away, they simply won't have any clue what is wrong.

在教某人关于asp.net编程的时候,如果能够只教他们如何在某样东西上面双击然后编写用户点击的时候会在服务器上运行的代码会很方便。 实际上ASP.NET抽象掉了编写HTML代码处理超链接()点击和处理按钮点击的不同指出。 问题在于:ASP.NET的设计者需要抽象掉在html里面没有办法使用超链接提交表单这种事情。通过给超链接的onclick处理函数添加一小段JS代码来实现一种抽象。而如果终端用户如果禁止使用JS,那么这种抽象就会失效,ASP.NET程序就无法正确工作。如果程序员不理解ASP.NET的这种抽象,他们就完全不知道什么错了。

The law of leaky abstractions means that whenever somebody comes up with a wizzy new code-generation tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying "learn how to do it manually first, then use the wizzy tool to save time." Code generation tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don't save us time learning.

抽象失效定律也意味着:不管什么时候什么人发明一种超级棒的新的代码生成工具来让所有人变得高效无比的时候,你经常会听到别人说:“先学习如何手动完成这件事情,然后用伟大的工具来节省时间。”代码生成工具就像所有的抽象那样,先假定能够抽象的一些问题,然后所有的抽象那样也会失效,要处理这种失效的唯一办法就是去学习抽象是如何工作的,什么东西被抽象掉了。这样,抽象确实节省了我们的工作时间,但是他们不能节省我们学习的时间。

And all this means that paradoxically, even as we have higher and higher level programming tools with better and better abstractions, becoming a proficient programmer is getting harder and harder.

然而矛盾的是。哪怕我们有着越来越高级的编程工具,越来越好的抽象,要成为一个熟练的程序员却越来越难。

During my first Microsoft internship, I wrote string libraries to run on the Macintosh. A typical assignment: write a version of strcat that returns a pointer to the end of the new string. A few lines of C code. Everything I did was right from K&R -- one thin book about the C programming language.

我在微软第一份实习的时候,曾经写了一个运行在mac上的字符串库。很典型的作业: 编写一个strcat函数要能返回新拼接字符串的末尾的指针。 只用了几行的C代码。所有的操作都源自于K&R那本很薄的《C语言编程》。

Today, to work on CityDesk, I need to know Visual Basic, COM, ATL, C++, InnoSetup, Internet Explorer internals, regular expressions, DOM, HTML, CSS, and XML. All high level tools compared to the old K&R stuff, but I still have to know the K&R stuff or I'm toast.

今天,工作在CityDesk上的时候,我需要知道:VB,COM,ATL,C++, InnoSetup, IE内部调用,正则表达式, DOM,HTML,CSS和XML。相比于K&R的所有这些高级工具,到我还是得知道K&R否则我就完了。

Ten years ago, we might have imagined that new programming paradigms would have made programming easier by now. Indeed, the abstractions we've created over the years do allow us to deal with new orders of complexity in software development that we didn't have to deal with ten or fifteen years ago, like GUI programming and network programming. And while these great tools, like modern OO forms-based languages, let us get a lot of work done incredibly quickly, suddenly one day we need to figure out a problem where the abstraction leaked, and it takes 2 weeks. And when you need to hire a programmer to do mostly VB programming, it's not good enough to hire a VB programmer, because they will get completely stuck in tar every time the VB abstraction leaks.

十年以前,我们想象会出现新的编程范式,让我们今天的编程变得更容易。事实上十年来我们开发出的新的抽象确实能够让我们处理在十年之前或者15年之前不需要面对的软件开发复杂度,就像用户界面编程和网络编程。然而正当这些伟大的工具,就像现代基于面向对象的语言让我们确实能够很快的完成手头工作的时候。今天我们还是得处理抽象失效的一个问题,然后这个问题需要花两个礼拜。当你需要雇一个程序员来做大部分VB编程的时候,单单雇一个只会VB的程序员是不够的,因为每当VB抽象失效的时候,他们就会完全卡壳。

The Law of Leaky Abstractions is dragging us down.

抽象失效定律拖了我们的后腿。