diff --git a/docs/ch04.md b/docs/ch04.md index fc3ac9bf..0df4c7c3 100644 --- a/docs/ch04.md +++ b/docs/ch04.md @@ -18,9 +18,9 @@ 不幸的是,这种理想是无法实现的。模块必须通过调用彼此的函数或方法来协同工作。结果,***模块必须相互了解。模块之间将存在依赖关系***:如果一个模块发生更改,则可能需要更改其他模块以进行匹配。例如,方法的参数在方法本身与调用该方法的任何代码之间创建了依赖关系。如果更改了要求的参数,则必须修改该方法的所有调用以符合新的签名。依赖关系可以采用许多其他形式,并且它们可能非常微妙。模块化设计的目标是最大程度地 ***减少模块之间的依赖性***。 -> In order to manage dependencies, we think of each module in two parts: an interface and an *implementation*. The interface consists of everything that a developer working in a different module must know in order to use the given module. Typically, the interface describes what the module does but not how it does it. The implementation consists of the code that carries out the promises made by the interface. A developer working in a particular module must understand the interface and implementation of that module, plus the interfaces of any other modules invoked by the given module. A developer should not need to understand the implementations of modules other than the one he or she is working in. +> In order to manage dependencies, we think of each module in two parts: an *interface* and an *implementation*. The interface consists of everything that a developer working in a different module must know in order to use the given module. Typically, the interface describes what the module does but not how it does it. The implementation consists of the code that carries out the promises made by the interface. A developer working in a particular module must understand the interface and implementation of that module, plus the interfaces of any other modules invoked by the given module. A developer should not need to understand the implementations of modules other than the one he or she is working in. -为了管理依赖关系,我们将每个模块分为两个部分:接口和实现。接口包含在其他模块上工作的开发人员在使用这个模块时必须知道的所有内容。通常,接口描述模块做什么,而不描述模块如何做。实现则由承载接口承诺的代码组成。在特定模块中工作的开发人员必须了解该模块的接口和实现,以及由该模块调用的任何其他模块的接口。除了正在其中工作的模块,开发人员应该无需了解其他模块的实现。 +为了管理依赖关系,我们将每个模块分为两个部分:*接口* 和 *实现*。接口包含在其他模块上工作的开发人员在使用这个模块时必须知道的所有内容。通常,接口描述模块做什么,而不描述模块如何做。实现则由承载接口承诺的代码组成。在特定模块中工作的开发人员必须了解该模块的接口和实现,以及由该模块调用的任何其他模块的接口。除了正在其中工作的模块,开发人员应该无需了解其他模块的实现。 > Consider a module that implements balanced trees. The module probably contains sophisticated code for ensuring that the tree remains balanced. However, this complexity is not visible to users of the module. Users see a relatively simple interface for invoking operations to insert, remove, and fetch nodes in the tree. To invoke an insert operation, the caller need only provide the key and value for the new node; the mechanisms for traversing the tree and splitting nodes are not visible in the interface. diff --git a/docs/en/ch04.md b/docs/en/ch04.md index 5bba6b7b..ff46876c 100644 --- a/docs/en/ch04.md +++ b/docs/en/ch04.md @@ -8,7 +8,7 @@ In modular design, a software system is decomposed into a collection of modules Unfortunately, this ideal is not achievable. Modules must work together by calling each others’s functions or methods. As a result, modules must know something about each other. There will be dependencies between the modules: if one module changes, other modules may need to change to match. For example, the arguments for a method create a dependency between the method and any code that invokes the method. If the required arguments change, all invocations of the method must be modified to conform to the new signature. Dependencies can take many other forms, and they can be quite subtle. The goal of modular design is to minimize the dependencies between modules. -In order to manage dependencies, we think of each module in two parts: an interface and an *implementation*. The interface consists of everything that a developer working in a different module must know in order to use the given module. Typically, the interface describes what the module does but not how it does it. The implementation consists of the code that carries out the promises made by the interface. A developer working in a particular module must understand the interface and implementation of that module, plus the interfaces of any other modules invoked by the given module. A developer should not need to understand the implementations of modules other than the one he or she is working in. +In order to manage dependencies, we think of each module in two parts: an *interface* and an *implementation*. The interface consists of everything that a developer working in a different module must know in order to use the given module. Typically, the interface describes what the module does but not how it does it. The implementation consists of the code that carries out the promises made by the interface. A developer working in a particular module must understand the interface and implementation of that module, plus the interfaces of any other modules invoked by the given module. A developer should not need to understand the implementations of modules other than the one he or she is working in. Consider a module that implements balanced trees. The module probably contains sophisticated code for ensuring that the tree remains balanced. However, this complexity is not visible to users of the module. Users see a relatively simple interface for invoking operations to insert, remove, and fetch nodes in the tree. To invoke an insert operation, the caller need only provide the key and value for the new node; the mechanisms for traversing the tree and splitting nodes are not visible in the interface. @@ -68,8 +68,6 @@ A modern implementation of the Unix I/O interface requires hundreds of thousands - How can recently accessed file data be cached in memory in order to reduce the number of disk accesses? - How can a variety of different secondary storage devices, such as disks and flash drives, be incorporated into a single file system? ---- - All of these issues, and many more, are handled by the Unix file system implementation; they are invisible to programmers who invoke the system calls. Implementations of the Unix I/O interface have evolved radically over the years, but the five basic kernel calls have not changed. Another example of a deep module is the garbage collector in a language such as Go or Java. This module has no interface at all; it works invisibly behind the scenes to reclaim unused memory. Adding garbage collection to a system actually shrinks its overall interface, since it eliminates the interface for freeing objects. The implementation of a garbage collector is quite complex, but that complexity is hidden from programmers using the language. diff --git a/docs/en/ch05.md b/docs/en/ch05.md index 252c272e..a7977435 100644 --- a/docs/en/ch05.md +++ b/docs/en/ch05.md @@ -14,8 +14,6 @@ The information hidden within a module usually consists of details about how to - How to schedule threads on a multi-core processor. - How to parse JSON documents. ---- - The hidden information includes data structures and algorithms related to the mechanism. It can also include lower-level details such as the size of a page, and it can include higher-level concepts that are more abstract, such as an assumption that most files are small. Information hiding reduces complexity in two ways. First, it simplifies the interface to a module. The interface reflects a simpler, more abstract view of the module’s functionality and hides the details; this reduces the cognitive load on developers who use the module. For instance, a developer using a B-tree class need not worry about the ideal fanout for nodes in the tree or how to keep the tree balanced. Second, information hiding makes it easier to evolve the system. If a piece of information is hidden, there are no dependencies on that information outside the module containing the information, so a design change related to that information will affect only the one module. For example, if the TCP protocol changes (to introduce a new mechanism for congestion control, for instance), the protocol’s implementation will have to be modified, but no changes should be needed in higher-level code that uses TCP to send and receive data. diff --git a/docs/en/ch07.md b/docs/en/ch07.md index 409e9d32..1171ddbf 100644 --- a/docs/en/ch07.md +++ b/docs/en/ch07.md @@ -5,8 +5,6 @@ Software systems are composed in layers, where higher layers use the facilities - In a file system, the uppermost layer implements a file abstraction. A file consists of a variable-length array of bytes, which can be updated by reading and writing variable-length byte ranges. The next lower layer in the file system implements a cache in memory of fixed-size disk blocks; callers can assume that frequently used blocks will stay in memory where they can be accessed quickly. The lowest layer consists of device drivers, which move blocks between secondary storage devices and memory. - In a network transport protocol such as TCP, the abstraction provided by the topmost layer is a stream of bytes delivered reliably from one machine to another. This level is built on a lower level that transmits packets of bounded size between machines on a best-effort basis: most packets will be delivered successfully, but some packets may be lost or delivered out of order. ---- - If a system contains adjacent layers with similar abstractions, this is a red flag that suggests a problem with the class decomposition. This chapter discusses situations where this happens, the problems that result, and how to refactor to eliminate the problems. ## 7.1 Pass-through methods @@ -77,8 +75,6 @@ Before creating a decorator class, consider alternatives such as the following: - Could you merge the new functionality with an existing decorator, rather than creating a new decorator? This would result in a single deeper decorator class rather than multiple shallow ones. - Finally, ask yourself whether the new functionality really needs to wrap the existing functionality: could you implement it as a stand-alone class that is independent of the base class? In the windowing example, the scrollbars could probably be implemented separately from the main window, without wrapping all of its existing functionality. ---- - Sometimes decorators make sense, but there is usually a better alternative. ## 7.4 Interface versus implementation diff --git a/docs/en/ch09.md b/docs/en/ch09.md index 50f366f3..708ee8bd 100644 --- a/docs/en/ch09.md +++ b/docs/en/ch09.md @@ -9,8 +9,6 @@ When deciding whether to combine or separate, the goal is to reduce the complexi - Subdivision creates separation: the subdivided components will be farther apart than they were before subdivision. For example, methods that were together in a single class before subdivision may be in different classes after subdivision, and possibly in different files. Separation makes it harder for developers to see the components at the same time, or even to be aware of their existence. If the components are truly independent, then separation is good: it allows the developer to focus on a single component at a time, without being distracted by the other components. On the other hand, if there are dependencies between the components, then separation is bad: developers will end up flipping back and forth between the components. Even worse, they may not be aware of the dependencies, which can lead to bugs. - Subdivision can result in duplication: code that was present in a single instance before subdivision may need to be present in each of the subdivided components. ---- - Bringing pieces of code together is most beneficial if they are closely related. If the pieces are unrelated, they are probably better off apart. Here are a few indications that two pieces of code are related: - They share information; for example, both pieces of code might depend on the syntax of a particular type of document. @@ -18,15 +16,13 @@ Bringing pieces of code together is most beneficial if they are closely related. - They overlap conceptually, in that there is a simple higher-level category that includes both of the pieces of code. For example, searching for a substring and case conversion both fall under the category of string manipulation; flow control and reliable delivery both fall under the category of network communication. - It is hard to understand one of the pieces of code without looking at the other. ---- - The rest of this chapter uses more specific rules as well as examples to show when it makes sense to bring pieces of code together and when it makes sense to separate them. ## 9.1 Bring together if information is shared Section 5.4 introduced this principle in the context of a project implementing an HTTP server. In its first implementation, the project used two different methods in different classes to read in and parse HTTP requests. The first method read the text of an incoming request from a network socket and placed it in a string object. The second method parsed the string to extract the various components of the request. With this decomposition, both of the methods ended up with considerable knowledge of the format of HTTP requests: the first method was only trying to read the request, not parse it, but it couldn’t identify the end of the request without doing most of the work of parsing it (for example, it had to parse header lines in order to identify the header containing the overall request length). Because of this shared information, it is better to both read and parse the request in the same place; when the two classes were combined into one, the code got shorter and simpler. -## 9.2 Bring together if it will simplify the interface​​ +## 9.2 Bring together if it will simplify the interface When two or more modules are combined into a single module, it may be possible to define an interface for the new module that is simpler or easier to use than the original interfaces. This often happens when the original modules each implement part of the solution to a problem. In the HTTP server example from the preceding section, the original methods required an interface to return the HTTP request string from the first method and pass it to the second. When the methods were combined, these interfaces were eliminated. diff --git a/docs/en/ch10.md b/docs/en/ch10.md index 95d9cd51..995adc1e 100644 --- a/docs/en/ch10.md +++ b/docs/en/ch10.md @@ -13,8 +13,6 @@ A particular piece of code may encounter exceptions in several different ways: - In a distributed system, network packets may be lost or delayed, servers may not respond in a timely fashion, or peers may communicate in unexpected ways. - The code may detect bugs, internal inconsistencies, or situations it is not prepared to handle. ---- - Large systems have to deal with many exceptional conditions, particularly if they are distributed or need to be fault-tolerant. Exception handling can account for a significant fraction of all the code in a system. Exception handling code is inherently more difficult to write than normal-case code. An exception disrupts the normal flow of the code; it usually means that something didn’t work as expected. When an exception occurs, the programmer can deal with it in two ways, each of which can be complicated. The first approach is to move forward and complete the work in progress in spite of the exception. For example, if a network packet is lost, it can be resent; if data is corrupted, perhaps it can be recovered from a redundant copy. The second approach is to abort the operation in progress and report the exception upwards. However, aborting can be complicated because the exception may have occurred at a point where system state is inconsistent (a data structure might have been partially initialized); the exception handling code must restore consistency, such as by unwinding any changes made before the exception occurred. diff --git a/docs/en/ch11.md b/docs/en/ch11.md index b2429b83..9d69e531 100644 --- a/docs/en/ch11.md +++ b/docs/en/ch11.md @@ -12,8 +12,6 @@ After you have roughed out the designs for the alternatives, make a list of the - Is one interface more general-purpose than another? - Does one interface enable a more efficient implementation than another? In the text example, the character-oriented approach is likely to be significantly slower than the others, because it requires a separate call into the text module for each character. ---- - Once you have compared alternative designs, you will be in a better position to identify the best design. The best choice may be one of the alternatives, or you may discover that you can combine features of multiple alternatives into a new design that is better than any of the original choices. Sometimes none of the alternatives is particularly attractive; when this happens, see if you can come up with additional schemes. Use the problems you identified with the original alternatives to drive the new design(s). If you were designing the text class and considered only the line-oriented and character-oriented approaches, you might notice that each of the alternatives is awkward because it requires higher level software to perform additional text manipulations. That’s a red flag: if there’s going to be a text class, it should handle all of the text manipulation. In order to eliminate the additional text manipulations, the text interface needs to match more closely the operations happening in higher level software. These operations don’t always correspond to single characters or single lines. This line of reasoning should lead you to a range-oriented API for text, which eliminates the problem with the earlier designs. diff --git a/docs/en/ch12.md b/docs/en/ch12.md index 4350f009..1f10131e 100644 --- a/docs/en/ch12.md +++ b/docs/en/ch12.md @@ -13,8 +13,6 @@ When developers don’t write comments, they usually justify their behavior with - “Comments get out of date and become misleading.” - “The comments I have seen are all worthless; why bother?” In the sections below I will address each of these excuses in turn. ---- - ## 12.1 Good code is self-documenting Some people believe that if code is written well, it is so obvious that no comments are needed. This is a delicious myth, like a rumor that ice cream is good for your health: we’d really like to believe it! Unfortunately, it’s simply not true. To be sure, there are things you can do when writing code to reduce the need for comments, such as choosing good variable names (see Chapter 14). Nonetheless, there is still a significant amount of design information that can’t be represented in code. For example, only a small part of a class’s interface, such as the signatures of its methods, can be specified formally in the code. The informal aspects of an interface, such as a high-level description of what each method does or the meaning of its result, can only be described in comments. There are many other examples of things that can’t be described in the code, such as the rationale for a particular design decision, or the conditions under which it makes sense to call a particular method. @@ -49,8 +47,6 @@ Chapter 2 described three ways in which complexity manifests itself in software - Cognitive load: in order to make a change, the developer must accumulate a large amount of information. - Unknown unknowns: it is unclear what code needs to be modified, or what information must be considered in order to make those modifications. ---- - Good documentation helps with the last two of these issues. Documentation can reduce cognitive load by providing developers with the information they need to make changes and by making it easy for developers to ignore information that is irrelevant. Without adequate documentation, developers may have to read large amounts of code to reconstruct what was in the designer’s mind. Documentation can also reduce the unknown unknowns by clarifying the structure of the system, so that it is clear what information and code is relevant for any given change. Chapter 2 pointed out that the primary causes of complexity are dependencies and obscurity. Good documentation can clarify dependencies, and it fills in gaps to eliminate obscurity. diff --git a/docs/en/ch13.md b/docs/en/ch13.md index 53f47413..fa974d47 100644 --- a/docs/en/ch13.md +++ b/docs/en/ch13.md @@ -114,8 +114,6 @@ Precision is most useful when commenting variable declarations such as class ins - If a variable refers to a resource that must eventually be freed or closed, who is responsible for freeing or closing it? - Are there certain properties that are always true for the variable (invariants), such as “this list always contains at least one entry”? ---- - Some of this information could potentially be figured out by examining all of the code where the variable is used. However, this is time-consuming and error-prone; the declaration’s comment should be clear and complete enough to make this unnecessary. When I say that the comment for a declaration should describe things that aren’t obvious from the code, “the code” refers to the code next to the comment (the declaration), not “all of the code in the application.” The most common problem with comments for variables is that the comments are too vague. Here are two examples of comments that aren’t precise enough: @@ -262,8 +260,6 @@ The interface comment for a method includes both higher-level information for ab - A method’s interface comment must describe any exceptions that can emanate from the method. - If there are any preconditions that must be satisfied before a method is invoked, these must be described (perhaps some other method must be invoked first; for a binary search method, the list being searched must be sorted). It is a good idea to minimize preconditions, but any that remain must be documented. ---- - Here is the interface comment for a method that copies data out of a Buffer object: ```java @@ -318,8 +314,6 @@ Now let’s consider what information needs to be included in the interface comm 4. Whether or not IndexLookup issues multiple requests to different servers concurrently. 5. The mechanism for handling server crashes. ---- - Here is the original version of the interface comment for the IndexLookup class; the excerpt also includes a few lines from the class’s definition, which are referred to in the comment: ```cpp @@ -359,8 +353,6 @@ Before reading further, see if you can identify the problems with this comment. - Most of the first paragraph concerns the implementation, not the interface. As one example, users don’t need to know the names of the particular remote procedure calls used to communicate with the servers. The configuration parameters referred to in the second half of the first paragraph are all private variables that are relevant only to the maintainer of the class, not to its users. All of this implementation information should be omitted from the comment. - The comment also includes several things that are obvious. For example, there’s no need to tell users to include IndexLookup.h: anyone who writes C++ code will be able to guess that this is necessary. In addition, the text “by providing all necessary information” says nothing, so it can be omitted. ---- - A shorter comment for this class is sufficient (and preferable): ```java @@ -523,5 +515,3 @@ Does a developer need to know each of the following pieces of information in ord 3. The data structure used to store indexes on servers. No: this information should be encapsulated on the servers; not even the implementation of IndexLookup should need to know this. 4. Whether or not IndexLookup issues multiple requests to different servers concurrently. Possibly: if IndexLookup uses special techniques to improve performance, then the documentation should provide some high-level information about this, since users may care about performance. 5. The mechanism for handling server crashes. No: RAMCloud recovers automatically from server crashes, so crashes are not visible to application-level software; thus, there is no need to mention crashes in the interface documentation for IndexLookup. If crashes were reflected up to applications, then the interface documentation would need to describe how they manifest themselves (but not the details of how crash recovery works). - ---- diff --git a/docs/en/ch15.md b/docs/en/ch15.md index cb6c83ec..817f1042 100644 --- a/docs/en/ch15.md +++ b/docs/en/ch15.md @@ -22,8 +22,6 @@ I use a different approach to writing comments, where I write the comments at th - Finally, I fill in the bodies of the methods, adding implementation comments as needed. - While writing method bodies, I usually discover the need for additional methods and instance variables. For each new method I write the interface comment before the body of the method; for instance variables I fill in the comment at the same time that I write the variable declaration. ---- - When the code is done, the comments are also done. There is never a backlog of unwritten comments. The comments-first approach has three benefits. First, it produces better comments. If you write the comments as you are designing the class, the key design issues will be fresh in your mind, so it’s easy to record them. It’s better to write the interface comment for each method before its body, so you can focus on the method’s abstraction and interface without being distracted by its implementation. During the coding and testing process you will notice and fix problems with the comments. As a result, the comments improve over the course of development. diff --git a/docs/en/ch20.md b/docs/en/ch20.md index 75315e96..3a917ab4 100644 --- a/docs/en/ch20.md +++ b/docs/en/ch20.md @@ -67,8 +67,6 @@ Figure 20.2 shows the original code for the critical path, which starts with the - The code checks twice to see if the current allocation has enough room for the new data: once in Buffer::Allocation::allocateAppend, and again when its return value is tested by Buffer::allocateAppend. - Buffer::alloc tests the return value from Buffer::allocAppend to confirm yet again that the allocation succeeded. ---- - Furthermore, rather than trying to expand the last chunk directly, the code allocates new space without any consideration of the last chunk. Then Buffer::alloc checks to see if that space happens to be adjacent to the last chunk, in which case it merges the new space with the existing chunk. This results in additional checks. Overall, this code tests 6 distinct conditions in the critical path. The second problem with the original code is that it has too many layers, all of which are shallow. This is both a performance problem and a design problem. The critical path makes two additional method calls in addition to the original invocation of Buffer::alloc. Each method call takes additional time, and the result of each call must be checked by its caller, which results in more special cases to consider. Chapter 7 discussed how abstractions should normally change as you pass from one layer to another, but all three of the methods in Figure 20.2 have identical signatures and they provide essentially the same abstraction; this is a red flag. Buffer::allocateAppend is nearly a pass-though method; its only contribution is to create a new allocation if needed. The extra layers make the code both slower and more complicated. diff --git a/docs/en/summary.md b/docs/en/summary.md index 30c07239..04c23c24 100644 --- a/docs/en/summary.md +++ b/docs/en/summary.md @@ -20,8 +20,6 @@ Here are the most important software design principles discussed in this book: 14. Software should be designed for ease of reading, not ease of writing (see p. 149). 15. The increments of software development should be abstractions, not features (see p. 154). ---- - ## Summary of Red Flags Here are a few of of the most important red flags discussed in this book. The presence of any of these symptoms in a system suggests that there is a problem with the system’s design: @@ -41,8 +39,6 @@ Here are a few of of the most important red flags discussed in this book. The pr - Hard to Describe: in order to be complete, the documentation for a variable or method must be long. (see p. 131). - Nonobvious Code: the behavior or meaning of a piece of code cannot be understood easily. (see p. 148). ---- - ## About the Author John Ousterhout is the Bosack Lerner Professor of Computer Science at Stanford University. He is the creator of the Tcl scripting language and is also well known for his work in distributed operating systems and storage systems. Ousterhout received a BS degree in Physics from Yale University and a PhD in Computer Science from Carnegie Mellon University. He is a member of the National Academy of Engineering and has received numerous awards, including the ACM Software System Award, the ACM Grace Murray Hopper Award, the National Science Foundation Presidential Young Investigator Award, and the U.C. Berkeley Distinguished Teaching Award.