We cut nature up, organise it into concepts, and ascribe significances as we do, largely because we are parties to an agreement that holds throughout our speech community and is codified in the patterns of our language ... we cannot talk at all except by subscribing to the organisation and classification of data which the agreement decrees. Benjamin Lee Whorf (1897-1941)
The genesis of the computer revolution was in a machine. The genesis of our programming languages thus tends to look like that machine.
But computers are not so much machines as they are mind amplification tools (bicycles for the mind, as Steve Jobs is fond of saying) and a different kind of expressive medium. As a result, the tools are beginning to look less like machines and more like parts of our minds, and also like other forms of expression such as writing, painting, sculpture, animation, and filmmaking. Object-oriented programming (OOP) is part of this movement toward using the computer as an expressive medium. Comment
This chapter will introduce you to the basic concepts of OOP, including an overview of development methods. This chapter, and this book, assume that you have had experience in a procedural programming language, although not necessarily C. If you think you need more preparation in programming and the syntax of C before tackling this book, you should work through the Thinking in C: Foundations for C++ and Java training CD ROM, bound in with this book and also available at www.BruceEckel.com. Comment
This chapter is background and supplementary material. Many people do not feel comfortable wading into object-oriented programming without understanding the big picture first. Thus, there are many concepts that are introduced here to give you a solid overview of OOP. However, many other people dont get the big picture concepts until theyve seen some of the mechanics first; these people may become bogged down and lost without some code to get their hands on. If youre part of this latter group and are eager to get to the specifics of the language, feel free to jump past this chapterskipping it at this point will not prevent you from writing programs or learning the language. However, you will want to come back here eventually to fill in your knowledge so you can understand why objects are important and how to design with them. Comment
All programming languages provide abstractions. It can be argued that the complexity of the problems youre able to solve is directly related to the kind and quality of abstraction. By kind I mean, What is it that you are abstracting? Assembly language is a small abstraction of the underlying machine. Many so-called imperative languages that followed (such as Fortran, BASIC, and C) were abstractions of assembly language. These languages are big improvements over assembly language, but their primary abstraction still requires you to think in terms of the structure of the computer rather than the structure of the problem you are trying to solve. The programmer must establish the association between the machine model (in the solution space, which is the place where youre modeling that problem, such as a computer) and the model of the problem that is actually being solved (in the problem space, which is the place where the problem exists). The effort required to perform this mapping, and the fact that it is extrinsic to the programming language, produces programs that are difficult to write and expensive to maintain, and as a side effect created the entire programming methods industry. Comment
The alternative to modeling the machine is to model the problem youre trying to solve. Early languages such as LISP and APL chose particular views of the world (All problems are ultimately lists or All problems are algorithmic, respectively). PROLOG casts all problems into chains of decisions. Languages have been created for constraint-based programming and for programming exclusively by manipulating graphical symbols. (The latter proved to be too restrictive.) Each of these approaches is a good solution to the particular class of problem theyre designed to solve, but when you step outside of that domain they become awkward. Comment
The object-oriented approach goes a step further by providing tools for the programmer to represent elements in the problem space. This representation is general enough that the programmer is not constrained to any particular type of problem. We refer to the elements in the problem space and their representations in the solution space as objects. (Of course, you will also need other objects that dont have problem-space analogs.) The idea is that the program is allowed to adapt itself to the lingo of the problem by adding new types of objects, so when you read the code describing the solution, youre reading words that also express the problem. This is a more flexible and powerful language abstraction than what weve had before. Thus, OOP allows you to describe the problem in terms of the problem, rather than in terms of the computer where the solution will run. Theres still a connection back to the computer, though. Each object looks quite a bit like a little computer; it has a state, and it has operations that you can ask it to perform. However, this doesnt seem like such a bad analogy to objects in the real worldthey all have characteristics and behaviors. Comment
Some language designers have decided that object-oriented programming by itself is not adequate to easily solve all programming problems, and advocate the combination of various approaches into multiparadigm programming languages.[2] Comment
Alan Kay summarized five basic characteristics of Smalltalk, the first successful object-oriented language and one of the languages upon which Java is based. These characteristics represent a pure approach to object-oriented programming: Comment
Booch offers an even more succinct description of an object:
An object has state, behavior and identity.
This means that an object can have internal data (which gives it state), methods (to produce behavior), and each object can be uniquely distinguished from every other object to put this in a concrete sense, each object has a unique address in memory[3]. Comment
Aristotle was probably the first to begin a careful study of the concept of type; he spoke of the class of fishes and the class of birds. The idea that all objects, while being unique, are also part of a class of objects that have characteristics and behaviors in common was used directly in the first object-oriented language, Simula-67, with its fundamental keyword class that introduces a new type into a program. Comment
Simula, as its name implies, was created for developing simulations such as the classic bank teller problem. In this, you have a bunch of tellers, customers, accounts, transactions, and units of moneya lot of objects. Objects that are identical except for their state during a programs execution are grouped together into classes of objects and thats where the keyword class came from. Creating abstract data types (classes) is a fundamental concept in object-oriented programming. Abstract data types work almost exactly like built-in types: You can create variables of a type (called objects or instances in object-oriented parlance) and manipulate those variables (called sending messages or requests; you send a message and the object figures out what to do with it). The members (elements) of each class share some commonality: every account has a balance, every teller can accept a deposit, etc. At the same time, each member has its own state, each account has a different balance, each teller has a name. Thus, the tellers, customers, accounts, transactions, etc., can each be represented with a unique entity in the computer program. This entity is the object, and each object belongs to a particular class that defines its characteristics and behaviors. Comment
So, although what we really do in object-oriented programming is create new data types, virtually all object-oriented programming languages use the class keyword. When you see the word type think class and vice versa[4]. Comment
Since a class describes a set of objects that have identical characteristics (data elements) and behaviors (functionality), a class is really a data type because a floating point number, for example, also has a set of characteristics and behaviors. The difference is that a programmer defines a class to fit a problem rather than being forced to use an existing data type that was designed to represent a unit of storage in a machine. You extend the programming language by adding new data types specific to your needs. The programming system welcomes the new classes and gives them all the care and type-checking that it gives to built-in types. Comment
The object-oriented approach is not limited to building simulations. Whether or not you agree that any program is a simulation of the system youre designing, the use of OOP techniques can easily reduce a large set of problems to a simple solution. Comment
Once a class is established, you can make as many objects of that class as you like, and then manipulate those objects as if they are the elements that exist in the problem you are trying to solve. Indeed, one of the challenges of object-oriented programming is to create a one-to-one mapping between the elements in the problem space and objects in the solution space. Comment
But how do you get an object to do useful work for you? There must be a way to make a request of the object so that it will do something, such as complete a transaction, draw something on the screen, or turn on a switch. And each object can satisfy only certain requests. The requests you can make of an object are defined by its interface, and the type is what determines the interface. A simple example might be a representation of a light bulb: Comment
Light lt = new Light(); lt.on();
The interface establishes what requests you can make for a particular object. However, there must be code somewhere to satisfy that request. This, along with the hidden data, comprises the implementation. From a procedural programming standpoint, its not that complicated. A type has a function associated with each possible request, and when you make a particular request to an object, that function is called. This process is usually summarized by saying that you send a message (make a request) to an object, and the object figures out what to do with that message (it executes code). Comment
Here, the name of the type/class is Light, the name of this particular Light object is lt, and the requests that you can make of a Light object are to turn it on, turn it off, make it brighter, or make it dimmer. You create a Light object by defining a reference (lt) for that object and calling new to request a new object of that type. To send a message to the object, you state the name of the object and connect it to the message request with a period (dot). From the standpoint of the user of a predefined class, thats pretty much all there is to programming with objects. Comment
The diagram shown above follows the format of the Unified Modeling Language (UML). Each class is represented by a box, with the type name in the top portion of the box, any data members that you care to describe in the middle portion of the box, and the member functions (the functions that belong to this object, which receive any messages you send to that object) in the bottom portion of the box. Often, only the name of the class and the public member functions are shown in UML design diagrams, and so the middle portion is not shown. If youre interested only in the class name, then the bottom portion doesnt need to be shown, either. Comment
It is helpful to break up the playing field into class creators (those who create new data types) and client programmers[5] (the class consumers who use the data types in their applications). The goal of the client programmer is to collect a toolbox full of classes to use for rapid application development. The goal of the class creator is to build a class that exposes only whats necessary to the client programmer and keeps everything else hidden. Why? Because if its hidden, the client programmer cant use it, which means that the class creator can change the hidden portion at will without worrying about the impact on anyone else. The hidden portion usually represents the tender insides of an object that could easily be corrupted by a careless or uninformed client programmer, so hiding the implementation reduces program bugs. The concept of implementation hiding cannot be overemphasized. Comment
In any relationship its important to have boundaries that are respected by all parties involved. When you create a library, you establish a relationship with the client programmer, who is also a programmer, but one who is putting together an application by using your library, possibly to build a bigger library. Comment
If all the members of a class are available to everyone, then the client programmer can do anything with that class and theres no way to enforce rules. Even though you might really prefer that the client programmer not directly manipulate some of the members of your class, without access control theres no way to prevent it. Everythings naked to the world. Comment
So the first reason for access control is to keep client programmers hands off portions they shouldnt touchparts that are necessary for the internal machinations of the data type but not part of the interface that users need in order to solve their particular problems. This is actually a service to users because they can easily see whats important to them and what they can ignore. Comment
The second reason for access control is to allow the library designer to change the internal workings of the class without worrying about how it will affect the client programmer. For example, you might implement a particular class in a simple fashion to ease development, and then later discover that you need to rewrite it in order to make it run faster. If the interface and implementation are clearly separated and protected, you can accomplish this easily. Comment
Java uses three explicit keywords to set the boundaries in a class: public, private, and protected. Their use and meaning are quite straightforward. These access specifiers determine who can use the definitions that follow. public means the following definitions are available to everyone. The private keyword, on the other hand, means that no one can access those definitions except you, the creator of the type, inside member functions of that type. private is a brick wall between you and the client programmer. If someone tries to access a private member, theyll get a compile-time error. protected acts like private, with the exception that an inheriting class has access to protected members, but not private members. Inheritance will be introduced shortly. Comment
Java also has a default access, which comes into play if you dont use one of the aforementioned specifiers. This is sometimes called friendly access because classes can access the friendly members of other classes in the same package, but outside of the package those same friendly members appear to be private. Comment
Once a class has been created and tested, it should (ideally) represent a useful unit of code. It turns out that this reusability is not nearly so easy to achieve as many would hope; it takes experience and insight to produce a good design. But once you have such a design, it begs to be reused. Code reuse is one of the greatest advantages that object-oriented programming languages provide. Comment
The simplest way to reuse a class is to just use an object of that class directly, but you can also place an object of that class inside a new class. We call this creating a member object. Your new class can be made up of any number and type of other objects, in any combination that you need to achieve the functionality desired in your new class. Because you are composing a new class from existing classes, this concept is called composition (or more generally, aggregation). Composition is often referred to as a has-a relationship, as in a car has an engine. Comment
(The above UML diagram indicates composition with the filled diamond, which states there is one car. I will typically use a simpler form: just a line, without the diamond, to indicate an association.[6]) Comment
Composition comes with a great deal of flexibility. The member objects of your new class are usually private, making them inaccessible to the client programmers who are using the class. This allows you to change those members without disturbing existing client code. You can also change the member objects at run-time, to dynamically change the behavior of your program. Inheritance, which is described next, does not have this flexibility since the compiler must place compile-time restrictions on classes created with inheritance. Comment
Because inheritance is so important in object-oriented programming it is often highly emphasized, and the new programmer can get the idea that inheritance should be used everywhere. This can result in awkward and overly complicated designs. Instead, you should first look to composition when creating new classes, since it is simpler and more flexible. If you take this approach, your designs will be cleaner. Once youve had some experience, it will be reasonably obvious when you need inheritance. Comment
By itself, the idea of an object is a convenient tool. It allows you to package data and functionality together by concept, so you can represent an appropriate problem-space idea rather than being forced to use the idioms of the underlying machine. These concepts are expressed as fundamental units in the programming language by using the class keyword. Comment
It seems a pity, however, to go to all the trouble to create a class and then be forced to create a brand new one that might have similar functionality. Its nicer if we can take the existing class, clone it, and then make additions and modifications to the clone. This is effectively what you get with inheritance, with the exception that if the original class (called the base or super or parent class) is changed, the modified clone (called the derived or inherited or sub or child class) also reflects those changes. Comment
(The arrow in the above UML diagram points from the derived class to the base class. As you will see, there can be more than one derived class.) Comment
A type does more than describe the constraints on a set of objects; it also has a relationship with other types. Two types can have characteristics and behaviors in common, but one type may contain more characteristics than another and may also handle more messages (or handle them differently). Inheritance expresses this similarity between types using the concept of base types and derived types. A base type contains all of the characteristics and behaviors that are shared among the types derived from it. You create a base type to represent the core of your ideas about some objects in your system. From the base type, you derive other types to express the different ways that this core can be realized. Comment
For example, a trash-recycling machine sorts pieces of trash. The base type is trash, and each piece of trash has a weight, a value, and so on, and can be shredded, melted, or decomposed. From this, more specific types of trash are derived that may have additional characteristics (a bottle has a color) or behaviors (an aluminum can may be crushed, a steel can is magnetic). In addition, some behaviors may be different (the value of paper depends on its type and condition). Using inheritance, you can build a type hierarchy that expresses the problem youre trying to solve in terms of its types. Comment
A second example is the classic shape example, perhaps used in a computer-aided design system or game simulation. The base type is shape, and each shape has a size, a color, a position, and so on. Each shape can be drawn, erased, moved, colored, etc. From this, specific types of shapes are derived (inherited): circle, square, triangle, and so on, each of which may have additional characteristics and behaviors. Certain shapes can be flipped, for example. Some behaviors may be different, such as when you want to calculate the area of a shape. The type hierarchy embodies both the similarities and differences between the shapes. Comment
Casting the solution in the same terms as the problem is tremendously beneficial because you dont need a lot of intermediate models to get from a description of the problem to a description of the solution. With objects, the type hierarchy is the primary model, so you go directly from the description of the system in the real world to the description of the system in code. Indeed, one of the difficulties people have with object-oriented design is that its too simple to get from the beginning to the end. A mind trained to look for complex solutions is often stumped by this simplicity at first. Comment
When you inherit from an existing type, you create a new type. This new type contains not only all the members of the existing type (although the private ones are hidden away and inaccessible), but more important, it duplicates the interface of the base class. That is, all the messages you can send to objects of the base class you can also send to objects of the derived class. Since we know the type of a class by the messages we can send to it, this means that the derived class is the same type as the base class. In the previous example, a circle is a shape. This type equivalence via inheritance is one of the fundamental gateways in understanding the meaning of object-oriented programming. Comment
Since both the base class and derived class have the same interface, there must be some implementation to go along with that interface. That is, there must be some code to execute when an object receives a particular message. If you simply inherit a class and dont do anything else, the methods from the base-class interface come right along into the derived class. That means objects of the derived class have not only the same type, they also have the same behavior, which isnt particularly interesting. Comment
You have two ways to differentiate your new derived class from the original base class. The first is quite straightforward: You simply add brand new functions to the derived class. These new functions are not part of the base class interface. This means that the base class simply didnt do as much as you wanted it to, so you added more functions. This simple and primitive use for inheritance is, at times, the perfect solution to your problem. However, you should look closely for the possibility that your base class might also need these additional functions. This process of discovery and iteration of your design happens regularly in object-oriented programming. Comment
Although inheritance may sometimes imply (especially in Java, where the keyword that indicates inheritance is extends) that you are going to add new functions to the interface, thats not necessarily true. The second and more important way to differentiate your new class is to change the behavior of an existing base-class function. This is referred to as overriding that function. Comment
To override a function, you simply create a new definition for the function in the derived class. Youre saying, Im using the same interface function here, but I want it to do something different for my new type. Comment
Theres a certain debate that can occur about inheritance: Should inheritance override only base-class functions (and not add new member functions that arent in the base class)? This would mean that the derived type is exactly the same type as the base class since it has exactly the same interface. As a result, you can exactly substitute an object of the derived class for an object of the base class. This can be thought of as pure substitution, and its often referred to as the substitution principle. In a sense, this is the ideal way to treat inheritance. We often refer to the relationship between the base class and derived classes in this case as an is-a relationship, because you can say a circle is a shape. A test for inheritance is to determine whether you can state the is-a relationship about the classes and have it make sense. Comment
There are times when you must add new interface elements to a derived type, thus extending the interface and creating a new type. The new type can still be substituted for the base type, but the substitution isnt perfect because your new functions are not accessible from the base type. This can be described as an is-like-a[7] relationship; the new type has the interface of the old type but it also contains other functions, so you cant really say its exactly the same. For example, consider an air conditioner. Suppose your house is wired with all the controls for cooling; that is, it has an interface that allows you to control cooling. Imagine that the air conditioner breaks down and you replace it with a heat pump, which can both heat and cool. The heat pump is-like-an air conditioner, but it can do more. Because the control system of your house is designed only to control cooling, it is restricted to communication with the cooling part of the new object. The interface of the new object has been extended, and the existing system doesnt know about anything except the original interface. Comment
Of course, once you see this design it becomes clear that the base class cooling system is not general enough, and should be renamed to temperature control system so that it can also include heatingat which point the substitution principle will work. However, the diagram above is an example of what can happen in design and in the real world. Comment
When you see the substitution principle its easy to feel like this approach (pure substitution) is the only way to do things, and in fact it is nice if your design works out that way. But youll find that there are times when its equally clear that you must add new functions to the interface of a derived class. With inspection both cases should be reasonably obvious. Comment
When dealing with type hierarchies, you often want to treat an object not as the specific type that it is, but instead as its base type. This allows you to write code that doesnt depend on specific types. In the shape example, functions manipulate generic shapes without respect to whether theyre circles, squares, triangles, or some shape that hasnt even been defined yet. All shapes can be drawn, erased, and moved, so these functions simply send a message to a shape object; they dont worry about how the object copes with the message. Comment
Such code is unaffected by the addition of new types, and adding new types is the most common way to extend an object-oriented program to handle new situations. For example, you can derive a new subtype of shape called pentagon without modifying the functions that deal only with generic shapes. This ability to extend a program easily by deriving new subtypes is important because it greatly improves designs while reducing the cost of software maintenance. Comment
Theres a problem, however, with attempting to treat derived-type objects as their generic base types (circles as shapes, bicycles as vehicles, cormorants as birds, etc.). If a function is going to tell a generic shape to draw itself, or a generic vehicle to steer, or a generic bird to move, the compiler cannot know at compile-time precisely what piece of code will be executed. Thats the whole pointwhen the message is sent, the programmer doesnt want to know what piece of code will be executed; the draw function can be applied equally to a circle, a square, or a triangle, and the object will execute the proper code depending on its specific type. If you dont have to know what piece of code will be executed, then when you add a new subtype, the code it executes can be different without requiring changes to the function call. Therefore, the compiler cannot know precisely what piece of code is executed, so what does it do? For example, in the following diagram the BirdController object just works with generic Bird objects, and does not know what exact type they are. This is convenient from BirdControllers perspective because it doesnt have to write special code to determine the exact type of Bird its working with, or that Birds behavior. So how does it happen that, when move( ) is called while ignoring the specific type of Bird, the right behavior will occur (a Goose runs, flies, or swims, and a Penguin runs or swims)? Comment
The answer is the primary twist in object-oriented programming: the compiler cannot make a function call in the traditional sense. The function call generated by a non-OOP compiler causes what is called early binding, a term you may not have heard before because youve never thought about it any other way. It means the compiler generates a call to a specific function name, and the linker resolves this call to the absolute address of the code to be executed. In OOP, the program cannot determine the address of the code until run-time, so some other scheme is necessary when a message is sent to a generic object. Comment
To solve the problem, object-oriented languages use the concept of late binding. When you send a message to an object, the code being called isnt determined until run-time. The compiler does ensure that the function exists and performs type checking on the arguments and return value (a language in which this isnt true is called weakly typed), but it doesnt know the exact code to execute. Comment
To perform late binding, Java uses a special bit of code in lieu of the absolute call. This code calculates the address of the function body, using information stored in the object (this process is covered in great detail in Chapter 7). Thus, each object can behave differently according to the contents of that special bit of code. When you send a message to an object, the object actually does figure out what to do with that message. Comment
In some languages (C++, in particular) you must explicitly state that you want a function to have the flexibility of late-binding properties. In these languages, by default, member functions are not dynamically bound. This caused problems, so in Java dynamic binding is the default and you dont need to remember to add any extra keywords in order to get polymorphism. Comment
Consider the shape example. The family of classes (all based on the same uniform interface) was diagrammed earlier in this chapter. To demonstrate polymorphism, we want to write a single piece of code that ignores the specific details of type and talks only to the base class. That code is decoupled from type-specific information, and thus is simpler to write and easier to understand. And, if a new typea Hexagon, for exampleis added through inheritance, the code you write will work just as well for the new type of Shape as it did on the existing types. Thus, the program is extensible. Comment
If you write a method in Java (as you will soon learn how to do): Comment
void doStuff(Shape s) { s.erase(); // ... s.draw(); }
This function speaks to any Shape, so it is independent of the specific type of object that its drawing and erasing. If in some other part of the program we use the doStuff( ) function: Comment
Circle c = new Circle(); Triangle t = new Triangle(); Line l = new Line(); doStuff(c); doStuff(t); doStuff(l);
The calls to doStuff( ) automatically work correctly, regardless of the exact type of the object. Comment
This is actually a pretty amazing trick. Consider the line:
doStuff(c);
Whats happening here is that a Circle is being passed into a function thats expecting a Shape. Since a Circle is a Shape it can be treated as one by doStuff( ). That is, any message that doStuff( ) can send to a Shape, a Circle can accept. So it is a completely safe and logical thing to do. Comment
We call this process of treating a derived type as though it were its base type upcasting. The name cast is used in the sense of casting into a mold and the up comes from the way the inheritance diagram is typically arranged, with the base type at the top and the derived classes fanning out downward. Thus, casting to a base type is moving up the inheritance diagram: upcasting. Comment
An object-oriented program contains some upcasting somewhere, because thats how you decouple yourself from knowing about the exact type youre working with. Look at the code in doStuff( ): Comment
s.erase(); // ... s.draw();
Notice that it doesnt say If youre a Circle, do this, if youre a Square, do that, etc. If you write that kind of code, which checks for all the possible types that a Shape can actually be, its messy and you need to change it every time you add a new kind of Shape. Here, you just say Youre a shape, I know you can erase( ) and draw( ) yourself, do it, and take care of the details correctly. Comment
Whats impressive about the code in doStuff( ) is that, somehow, the right thing happens. Calling draw( ) for Circle causes different code to be executed than when calling draw( ) for a Square or a Line, but when the draw( ) message is sent to an anonymous Shape, the correct behavior occurs based on the actual type of the Shape. This is amazing because, as mentioned earlier, when the Java compiler is compiling the code for doStuff( ), it cannot know exactly what types it is dealing with. So ordinarily, youd expect it to end up calling the version of erase( ) and draw( ) for the base class Shape, and not for the specific Circle, Square, or Line. And yet the right thing happens because of polymorphism. The compiler and run-time system handle the details; all you need to know is that it happens, and more important how to design with it. When you send a message to an object, the object will do the right thing, even when upcasting is involved. Comment
Often in a design, you want the base class to present only an interface for its derived classes. That is, you dont want anyone to actually create an object of the base class, only to upcast to it so that its interface can be used. This is accomplished by making that class abstract using the abstract keyword. If anyone tries to make an object of an abstract class, the compiler prevents them. This is a tool to enforce a particular design. Comment
You can also use the abstract keyword to describe a method that hasnt been implemented yetas a stub indicating here is an interface function for all types inherited from this class, but at this point I dont have any implementation for it. An abstract method may be created only inside an abstract class. When the class is inherited, that method must be implemented, or the inheriting class becomes abstract as well. Creating an abstract method allows you to put a method in an interface without being forced to provide a possibly meaningless body of code for that method. Comment
The interface keyword takes the concept of an abstract class one step further by preventing any function definitions at all. The interface is a very handy and commonly used tool, as it provides the perfect separation of interface and implementation. In addition, you can combine many interfaces together, if you wish, whereas inheriting from multiple regular classes or abstract classes is not possible. Comment
Technically, OOP is just about abstract data typing, inheritance, and polymorphism, but other issues can be at least as important. The remainder of this section will cover these issues. Comment
One of the most important factors is the way objects are created and destroyed. Where is the data for an object and how is the lifetime of the object controlled? There are different philosophies at work here. C++ takes the approach that control of efficiency is the most important issue, so it gives the programmer a choice. For maximum run-time speed, the storage and lifetime can be determined while the program is being written, by placing the objects on the stack (these are sometimes called automatic or scoped variables) or in the static storage area. This places a priority on the speed of storage allocation and release, and control of these can be very valuable in some situations. However, you sacrifice flexibility because you must know the exact quantity, lifetime, and type of objects while you're writing the program. If you are trying to solve a more general problem such as computer-aided design, warehouse management, or air-traffic control, this is too restrictive. Comment
The second approach is to create objects dynamically in a pool of memory called the heap. In this approach, you don't know until run-time how many objects you need, what their lifetime is, or what their exact type is. Those are determined at the spur of the moment while the program is running. If you need a new object, you simply make it on the heap at the point that you need it. Because the storage is managed dynamically, at run-time, the amount of time required to allocate storage on the heap is significantly longer than the time to create storage on the stack. (Creating storage on the stack is often a single assembly instruction to move the stack pointer down, and another to move it back up.) The dynamic approach makes the generally logical assumption that objects tend to be complicated, so the extra overhead of finding storage and releasing that storage will not have an important impact on the creation of an object. In addition, the greater flexibility is essential to solve the general programming problem. Comment
Java uses the second approach, exclusively[8]. Every time you want to create an object, you use the new keyword to build a dynamic instance of that object. Comment
There's another issue, however, and that's the lifetime of an object. With languages that allow objects to be created on the stack, the compiler determines how long the object lasts and can automatically destroy it. However, if you create it on the heap the compiler has no knowledge of its lifetime. In a language like C++, you must determine programmatically when to destroy the object, which can lead to memory leaks if you dont do it correctly (and this is a common problem in C++ programs). Java provides a feature called a garbage collector that automatically discovers when an object is no longer in use and destroys it. A garbage collector is much more convenient because it reduces the number of issues that you must track and the code you must write. More important, the garbage collector provides a much higher level of insurance against the insidious problem of memory leaks (which has brought many a C++ project to its knees). Comment
The rest of this section looks at additional factors concerning object lifetimes and landscapes. Comment
If you dont know how many objects youre going to need to solve a particular problem, or how long they will last, you also dont know how to store those objects. How can you know how much space to create for those objects? You cant, since that information isnt known until run-time. Comment
The solution to most problems in object-oriented design seems flippant: you create another type of object. The new type of object that solves this particular problem holds references to other objects. Of course, you can do the same thing with an array, which is available in most languages. But theres more. This new object, generally called a container (also called a collection, but the Java library uses that term in a different sense so this book will use container), will expand itself whenever necessary to accommodate everything you place inside it. So you dont need to know how many objects youre going to hold in a container. Just create a container object and let it take care of the details. Comment
Fortunately, a good OOP language comes with a set of containers as part of the package. In C++, its part of the Standard C++ Library and is sometimes called the Standard Template Library (STL). Object Pascal has containers in its Visual Component Library (VCL). Smalltalk has a very complete set of containers. Java also has containers in its standard library. In some libraries, a generic container is considered good enough for all needs, and in others (Java, for example) the library has different types of containers for different needs: a vector (called an ArrayList in Java) for consistent access to all elements, and a linked list for consistent insertion at all elements, for example, so you can choose the particular type that fits your needs. Container libraries may also include sets, queues, hash tables, trees, stacks, etc. Comment
All containers have some way to put things in and get things out; there are usually functions to add elements to a container, and others to fetch those elements back out. But fetching elements can be more problematic, because a single-selection function is restrictive. What if you want to manipulate or compare a set of elements in the container instead of just one? Comment
The solution is an iterator, which is an object whose job is to select the elements within a container and present them to the user of the iterator. As a class, it also provides a level of abstraction. This abstraction can be used to separate the details of the container from the code thats accessing that container. The container, via the iterator, is abstracted to be simply a sequence. The iterator allows you to traverse that sequence without worrying about the underlying structurethat is, whether its an ArrayList, a LinkedList, a Stack, or something else. This gives you the flexibility to easily change the underlying data structure without disturbing the code in your program. Java began (in version 1.0 and 1.1) with a standard iterator, called Enumeration, for all of its container classes. Java 2 has added a much more complete container library that contains an iterator called Iterator that does more than the older Enumeration. Comment
From a design standpoint, all you really want is a sequence that can be manipulated to solve your problem. If a single type of sequence satisfied all of your needs, thered be no reason to have different kinds. There are two reasons that you need a choice of containers. First, containers provide different types of interfaces and external behavior. A stack has a different interface and behavior than that of a queue, which is different from that of a set or a list. One of these might provide a more flexible solution to your problem than the other. Second, different containers have different efficiencies for certain operations. The best example is an ArrayList and a LinkedList. Both are simple sequences that can have identical interfaces and external behaviors. But certain operations can have radically different costs. Randomly accessing elements in an ArrayList is a constant-time operation; it takes the same amount of time regardless of the element you select. However, in a LinkedList it is expensive to move through the list to randomly select an element, and it takes longer to find an element that is further down the list. On the other hand, if you want to insert an element in the middle of a sequence, its much cheaper in a LinkedList than in an ArrayList. These and other operations have different efficiencies depending on the underlying structure of the sequence. In the design phase, you might start with a LinkedList and, when tuning for performance, change to an ArrayList. Because of the abstraction via iterators, you can change from one to the other with minimal impact on your code. Comment
In the end, remember that a container is only a storage cabinet to put objects in. If that cabinet solves all of your needs, it doesnt really matter how it is implemented (a basic concept with most types of objects). If youre working in a programming environment that has built-in overhead due to other factors, then the cost difference between an ArrayList and a LinkedList might not matter. You might need only one type of sequence. You can even imagine the perfect container abstraction, which can automatically change its underlying implementation according to the way it is used. Comment
One of the issues in OOP that has become especially prominent since the introduction of C++ is whether all classes should ultimately be inherited from a single base class. In Java (as with virtually all other OOP languages) the answer is yes and the name of this ultimate base class is simply Object. It turns out that the benefits of the singly rooted hierarchy are many. Comment
All objects in a singly rooted hierarchy have an interface in common, so they are all ultimately the same type. The alternative (provided by C++) is that you dont know that everything is the same fundamental type. From a backward-compatibility standpoint this fits the model of C better and can be thought of as less restrictive, but when you want to do full-on object-oriented programming you must then build your own hierarchy to provide the same convenience thats built into other OOP languages. And in any new class library you acquire, some other incompatible interface will be used. It requires effort (and possibly multiple inheritance) to work the new interface into your design. Is the extra flexibility of C++ worth it? If you need itif you have a large investment in Cits quite valuable. If youre starting from scratch, other alternatives such as Java can often be more productive. Comment
All objects in a singly rooted hierarchy (such as Java provides) can be guaranteed to have certain functionality. You know you can perform certain basic operations on every object in your system. A singly rooted hierarchy, along with creating all objects on the heap, greatly simplifies argument passing (one of the more complex topics in C++). Comment
A singly rooted hierarchy makes it much easier to implement a garbage collector (which is conveniently built into Java). The necessary support can be installed in the base class, and the garbage collector can thus send the appropriate messages to every object in the system. Without a singly rooted hierarchy and a system to manipulate an object via a reference, it is difficult to implement a garbage collector. Comment
Since run-time type information is guaranteed to be in all objects, youll never end up with an object whose type you cannot determine. This is especially important with system level operations, such as exception handling, and to allow greater flexibility in programming. Comment
Because a container is a tool that youll use frequently, it makes sense to have a library of containers that are built in a reusable fashion, so you can take one off the shelf and plug it into your program. Java provides such a library, which should satisfy most needs. Comment
To make these containers reusable, they hold the one universal type in Java that was previously mentioned: Object. The singly rooted hierarchy means that everything is an Object, so a container that holds Objects can hold anything. This makes containers easy to reuse. Comment
To use such a container, you simply add object references to it, and later ask for them back. But, since the container holds only Objects, when you add your object reference into the container it is upcast to Object, thus losing its identity. When you fetch it back, you get an Object reference, and not a reference to the type that you put in. So how do you turn it back into something that has the useful interface of the object that you put into the container? Comment
Here, the cast is used again, but this time youre not casting up the inheritance hierarchy to a more general type, you cast down the hierarchy to a more specific type. This manner of casting is called downcasting. With upcasting, you know, for example, that a Circle is a type of Shape so its safe to upcast, but you dont know that an Object is necessarily a Circle or a Shape so its hardly safe to downcast unless you know thats what youre dealing with. Comment
Its not completely dangerous, however, because if you downcast to the wrong thing youll get a run-time error called an exception, which will be described shortly. When you fetch object references from a container, though, you must have some way to remember exactly what they are so you can perform a proper downcast. Comment
Downcasting and the run-time checks require extra time for the running program, and extra effort from the programmer. Wouldnt it make sense to somehow create the container so that it knows the types that it holds, eliminating the need for the downcast and a possible mistake? The solution is parameterized types, which are classes that the compiler can automatically customize to work with particular types. For example, with a parameterized container, the compiler could customize that container so that it would accept only Shapes and fetch only Shapes. Comment
Parameterized types are an important part of C++, partly because C++ has no singly rooted hierarchy. In C++, the keyword that implements parameterized types is template. Java currently has no parameterized types since it is possible for it to get byhowever awkwardlyusing the singly rooted hierarchy. However, a current proposal for parameterized types uses a syntax that is strikingly similar to C++ templates. Comment
Each object requires resources in order to exist, most notably memory. When an object is no longer needed it must be cleaned up so that these resources are released for reuse. In simple programming situations the question of how an object is cleaned up doesnt seem too challenging: you create the object, use it for as long as its needed, and then it should be destroyed. Its not hard, however, to encounter situations in which the situation is more complex. Comment
Suppose, for example, you are designing a system to manage air traffic for an airport. (The same model might also work for managing crates in a warehouse, or a video rental system, or a kennel for boarding pets.) At first it seems simple: make a container to hold airplanes, then create a new airplane and place it in the container for each airplane that enters the air-traffic-control zone. For cleanup, simply delete the appropriate airplane object when a plane leaves the zone. Comment
But perhaps you have some other system to record data about the planes; perhaps data that doesnt require such immediate attention as the main controller function. Maybe its a record of the flight plans of all the small planes that leave the airport. So you have a second container of small planes, and whenever you create a plane object you also put it in this second container if its a small plane. Then some background process performs operations on the objects in this container during idle moments. Comment
Now the problem is more difficult: how can you possibly know when to destroy the objects? When youre done with the object, some other part of the system might not be. This same problem can arise in a number of other situations, and in programming systems (such as C++) in which you must explicitly delete an object when youre done with it this can become quite complex. Comment
With Java, the garbage collector is designed to take care of the problem of releasing the memory (although this doesnt include other aspects of cleaning up an object). The garbage collector knows when an object is no longer in use, and it then automatically releases the memory for that object. This (combined with the fact that all objects are inherited from the single root class Object and that you can create objects only one way, on the heap) makes the process of programming in Java much simpler than programming in C++. You have far fewer decisions to make and hurdles to overcome. Comment
If all this is such a good idea, why didnt they do the same thing in C++? Well of course theres a price you pay for all this programming convenience, and that price is run-time overhead. As mentioned before, in C++ you can create objects on the stack, and in this case theyre automatically cleaned up (but you dont have the flexibility of creating as many as you want at run-time). Creating objects on the stack is the most efficient way to allocate storage for objects and to free that storage. Creating objects on the heap can be much more expensive. Always inheriting from a base class and making all function calls polymorphic also exacts a small toll. But the garbage collector is a particular problem because you never quite know when its going to start up or how long it will take. This means that theres an inconsistency in the rate of execution of a Java program, so you cant use it in certain situations, such as when the rate of execution of a program is uniformly critical. (These are generally called real time programs, although not all real time programming problems are this stringent.) Comment
The designers of the C++ language, trying to woo C programmers (and most successfully, at that), did not want to add any features to the language that would impact the speed or the use of C++ in any situation where programmers might otherwise choose C. This goal was realized, but at the price of greater complexity when programming in C++. Java is simpler than C++, but the trade-off is in efficiency and sometimes applicability. For a significant portion of programming problems, however, Java is the superior choice. Comment
Ever since the beginning of programming languages, error handling has been one of the most difficult issues. Because its so hard to design a good error handling scheme, many languages simply ignore the issue, passing the problem on to library designers who come up with halfway measures that can work in many situations but can easily be circumvented, generally by just ignoring them. A major problem with most error handling schemes is that they rely on programmer vigilance in following an agreed-upon convention that is not enforced by the language. If the programmer is not vigilantoften the case if they are in a hurrythese schemes can easily be forgotten. Comment
Exception handling wires error handling directly into the programming language and sometimes even the operating system. An exception is an object that is thrown from the site of the error and can be caught by an appropriate exception handler designed to handle that particular type of error. Its as if exception handling is a different, parallel path of execution that can be taken when things go wrong. And because it uses a separate execution path, it doesnt need to interfere with your normally executing code. This makes that code simpler to write since you arent constantly forced to check for errors. In addition, a thrown exception is unlike an error value thats returned from a function or a flag thats set by a function in order to indicate an error conditionthese can be ignored. An exception cannot be ignored, so its guaranteed to be dealt with at some point. Finally, exceptions provide a way to reliably recover from a bad situation. Instead of just exiting you are often able to set things right and restore the execution of a program, which produces much more robust programs. Comment
Javas exception handling stands out among programming languages, because in Java, exception handling was wired in from the beginning and youre forced to use it. If you dont write your code to properly handle exceptions, youll get a compile-time error message. This guaranteed consistency makes error handling much easier. Comment
Its worth noting that exception handling isnt an object-oriented feature, although in object-oriented languages the exception is normally represented with an object. Exception handling existed before object-oriented languages. Comment
A fundamental concept in computer programming is the idea of handling more than one task at a time. Many programming problems require that the program be able to stop what its doing, deal with some other problem, and then return to the main process. The solution has been approached in many ways. Initially, programmers with low-level knowledge of the machine wrote interrupt service routines and the suspension of the main process was initiated through a hardware interrupt. Although this worked well, it was difficult and nonportable, so it made moving a program to a new type of machine slow and expensive. Comment
Sometimes interrupts are necessary for handling time-critical tasks, but theres a large class of problems in which youre simply trying to partition the problem into separately running pieces so that the whole program can be more responsive. Within a program, these separately running pieces are called threads, and the general concept is called multithreading. A common example of multithreading is the user interface. By using threads, a user can press a button and get a quick response rather than being forced to wait until the program finishes its current task. Comment
Ordinarily, threads are just a way to allocate the time of a single processor. But if the operating system supports multiple processors, each thread can be assigned to a different processor and they can truly run in parallel. One of the convenient features of multithreading at the language level is that the programmer doesnt need to worry about whether there are many processors or just one. The program is logically divided into threads and if the machine has more than one processor then the program runs faster, without any special adjustments. Comment
All this makes threading sound pretty simple. There is a catch: shared resources. If you have more than one thread running thats expecting to access the same resource you have a problem. For example, two processes cant simultaneously send information to a printer. To solve the problem, resources that can be shared, such as the printer, must be locked while they are being used. So a thread locks a resource, completes its task, and then releases the lock so that someone else can use the resource. Comment
Javas threading is built into the language, which makes a complicated subject much simpler. The threading is supported on an object level, so one thread of execution is represented by one object. Java also provides limited resource locking. It can lock the memory of any object (which is, after all, one kind of shared resource) so that only one thread can use it at a time. This is accomplished with the synchronized keyword. Other types of resources must be locked explicitly by the programmer, typically by creating an object to represent the lock that all threads must check before accessing that resource. Comment
When you create an object, it exists for as long as you need it, but under no circumstances does it exist when the program terminates. While this makes sense at first, there are situations in which it would be incredibly useful if an object could exist and hold its information even while the program wasnt running. Then the next time you started the program, the object would be there and it would have the same information it had the previous time the program was running. Of course, you can get a similar effect by writing the information to a file or to a database, but in the spirit of making everything an object it would be quite convenient to be able to declare an object persistent and have all the details taken care of for you. Comment
Java provides support for lightweight persistence, which means that you can easily store objects on disk and later retrieve them. The reason its lightweight is that youre still forced to make explicit calls to do the storage and retrieval. In addition, JavaSpaces (described in Chapter 15) provide for a kind of persistent storage of objects. In some future release more complete support for persistence might appear. Comment
If Java is, in fact, yet another computer programming language, you may question why it is so important and why it is being promoted as a revolutionary step in computer programming. The answer isnt immediately obvious if youre coming from a traditional programming perspective. Although Java is very useful for solving traditional stand-alone programming problems, it is also important because it will solve programming problems on the World Wide Web. Comment
The Web can seem a bit of a mystery at first, with all this talk of surfing, presence, and home pages. There has even been a growing reaction against Internet-mania, questioning the economic value and outcome of such a sweeping movement. Its helpful to step back and see what it really is, but to do this you must understand client/server systems, another aspect of computing thats full of confusing issues. Comment
The primary idea of a client/server system is that you have a central repository of informationsome kind of data, often in a databasethat you want to distribute on demand to some set of people or machines. A key to the client/server concept is that the repository of information is centrally located so that it can be changed and so that those changes will propagate out to the information consumers. Taken together, the information repository, the software that distributes the information, and the machine(s) where the information and software reside is called the server. The software that resides on the remote machine, communicates with the server, fetches the information, processes it, and then displays it on the remote machine is called the client. Comment
The basic concept of client/server computing, then, is not so complicated. The problems arise because you have a single server trying to serve many clients at once. Generally, a database management system is involved so the designer balances the layout of data into tables for optimal use. In addition, systems often allow a client to insert new information into a server. This means you must ensure that one clients new data doesnt walk over another clients new data, or that data isnt lost in the process of adding it to the database. (This is called transaction processing.) As client software changes, it must be built, debugged, and installed on the client machines, which turns out to be more complicated and expensive than you might think. Its especially problematic to support multiple types of computers and operating systems. Finally, theres the all-important performance issue: you might have hundreds of clients making requests of your server at any one time, and so any small delay is crucial. To minimize latency, programmers work hard to offload processing tasks, often to the client machine, but sometimes to other machines at the server site, using so-called middleware. (Middleware is also used to improve maintainability.) Comment
The simple idea of distributing information to people has so many layers of complexity in implementing it that the whole problem can seem hopelessly enigmatic. And yet its crucial: client/server computing accounts for roughly half of all programming activities. Its responsible for everything from taking orders and credit-card transactions to the distribution of any kind of datastock market, scientific, government, you name it. What weve come up with in the past is individual solutions to individual problems, inventing a new solution each time. These were hard to create and hard to use, and the user had to learn a new interface for each one. The entire client/server problem needs to be solved in a big way. Comment
The Web is actually one giant client/server system. Its a bit worse than that, since you have all the servers and clients coexisting on a single network at once. You dont need to know that, since all you care about is connecting to and interacting with one server at a time (even though you might be hopping around the world in your search for the correct server). Comment
Initially it was a simple one-way process. You made a request of a server and it handed you a file, which your machines browser software (i.e., the client) would interpret by formatting onto your local machine. But in short order people began wanting to do more than just deliver pages from a server. They wanted full client/server capability so that the client could feed information back to the server, for example, to do database lookups on the server, to add new information to the server, or to place an order (which required more security than the original systems offered). These are the changes weve been seeing in the development of the Web. Comment
The Web browser was a big step forward: the concept that one piece of information could be displayed on any type of computer without change. However, browsers were still rather primitive and rapidly bogged down by the demands placed on them. They werent particularly interactive, and tended to clog up both the server and the Internet because any time you needed to do something that required programming you had to send information back to the server to be processed. It could take many seconds or minutes to find out you had misspelled something in your request. Since the browser was just a viewer it couldnt perform even the simplest computing tasks. (On the other hand, it was safe, since it couldnt execute any programs on your local machine that might contain bugs or viruses.) Comment
To solve this problem, different approaches have been taken. To begin with, graphics standards have been enhanced to allow better animation and video within browsers. The remainder of the problem can be solved only by incorporating the ability to run programs on the client end, under the browser. This is called client-side programming. Comment
The Webs initial server-browser design provided for interactive content, but the interactivity was completely provided by the server. The server produced static pages for the client browser, which would simply interpret and display them. Basic HTML contains simple mechanisms for data gathering: text-entry boxes, check boxes, radio boxes, lists and drop-down lists, as well as a button that can only be programmed to reset the data on the form or submit the data on the form back to the server. This submission passes through the Common Gateway Interface (CGI) provided on all Web servers. The text within the submission tells CGI what to do with it. The most common action is to run a program located on the server in a directory thats typically called cgi-bin. (If you watch the address window at the top of your browser when you push a button on a Web page, you can sometimes see cgi-bin within all the gobbledygook there.) These programs can be written in most languages. Perl is a common choice because it is designed for text manipulation and is interpreted, so it can be installed on any server regardless of processor or operating system. Comment
Many powerful Web sites today are built strictly on CGI, and you can in fact do nearly anything with it. However, Web sites built on CGI programs can rapidly become overly complicated to maintain, and there is also the problem of response time. The response of a CGI program depends on how much data must be sent, as well as the load on both the server and the Internet. (On top of this, starting a CGI program tends to be slow.) The initial designers of the Web did not foresee how rapidly this bandwidth would be exhausted for the kinds of applications people developed. For example, any sort of dynamic graphing is nearly impossible to perform with consistency because a GIF file must be created and moved from the server to the client for each version of the graph. And youve no doubt had direct experience with something as simple as validating the data on an input form. You press the submit button on a page; the data is shipped back to the server; the server starts a CGI program that discovers an error, formats an HTML page informing you of the error, and then sends the page back to you; you must then back up a page and try again. Not only is this slow, its inelegant. Comment
The solution is client-side programming. Most machines that run Web browsers are powerful engines capable of doing vast work, and with the original static HTML approach they are sitting there, just idly waiting for the server to dish up the next page. Client-side programming means that the Web browser is harnessed to do whatever work it can, and the result for the user is a much speedier and more interactive experience at your Web site. Comment
The problem with discussions of client-side programming is that they arent very different from discussions of programming in general. The parameters are almost the same, but the platform is different: a Web browser is like a limited operating system. In the end, you must still program, and this accounts for the dizzying array of problems and solutions produced by client-side programming. The rest of this section provides an overview of the issues and approaches in client-side programming. Comment
One of the most significant steps forward in client-side programming is the development of the plug-in. This is a way for a programmer to add new functionality to the browser by downloading a piece of code that plugs itself into the appropriate spot in the browser. It tells the browser from now on you can perform this new activity. (You need to download the plug-in only once.) Some fast and powerful behavior is added to browsers via plug-ins, but writing a plug-in is not a trivial task, and isnt something youd want to do as part of the process of building a particular site. The value of the plug-in for client-side programming is that it allows an expert programmer to develop a new language and add that language to a browser without the permission of the browser manufacturer. Thus, plug-ins provide a back door that allows the creation of new client-side programming languages (although not all languages are implemented as plug-ins). Comment
Plug-ins resulted in an explosion of scripting languages. With a scripting language you embed the source code for your client-side program directly into the HTML page, and the plug-in that interprets that language is automatically activated while the HTML page is being displayed. Scripting languages tend to be reasonably easy to understand and, because they are simply text that is part of an HTML page, they load very quickly as part of the single server hit required to procure that page. The trade-off is that your code is exposed for everyone to see (and steal). Generally, however, you arent doing amazingly sophisticated things with scripting languages so this is not too much of a hardship. Comment
This points out that the scripting languages used inside Web browsers are really intended to solve specific types of problems, primarily the creation of richer and more interactive graphical user interfaces (GUIs). However, a scripting language might solve 80 percent of the problems encountered in client-side programming. Your problems might very well fit completely within that 80 percent, and since scripting languages can allow easier and faster development, you should probably consider a scripting language before looking at a more involved solution such as Java or ActiveX programming. Comment
The most commonly discussed browser scripting languages are JavaScript (which has nothing to do with Java; its named that way just to grab some of Javas marketing momentum), VBScript (which looks like Visual Basic), and Tcl/Tk, which comes from the popular cross-platform GUI-building language. There are others out there, and no doubt more in development. Comment
JavaScript is probably the most commonly supported. It comes built into both Netscape Navigator and the Microsoft Internet Explorer (IE). In addition, there are probably more JavaScript books available than there are for the other browser languages, and some tools automatically create pages using JavaScript. However, if youre already fluent in Visual Basic or Tcl/Tk, youll be more productive using those scripting languages rather than learning a new one. (Youll have your hands full dealing with the Web issues already.) Comment
If a scripting language can solve 80 percent of the client-side programming problems, what about the other 20 percentthe really hard stuff? The most popular solution today is Java. Not only is it a powerful programming language built to be secure, cross-platform, and international, but Java is being continually extended to provide language features and libraries that elegantly handle problems that are difficult in traditional programming languages, such as multithreading, database access, network programming, and distributed computing. Java allows client-side programming via the applet. Comment
An applet is a mini-program that will run only under a Web browser. The applet is downloaded automatically as part of a Web page (just as, for example, a graphic is automatically downloaded). When the applet is activated it executes a program. This is part of its beautyit provides you with a way to automatically distribute the client software from the server at the time the user needs the client software, and no sooner. The user gets the latest version of the client software without fail and without difficult reinstallation. Because of the way Java is designed, the programmer needs to create only a single program, and that program automatically works with all computers that have browsers with built-in Java interpreters. (This safely includes the vast majority of machines.) Since Java is a full-fledged programming language, you can do as much work as possible on the client before and after making requests of the server. For example, you wont need to send a request form across the Internet to discover that youve gotten a date or some other parameter wrong, and your client computer can quickly do the work of plotting data instead of waiting for the server to make a plot and ship a graphic image back to you. Not only do you get the immediate win of speed and responsiveness, but the general network traffic and load on servers can be reduced, preventing the entire Internet from slowing down. Comment
One advantage a Java applet has over a scripted program is that its in compiled form, so the source code isnt available to the client. On the other hand, a Java applet can be decompiled without too much trouble, but hiding your code is often not an important issue. Two other factors can be important. As you will see later in this book, a compiled Java applet can comprise many modules and take multiple server hits (accesses) to download. (In Java 1.1 and higher this is minimized by Java archives, called JAR files, that allow all the required modules to be packaged together and compressed for a single download.) A scripted program will just be integrated into the Web page as part of its text (and will generally be smaller and reduce server hits). This could be important to the responsiveness of your Web site. Another factor is the all-important learning curve. Regardless of what youve heard, Java is not a trivial language to learn. If youre a Visual Basic programmer, moving to VBScript will be your fastest solution, and since it will probably solve most typical client/server problems you might be hard pressed to justify learning Java. If youre experienced with a scripting language you will certainly benefit from looking at JavaScript or VBScript before committing to Java, since they might fit your needs handily and youll be more productive sooner. Comment
To some degree, the competitor to Java is Microsofts ActiveX, although it takes a completely different approach. ActiveX was originally a Windows-only solution, although it is now being developed via an independent consortium to become cross-platform. Effectively, ActiveX says if your program connects to its environment just so, it can be dropped into a Web page and run under a browser that supports ActiveX. (IE directly supports ActiveX and Netscape does so using a plug-in.) Thus, ActiveX does not constrain you to a particular language. If, for example, youre already an experienced Windows programmer using a language such as C++, Visual Basic, or Borlands Delphi, you can create ActiveX components with almost no changes to your programming knowledge. ActiveX also provides a path for the use of legacy code in your Web pages. Comment
Automatically downloading and running programs across the Internet can sound like a virus-builders dream. ActiveX especially brings up the thorny issue of security in client-side programming. If you click on a Web site, you might automatically download any number of things along with the HTML page: GIF files, script code, compiled Java code, and ActiveX components. Some of these are benign; GIF files cant do any harm, and scripting languages are generally limited in what they can do. Java was also designed to run its applets within a sandbox of safety, which prevents it from writing to disk or accessing memory outside the sandbox. Comment
ActiveX is at the opposite end of the spectrum. Programming with ActiveX is like programming Windowsyou can do anything you want. So if you click on a page that downloads an ActiveX component, that component might cause damage to the files on your disk. Of course, programs that you load onto your computer that are not restricted to running inside a Web browser can do the same thing. Viruses downloaded from Bulletin-Board Systems (BBSs) have long been a problem, but the speed of the Internet amplifies the difficulty. Comment
The solution seems to be digital signatures, whereby code is verified to show who the author is. This is based on the idea that a virus works because its creator can be anonymous, so if you remove the anonymity individuals will be forced to be responsible for their actions. This seems like a good plan because it allows programs to be much more functional, and I suspect it will eliminate malicious mischief. If, however, a program has an unintentional destructive bug it will still cause problems. Comment
The Java approach is to prevent these problems from occurring, via the sandbox. The Java interpreter that lives on your local Web browser examines the applet for any untoward instructions as the applet is being loaded. In particular, the applet cannot write files to disk or erase files (one of the mainstays of viruses). Applets are generally considered to be safe, and since this is essential for reliable client/server systems, any bugs in the Java language that allow viruses are rapidly repaired. (Its worth noting that the browser software actually enforces these security restrictions, and some browsers allow you to select different security levels to provide varying degrees of access to your system.) Comment
You might be skeptical of this rather draconian restriction against writing files to your local disk. For example, you may want to build a local database or save data for later use offline. The initial vision seemed to be that eventually everyone would get online to do anything important, but that was soon seen to be impractical (although low-cost Internet appliances might someday satisfy the needs of a significant segment of users). The solution is the signed applet that uses public-key encryption to verify that an applet does indeed come from where it claims it does. A signed applet can still trash your disk, but the theory is that since you can now hold the applet creator accountable they wont do vicious things. Java provides a framework for digital signatures so that you will eventually be able to allow an applet to step outside the sandbox if necessary. Comment
Digital signatures have missed an important issue, which is the speed that people move around on the Internet. If you download a buggy program and it does something untoward, how long will it be before you discover the damage? It could be days or even weeks. By then, how will you track down the program thats done it? And what good will it do you at that point? Comment
The Web is the most general solution to the client/server problem, so it makes sense that you can use the same technology to solve a subset of the problem, in particular the classic client/server problem within a company. With traditional client/server approaches you have the problem of multiple types of client computers, as well as the difficulty of installing new client software, both of which are handily solved with Web browsers and client-side programming. When Web technology is used for an information network that is restricted to a particular company, it is referred to as an intranet. Intranets provide much greater security than the Internet, since you can physically control access to the servers within your company. In terms of training, it seems that once people understand the general concept of a browser its much easier for them to deal with differences in the way pages and applets look, so the learning curve for new kinds of systems seems to be reduced. Comment
The security problem brings us to one of the divisions that seems to be automatically forming in the world of client-side programming. If your program is running on the Internet, you dont know what platform it will be working under, and you want to be extra careful that you dont disseminate buggy code. You need something cross-platform and secure, like a scripting language or Java. Comment
If youre running on an intranet, you might have a different set of constraints. Its not uncommon that your machines could all be Intel/Windows platforms. On an intranet, youre responsible for the quality of your own code and can repair bugs when theyre discovered. In addition, you might already have a body of legacy code that youve been using in a more traditional client/server approach, whereby you must physically install client programs every time you do an upgrade. The time wasted in installing upgrades is the most compelling reason to move to browsers, because upgrades are invisible and automatic. If you are involved in such an intranet, the most sensible approach to take is the shortest path that allows you to use your existing code base, rather than trying to recode your programs in a new language. Comment
When faced with this bewildering array of solutions to the client-side programming problem, the best plan of attack is a cost-benefit analysis. Consider the constraints of your problem and what would be the shortest path to your solution. Since client-side programming is still programming, its always a good idea to take the fastest development approach for your particular situation. This is an aggressive stance to prepare for inevitable encounters with the problems of program development. Comment
This whole discussion has ignored the issue of server-side programming. What happens when you make a request of a server? Most of the time the request is simply send me this file. Your browser then interprets the file in some appropriate fashion: as an HTML page, a graphic image, a Java applet, a script program, etc. A more complicated request to a server generally involves a database transaction. A common scenario involves a request for a complex database search, which the server then formats into an HTML page and sends to you as the result. (Of course, if the client has more intelligence via Java or a scripting language, the raw data can be sent and formatted at the client end, which will be faster and less load on the server.) Or you might want to register your name in a database when you join a group or place an order, which will involve changes to that database. These database requests must be processed via some code on the server side, which is generally referred to as server-side programming. Traditionally, server-side programming has been performed using Perl and CGI scripts, but more sophisticated systems have been appearing. These include Java-based Web servers that allow you to perform all your server-side programming in Java by writing what are called servlets. Servlets and their offspring, JSPs, are two of the most compelling reasons that companies who develop Web sites are moving to Java, especially because they eliminate the problems of dealing with differently abled browsers. Comment
Much of the brouhaha over Java has been over applets. Java is actually a general-purpose programming language that can solve any type of problemat least in theory. And as pointed out previously, there might be more effective ways to solve most client/server problems. When you move out of the applet arena (and simultaneously release the restrictions, such as the one against writing to disk) you enter the world of general-purpose applications that run standalone, without a Web browser, just like any ordinary program does. Here, Javas strength is not only in its portability, but also its programmability. As youll see throughout this book, Java has many features that allow you to create robust programs in a shorter period than with previous programming languages. Comment
Be aware that this is a mixed blessing. You pay for the improvements through slower execution speed (although there is significant work going on in this areaJDK 1.3, in particular, introduces the so-called hotspot performance improvements). Like any language, Java has built-in limitations that might make it inappropriate to solve certain types of programming problems. Java is a rapidly evolving language, however, and as each new release comes out it becomes more and more attractive for solving larger sets of problems. Comment
The object-oriented paradigm is a new and different way of thinking about programming. Many folks have trouble at first knowing how to approach an OOP project. Once you know that everything is supposed to be an object, and as you learn to think more in an object-oriented style, you can begin to create good designs that take advantage of all the benefits that OOP has to offer. Comment
A methodology (sometimes simply called a method) is a set of processes and heuristics used to break down the complexity of a programming problem. Many OOP methodologies have been formulated since the dawn of object-oriented programming. This section will give you a feel for what youre trying to accomplish when using a methodology. Comment
Especially in OOP, methodology is a field of many experiments, so it is important to understand what problem the methodology is trying to solve before you consider adopting one. This is particularly true with Java, in which the programming language is intended to reduce the complexity (compared to C) involved in expressing a program. This may in fact alleviate the need for ever-more-complex methodologies. Instead, simple methodologies may suffice in Java for a much larger class of problems than you could handle using simple methodologies with procedural languages. Comment
Its also important to realize that the term methodology is often too grand and promises too much. Whatever you do now when you design and write a program is a methodology. It may be your own methodology, and you may not be conscious of doing it, but it is a process you go through as you create. If it is an effective process, it may need only a small tune-up to work with Java. If you are not satisfied with your productivity and the way your programs turn out, you may want to consider adopting a formal methodology, or choosing pieces from among the many formal methodologies. Comment
While youre going through the development process, the most important issue is this: Dont get lost. Its easy to do. Most of the analysis and design methodoligies are intended to solve the largest of problems. Remember that most projects dont fit into that category, so you can usually have successful analysis and design with a relatively small subset of what a methodology recommends[9]. But some sort of process, no matter how limited, will generally get you on your way in a much better fashion than simply beginning to code. Comment
Its also easy to get stuck, to fall into analysis paralysis, where you feel like you cant move forward because you havent nailed down every little detail at the current stage. Remember, no matter how much analysis you do, there are some things about a system that wont reveal themselves until design time, and more things that wont reveal themselves until youre coding, or not even until a program is up and running. Because of this, its crucial to move fairly quickly through analysis and design, and to implement a test of the proposed system. Comment
This point is worth emphasizing. Because of the history weve had with procedural languages, it is commendable that a team will want to proceed carefully and understand every minute detail before moving to design and implementation. Certainly, when creating a Database Management System (DBMS), it pays to understand a customers needs thoroughly. But a DBMS is in a class of problems that is very well-posed and well-understood; in many such programs, the database structure is the problem to be tackled. The class of programming problem discussed in this chapter is of the wild-card (my term) variety, in which the solution isnt simply re-forming a well-known solution, but instead involves one or more wild-card factorselements for which there is no well-understood previous solution, and for which research is necessary[10]. Attempting to thoroughly analyze a wild-card problem before moving into design and implementation results in analysis paralysis because you dont have enough information to solve this kind of problem during the analysis phase. Solving such a problem requires iteration through the whole cycle, and that requires risk-taking behavior (which makes sense, because youre trying to do something new and the potential rewards are higher). It may seem like the risk is compounded by rushing into a preliminary implementation, but it can instead reduce the risk in a wild-card project because youre finding out early whether a particular approach to the problem is viable. Product development is risk management. Comment
Its often proposed that you build one to throw away. With OOP, you may still throw part of it away, but because code is encapsulated into classes, during the first pass you will inevitably produce some useful class designs and develop some worthwhile ideas about the system design that do not need to be thrown away. Thus, the first rapid pass at a problem not only produces critical information for the next analysis, design, and implementation pass, it also creates a code foundation. Comment
That said, if youre looking at a methodology that contains tremendous detail and suggests many steps and documents, its still difficult to know when to stop. Keep in mind what youre trying to discover: Comment
If you come up with nothing more than the objects and their interfaces, then you can write a program. For various reasons you might need more descriptions and documents than this, but you cant get away with any less. Comment
The process can be undertaken in five phases, and a Phase 0 that is just the initial commitment to using some kind of structure. Comment
You must first decide what steps youre going to have in your process. It sounds simple (in fact, all of this sounds simple), and yet people often dont make this decision before they start coding. If your plan is lets jump in and start coding, fine. (Sometimes thats appropriate when you have a well-understood problem.) At least agree that this is the plan. Comment
You might also decide at this phase that some additional process structure is necessary, but not the whole nine yards. Understandably, some programmers like to work in vacation mode, in which no structure is imposed on the process of developing their work; It will be done when its done. This can be appealing for a while, but Ive found that having a few milestones along the way helps to focus and galvanize your efforts around those milestones instead of being stuck with the single goal of finish the project. In addition, it divides the project into more bite-sized pieces and makes it seem less threatening (plus the milestones offer more opportunities for celebration). Comment
When I began to study story structure (so that I will someday write a novel) I was initially resistant to the idea of structure, feeling that I wrote best when I simply let it flow onto the page. But I later realized that when I write about computers the structure is clear enough to me that I dont have to think about it very much. But I still structure my work, albeit only semi-consciously in my head. Even if you think that your plan is to just start coding, you still somehow go through the subsequent phases while asking and answering certain questions. Comment
Any system you build, no matter how complicated, has a fundamental purpose; the business that its in, the basic need that it satisfies. If you can look past the user interface, the hardware- or system-specific details, the coding algorithms and the efficiency problems, you will eventually find the core of its beingsimple and straightforward. Like the so-called high concept from a Hollywood movie, you can describe it in one or two sentences. This pure description is the starting point. Comment
The high concept is quite important because it sets the tone for your project; its a mission statement. You wont necessarily get it right the first time (you may be in a later phase of the project before it becomes completely clear), but keep trying until it feels right. For example, in an air-traffic control system you may start out with a high concept focused on the system that youre building: The tower program keeps track of the aircraft. But consider what happens when you shrink the system to a very small airfield; perhaps theres only a human controller, or none at all. A more useful model wont concern the solution youre creating as much as it describes the problem: Aircraft arrive, unload, service and reload, then depart. Comment
In the previous generation of program design (called procedural design), this is called creating the requirements analysis and system specification. These, of course, were places to get lost; intimidatingly named documents that could become big projects in their own right. Their intention was good, however. The requirements analysis says Make a list of the guidelines we will use to know when the job is done and the customer is satisfied. The system specification says Heres a description of what the program will do (not how) to satisfy the requirements. The requirements analysis is really a contract between you and the customer (even if the customer works within your company, or is some other object or system). The system specification is a top-level exploration into the problem and in some sense a discovery of whether it can be done and how long it will take. Since both of these will require consensus among people (and because they will usually change over time), I think its best to keep them as bare as possibleideally, to lists and basic diagramsto save time. You might have other constraints that require you to expand them into bigger documents, but by keeping the initial document small and concise, it can be created in a few sessions of group brainstorming with a leader who dynamically creates the description. This not only solicits input from everyone, it also fosters initial buy-in and agreement by everyone on the team. Perhaps most importantly, it can kick off a project with a lot of enthusiasm. Comment
Its necessary to stay focused on the heart of what youre trying to accomplish in this phase: determine what the system is supposed to do. The most valuable tool for this is a collection of what are called use cases. Use cases identify key features in the system that will reveal some of the fundamental classes youll be using. These are essentially descriptive answers to questions like[11]: Comment
If you are designing an auto-teller, for example, the use case for a particular aspect of the functionality of the system is able to describe what the auto-teller does in every possible situation. Each of these situations is referred to as a scenario, and a use case can be considered a collection of scenarios. You can think of a scenario as a question that starts with: What does the system do if...? For example, What does the auto-teller do if a customer has just deposited a check within the last 24 hours, and theres not enough in the account without the check having cleared to provide a desired withdrawal? Comment
Use case diagrams are intentionally simple to prevent you from getting bogged down in system implementation details prematurely:
Each stick person represents an actor, which is typically a human or some other kind of free agent. (These can even be other computer systems, as is the case with ATM.) The box represents the boundary of your system. The ellipses represent the use cases, which are descriptions of valuable work that can be performed with the system. The lines between the actors and the use cases represent the interactions. Comment
It doesnt matter how the system is actually implemented, as long as it looks like this to the user. Comment
A use case does not need to be terribly complex, even if the underlying system is complex. It is only intended to show the system as it appears to the user. For example: Comment
The use cases produce the requirements specifications by determining all the interactions that the user may have with the system. You try to discover a full set of use cases for your system, and once youve done that you have the core of what the system is supposed to do. The nice thing about focusing on use cases is that they always bring you back to the essentials and keep you from drifting off into issues that arent critical for getting the job done. That is, if you have a full set of use cases, you can describe your system and move on to the next phase. You probably wont get it all figured out perfectly on the first try, but thats OK. Everything will reveal itself in time, and if you demand a perfect system specification at this point youll get stuck. Comment
If you do get stuck, you can kick-start this phase by using a rough approximation tool: describe the system in a few paragraphs and then look for nouns and verbs. The nouns can suggest actors, context of the use case (e.g., lobby), or artifacts manipulated in the use case. Verbs can suggest interactions between actors and use cases, and specify steps within the use case. Youll also discover that nouns and verbs produce objects and messages during the design phase (and note that use cases describe interactions between subsystems, so the noun and verb technique can be used only as a brainstorming tool as it does not generate use cases) [12]. Comment
The boundary between a use case and an actor can point out the existence of a user interface, but it does not define such a user interface. For a process of defining and creating user interfaces, see Software for Use by Larry Constantine and Lucy Lockwood, (Addison-Wesley Longman, 1999) or go to www.ForUse.com. Comment
Although its a black art, at this point some kind of basic scheduling is important. You now have an overview of what youre building, so youll probably be able to get some idea of how long it will take. A lot of factors come into play here. If you estimate a long schedule then the company might decide not to build it (and thus use their resources on something more reasonablethats a good thing). Or a manager might have already decided how long the project should take and will try to influence your estimate. But its best to have an honest schedule from the beginning and deal with the tough decisions early. There have been a lot of attempts to come up with accurate scheduling techniques (much like techniques to predict the stock market), but probably the best approach is to rely on your experience and intuition. Get a gut feeling for how long it will really take, then double that and add 10 percent. Your gut feeling is probably correct; you can get something working in that time. The doubling will turn that into something decent, and the 10 percent will deal with the final polishing and details[13]. However you want to explain it, and regardless of the moans and manipulations that happen when you reveal such a schedule, it just seems to work out that way. Comment
In this phase you must come up with a design that describes what the classes look like and how they will interact. An excellent technique in determining classes and interactions is the Class-Responsibility-Collaboration (CRC) card. Part of the value of this tool is that its so low-tech: you start out with a set of blank 3 x 5 cards, and you write on them. Each card represents a single class, and on the card you write: Comment
You may feel like the cards should be bigger because of all the information youd like to get on them, but they are intentionally small, not only to keep your classes small but also to keep you from getting into too much detail too early. If you cant fit all you need to know about a class on a small card, the class is too complex (either youre getting too detailed, or you should create more than one class). The ideal class should be understood at a glance. The idea of CRC cards is to assist you in coming up with a first cut of the design so that you can get the big picture and then refine your design. Comment
One of the great benefits of CRC cards is in communication. Its best done real time, in a group, without computers. Each person takes responsibility for several classes (which at first have no names or other information). You run a live simulation by solving one scenario at a time, deciding which messages are sent to the various objects to satisfy each scenario. As you go through this process, you discover the classes that you need along with their responsibilities and collaborations, and you fill out the cards as you do this. When youve moved through all the use cases, you should have a fairly complete first cut of your design. Comment
Before I began using CRC cards, the most successful consulting experiences I had when coming up with an initial design involved standing in front of a teamwho hadnt built an OOP project beforeand drawing objects on a whiteboard. We talked about how the objects should communicate with each other, and erased some of them and replaced them with other objects. Effectively, I was managing all the CRC cards on the whiteboard. The team (who knew what the project was supposed to do) actually created the design; they owned the design rather than having it given to them. All I was doing was guiding the process by asking the right questions, trying out the assumptions, and taking the feedback from the team to modify those assumptions. The true beauty of the process was that the team learned how to do object-oriented design not by reviewing abstract examples, but by working on the one design that was most interesting to them at that moment: theirs. Comment
Once youve come up with a set of CRC cards, you may want to create a more formal description of your design using UML[14]. You dont need to use UML, but it can be helpful, especially if you want to put up a diagram on the wall for everyone to ponder, which is a good idea. An alternative to UML is a textual description of the objects and their interfaces, or, depending on your programming language, the code itself[15]. Comment
UML also provides an additional diagramming notation for describing the dynamic model of your system. This is helpful in situations in which the state transitions of a system or subsystem are dominant enough that they need their own diagrams (such as in a control system). You may also need to describe the data structures, for systems or subsystems in which data is a dominant factor (such as a database). Comment
Youll know youre done with Phase 2 when you have described the objects and their interfaces. Well, most of themthere are usually a few that slip through the cracks and dont make themselves known until Phase 3. But thats OK. All you are concerned with is that you eventually discover all of your objects. Its nice to discover them early in the process, but OOP provides enough structure so that its not so bad if you discover them later. In fact, the design of an object tends to happen in five stages, throughout the process of program development. Comment
The design life of an object is not limited to the time when youre writing the program. Instead, the design of an object appears over a sequence of stages. Its helpful to have this perspective because you stop expecting perfection right away; instead, you realize that the understanding of what an object does and what it should look like happens over time. This view also applies to the design of various types of programs; the pattern for a particular type of program emerges through struggling again and again with that problem (This is chronicled in the book Thinking in Patterns with Java, downloadable at www.BruceEckel.com). Objects, too, have their patterns that emerge through understanding, use, and reuse. Comment
1. Object discovery. This stage occurs during the initial analysis of a program. Objects may be discovered by looking for external factors and boundaries, duplication of elements in the system, and the smallest conceptual units. Some objects are obvious if you already have a set of class libraries. Commonality between classes suggesting base classes and inheritance may appear right away, or later in the design process. Comment
2. Object assembly. As youre building an object youll discover the need for new members that didnt appear during discovery. The internal needs of the object may require other classes to support it. Comment
3. System construction. Once again, more requirements for an object may appear at this later stage. As you learn, you evolve your objects. The need for communication and interconnection with other objects in the system may change the needs of your classes or require new classes. For example, you may discover the need for facilitator or helper classes, such as a linked list, that contain little or no state information and simply help other classes function. Comment
4. System extension. As you add new features to a system you may discover that your previous design doesnt support easy system extension. With this new information, you can restructure parts of the system, possibly adding new classes or class hierarchies. This is also a good time to consider taking features out of a projectComment
5. Object reuse. This is the real stress test for a class. If someone tries to reuse it in an entirely new situation, theyll probably discover some shortcomings. As you change a class to adapt to more new programs, the general principles of the class will become clearer, until you have a truly reusable type. However, dont expect most objects from a system design to be reusableit is perfectly acceptable for the bulk of your objects to be system-specific. Reusable types tend to be less common, and they must solve more general problems in order to be reusable. Comment
These stages suggest some guidelines when thinking about developing your classes: Comment
This is the initial conversion from the rough design into a compiling and executing body of code that can be tested, and especially that will prove or disprove your architecture. This is not a one-pass process, but rather the beginning of a series of steps that will iteratively build the system, as youll see in Phase 4. Comment
Your goal is to find the core of your system architecture that needs to be implemented in order to generate a running system, no matter how incomplete that system is in this initial pass. Youre creating a framework that you can build on with further iterations. Youre also performing the first of many system integrations and tests, and giving the stakeholders feedback about what their system will look like and how it is progressing. Ideally, you are also exposing some of the critical risks. Youll probably also discover changes and improvements that can be made to your original architecturethings you would not have learned without implementing the system. Comment
Part of building the system is the reality check that you get from testing against your requirements analysis and system specification (in whatever form they exist). Make sure that your tests verify the requirements and use cases. When the core of the system is stable, youre ready to move on and add more functionality. Comment
Once the core framework is running, each feature set you add is a small project in itself. You add a feature set during an iteration, a reasonably short period of development. Comment
How big is an iteration? Ideally, each iteration lasts one to three weeks (this can vary based on the implementation language). At the end of that period, you have an integrated, tested system with more functionality than it had before. But whats particularly interesting is the basis for the iteration: a single use case. Each use case is a package of related functionality that you build into the system all at once, during one iteration. Not only does this give you a better idea of what the scope of a use case should be, but it also gives more validation to the idea of a use case, since the concept isnt discarded after analysis and design, but instead it is a fundamental unit of development throughout the software-building process. Comment
You stop iterating when you achieve target functionality or an external deadline arrives and the customer can be satisfied with the current version. (Remember, software is a subscription business.) Because the process is iterative, you have many opportunities to ship a product rather than a single endpoint; open-source projects work exclusively in an iterative, high-feedback environment, which is precisely what makes them successful. Comment
An iterative development process is valuable for many reasons. You can reveal and resolve critical risks early, the customers have ample opportunity to change their minds, programmer satisfaction is higher, and the project can be steered with more precision. But an additional important benefit is the feedback to the stakeholders, who can see by the current state of the product exactly where everything lies. This may reduce or eliminate the need for mind-numbing status meetings and increase the confidence and support from the stakeholders. Comment
This is the point in the development cycle that has traditionally been called maintenance, a catch-all term that can mean everything from getting it to work the way it was really supposed to in the first place to adding features that the customer forgot to mention to the more traditional fixing the bugs that show up and adding new features as the need arises. So many misconceptions have been applied to the term maintenance that it has taken on a slightly deceiving quality, partly because it suggests that youve actually built a pristine program and all you need to do is change parts, oil it, and keep it from rusting. Perhaps theres a better term to describe whats going on. Comment
Ill use the term evolution[16]. That is, You wont get it right the first time, so give yourself the latitude to learn and to go back and make changes. You might need to make a lot of changes as you learn and understand the problem more deeply. The elegance youll produce if you evolve until you get it right will pay off, both in the short and the long term. Evolution is where your program goes from good to great, and where those issues that you didnt really understand in the first pass become clear. Its also where your classes can evolve from single-project usage to reusable resources. Comment
What it means to get it right isnt just that the program works according to the requirements and the use cases. It also means that the internal structure of the code makes sense to you, and feels like it fits together well, with no awkward syntax, oversized objects, or ungainly exposed bits of code. In addition, you must have some sense that the program structure will survive the changes that it will inevitably go through during its lifetime, and that those changes can be made easily and cleanly. This is no small feat. You must not only understand what youre building, but also how the program will evolve (what I call the vector of change). Fortunately, object-oriented programming languages are particularly adept at supporting this kind of continuing modificationthe boundaries created by the objects are what tend to keep the structure from breaking down. They also allow you to make changesones that would seem drastic in a procedural programwithout causing earthquakes throughout your code. In fact, support for evolution might be the most important benefit of OOP. Comment
With evolution, you create something that at least approximates what you think youre building, and then you kick the tires, compare it to your requirements, and see where it falls short. Then you can go back and fix it by redesigning and reimplementing the portions of the program that didnt work right[17]. You might actually need to solve the problem, or an aspect of the problem, several times before you hit on the right solution. (A study of Design Patterns is usually helpful here. You can find information in Thinking in Patterns with Java, downloadable at www.BruceEckel.com.) Comment
Evolution also occurs when you build a system, see that it matches your requirements, and then discover it wasnt actually what you wanted. When you see the system in operation, you find that you really wanted to solve a different problem. If you think this kind of evolution is going to happen, then you owe it to yourself to build your first version as quickly as possible so you can find out if it is indeed what you want. Comment
Perhaps the most important thing to remember is that by defaultby definition, reallyif you modify a class, its super- and subclasses will still function. You need not fear modification (especially if you have a built-in set of unit tests to verify the correctness of your modifications). Modification wont necessarily break the program, and any change in the outcome will be limited to subclasses and/or specific collaborators of the class you change. Comment
Of course you wouldnt build a house without a lot of carefully drawn plans. If you build a deck or a dog house your plans wont be so elaborate, but youll probably still start with some kind of sketches to guide you on your way. Software development has gone to extremes. For a long time, people didnt have much structure in their development, but then big projects began failing. In reaction, we ended up with methodologies that had an intimidating amount of structure and detail, primarily intended for those big projects. These methodologies were too scary to useit looked like youd spend all your time writing documents and no time programming. (This was often the case.) I hope that what Ive shown you here suggests a middle patha sliding scale. Use an approach that fits your needs (and your personality). No matter how minimal you choose to make it, some kind of plan will make a big improvement in your project as opposed to no plan at all. Remember that, by most estimates, over 50 percent of projects fail (some estimates go up to 70 percent!). Comment
By following a planpreferably one that is simple and briefand coming up with design structure before coding, youll discover that things fall together far more easily than if you dive in and start hacking. Youll also realize a great deal of satisfaction. Its my experience that coming up with an elegant solution is deeply satisfying at an entirely different level; it feels closer to art than technology. And elegance always pays off; its not a frivolous pursuit. Not only does it give you a program thats easier to build and debug, but its also easier to understand and maintain, and thats where the financial value lies. Comment
I have studied analysis and design techniques, on and off, since I was in graduate school. The concept of Extreme Programming (XP) is the most radical, and delightful, that Ive seen. You can find it chronicled in Extreme Programming Explained by Kent Beck (Addison-Wesley, 2000) and on the Web at www.xprogramming.com. Comment
XP is both a philosophy about programming work and a set of guidelines to do it. Some of these guidelines are reflected in other recent methodologies, but the two most important and distinct contributions, in my opinion, are write tests first and pair programming. Although he argues strongly for the whole process, Beck points out that if you adopt only these two practices youll greatly improve your productivity and reliability. Comment
Testing has traditionally been relegated to the last part of a project, after youve gotten everything working, but just to be sure. Its implicitly had a low priority, and people who specialize in it have not been given a lot of status and have often even been cordoned off in a basement, away from the real programmers. Test teams have responded in kind, going so far as to wear black clothing and cackling with glee whenever they break something (to be honest, Ive had this feeling myself when breaking compilers). Comment
XP completely revolutionizes the concept of testing by giving it equal (or even greater) priority than the code. In fact, you write the tests before you write the code that will be tested, and the tests stay with the code forever. The tests must be executed successfully every time you do an integration of the project (which is often, sometimes more than once a day). Comment
Writing tests first has two extremely important effects. Comment
First, it forces a clear definition of the interface of a class. Ive often suggested that people imagine the perfect class to solve a particular problem as a tool when trying to design the system. The XP testing strategy goes further than thatit specifies exactly what the class must look like, to the consumer of that class, and exactly how the class must behave. In no uncertain terms. You can write all the prose, or create all the diagrams you want, describing how a class should behave and what it looks like, but nothing is as real as a set of tests. The former is a wish list, but the tests are a contract that is enforced by the compiler and the running program. Its hard to imagine a more concrete description of a class than the tests. Comment
While creating the tests, you are forced to completely think out the class and will often discover needed functionality that might be missed during the thought experiments of UML diagrams, CRC cards, use cases, etc. Comment
The second important effect of writing the tests first comes from running the tests every time you do a build of your software. This activity gives you the other half of the testing thats performed by the compiler. If you look at the evolution of programming languages from this perspective, youll see that the real improvements in the technology have actually revolved around testing. Assembly language checked only for syntax, but C imposed some semantic restrictions, and these prevented you from making certain types of mistakes. OOP languages impose even more semantic restrictions, which if you think about it are actually forms of testing. Is this data type being used properly? and Is this function being called properly? are the kinds of tests that are being performed by the compiler or run-time system. Weve seen the results of having these tests built into the language: people have been able to write more complex systems, and get them to work, with much less time and effort. Ive puzzled over why this is, but now I realize its the tests: you do something wrong, and the safety net of the built-in tests tells you theres a problem and points you to where it is. Comment
But the built-in testing afforded by the design of the language can only go so far. At some point, you must step in and add the rest of the tests that produce a full suite (in cooperation with the compiler and run-time system) that verifies all of your program. And, just like having a compiler watching over your shoulder, wouldnt you want these tests helping you right from the beginning? Thats why you write them first, and run them automatically with every build of your system. Your tests become an extension of the safety net provided by the language. Comment
One of the things that Ive discovered about the use of more and more powerful programming languages is that I am emboldened to try more brazen experiments, because I know that the language will keep me from wasting my time chasing bugs. The XP test scheme does the same thing for your entire project. Because you know your tests will always catch any problems that you introduce (and you regularly add any new tests as you think of them), you can make big changes when you need to without worrying that youll throw the whole project into complete disarray. This is incredibly powerful. Comment
Pair programming goes against the rugged individualism that weve been indoctrinated into from the beginning, through school (where we succeed or fail on our own, and working with our neighbors is considered cheating), and media, especially Hollywood movies in which the hero is usually fighting against mindless conformity[18]. Programmers, too, are considered paragons of individualitycowboy coders as Larry Constantine likes to say. And yet XP, which is itself battling against conventional thinking, says that code should be written with two people per workstation. And that this should be done in an area with a group of workstations, without the barriers that the facilities-design people are so fond of. In fact, Beck says that the first task of converting to XP is to arrive with screwdrivers and Allen wrenches and take apart everything that gets in the way.[19] (This will require a manager who can deflect the ire of the facilities department.) Comment
The value of pair programming is that one person is actually doing the coding while the other is thinking about it. The thinker keeps the big picture in mindnot only the picture of the problem at hand, but the guidelines of XP. If two people are working, its less likely that one of them will get away with saying, I dont want to write the tests first, for example. And if the coder gets stuck, they can swap places. If both of them get stuck, their musings may be overheard by someone else in the work area who can contribute. Working in pairs keeps things flowing and on track. Probably more important, it makes programming a lot more social and fun. Comment
Ive begun using pair programming during the exercise periods in some of my seminars and it seems to significantly improve everyones experience. Comment
The reason Java has been so successful is that the goal was to solve many of the problems facing developers today. The goal of Java is improved productivity. This productivity comes in many ways, but the language is designed to aid you as much as possible, while hindering you as little as possible with arbitrary rules or any requirement that you use a particular set of features. Java is designed to be practical; Java language design decisions were based on providing the maximum benefits to the programmer. Comment
Classes designed to fit the problem tend to express it better. This means that when you write the code, youre describing your solution in the terms of the problem space (Put the grommet in the bin) rather than the terms of the computer, which is the solution space (Set the bit in the chip that means that the relay will close). You deal with higher-level concepts and can do much more with a single line of code. Comment
The other benefit of this ease of expression is maintenance, which (if reports can be believed) takes a huge portion of the cost over a programs lifetime. If a program is easier to understand, then its easier to maintain. This can also reduce the cost of creating and maintaining the documentation. Comment
The fastest way to create a program is to use code thats already written: a library. A major goal in Java is to make library use easier. This is accomplished by casting libraries into new data types (classes), so that bringing in a library means adding new types to the language. Because the Java compiler takes care of how the library is usedguaranteeing proper initialization and cleanup, and ensuring that functions are called properlyyou can focus on what you want the library to do, not how you have to do it. Comment
Error handling in C is a notorious problem, and one that is often ignoredfinger-crossing is usually involved. If youre building a large, complex program, theres nothing worse than having an error buried somewhere with no clue as to where it came from. Java exception handling is a way to guarantee that an error is noticed, and that something happens as a result. Comment
Many traditional languages have built-in limitations to program size and complexity. BASIC, for example, can be great for pulling together quick solutions for certain classes of problems, but if the program gets more than a few pages long, or ventures out of the normal problem domain of that language, its like trying to swim through an ever-more viscous fluid. Theres no clear line that tells you when your language is failing you, and even if there were, youd ignore it. You dont say, My BASIC program just got too big; Ill have to rewrite it in C! Instead, you try to shoehorn a few more lines in to add that one new feature. So the extra costs come creeping up on you. Comment
Java is designed to aid programming in the largethat is, to erase those creeping-complexity boundaries between a small program and a large one. You certainly dont need to use OOP when youre writing a hello world style utility program, but the features are there when you need them. And the compiler is aggressive about ferreting out bug-producing errors for small and large programs alike. Comment
If you buy into OOP, your next question is probably, How can I get my manager/colleagues/department/peers to start using objects? Think about how youone independent programmerwould go about learning to use a new language and a new programming paradigm. Youve done it before. First comes education and examples; then comes a trial project to give you a feel for the basics without doing anything too confusing. Then comes a real world project that actually does something useful. Throughout your first projects you continue your education by reading, asking questions of experts, and trading hints with friends. This is the approach many experienced programmers suggest for the switch to Java. Switching an entire company will of course introduce certain group dynamics, but it will help at each step to remember how one person would do it. Comment
Here are some guidelines to consider when making the transition to OOP and Java: Comment
The first step is some form of education. Remember the companys investment in code, and try not to throw everything into disarray for six to nine months while everyone puzzles over how interfaces work. Pick a small group for indoctrination, preferably one composed of people who are curious, work well together, and can function as their own support network while theyre learning Java. Comment
An alternative approach that is sometimes suggested is the education of all company levels at once, including overview courses for strategic managers as well as design and programming courses for project builders. This is especially good for smaller companies making fundamental shifts in the way they do things, or at the division level of larger companies. Because the cost is higher, however, some may choose to start with project-level training, do a pilot project (possibly with an outside mentor), and let the project team become the teachers for the rest of the company. Comment
Try a low-risk project first and allow for mistakes. Once youve gained some experience, you can either seed other projects from members of this first team or use the team members as an OOP technical support staff. This first project may not work right the first time, so it should not be mission-critical for the company. It should be simple, self-contained, and instructive; this means that it should involve creating classes that will be meaningful to the other programmers in the company when they get their turn to learn Java. Comment
Seek out examples of good object-oriented design before starting from scratch. Theres a good probability that someone has solved your problem already, and if they havent solved it exactly you can probably apply what youve learned about abstraction to modify an existing design to fit your needs. This is the general concept of design patterns, covered in Thinking in Patterns with Java, downloadable at www.BruceEckel.com. Comment
The primary economic motivation for switching to OOP is the easy use of existing code in the form of class libraries (in particular, the Standard Java libraries, which are covered throughout this book). The shortest application development cycle will result when you can create and use objects from off-the-shelf libraries. However, some new programmers dont understand this, are unaware of existing class libraries, or, through fascination with the language, desire to write classes that may already exist. Your success with OOP and Java will be optimized if you make an effort to seek out and reuse other peoples code early in the transition process. Comment
It is not usually the best use of your time to take existing, functional code and rewrite it in Java. (If you must turn it into objects, you can interface to the C or C++ code using the Java Native Interface, described in Appendix B.) There are incremental benefits, especially if the code is slated for reuse. But chances are you arent going to see the dramatic increases in productivity that you hope for in your first few projects unless that project is a new one. Java and OOP shine best when taking a project from concept to reality. Comment
If youre a manager, your job is to acquire resources for your team, to overcome barriers to your teams success, and in general to try to provide the most productive and enjoyable environment so your team is most likely to perform those miracles that are always being asked of you. Moving to Java falls in all three of these categories, and it would be wonderful if it didnt cost you anything as well. Although moving to Java may be cheaperdepending on your constraintsthan the OOP alternatives for a team of C programmers (and probably for programmers in other procedural languages), it isnt free, and there are obstacles you should be aware of before trying to sell the move to Java within your company and embarking on the move itself. Comment
The cost of moving to Java is more than just the acquisition of Java compilers (the Sun Java compiler is free, so this is hardly an obstacle). Your medium- and long-term costs will be minimized if you invest in training (and possibly mentoring for your first project) and also if you identify and purchase class libraries that solve your problem rather than trying to build those libraries yourself. These are hard-money costs that must be factored into a realistic proposal. In addition, there are the hidden costs in loss of productivity while learning a new language and possibly a new programming environment. Training and mentoring can certainly minimize these, but team members must overcome their own struggles to understand the new technology. During this process they will make more mistakes (this is a feature, because acknowledged mistakes are the fastest path to learning) and be less productive. Even then, with some types of programming problems, the right classes, and the right development environment, its possible to be more productive while youre learning Java (even considering that youre making more mistakes and writing fewer lines of code per day) than if youd stayed with C. Comment
A common question is, Doesnt OOP automatically make my programs a lot bigger and slower? The answer is, It depends. The extra safety features in Java have traditionally extracted a performance penalty over a language like C++. Technologies such as hotspot and compilation technologies have improved the speed significantly in most cases, and efforts continue toward higher performance. Comment
When your focus is on rapid prototyping, you can throw together components as fast as possible while ignoring efficiency issues. If youre using any third-party libraries, these are usually already optimized by their vendors; in any case its not an issue while youre in rapid-development mode. When you have a system that you like, if its small and fast enough, then youre done. If not, you begin tuning with a profiling tool, looking first for speedups that can be done by rewriting small portions of code. If that doesnt help, you look for modifications that can be made in the underlying implementation so no code that uses a particular class needs to be changed. Only if nothing else solves the problem do you need to change the design. The fact that performance is so critical in that portion of the design is an indicator that it must be part of the primary design criteria. You have the benefit of finding this out early using rapid development.
If you find a function that is a particular bottleneck, you can rewrite it in C/C++ using Javas native methods, the subject of Appendix B. Comment
When starting your team into OOP and Java, programmers will typically go through a series of common design errors. This often happens due to insufficient feedback from experts during the design and implementation of early projects, because no experts have been developed within the company, and because there may be resistance to retaining consultants. Its easy to feel that you understand OOP too early in the cycle and go off on a bad tangent. Something thats obvious to someone experienced with the language may be a subject of great internal debate for a novice. Much of this trauma can be skipped by using an experienced outside expert for training and mentoring. Comment
Java looks a lot like C++, and so naturally it would seem that C++ will be replaced by Java. But Im starting to question this logic. For one thing, C++ still has some features that Java doesnt, and although there have been a lot of promises about Java someday being as fast or faster than C++, weve seen steady improvements but no dramatic breakthroughs. Also, there seems to be a continuing interest in C++, so I dont think that language is going away any time soon. (Languages seem to hang around. Speaking at one of my Intermediate/Advanced Java Seminars, Allen Holub asserted that the two most commonly used languages are Rexx and COBOL, in that order.) Comment
Im beginning to think that the strength of Java lies in a slightly different arena than that of C++. C++ is a language that doesnt try to fit a mold. Certainly it has been adapted in a number of ways to solve particular problems. Some C++ tools combine libraries, component models, and code-generation tools to solve the problem of developing windowed end-user applications (for Microsoft Windows). And yet, what do the vast majority of Windows developers use? Microsofts Visual Basic (VB). This despite the fact that VB produces the kind of code that becomes unmanageable when the program is only a few pages long (and syntax that can be positively mystifying). As successful and popular as VB is, its not a very good example of language design. It would be nice to have the ease and power of VB without the resulting unmanageable code. And thats where I think Java will shine: as the next VB. You may or may not shudder to hear this, but think about it: so much of Java is intended to make it easy for the programmer to solve application-level problems like networking and cross-platform UI, and yet it has a language design that allows the creation of very large and flexible bodies of code. Add to this the fact that Java has the most robust type checking and error handling systems Ive ever seen in a language and you have the makings of a significant leap forward in programming productivity. Comment
Should you use Java instead of C++ for your project? Other than Web applets, there are two issues to consider. First, if you want to use a lot of existing C++ libraries (and youll certainly get a lot of productivity gains there), or if you have an existing C or C++ code base, Java might slow your development down rather than speeding it up. Comment
If youre developing all your code primarily from scratch, then the simplicity of Java over C++ will significantly shorten your development timethe anecdotal evidence (stories from C++ teams that Ive talked to who have switched to Java) suggests a doubling of development speed over C++. If Java performance doesnt matter or you can somehow compensate for it, sheer time-to-market issues make it difficult to choose C++ over Java. Comment
The biggest issue is performance. Interpreted Java has been slow, even 20 to 50 times slower than C in the original Java interpreters. This has improved greatly over time, but it will still remain an important number. Computers are about speed; if it wasnt significantly faster to do something on a computer then youd do it by hand. (Ive even heard it suggested that you start with Java, to gain the short development time, then use a tool and support libraries to translate your code to C++, if you need faster execution speed.) Comment
The key to making Java feasible for most development projects is the appearance of speed improvements like so-called just-in time (JIT) compilers, Suns own hotspot technology, and even native code compilers. Of course, native code compilers will eliminate the touted cross-platform execution of the compiled programs, but they will also bring the speed of the executable closer to that of C and C++. And cross-compiling a program in Java should be a lot easier than doing so in C or C++. (In theory, you just recompile, but that promise has been made before for other languages.) Comment
You can find comparisons of Java and C++ and observations about Java realities in the appendices of the first edition of this book (Available on this books accompanying CD ROM, as well as at www.BruceEckel.com). Comment
This chapter attempts to give you a feel for the broad issues of object-oriented programming and Java, including why OOP is different, and why Java in particular is different, concepts of OOP methodologies, and finally the kinds of issues you will encounter when moving your own company to OOP and Java. Comment
OOP and Java may not be for everyone. Its important to evaluate your own needs and decide whether Java will optimally satisfy those needs, or if you might be better off with another programming system (including the one youre currently using). If you know that your needs will be very specialized for the foreseeable future and if you have specific constraints that may not be satisfied by Java, then you owe it to yourself to investigate the alternatives[20]. Even if you eventually choose Java as your language, youll at least understand what the options were and have a clear vision of why you took that direction. Comment
You know what a procedural program looks like: data definitions and function calls. To find the meaning of such a program you have to work a little, looking through the function calls and low-level concepts to create a model in your mind. This is the reason we need intermediate representations when designing procedural programsby themselves, these programs tend to be confusing because the terms of expression are oriented more toward the computer than to the problem youre solving. Comment
Because Java adds many new concepts on top of what you find in a procedural language, your natural assumption may be that the main( ) in a Java program will be far more complicated than for the equivalent C program. Here, youll be pleasantly surprised: A well-written Java program is generally far simpler and much easier to understand than the equivalent C program. What youll see are the definitions of the objects that represent concepts in your problem space (rather than the issues of the computer representation) and messages sent to those objects to represent the activities in that space. One of the delights of object-oriented programming is that, with a well-designed program, its easy to understand the code by reading it. Usually theres a lot less code as well, because many of your problems will be solved by reusing existing library code. Comment
[2] See Multiparadigm Programming in Leda by Timothy Budd (Addison-Wesley 1995).
[3] This is actually a bit restrictive, since objects can conceivably exist in different machines and address spaces, and they can also be stored on disk. In these cases, the identity of the object must be determined by something other than memory address.
[4] Some people make a distinction, stating that type determines the interface while class is a particular implementation of that interface.
[5] Im indebted to my friend Scott Meyers for this term.
[6] This is usually enough detail for most diagrams, and you dont need to get specific about whether youre using aggregation or composition.
[7] My term.
[8] Primitive types, which youll learn about later, are a special case.
[9] An excellent example of this is UML Distilled, 2nd edition, by Martin Fowler (Addison-Wesley 2000), which reduces the sometimes-overwhelming UML process to a manageable subset.
[10] My rule of thumb for estimating such projects: If theres more than one wild card, dont even try to plan how long its going to take or how much it will cost until youve created a working prototype. There are too many degrees of freedom.
[11] Thanks for help from James H Jarrett.
[12] More information on use cases can be found in Applying Use Cases by Schneider & Winters (Addison-Wesley 1998) and Use Case Driven Object Modeling with UML by Rosenberg (Addison-Wesley 1999).
[13] My personal take on this has changed lately. Doubling and adding 10 percent will give you a reasonably accurate estimate (assuming there are not too many wild-card factors), but you still have to work quite diligently to finish in that time. If you want time to really make it elegant and to enjoy yourself in the process, the correct multiplier is more like three or four times, I believe.
[14] For starters, I recommend the aforementioned UML Distilled, 2nd edition.
[15] Python (www.Python.org) is often used as executable pseudocode.
[16] At least one aspect of evolution is covered in Martin Fowlers book Refactoring: improving the design of existing code (Addison-Wesley 1999), which uses Java examples exclusively.
[17] This is something like rapid prototyping, where you were supposed to build a quick-and-dirty version so that you could learn about the system, and then throw away your prototype and build it right. The trouble with rapid prototyping is that people didnt throw away the prototype, but instead built upon it. Combined with the lack of structure in procedural programming, this often leads to messy systems that are expensive to maintain.
[18] Although this may be a more American perspective, the stories of Hollywood reach everywhere.
[19] Including (especially) the PA system. I once worked in a company that insisted on broadcasting every phone call that arrived for every executive, and it constantly interrupted our productivity (but the managers couldnt begin to conceive of stifling such an important service as the PA). Finally, when no one was looking I started snipping speaker wires.
[20] In particular, I recommend looking at Python (http://www.Python.org).