Chapter 8. Applying Good Software Processes

The Java MARIAN server was mainly developed by 15 graduate students involved in a CS5604 class project in the fall semester of 1998. The students involved selected this for their term project. As mentioned in Chapter 1, system reliability is especially an issue in class projects. To verify that our approach to system reliability is effective even in such an environment, software processes based on the Software Engineering Institute's Capability Maturity Model were tailored and applied throughout this project. These include software product engineering, training, measurement and estimation, process improvement, and defects prevention. This chapter describes how we applied and tailored these software processes in the development of the Java MARIAN server.

8.1. Project Development Lifecycle

Basically speaking, the development of the Java MARIAN server followed the waterfall lifecycle. Figure 8.1 illustrates the basic phases we followed.

Figure 8.1. Basic development phases

We first did the requirements analysis, then the analyzing of the original project (the C/C++ MARIAN server), then the system high-level design, and then the detailed design. After that the source code of the system and the test code were written in parallel. Finally, we performed unit testing and integration testing (the design of the integration testing began after we finished the high level design). At the end of each phase, we performed one or more reviews to make sure all the problems found in the phase had been solved.

8.1.1. Requirements Analysis

In the requirements analysis phase, we considered our resources in terms of the number of students available, the number of hours each student would spend on this project, and the complexity of converting each module of the C/C++ MARIAN server to Java. We changed our requirement from reengineering 6 modules from C/C++ to Java to 5 after we found that we had 15 students enrolled in this project versus the 19 to 26 we expected originally. The group assignments were done in this phrase. Six groups were identified. The original project understanding group was responsible for analyzing the C/C++ modules which needed to be converted to Java. The high-level design group was responsible for the high level design of the new system based on the work generated by the original project understanding group. The uip group was responsible for designing, implementing and testing the Java "uip". There were two detailed design, implementation, and testing groups which were responsible for the work generated by the high level design group. The last group, i.e., the webgate interface group, was responsible for producing an attractive new system interface and optimizing the code in the Java "webgate". At the end of this phrase, a review meeting was held to make sure the things we proposed to do were reasonable, given the resources available. The size of our product was also estimated in the meeting.

8.1.2. Original Project Analysis

Although this is not a typical phrase in the waterfall lifecycle model, we found it was necessary since this project was a reengineering project. In this phrase, the members in the original project understanding group read the code of the parts of the C/C++ MARIAN server which were to be converted to Java. They wrote documentation describing the detailed operation of the parts they were analyzing. Figures and graphs were widely used in the documentation to help following group(s) understand these operations quickly, without diving into the sea of C/C++ source code. Several review meetings were held between the members in the original project understanding group and the high level design group to make sure that the documents generated in this phrase were easily understood and detailed enough to carry on the following work -- the high level design of the system.

8.1.3. High Level Design

In the high-level design phrase, the designers read the work generated by the understanding group and then produced the design of the new system. Major classes of the system were identified. The high-level design document contains the class relationship diagram of the top-level classes in the system. Each relation between them (message passing or function call) was explained in detail. The signatures of all the public methods of those classes were specified. Furthermore, the high level designers wrote scenarios for the high level design to help following group(s) to understand the operation of the system better. Several review meetings were held and a significant amount of time was spent in design discussions. We compared several design choices in terms of their flexibility, scalability, complexity, testability, reusability, and efficiency. In the end we believed we found the best choice for the system in the long run.

8.1.4. Detailed Design

In the detailed design phase, all the methods of all the classes were identified, and the flowchart of each method was generated. The flowcharts were detailed to the extent that on the average each page of flowchart corresponded to 60 ~ 70 lines of source code. Detailed designs were reviewed at the flowchart stage, before coding began. In the review meetings, we removed tricky algorithm errors and optimized parts of the system. Testers participated in those meetings so that they could start designing test cases for branch coverage without seeing the source code of the system.

8.1.5. Coding

In the coding phase, developers were required to write code based on the detailed design document (flowchart). Code reviews were performed after they produced the code but before they compiled it, since according to [28], reviewing code before compiling it can remove syntax defects not detected by a compiler and at the same time generate satisfaction that comes from doing a quality job. We checked boundary conditions and typos as well as misuse of the Java language, during the code reviews. We verified that the code written really followed the flowchart produced in the detailed design. We also emphasized the places where inline comments should be added to make the code easily readable. Some good coding styles were found from the meetings and applied in the development of the project. Based on the number of problems found per thousand lines of code, we performed additional code reviews to guarantee that as many bugs as possible were removed from the code before we tested it. On average, two to three reviews were needed.

8.1.6. Unit Testing

The unit test design and implementation were done in parallel with the development of the source code. Some test code was written even before the source code to be tested was ready. Code reviews were performed on the test code too. On the one hand this removed bugs that existed in the test code. On the other hand some bugs in the source code were also found and removed just by reviewing the test code. The review meetings also generated some good coding styles especially suitable for test code: they make the test code easy to follow and test results easy to check by human beings.

Unit testing was done following a bottom-up strategy. We first tested and debugged classes that are not using services provided by other classes. Then we tested and debugged classes that use the services of the classes we had already tested and debugged. Sometimes, testers coded dummy classes. Due to the weak coupling of the system, coding those dummy classes was relatively easy.

8.1.7. Integration Testing

There is no clear line between unit testing and integration testing. Since we were using a bottom-up strategy, the later phase of unit testing was really testing the cooperation of many classes. The integration testing here refers to the testing of the communication between "session_manager" and "uip", and also the testing of connecting the Java "webgate", Java MARIAN server, and the C/C++ MARIAN server together.

8.2. Project Management

A significant amount of time and effort has been put into the management of this project since we believe this is crucial to the success of a multi-person project.

8.2.1. Group Assignment

Group assignment was done at the very beginning of the semester. Six groups along with their responsibilities were identified for this project. For each group, workload at different times during the semester had been specified so that members could arrange their work accordingly. For example, the original project understanding group would be very busy in the first third of the semester while the detailed design, implementation and testing groups had almost nothing to do until the middle of the semester.

To choose the most suitable persons for all the positions, also to respect each member's interest as much as possible, a group preference questionnaire was sent to each member. Each member was required to list four positions he/she preferred, in order. Also each was required to explain why he/she liked that position in terms of his/her background and interest. Analyzing the questionnaire we found 8 students were interested in the webgate interface group position (first preference) while we only needed three. On the other hand only one student expressed tentative interest in testing while we needed at least 3 testers. Thus there was strong competition for the webgate interface group position and in the end we chose those who had usability engineering experience and had done some interface design before.

8.2.2. Training

Training was used to make sure each member had the required skill to perform his/her task effectively. The Java programming training performed during the development of this project proved to be very effective.

Many members said that one of the reasons they chose this project was because they didn't know or were not familiar with the Java language. They hoped they could learn some Java programming skills by working on this project. Therefore we decided they needed to be trained in Java language programming. Otherwise they would end up spending a lot of time studying the language by themselves. Further, many things they would learn in this way might not be used in the project. So training was aimed to help them focus more time on the development of the project.

A training questionnaire was sent out to 9 members based on their positions and roles in the project. We only trained those who needed to know Java for their task. There were 12 questions in the questionnaire. Some questions were very simple (like "Do you know how to compile a Java program?") while others were complicated (like "Do you know how to use the synchronization control in Java?"). To save time and meet the deadline of the project, only questions related to the development of the project were put in the training questionnaire. The purpose of the questionnaire was to identify how much needed to be taught for each member.

Based on the answers to the survey, members were assigned to three groups. The members of the basic level group knew nothing about Java including how to compile and run a Java program. The middle level group members knew some Java but were not familiar with complicated operations. The advanced level group members knew and had used Java before but needed some reminding since they had not used it for some time. There were two members who knew the answers to all the questions in the questionnaire and thus were excused from the training.

Training for the three groups was performed in parallel by three trainers (Java experts) at the same time. Different training strategies were used for different groups since members had different backgrounds. A training page was posted online after the training. The training page explains how to download and install JDK, the usage of some system classes as well as some features of the Java language. We built the page in such a way that everything on it was useful for the development of this project, and almost everything used in this project (in terms of Java language) was on the page. This page helped greatly in our software development.

The result of the training was amazing. Even those members who knew nothing about Java at the beginning could write code with high speed in the coding phrase by consulting the training page. They said the training saved them a tremendous amount of time in learning Java programming.

8.2.3. Measurement & Estimation

Measurement and estimation also were very important in the development of this project. On the one hand they helped us make reasonable plans based on previous performance. On the other hand by analyzing the data collected we figured out ways to improve the processes we followed in developing the system, thus increasing our efficiency.

The project manager held weekly meetings with each group and individual members. In the meeting they planned the next week's task for the group or member. A public Web page was set up with all the deadlines and task assignments for group and members. When the deadline came, another meeting was held to determine whether or not the tasks had been accomplished. After that a colored percentage number was posted near the deadline. 100% success was shown as green, 50% as yellow, 0% as red. If much less than 100% was achieved, a reason was posted too. Since all the members want to see deadlines under his/her name posted green (which was considered in grading), this page acted as a tool for motivation. The result was that a majority of students achieved all green, some even achieving some deep green (indicating more than 100% success -- i.e., ahead of schedule).

That page alone was not enough since we still had the problem of making reasonable estimates in the meetings. We used a table with the following format (see Table 8.1) to measure the performance of all the members.

Table 8.1. Measurement and estimation table

date name position task description estimated time actual time comments
12/01/98 XXX Developer XXX 20 hours 18 hours
12/03/98 XXX Tester XXX 10hours 20 hours Understand the design document took too much time
... ... ... ... ... ... ...

The column "date" records when the corresponding task has been finished. The "name" column records who performed the task. The "position" column records the position of the person when he/she performed the task. The "task description" column describes the task performed. This description should be understandable in terms of workload -- for instance, "code classes AAA and BBB based on detailed design". Where possible, numbers were used to help measure it quantitatively. The "estimated time" and "actual time" columns show the estimated and the real time taken to perform the task, respectively. The "comments" column is used if there is a big difference between the estimated and the actual time used.

This table along with the public deadline page was introduced into the project late in the semester. We benefited from them very much in terms of making accurate estimations based on a member's previous performance. Many estimates made at the beginning were far from accurate since we had nothing to base them on. By the end of the semester, our estimates were almost 90% accurate. We found that design time is more difficult to estimate than coding since it involves much more creativity. Also we made "poor" estimates at the beginning of the testing because we didn't consider the significant amount of time taken for the tester to understand the operation of the system.

During the unit testing phrase, we built a bug history table (see Table 8.2) to record all the bugs we encountered.

Table 8.2. Bug history table

time bug description severity bug reason time taken to fix comments
12/10/98 Class "coverage_string_pair", couldn't set coverage using the constructor Severe Typo in the parameter of the constructor of the class "coverage_string_pair", "converage" should be "coverage" 15 minutes In the constructor when assigning variables try to use different names, like "my_coverage = coverage"
02/10/99 Class "uip_log_manager" couldn't log data correctly when the logging level is 2 Not very severe Class "rpc_ function" method "to_stream()" forgets to flush the stream when the logging level is not high enough 20 minutes
... ... ... ... ... ...

There is one row for each bug found during the testing of the system. The "time" column specifies the date the bug was fixed. The column "bug description" specifies the behavior that created awareness of the bug. The "severity" column specifies how severe the bug would be to the whole system if not fixed. The column "bug reason" specifies what caused the bug, and the column "time taken to fix" specifies how long it took the corresponding developer to find the reason and fix the bug. The "comments" column is used to write something which the developer or tester believes can prevent similar bugs from occurring again. Analyzing this table helped us improve the efficiency of our programming.

We kept records for all the review meetings we held during the development of the project. For each meeting we recorded persons attending, time taken, things covered as well as conclusions reached. The time taken and things covered were used to estimate future review meetings.

8.2.4. Process Improvement and Defect Prevention

Though we believed we had followed very good processes, we knew they could never be perfect. So we kept an eye on possible improvements of our process throughout the development of the project. We observed the result of our review meetings, and analyzed the data we collected using the estimation and measurement table. We also asked group members for their thoughts on the bottlenecks in the development during our weekly meetings.

For example, during several high level design and detailed design review meetings developers and testers complained that they had difficulty understanding the design and operation of the system. They said they only saw "a batch of functions there, but don't know how they communicated with each other". We analyzed the reason for this and reached the conclusion that our high-level design document (which was only a class relationship diagram with explanations at that time) only gives a static view of the system. To perform their task efficiently, developers and testers also need a dynamic view of the system in terms of how different methods/objects communicate with each other. Then the high level designers were required to write another document describing how the system operates in terms of scenarios. Since the high level designers were very familiar with the design (they made it), it took them little time to write the operation scenarios (only 2 ~ 3 hours for 4~5 thousand lines of code). This saved developers and testers much time in understanding the operation of the system. They reported that it only took them 20 to 30 minutes to understand the operation of a system of several thousand lines of code using the scenarios.

Another process improvement was the adding of synchronization control checking in detailed design review meetings. This system provides service to multiple users at the same time. No one can predict what each user will do at a certain time. There may be multiple threads running independently inside the system. To make sure our system will not crash or behave undesirably under all possible situations, we decided that a synchronization-control checking process needed to be added to our detailed design review. In this process, we first examined each method of an object, to see whether or not it would be possible that more than one thread would execute it at the same time. If the answer is yes, we check whether or not such execution will produce undesirable results to our system. If that is possible, we mark this method as a synchronized method. Second, for each method of an object we check whether or not it will be executed at the same time when other methods of this object are executed by other threads. If there is such a possibility we check whether or not this will produce undesirable results. If so, we mark both methods as synchronized methods. Many methods were marked as synchronized as a result of this process. It was also in this process that we identified the need for the "reader_writer_mutex" class. Mainly due to this process, no synchronization bugs were found in the Java MARIAN server during testing while such bugs were found both in the Java "webgate" and the C/C++ MARIAN server which were developed without such a process.

A significant amount of effort also was put into avoiding duplicated mistakes or preventing mistakes from happening at all. From the beginning of the semester we maintained a common error page (http://video.dlib.vt.edu:90/marian/cs5604/management/err.html) for our programmers. The content of this page was collected from several Java experts based on the mistakes they made when they did Java programming before. All our programmers were required to read the common error page (the time it took them to read the page was also estimated and counted into their workload) to avoid making similar mistakes. In addition to that, during our code reviews, if we found a mistake and believed it was not a special case, we posted it on our common error page and informed other programmers about this. Sometimes we even made use of mistakes made by students in other projects (like those who were doing projects in the software engineering class), posting them on our common error page. We maintained a page (http://video.dlib.vt.edu:90/marian/cs5604/management/code_style.html) about the coding style in this project. Again programmers were required to read this page before they started coding. We updated this page dynamically when we found some good coding styles which we believe could help other people understand the program better or reduce the bug rate. All these were very effective: our programmers became more and more efficient and wrote more and more professional code with higher speed.

8.2.5. Software Reuse

We believe software reuse can reduce time and increase system quality. So during the development of this project we kept an eye on software reuse and also developed some reusable components when our time constraints permitted.

When designing the services provided by a class, we not only considered those which were needed in this project at that time, but also those which should be provided to make the services provided by this class complete. This increased the possibility that this class could be used in other projects or in the future development of this project.

For example, when we designed our "uip", we allowed two ways of passing and receiving functions through it. Running in "DIRECT_CALL" mode, when it receives a function from its "call_back_processor" to pass to the other subsystem, it will block the "call_back_processor" until the function is written to the other subsystem through a socket. Running in "THREAD_CALL" mode, it will generate a thread to write the function while at the same time returning the control to the call back processor immediately. The same thing happens in the other direction -- when the "uip" receives a function from the other subsystem it either generates a thread to pass the function to its "call_back_processor" to process or passes the function to the "call_back_processor" directly, depending on the mode it is running in. The advantage of "THREAD_CALL" is that it will not block the caller when the function takes the callee a long time to process. The disadvantage is that more threads are created in the system and they consume system resources. The user is allowed to choose either mode for the "uip" or even to choose different modes for different directions (sending or receiving functions) by changing a configuration file. Being able to provide this feature makes our "uip" suitable for different environments, thus increases its reusability. The design of our "reader_writer_mutex" class also illustrates this by allowing the user to specify a maximum number of concurrent readers.

We reused code throughout the development of this project. The class "client_uip" was used both in "webgate" and the Java MARIAN server. It also will be used in our future design to distribute search engines, as illustrated in Figure 7.10. The "uip_log_manager" class was used both in "client_uip" and "server_uip". The "reader_writer_mutex" class was used in "session_table" and "server_uip_receiver_table" for synchronization control. Since the constructors of many classes need to read configuration information from a file, we developed and tested some code to do this and let all our developers use this piece of code in writing the constructors of their classes.

We also reused design patterns we created earlier. The design of "server_uip" and that of the "session_manager" follow similar patterns. Also similar design patterns can be found in the relationship between "server_uip_receiever_manage" thread and "server_uip_receiver_table", the relationship between "session_manage" thread and "session_table", the relationship between "user_manage" thread and "user_manager", and the relationship between "dynamic_uip_manage" thread and "uip_manager".

We also reused some design patterns from other projects -- the design of "request_response" thread in the Java "webgate" made use of the design of one of the projects of the Networking class (CS5516) to develop an HTTP server. In addition to that, after we found possible improvements in one design, we traced all the similar designs to see if it were possible for them to benefit from those improvements.

8.3. Results

The Java MARIAN server contains about 40 classes and 20,000 lines of code. About 1,500 man-hours were spent in developing this subsystem. These include the time spent in requirement analysis, original project analyzing, high level design, detailed design, coding, unit testing, and integration testing. These also include the time spent in all the review meetings and training.

During the unit testing phase, eleven bugs (including non-severe ones) were found and fixed. The total time spent in debugging those bugs was less than 3 hours (not including the time to document the fixes). After that, no bugs were found to date. This shows that the system we developed following such processes is very reliable.

If we assume a full-time employee works 40 hours per day, and 4 weeks per month, then the time we spent in developing the 20,000 lines Java server is roughly 9 man-months. Considering the quality of the system, and considering the background of our members at the beginning of the semester (they had little experience in developing digital library systems, and quite a few developers knew almost nothing about Java), the results we achieved are very encouraging.

The results of the experiments described in the next chapter further verified that the system developed following those processes seems to be reliable.


[Title] [Ack] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [Bib] [Vita]

ETD-ML Version 0.9.7a (beta) http://etd.vt.edu/etd-ml/ Mon Jul 19 11:13:10 1999