Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 59 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
59
Dung lượng
6,13 MB
Nội dung
ptg5994185 211 Chapter 13 Joint Architecture Design Thus it is that in war the victorious strategist only seeks battle after the victory has been won, whereas he who is destined to defeat first fights and afterwards looks for victory. —Sun Tzu So far in Part II, Building Processes for Scale, we have focused on many reactionary processes such as managing issues, crisis management, and determining headroom. In this chapter and the next, we are going to introduce two processes that are proactive, not reactive. We are going to shift from reaction (what to do when something goes wrong) to discuss how to build the application in a scalable manner in the first place. These two processes are cross functional and are interwoven within the product development life cycle. They are the Joint Architecture Design (JAD) and the Archi- tecture Review Board (ARB). In this chapter, we are going to focus on JAD, which comes earlier in the product development life cycle than does ARB and sets the stage for designing a system that scales. Using a very simple sport analogy of running, JAD would be equivalent to the training or preparation for the race. ARB, continuing the analogy, would be the actual race. JAD is a collaborative design process wherein all engineering assets nec- essary to develop some new major functionality work together to define a design con- sistent with the architectural principles of the organization. ARB Board is a review board of select architects from each of the functional or business areas, whose job is to ensure that prior to final sign-off of a design, all company architectural principles have been incorporated, and that best practices have been applied. Fixing Organizational Dysfunction In the introduction, we mentioned that the JAD process was cross functional. In dys- functional organizations, the implementation of JAD is challenging but absolutely ptg5994185 212 CHAPTER 13 JOINT ARCHITECTURE DESIGN necessary to help cure the dysfunction. If you are part of one of those organizations where engineers do not trust operations staff and vice versa, unfortunately you are among the majority. It is sad but common to have distrust and even animosity between teams. Before we can figure out how we can overcome this dysfunction and start building scalable applications through the use of cross-functional teams, we need to understand why this problem exists. In most software development shops, it is not too difficult to find an engineer who feels that the architects and the operations staff, database administrators, systems administrators, and network engineers are either not knowledgeable about coding or don’t fully understand the software development process. The reverse distrust is also prevalent where operations staff or architects feel that software engineers only know how to code and do not understand higher level design or total systems concepts. Even worse, each believes that the other’s job is in direct opposition to accomplishing his own goals. Operations staff can often be heard mumbling that “they could keep the servers up if the developers would stop putting code on them” and developers mumble back that “they could develop and debug much faster if they didn’t have operations making them follow silly policies.” This set of perceptions and misgivings is destructive to the scalability of the application and organization. They also show how the “experiential chasm,” which we discussed in Chapter 6, Making the Busi- ness Case, can exist among technology teams as easily as it can between the business and technology teams. As a brief refresher on the experiential chasm, we proposed that the differences in education and experience between the two groups of people cause a type of destruc- tive interference in communication. The formal education of a software developer and a systems administrator at an undergraduate level may be very similar—same computer science degree—or they may vary significantly—computer science versus computer engineering. The on-the-job education is where the really large differences begin to emerge. Systems administrators or database administrators typically get mentored by more senior administrators for several years until they become profi- cient with a specific technology, ever increasing their specialization in that field. Soft- ware engineers typically follow a similar path but are focused on a particular programming language. What server the application runs on or what database the application calls is for the most part abstracted away for the software engineers so they can concentrate on feature development. With the experiential chasm as the starting point between the two groups, when we add differing and sometimes opposing goals, these groups start becoming so far apart they see no common ground. Most organizations do not share goals across teams. This is problematic if the intent is to get these teams working together instead of fighting each other. The operations team usually is saddled with the goal of uptime or availability for the site. Any downtime gets taken out of their bonuses. The devel- opment team is usually given the goal of feature delivery. A missed delivery date ptg5994185 FIXING ORGANIZATIONAL DYSFUNCTION 213 results in lower bonuses for them. At the CTO level, the CTO thinks that all of his goals are being handled by one of his teams and therefore everything is covered. The reality is that separating goals like this actually causes strife among his teams. The development goal of feature delivery pushes them to want to get code out fast, and if it breaks, they figure they can fix it on-the-fly. This is by far the fastest way to achieve initial delivery of code, which is usually all that is measured. In the long run, this approach actually takes more time because fixing problems takes a lot of time to find, fix, and redeploy. As we mentioned earlier, this post-delivery time is usually never measured and therefore is missed as being part of the delivery goal. The operations team wants to keep the site up and increase the availability as per its goal. It is therefore motivated to keep changes out of production because changes are the primary cause of issues. It decides that the fewer code pushes or changes made to the production environment the more likely the team is able to meet its goal. Whether consciously or not, the team is suddenly not so motivated to get code pushed out and in fact will start to find any reason for delays. As you can hopefully see by now, you have two or more groups that have incredi- bly valuable knowledge about how systems and architectures work in general and specific knowledge about how your system operates, but they are naturally weary of each other and are likely being incented to not work together. How can you fix this? The JAD process is a great place to start. As we’ll discuss in the next section of this chapter, JAD is a collaborative process that pulls cross-functional team members together with a common goal. The JAD team either succeeds or fails together and this reflects on its organizations and its leadership team. The basic process of JAD is to assign a major feature to not only a software engi- neer but also to an architect, at least one operations engineer (database administrator, systems administrator, or network engineer), and optionally a product manager, project manager, and quality assurance engineer as needed for this specific feature. The responsibility of this team is to come up with a design that follows the estab- lished architecture principles of the organization that will allow the system to con- tinue to scale, that allows the feature to meet the product requirements, and that will be able to pass the ARB. This team is comprised of the people who will ultimately present the design to the ARB, which we will discuss in the next chapter is made up of peers and managers who get to decide if this design satisfies the exit criteria. Fortu- nately, this collusion does not just stop at the design; because these individuals have put their credentials on the line with this feature, they are now motivated to watch it through the entire life cycle to ensure it is a success. Engineers are now being held responsible for the design and how it performs in production. The database adminis- trators are being held accountable for designing this feature to not only scale but to also meet the business requirements. Now we have the developers, architects, and operations staff working together, jointly, with a shared goal. ptg5994185 214 CHAPTER 13 JOINT ARCHITECTURE DESIGN Designing for Scale Cross Functionally We discussed briefly the structure and mechanism of JAD. Now, we can get into more detail. JAD is a collaborative design process wherein all engineering assets necessary to develop some new major functionality or architectural modification work together to define a design consistent with the architectural principles and best practices of the company to implement the business unit requirements. This group of engineering assets is comprised of the software engineer responsible for ultimately coding the fea- ture, an architect, at least one but possibly multiple operations staff, and, as needed based on the feature, the product manager, a project manager, and a quality assur- ance engineer. As mentioned earlier, each brings unique knowledge, perspectives, experiences, and goals that augment each other as well as counter-balance each other. Although the operations engineer now has the goal of designing a feature that meets the business requirements, she also still has the goal from her organization of main- taining availability. This helps ensure that she is vigilant as ever about what goes into production. Involving each of the technology groups, tradeoffs between hardware, software, middleware, and build versus buy approaches can help shave time to market, cost of development and cost of operations, and increase overall quality. The software engi- neer has typically been abstracted from the hardware by the services of the opera- tions team. So trying to have the software engineer design a feature for image storage—see the “Image Storage Feature” sidebar for the complete example—with- out knowledge of the storage device that can and should be used is asking to fail in meeting the requirements, fail in the cost-effectiveness, or fail in the scalability of the system. Shared goal of scalability ensures the culture is pervasive; when there are issues or crises, all hands are on deck because of shared ownership. This JAD approach is not limited to waterfall development methodologies where one phase of product development must take place before the other. JAD can and has been successfully used in conjunction with all types of development methodologies such as iterative or agile, in which specifications, designs, and development evolve as greater insights are gained about the product feature. Each time a design is being modified or appended to, a JAD can be called to help with it. The type of architecture does not preclude the use of JAD either. Whether it is a traditional three-tier Web architecture, Service Oriented Architecture, or simply a monolithic application, the collaboration of engineering, operations, and architects to arrive at a better design is simply taking advantage of the fact that solutions arrived at by teams are better than individuals. The more diverse the background of the team members, the more holistic the solution is likely to be. The actual structure of the JAD is very informal. After the team has been assigned to the feature, one person takes the lead on coordinating the design sessions; this is ptg5994185 DESIGNING FOR SCALE CROSS FUNCTIONALLY 215 typically the software engineer or the project manager, if assigned. There are usually multiple design sessions that can last between one and several hours depending on people’s schedules. For very complex features, multiple design sessions for various components of the feature should be scheduled. For example, a session focused on the database should be set up, and then another one on the cache servers should be set up separately. Typically, the sessions start with a discussion covering the background of the fea- ture and the business requirements. During this phase, it is a good idea to have the product manager present and then on call for any clarifications as questions come up. After the product requirements have been discussed, a review of the architectural principles that relate to this area of the design is usually a good idea. Following this, the teams brainstorm about various solutions and typically arrive at a few different possible solutions. These are written up at the end of the meeting and sent around for people to ponder over until the next session. Usually only a session or two are required to come to an agreement on the best approach for the design of the feature. The final design is written down and documented for presentation at that ARB. Image Storage Feature At our fictional company AllScale, a feature for the human resource management (HRM) appli- cation has been requested that will allow for the storage of pictures of personnel to be dis- played in their personnel folders that the HR and hiring managers bring up to conduct reviews, salary adjustments, and so on. The software engineer, Sam Codur, who has been assigned to this feature, has very little idea of the hardware or storage devices that are used in production. He has overheard the operations folks talk about a SAN or NAS but he is really clueless about the dif- ferences. Furthermore, he has never even heard of different classes of storage and has never given a single minute of thought to backup and recovery of storage in the event of data corrup- tion, hardware failure, or natural disasters. Figure 13.1 depicts Sam trying to decide on all the nuances of hardware, software, and network devices alone without any other experts to aid him Figure 13.1 Software Engineer Pondering Classes of Storage ptg5994185 216 CHAPTER 13 JOINT ARCHITECTURE DESIGN The product manager has specified for this feature that any standard image format be acceptable, that all past profile images be available, and that the size be less than 500KB per image. To Sam, the software engineer, this seems reasonable and instead of soliciting guid- ance from the operations staff, he decides that he can code the feature and let ops worry about how and where the images actually get stored. The result, after ten days of coding and another five days of quality assurance testing, is the feature gets noticed in the notes for the upcoming release by Tom Harde, VP of operations. Tom sends a set of questions to Mike Softe, VP of engineering, asking how this feature was designed, the response time requirements, and the storage estimates per year. After this email gets passed back and forth several times, it eventu- ally gets escalated to Johnny Fixer, the CTO, with both sides demanding that the other is being unreasonable. Johnny now has to get involved and make some hard decisions to either delay the release in order that the feature be redeveloped to meet the operation team’s standards (images less than 100KB, no multiple images, timeouts coded for response times greater than 200msec, no guarantee of image storage, etc.) or push the feature out as developed and worry about it causing issues across the site. Johnny decides to pull the feature from the release, which requires some retesting to be performed and the delay of a day for the release date. Instead of just fixing this single feature, Johnny decides that he needs to fix the process to make sure there are not more features like this one in the future. Johnny gathers Mike and Tom to introduce the Joint Architecture Design process. He explains that when an engineer is developing a feature and it is either part of the core modules/services of the HRM system or it is estimated to take more than five days of development, then a JAD must take place. The participants will be the engineer developing the feature, a systems architect, and an operations staff member assigned by Tom or his manag- ers. Johnny continues to explain that this team of individuals own the design and will be held accountable for the success or failure of the feature in terms of its performance, availability, and scalability. Tom and Mike see this process as a way to achieve a win-win situation and depart quickly to explain it to their teams. JAD Checklist Here is a quick checklist for how to conduct the JAD sessions to ensure you do not skip any of the most important steps. As you feel more comfortable with this process, feel free to modify this and create your own JAD checklist for your organization to follow: 1. Assign participants. 2. Mandatory. Software engineer, architect, operations engineer (database administrator, systems administrator, and/or network engineer). 3. Optional. Product manager, project manager, quality assurance engineer. 4. Schedule one or more sessions. Divide sessions by component if possible: database, server, cache, storage, etc. ptg5994185 ENTRY AND EXIT CRITERIA 217 5. Start the session by covering the specifications. 6. Review the architectural principles related to this session’s component. 7. Brainstorm approaches. No need for complete detail. 8. List pros and cons of each approach. 9. If multiple sessions are needed, have someone write down all the ideas and send around to the group. 10. Arrive at consensus for the design. Use voting, rating, ranking, or any other decision technique that everyone can support. 11. Create the final documentation of the design in preparation for the ARB. Don’t be afraid to modify this checklist as necessary for your organization. Entry and Exit Criteria With the JAD process, we recommend that specific criteria must be met before a fea- ture can begin the JAD process. Likewise, certain criteria must be met in order for that feature to move out of JAD. By holding fast to these entrance and exit criteria, you will preserve the integrity of the JAD process and not weaken it. Some examples of how allowing these criteria to be bypassed are introducing features that aren’t large enough to require a team effort to design or allowing a feature without an oper- ations engineer on the team to start JAD because the operations team is swamped handling a crisis. Giving in to these one off requests will ultimately devalue the JAD and participants will believe that they can stop attending meetings or that they are not being held accountable for the outcome. Do not even start down this slippery slope; make the entrance and exit criteria rigorous and unwavering, no exceptions. The entrance criteria for JAD are the following: • Feature Significance. The feature must be significant enough to require the focus of a cross-functional team. The exact nature of significance can be debated. We suggest measuring this in three ways: 1. The first is size. For size, we use the amount of effort to develop as the mea- surement. Features requiring more than 10 days of total effort are considered significant. To calculate this for features that have multiple engineers assigned to them, sum all engineering days estimated for the feature. 2. The second is potential impact on the overall application or system. If the feature touches many of the core components of the system, this should be considered significant enough to design cross functionally. ptg5994185 218 CHAPTER 13 JOINT ARCHITECTURE DESIGN 3. The third is complexity of the feature. If the feature requires components that are not typically involved in features such as caching or storage, it should go through JAD. A feature that runs on the same type of application server as the rest of the site and retrieves data from the database is not complex enough to meet this requirement. • Established Team. The feature must have representatives assigned and present from engineering, architecture, and operations (database and system admin, possibly network). If needed, members from quality assurance, project manage- ment, and product management should be assigned. If these required team members are not assigned and made available to attend the meetings, the feature should not be allowed to participate in JAD. • Product Requirements. The feature must have product requirements and a busi- ness case in order to participate. The reason is that tradeoffs are likely to be made based on different architectural solutions, and the team will need to know the critical requirements from the nice-to-have ones. Also understanding the rev- enue generated by this feature will help when deciding how much investment should be considered for different solutions. • Empowered. The JAD team must be empowered to make decisions that will not be second-guessed by other engineers or architects. The only team that can approve or deny the JAD design is the ARB, who gets final review of the archi- tecture. In RASCI terminology, the JAD team is the R (Responsible) for the design and the ARB is the A (Accountable). The exit criteria for a feature coming out of JAD are the following: • Architectural Principles. The final design of the feature must follow all architec- tural principles that have been established in the organization. If there are exceptions to this rule, they should be documented and expected to be ques- tioned in ARB, resulting in a possible rejection of the design. We will talk more about the ARB process in the next chapter. • Consensus. The entire team should be in agreement and support the final design. Time for dissention is during the team meetings and not afterward. If someone from the JAD team begins second-guessing team decisions, this should be grounds for requiring the JAD to be conducted again and any development on the feature should be stopped immediately. • Tradeoffs Documented. If there were any significant tradeoffs made in the design with respect to the requirements, cost, or principles, these should be spelled out and documented for the ARB to review and for any other team mem- ber to reference when reviewing the design of the feature. • Final Design Documented. The final design must be documented and posted for reference. The design may or may not be reviewed by ARB, but the design must ptg5994185 CONCLUSION 219 be made available for all teams to review and reference in the future. These designs will soon become system documentation as well as design patterns that engineers, architects, and operations folks can reference when they are partici- pating in future JADs. • ARB. The final step in the JAD process is to decide whether the feature needs to go to ARB for final review and approval. We will talk in more detail in the next chapter about what features should be considered for ARB but here are our basic recommendations. If this feature meets any of the following, it should pro- ceed through ARB: 1. Noncompliance with architectural principles. If any of the architectural prin- ciples were violated, this feature should go through ARB. 2. Projects that cannot reach consensus on design. If the team fails to reach con- sensus, it can either be reassigned to a new JAD team or it can be sent to ARB for a final decision on the competing designs. 3. Significant tradeoffs made. If tradeoffs had to be made in terms of product requirements, cost, or other nonarchitectural principles, this should flag a feature to proceed to ARB. 4. High risk features. We will discuss how to assess risk in much more detail in Chapter 16, Determining Risk, but if the feature is considered a high risk fea- ture, it should go through ARB. A quick way of determining if this is high risk is to look at how many core components the feature touches or how dif- ferent it is from other features. The more core components that are touched or the greater the difference from other features, the higher the risk. Conclusion In this chapter, we covered the Joint Architecture Design (JAD) process. We started by understanding the dysfunction in technology organizations that causes features to be designed in silos. We revisited the experiential chasm as it played a role in this dys- function. We also saw how differing goals among different technology teams can add to this problem. The fix is forcing the teams to work together for a shared goal. This occurs with the JAD process. We then covered in detail the JAD process, including who were mandatory partic- ipants in the process and who were some of the optional team members. We described how the design meetings should be structured based on components and how important it was to start by making sure every team member was familiar with the business requirements of the feature as well as the applicable architecture princi- ples of the organization. ptg5994185 220 CHAPTER 13 JOINT ARCHITECTURE DESIGN We shared with you a JAD checklist that will be useful to get you and your organi- zation started quickly with the JAD process. Our recommendation for using this was to start with our standard steps but fill it out as necessary to make it part of your organization. And then of course document the process so it becomes fixed in your organization’s culture and processes. We closed the chapter with the entry and exit criteria of JAD. The entry criteria are focused on the preparation to ensure the JAD will be successful and to ensure that the integrity of the process remains. Letting features slip into a JAD without all the required team members is a sure way to cause the process to lose focus and not be as effective as it should be. The exit criteria are focused on ensuring that the feature design has been agreed upon by all members of the team and that if necessary it is prepared to be presented in the Architecture Review Board (ARB), which we will dis- cuss in the next chapter. Key Points • Designing applications in a vacuum leads to problems; the best designs are done involving multiple groups offering different perspectives. • The JAD is the best way to involve a cross-functional team that may not be incented to work together. • The JAD team must include members from engineering, architecture, and opera- tions (database administrators, systems administrators, or network engineers). • The optional members of the JAD team include project management, product management, and quality assurance. These people should be added to the team as required by the feature. • JAD is most successful when the integrity of the process is respected and entry and exit criteria are rigorously upheld. [...]... administrators, and other JAD members; therefore, a very informal setting is our preference The formality should come from the fact that there will be a go or no-go decision made on the architecture of the feature; that should be enough to establish the need for a well thought out and well-presented design by the JAD team Regardless of how formal or informal you determine the meeting should be, they should... that they were impressed with the level of detail and thought that had been put into the feature design and had unanimously voted to approve the feature to move forward to the development phase They brought Sam and Mark back into the room and shared the good news with them, congratulating them on being the first to present to the ARB and on being the first to pass the ARB with flying colors Entry and. .. the use of system means not only the application, but the entire product development life cycle, technology organization, and all the processes that make these up There are many different ways of calculating the amount of risk, and we will cover some of the best ones that we have seen, including the pros and cons of each method At the end of this chapter, you will have a much better grasp of risk and. .. responsible for the image storage feature, and Mark Admeen, the operations systems administrator, assigned the responsibility of participating in the JAD, worked on their presentation of the feature’s design They were a bit nervous when it came time for the meeting, but Johnny, Tom, Mike, and the other board members present quickly put them at ease by asking questions and taking up the conversation themselves... to have them petition to the CTO or VP of engineering to have that decision overthrown The ARB needs to be comprised of the right people to make the right decision and to be given the final authority of that decision If this requires VPs to be on the ARB, they should be If the VPs delegate the ARB to managers or architects, the VPs need to support them and not second-guess them The ARB, in these matters,... include the following steps: 1 Introduction Some members of the JAD may not know members of the ARB if the engineering organization is large 2 State Purpose Someone on the ARB should state the purpose of the meeting so that everyone understands what the goal of the meeting is We suggest you point out that the ARB will be making a judgment on the proposed architecture and not people on the JAD team If the. .. fees, and other costs of the system over the course of five years Johnny does a back -of- theenvelope calculation indicating that five engineers over seven months should be able to create the interpreter for a net headcount of expense of roughly $300,000.00 He further indicates that the company will need to dedicate one engineer full time for the life of the system to maintain it for an ongoing expense of. .. SLAs, retrieval time, and so on After lots of discussion with the VP of engineering, Mike Softe, the issue was escalated to Johnny Fixer, the CTO He decided to pull the feature from the release and redesign it properly As part of the after action review or postmortem of this issue, Johnny decided to introduce the teams to the concept of Joint Architecture Design Both the engineering and operations teams... it should first and foremost put what is best for the company before C ONDUCTING THE M EETING personal likes, distastes, or agendas, such as something causing more work for their team And, what is best for the company is to get more products in front of the customers while ensuring the scalability of the system Image Storage Feature At our fictional company AllScale, a feature for the human resource... us, the leadership team, as well as our senior technical folks, a chance to review all of the large or riskier features With both the teams having availability and scalability as goals, which affected their bonuses, both Tom and Mike were anxious to implement a process that would allow them to have a direct say in the design and architecture of key features The three of them worked the rest of the . administrators, and other JAD members; therefore, a very informal setting is our preference. The formality should come from the fact that there will be a go or no-go decision made on the architecture of the. move forward to the development phase. They brought Sam and Mark back into the room and shared the good news with them, congratulating them on being the first to present to the ARB and on being the. in and out of the board depending on the feature being discussed. There are various methods of rotation for the ARB positions. One straightforward method is to change the constituency of the