| | .. title:: |
---|
| | s5 DEMO!!!! - RandomCo Thoughts |
---|
| | =================================== |
---|
| | Using *s5* With *Restructured Text* |
---|
| | =================================== |
---|
| | |
---|
| | .. class:: center |
---|
| | |
---|
| | **Some Thoughts On RandomCo** |
---|
| | |
---|
| | .. class:: center |
---|
| | |
---|
| | *By R.U. Kidding-Me, Ph.D., Th.D, DDS, CNE, EIEIO* |
---|
| | |
---|
| | .. footer:: |
---|
| | $Id: s5demo.txt,v 1.1 2006/04/05 17:11:27 tundra Exp $ |
---|
| | $Id: s5demo.txt,v 1.101 2006/04/07 00:46:27 tundra Exp $ |
---|
| | |
---|
| | |
---|
| | Presentation Overview |
---|
| | --------------------- |
---|
| | |
---|
| | .. contents:: |
---|
| | |
---|
| | |
---|
| | |
---|
| | Meeting Mechanics |
---|
| | ----------------- |
---|
| | |
---|
| | - It would be best if we could get all the way through the presentation |
---|
| | in "Read Only" mode. This will allow you to understand the entire |
---|
| | analysis in context. This will be followed by an open-ended QA session. |
---|
| | |
---|
| | - Ed Murphy will present our findings using interprative dance. |
---|
| | - Ed Murphy will present our findings using interpretive dance. |
---|
| | |
---|
| | .. image:: images/dance.jpg |
---|
| | :height: 200 |
---|
| | :width: 200 |
---|
| | :align: center |
---|
| | :alt: Dancing Penguin Image Missing! (Where Did He Go?) |
---|
| | |
---|
| | Possible Responses |
---|
| | ------------------ |
---|
| | |
---|
| | - As you hear us present today, you will have one of three responses: |
---|
| | |
---|
| | - "Oh, I/we already know that." |
---|
| | - "Hmm, that's new to me - good point." |
---|
| | - "I thoroughly disagree with what you just said." |
---|
| | - "Dude, that is *Sooooooo* wrong!" |
---|
| | |
---|
| | Each of these are important ways that you will validate and/or act upon |
---|
| | our findings. |
---|
| | |
---|
| | Assumptions |
---|
| | ----------- |
---|
| | |
---|
| | - A lot of baseline data such as machine inventories, levels of utilization, |
---|
| | application response time, and arrival rate profiles was either missing, |
---|
| | incomplete, or inconsistent across systems. Some data (such as pricing) |
---|
| | was completely unavailable to us. We've thus built the models using |
---|
| | estimates for certain critical datapoints. RandomCo can update these models |
---|
| | at will to make use of "real" data and thereby get better model output. |
---|
| | - A lot of baseline data such as machine inventories, levels of |
---|
| | utilization, application response time, and arrival rate profiles |
---|
| | was either missing, incomplete, or inconsistent across systems. |
---|
| | |
---|
| | - Some data (such as pricing) was completely unavailable to us. |
---|
| | |
---|
| | - We've thus built the models using estimates for certain critical |
---|
| | datapoints. |
---|
| | |
---|
| | - RandomCo can update these models at will to make use of |
---|
| | "real" data and thereby get better model output. |
---|
| | |
---|
| | |
---|
| | Key Findings |
---|
| | ------------ |
---|
| | |
---|
| | - The RandomCo IT culture is end-user/project focused to a fault. |
---|
| | Infrastructure is (mostly) enhanced incrementally and is not treated |
---|
| | as a common asset designed to serve the entire breadth of |
---|
| | applications. |
---|
| | |
---|
| | - This has led to an overprovisioning of some classes of servers |
---|
| | (Windows) and a greater variety of system types (Unix) than is |
---|
| | strictly necessary. |
---|
| | |
---|
| | - Operational disciplines such as asset inventory control, measurement, |
---|
| | and reporting vary greatly in depth and quality. This makes it hard |
---|
| | to manage what is not consistently measured. |
---|
| | |
---|
| | - Business, Architecture, Development, Infrastructure, and |
---|
| | Operations are not bound together with a common overarching view of |
---|
| | IT at RandomCo. There is a tendency for each of these to operate |
---|
| | moreso as silos. The Architecture team tends to have the broadest |
---|
| | view of these issues. |
---|
| | |
---|
| | - Customizations to key business subsystems such as SAP and Manugistics |
---|
| | are creating a high degree of operational complexity that may |
---|
| | not be justified. |
---|
| | |
---|
| | - The strict commitment to a Windows/.NET-only development environment |
---|
| | is unnecessarily constraining the organization's agility, time-to-market, |
---|
| | and ability to control costs. |
---|
| | - RandomCo is in big trouble |
---|
| | - Ed Murphy is a fabulous dancer |
---|
| | |
---|
| | |
---|
| | Core Themes |
---|
| | ----------- |
---|
| | An In Closing ... |
---|
| | ----------------- |
---|
| | |
---|
| | - Infrastructure provisioning needs to move from a project-centric |
---|
| | model to an Enterprise-wide service model. |
---|
| | .. note:: |
---|
| | |
---|
| | 1) This will maximize reusability of extant infrastructure assets. |
---|
| | I would love to say more here but I have no meaningful content. |
---|
| | |
---|
| | 2) This will enable a systemic perspective for provisioning new |
---|
| | infrastructure with attendant economies of scale. |
---|
| | .. warning:: |
---|
| | |
---|
| | 3) The current complexity, variety, and underutilization of systems |
---|
| | at RandomCo is a direct consequence of making project-based |
---|
| | infrastructure decisions. No amount of migration/consolidation |
---|
| | will make a permanent difference if the underlying root cause |
---|
| | practice that caused the situtation in the first place is not |
---|
| | addressed. |
---|
| | If I ever enter politics, I will have lots of use |
---|
| | for meaningless content. |
---|
| | |
---|
| | - Measurement, Monitoring, and Management need to be made more |
---|
| | consistent and reach more widely across the IT operational |
---|
| | environment: |
---|
| | |
---|
| | 1) Basic asset information such as server inventory, software |
---|
| | revision levels, machine age, and so on varies widely by |
---|
| | platform. This makes business case cost calculations for new |
---|
| | initiatives difficult, and in some cases, impossible. |
---|
| | |
---|
| | 2) Today there is wide variation in the depth and quality of |
---|
| | performance and capacity metrics available across all the |
---|
| | datacenter assets. This makes tuning and capacity planning a |
---|
| | vertical, per-server activity (if at all), rather than a |
---|
| | systemic infrastructure concern. |
---|
| | |
---|
| | 3) In short, "You Cannot Manage What You Do Not Measure." |
---|
| | |
---|
| | - RandomCo today is already making good use of virtualization |
---|
| | in the "Big Box" Unix and Mainframe areas. This needs to be |
---|
| | extended to the Windows-class servers as well. |
---|
| | |
---|
| | 1) This will allow more efficient use of existing server capacity. |
---|
| | |
---|
| | 2) This will enable rapid resource (re)provisioning on a project |
---|
| | or even perhaps, event, basis. |
---|
| | |
---|
| | 3) This will decouple applications software from underlying |
---|
| | operating environments by testing and certifying the application |
---|
| | to the *virtual OS*, not the physical hardware. This will |
---|
| | materially reduce the retesting burden currently incurred when |
---|
| | hardware is upgraded or changed. |
---|
| | |
---|
| | - RandomCo should begin the necessary steps to reduce the number of |
---|
| | different Unix variants within the IT organization and reduce its |
---|
| | total dependence on Windows as a server platform. Wherever |
---|
| | possible, these should be migrated to SLES Linux across the required |
---|
| | breadth of hardware. RandomCo will benefit in doing so because: |
---|
| | |
---|
| | 1) This creates a common operational platform thereby reducing |
---|
| | training cost and maximally leveraging the employee skill set. |
---|
| | |
---|
| | 2) This make the organization hardware-agnostic thereby providing |
---|
| | negotiation leverage with the hardware vendors. |
---|
| | |
---|
| | 3) The net software licensing cost should drop significantly: |
---|
| | |
---|
| | a) RandomCo already has an Enterprise License for SLES. |
---|
| | |
---|
| | b) SLES will be bundled with XEN virtualization in future |
---|
| | releases. This should be considerably less expensive than the |
---|
| | separate licensing of Windows and VMWare on today's servers. |
---|
| | |
---|
| | 4) The first candidate for elimination is AIX. |
---|
| | |
---|
| | |
---|
| | - RandomCo needs to embrace Linux as a development platform for its own |
---|
| | customized software: |
---|
| | |
---|
| | 1) This will give it many more degrees of freedom in how it designs, |
---|
| | deploys, and operates its own applications. |
---|
| | |
---|
| | 2) This will open the door to cost reduction by replacing |
---|
| | expensive enabling components (like IIS) with free or very |
---|
| | inexpensive open source equivalents (like Apache). |
---|
| | |
---|
| | 3) This will enable "scale" at the *organizational* level. Today, |
---|
| | there is a significant difference in worldview, skillset, and |
---|
| | approach between the Windows developers and the rest of the |
---|
| | RandomCo IT community. By moving to make Linux one of the common |
---|
| | development platforms, RandomCo will open the door to having the |
---|
| | in-house applications it develops run on everything from an |
---|
| | entry-level machine through an Enterprise-class mainframe. This |
---|
| | will be done with a common set of development tools, |
---|
| | technologies, and *people* across the organization, with a far |
---|
| | stronger alignment between Architecture, Development, and |
---|
| | Operations. |
---|
| | |
---|
| | |
---|
| | Specific Technical Recommendations |
---|
| | ---------------------------------- |
---|
| | |
---|
| | - There are a number areas for improvement that are "Quit Hits". |
---|
| | These are relatively low risk/ low complexity and can be acted |
---|
| | upon fairly quickly: |
---|
| | |
---|
| | 1) Audit all printers and replace any that are still using PC |
---|
| | print server hosts with direct network connected printers. |
---|
| | |
---|
| | 2) Migrate the datacenter core LAN fabric from 100 BaseT to |
---|
| | Gigabit ethernet everywhere. |
---|
| | |
---|
| | 3) Build out the datacenter switch topology to accommodate more |
---|
| | ports, be 1G capable, and accommodate future growth. Get rid |
---|
| | of the daisy-chained switches used today. |
---|
| | |
---|
| | 4) Build an IP-connected NAS in the datacenter and migrate all the |
---|
| | corporate file servers away from locally attached storage to |
---|
| | the NAS to provide consolidated storage, backup, management, & |
---|
| | recovery. (It may be the case that it is easier/more consistent |
---|
| | to actually mount this on the existing SAN and expand the |
---|
| | SAN capacity accordingly.) |
---|
| | |
---|
| | 5) Continue/accelerate the path to virtualizing Dev/Test/QA |
---|
| | images. BUT, place the provisioning of these images into the |
---|
| | hands of infrastructure organization, not by each and every |
---|
| | disparate development project. |
---|
| | |
---|
| | - The QIP DNS infrastructure needs to be audited: |
---|
| | |
---|
| | 1) Revisit the overall Enterprise DNS architecture and make sure |
---|
| | it still makes sense. |
---|
| | |
---|
| | 2) Ensure that the versions of 'bind' and 'dhcpd' deployed in QIP |
---|
| | are new enough to overcome the known security holes of the |
---|
| | older versions of these tools. |
---|
| | |
---|
| | 3) The competitive landscape should be revisited here to see if there |
---|
| | a better/newer/cheaper integrated DNS solutions. |
---|
| | |
---|
| | 4) Examine the possibility of augmenting standard "bare" 'named' and |
---|
| | 'dhcpd' with open source or commercial DNS/DHCP configuration tools. |
---|
| | |
---|
| | - The Windows server farm provides a strong opportunity for *consolidation*: |
---|
| | |
---|
| | 1) Many machines are lightly utilized and thus can be consolidated |
---|
| | via virtualization. |
---|
| | |
---|
| | 2) The data on Windows server utilization is spotty at best. Instead |
---|
| | of attempting to analytically determine which servers to virtualize, |
---|
| | do so *empirically*, as follows: |
---|
| | |
---|
| | a) Select the servers that today represent the least powerful |
---|
| | 20% Windows servers. |
---|
| | |
---|
| | b) Begin adding servers from that 20% virtually to a target |
---|
| | machine *while monitoring utilization*. When the machine |
---|
| | hosting the virtual server images reaches some threshold |
---|
| | average utilization (we suggest 75%), consider it "full", |
---|
| | and start adding virtual servers to the next physical machine. |
---|
| | |
---|
| | c) Over time, you will discover what a reasonable average |
---|
| | level of utilization is for the machines hosting the virtual |
---|
| | servers and thus how many such virtual images a given class of |
---|
| | hardware can support. (The business case assumes consolidation |
---|
| | ratios of 3:1, 2:1, and 1:1 for small, medium, and large class |
---|
| | servers respectively.) |
---|
| | |
---|
| | - The Unix server farm offers some opportunities for *migration*: |
---|
| | |
---|
| | 1) SAP needs to be migrated to run on Linux instead of AIX. |
---|
| | |
---|
| | 2) The various flavors of Oracle currently in use need to be |
---|
| | migrated to a high-availability Oracle RAC environment, running |
---|
| | on Linux on either the existing Z-Series mainframe or a new |
---|
| | farm of purpose-built Linux servers. |
---|
| | |
---|
| | - The Unix server farm offers some slight opportunity for *consolidation*: |
---|
| | |
---|
| | <Niel needs to fill this in as regards to the non-SAP AIX servers |
---|
| | and their consolidation.> |
---|
| | |
---|
| | - There is a meaningful opportunity for Linux/Open Source in the Retail |
---|
| | Store environment: |
---|
| | |
---|
| | <Niel/Tom need to fill this in here> |
---|
| | |
---|
| | - A detailed analysis/audit of the core FLEX pricing algorithms needs |
---|
| | to be undertaken to determine whether more hardware or better |
---|
| | algorithms (or both) can be brought to bear: |
---|
| | |
---|
| | 1) Need to determine the nature of the computational contraint. |
---|
| | |
---|
| | 2) We suspect the problem being solved is "NP-Complete". If so, |
---|
| | there needs to be an investigation of improving/introducing |
---|
| | bounding heuristics to improve computation speed. |
---|
| | |
---|
| | |
---|
| | Other Scenarios Considered |
---|
| | -------------------------- |
---|
| | |
---|
| | - Retail Store Server Consolidation |
---|
| | |
---|
| | 1) We examined the possibility of collapsing the 4 servers currently |
---|
| | used in each store into 2 larger servers. |
---|
| | |
---|
| | 2) This scenario is currently a nonstarter because new hardware is |
---|
| | still being rolled out to the stores this year. The cost recovery |
---|
| | thus isn't there to justify a store server consoldiation. |
---|
| | |
---|
| | |
---|
| | Major Risks |
---|
| | ----------- |
---|
| | |
---|
| | - The absence of "before"SLA metrics means that any consolidations or |
---|
| | other changes made to the system may get blamed for subsequently |
---|
| | seen "poor" performance. When this happens there is no way to |
---|
| | compare the "after" to the "before" conditions. Senior management |
---|
| | needs to understand this and be prepared to manage through it. |
---|
| | |
---|
| | |
---|
| | |
---|
| | |
---|
| | |
---|
| | |