Friday, November 19, 2010

Capacity Review - IT Assets planning

IT Asset management - One of the core principles for Information Technology best practices is "Confidentiality, Availability, Integrity" (ok, that's three, but they are grouped as one). As business fluctuates and product and transaction cycles speed up ever more, it is very important, useful and in the vested interests of IT Management to perform a Capacity review periodically - due to the cost and complexity an annual review might be sufficient in most cases. What exactly is Capacity review anyway? In its simplest form, it is ensuring that IT assets - hardware, software and connected links (networks, telecommunication access to/from the internet, cloud etc) and storage - are sufficient to meet peak demand as defined by company policy or best practices. If one doesn't exist at your organization, review best practices at similar firms in your industry and define it and seek approval. As an example, one brokerage firm that I worked at had a policy that required that the main brokerage applications had to be capable of handling twice the capacity load of the busiest day of the year. So if the servers on the busiest day were at 45% of capacity for an extended period of time, then as per management's requirement,  the capacity of those servers had to be able to work at 90% load.

One of my roles, in the many hats that I wore, was to do a capacity review for all the distributed applications at the end of the year. To prepare for this, I first went to our metrics site, where I sifted through reams of data on all the web servers, application servers, database servers etc.  Then I organized the data into spreadsheets where I sorted through page views, server resource loads etc. Based on the data at hand, and using algorithms developed in-house, and by looking at the back-end (mainframe, database) data and connectivity analysis, I had a map of how much capacity was used on the distributed side and in-bound and out-bound feeds. That gave us an idea of whether the distributed servers could handle the capacity required for an unusually busy day (think of a very volatile day in the markets - major business collapse, terrorist attacks etc).  This is only an example of a specific industry - but such capacity reviews are de rigueur in every industry - telecommunications, transportations, and retail are some that come to mind.

On the software side,  analysts and reviewers could simulate many things - online transactional processing simulation, for example, is common.  But it might be more useful to ask about the connectivity - sure, the retail front-end web site can take 10,000 orders a second, but can the back-end handle it? Can the connections to the credit checks work successfully and simultaneously at that level? How about order fulfillment - do the fulfillment centers/warehouses have the capacity to handle huge backlogs - if not or unsure, how long would it take and more importantly, can orders be tracked adequately? Can the supply chain handle it - can it be tested? Has it been tested? If these processes are outsourced, does the vendor make any explicit guarantees in the contractual agreements? How often do they test, and how willing and able would they be to do a simulated test (note: these are different from a disaster recovery test, which normally only simulate average loads at a backup site or offer an alternate way to do the same thing you already do).

Capacity reviews are also useful in incorporating forecasts into future budgeting and for justification purposes. Additionally, as technology moves forward at an ever faster pace, old IT assets can be updated, upgraded or replaced by incorporating these reviews ("These 25 servers operating at 90% capacity can be replaced with 5 new ones operating at an average of 50% - and a payback period of 1.5 years"). This will also show management that you and your team have done your homework with substantiated facts.

Another way such reviews can assist is in finding underutilized resources - and at a time of budgetary pressures, might come in handy. For example, for a new application project with limited budget, I was able to point to underutilized servers (which, of course, I knew from my capacity planning exercise) which could host the new app. By sharing resources (servers, existing software licenses on those servers, and storage and network charge backs as well as backup site servers) costs were mitigated.

Capacity Review is a tool that business analysts and business / IT / process / operations managers can use to plan, streamline and optimize their assets and thereby provide more value to the business.

No comments:

Post a Comment