Monday, October 30, 2006

Ignore Pre-Automation Process Optimization at Your Own Peril!

I just came off a Pilot implementation of my company’s Data Palette automation product at a Fortune 100 company. This Pilot started off as optimistic jubilance at being given an opportunity to implement our technology in such a large and reputable organization, but soon that exhilaration changed to moroseness and even, despair. Finally, by the time the Pilot was nearing completion, the prevailing mood changed back to a sense of positive achievement as the effort successfully wound up in spite of certain deficiencies. This roller-coaster of emotions could be attributed to the missed opportunities and changing goals evidenced during the Pilot.

When we started out, the primary goal of the Pilot was to prove that Data Palette can be made to work in the client environment. (Because the environment is very large and complex with several moving parts and heavy human intervention, automation was not considered easily doable.) I helped carry out a detailed discovery there and felt we could take on the challenge. However I did so intending to optimize part of their processes along the way; to minimize the areas where human involvement existed, but was really unnecessary. The client CIO agreed that optimizing their IT processes was a good thing prior to standardizing and automating them. However once the Pilot started and we laid out the standard operating procedures, the client team operating at the ground-level was resistant to any changes in their day-to-day processes. They felt human DBAs needed to be in the drivers’ seat and any decision making needed to be kept manual, even the mundane ones that made them get up in the middle of the night.

That’s when the demons of doubt started plaguing me. Could we still automate the processes the way they were and attain significant productivity gains? Sometimes the manual actions and logistics interspersed throughout the task take up more time than the task itself. If these time-consuming manual interchanges cannot be avoided, any automation of the task itself can seem insignificant. (I' m sure you have often heard IT personnel say "well the actual task only takes about 20 minutes, but the logistics around it cause the task to take up 2 hours, so I'm not sure this task can be automated or whether it would be valuable to have it automated!")

I really wasn’t worried about Data Palette’s ability to accommodate manual decision making and control and the end result proved this out. However the extent of benefit to the client environment was bothering me. In certain areas, I wasn’t content in letting them operate the way they were. One of the intentions (and side benefits) of standardization and automation is that it needs to be preceded by process optimization to bring the right amount of value. I was concerned due to the ground-level staff’s rigidity, value wasn’t being realized to its full potential.

Let me explain my concerns via a simple IT call center operation. When a call or alert comes in, it’s typically handled by a Help Desk or Tier 1 group. They document the problem in a trouble ticket, look at the nature of the problem and try to resolve it if possible. If they can’t, they assign it to the appropriate Tier 2 group and move on to the next call. Depending on the urgency of the problem, an Tier 2 person may be paged and assigned to the issue so it can be worked on. Now what happens if the Tier 1 person handling the issue is unsure about “when” to pass it on to the right Tier 2 group. Let’s say, this person is obsessed with solving that problem and is unwilling to pass it on quickly. She takes hours trying to figure out what is causing it and it becomes a matter of personal pride. That’s very nice of the individual, but this kind of manual decision making hurts the caller and the company. The caller has to wait a lot longer to get a resolution. If there was a business rule that stated that a Tier 1 person could spend no more than 5 minutes trying to identify the problem and after that, they had to pass it on to a different group, that would help matters. That would allow the right (more senior) person to evaluate the problem and implement the solution. And that would free up the Tier 1 person to take additional calls and the operation would run much more smoothly. The process would dictate when and how each call would be escalated rather than placing the onus on the Tier 1 individual to decide how to deal with a call.

Similarly, if the Tier 1 person had to go to a manager’s office and ask for her permission to assign the problem to a Tier 2 group, that would introduce an even higher level of inefficiency. Or during assignment of the ticket, the Tier 1 person had to call the Tier 2 person and if unable to reach him, had to walk up to a this person’s cube (say, it’s in a different part of the office) to try and get that person or wait for him to get off the phone, again the process would be grossly inefficient. These are all areas that involve unnecessary human involvement and delays and pose a barrier to automation. With the right level of process optimization and decision automation, these inept areas can be completely eliminated.

As you peruse the above examples, some of you may feel "This is a no-brainer. Why would anyone tolerate such inefficiencies in their day-to-day IT processes??" Well, pause for a minute now. Take a deep breath and look hard inside your own environment. The inefficiencies may have manifested in a different way. Many managers and ground-level personnel in companies feel “We know what’s best for our business and there’s a reason why we have developed certain processes to accommodate our unique needs. Any automation efforts need to take those into consideration.” It’s easy to be rigid about these things. And some areas really do deserve such rigidity. But many areas really don't. When you have external specialists working with your internal experts, it’s a unique opportunity to be able to re-evaluate the basis for operating in a certain way and adjust them to yield the maximum productivity and faster task accomplishment prior to automating that method of accomplishment. The results are usually in black and white. If the productivity gain is demonstrable and not just a spreadsheet exercise, then the change should be embraced.

In this particular Pilot, we were able to eventually convince the staff that it was in their best interest to accept the process improvements in certain areas. That allowed the feeling of jubilation to return all around. However it may be only until the next Pilot, since this arm wrestling over process optimization is somewhat of a recurring pattern...

Process optimization should be ignored at one’s own peril since it could very well be the best part of the automation effort. Such optimization often means the difference between a hefty 30% plus efficiency gain versus a mere 5~10% improvement.

Tuesday, October 24, 2006

Are IT Automation Initiatives Here to Stay?

A few weeks ago, I was one of the lucky few (well, relatively speaking!) speakers at the Dow Jones DataCenter Ventures 2006 conference in San Jose (http://datacenterventures.dowjones.com/Default.aspx?pageid=111). The event reviewed emerging technologies related to the data center space and provided innovative companies a platform to present their solutions especially around virtualization, automation and methods to block things that are generally considered nasty such as viruses, worms, spam and other such elements causing loss in productivity and revenue.

Ben Horowitz, CEO of Opsware, one of the bigger success stories in the conference, was a keynote speaker and shared how his team was able to get their company on the map, the IT automation map that is. Several venture capitalists, industry analysts, trade journalists and large company technology scouts were present there. But what made the trip worthwhile was the presence of many technology leaders, including CIOs, CTOs and other real-world end-users with assorted titles).

Just about 18 months ago, a similar presentation from me at a CIO round-table resulted in barely 6 attendees showing up (about 18 were expected). It just seemed like CIOs had more pressing stuff to attend to, than being at a boring “ping, power and pipe” event learning about ways to keep the crown jewels of the business accessible and secure. But now, things seem to have take a turn for the better. Optimal data center management is beginning to be seen as sexy and glamorous, almost like business intelligence was, 5 years ago. So much so that popular publications like eWeek are increasingly running articles around this topic (see “Five Biggest Data Center Concerns” at: http://www.eweek.com/article2/0,1895,2034035,00.asp?kc=EWEWKEMLP102306STR2).

So what triggered this behavioral change? Data center management has always been one of the more costly items in IT due to the hosting space, power, and cost of human capital (not necessarily in this order). So why change perceptions now? Mind you, I’m not complaining. This obviously bodes well for the IT industry in general and the database industry in particular - since databases are ubiquitous in any data center. However I’m fairly curious about evolutions in the IT landscape, and understanding their short-term and long-term impact on business. Is the current IT optimization movement a mere swerve or is it here to stay?

My personal opinion (and hope) is that this is for keeps. CIOs are beginning to realize just how much they are truly spending on IT administration. Worse still, they are tying their smartest resources to mundane problems rather than the biggest opportunities. Current popular press and word of mouth is causing them to take a hard look at emerging alternatives rather than treating the toll merely as a cost of running a business. If innovative technology can bring around even a 15% efficiency improvement (half of what many of these vendors promise) to these more expensive administrative areas, it would tremendously impact the bottomline, free up IT budgets and ripple through to positively effect the priority of relevant IT projects, which will be greatly appreciated by the users and shareholders.

Besides my optimism, another indicator that this renewed focus on IT optimization is here to stay is the advancement in several sub-technologies that make it possible. For instance, key areas such as agent architecture, push/pull models, network and server security, autonomic computing, server virtualization and grid management, decision automation and expert systems have matured significantly over the last 10 years allowing vendors to apply these solutions to specific niche areas thereby culminating in the beneficial situation prevailing in the industry today.

So what does all this mean for the business besides just reducing costs? Well, such optimization results in lowered downtime, higher scalability and more predictable performance. But more than anything, it affords businesses the opportunity to align their smartest people towards solving their business’s biggest challenges and leveraging the best opportunities rather than merely being the best command-line junkie. And that, is the biggest win-win for the business and all stakeholders, including the employee.

Saturday, October 14, 2006

Baby steps, automation and OMM

I often hear this from my client IT managers, even battle-hardened ones - “We think automation and OMM are good things for us, but we don’t know how to really get started. We manage so many databases and applications here that it’s tough to standardize anything. We have no say in the matter. Our DBAs are busy working their tails off; it’s unrealistic to tell them to drop what they are doing and work on automating stuff. It’s just not going to happen!”

I know there are many environments out there that are probably struggling with the same issues. The desire is there to get to higher levels of operational maturity using board-based processes such as ITIL and OMM, but the challenges are steep. (BTW, if you are unfamiliar with OMM or the Operations Maturity Model, my company has a white-paper on their website on the topic: http://www.stratavia.com/downloads/9_adaptiveimplementation.pdf.)

For many companies, there are a myriad of 3rd party ERP, CRM and custom home-grown applications. There is often more than one mainstream database platform in use - Oracle, SQL Server, and/or DB2. The developers are playing with open-source DBMS platforms and there’s a good chance your DBAs will need to start supporting them in the near future, if they aren’t already! There are hundreds, possibly thousands of tickets being worked on by the DBA team every month. Pretty soon, there will be new projects that will need to go live. You will need to hire another DBA or more to support those projects. Hopefully your CIO will give the green light to add that head-count. And then, there’s talk that your company might acquire that pain-in-the-derriere competitor of yours… Your top DBAs carry a ton of tribal knowledge inside their heads and they don’t always have the time to document things or train others. They are all running at 100 miles an hour. In spite of this, the business users don’t seem exactly pleased; heck - truth be told, they aren’t even remotely sympathetic. Your team is already working so hard, what else can they do? In such “saturated manpower” situations, is any improvement possible? Isn’t standardization merely a nice-to-have, a luxury that one just doesn’t have time for?? Is automation a realistic goal to chase, given the team’s lack of bandwidth?

My question to those managers usually is, is this situation sustainable? How long can you go on like this? And how do you not get smoked by your competitors that have already figured this out? If IT is not providing the competitive advantage your business needs, the writing is on the wall. Regardless of the fact that your environment is bursting at the seams, you need to act now ‘cause there’s never a good time to do this. You have to take some tangible steps to inject a sense of order in your environment and get things under control.

So how do you do go about doing that? This is where baby steps come in.

The first step one can take is to simplify the environment. Don’t even think about relatively lofty initiatives such as automation yet. Focus whole-heartedly on identifying where complexity abounds in your environment and attempt to strike it down. For instance, if your DBA group uses a variety of tools, put together a “tools/needs analysis” spreadsheet that lists each tool, who uses it and why. (Send me an email if you would like me to share some of the spreadsheet templates I use in this regard.) Look for tools that are not actively in use, and those that are redundant. Similarly, make an inventory of what scripts are in use. Look for inefficiency indicators – such as, different DBAs using different scripts to accomplish the same thing.

Here I’m assuming that your environment already has decent processes for change control, release management and configuration management. If not, implement simple processes that help establish a level of stability and predictability in your environment without injecting additional layers of complexity. If your existing processes are too complex, try to get the relevant stakeholders to sufficiently dumb down the processes. To determine if a process is “simple” or not, try describing it on a single sheet of 8½ X 11 paper. If you can’t, chances are, the process is too convoluted.

Once you feel your environment is as simple as can be, then it’s time to embark on standardization. Compile a list of the top 5 most common (repetitive) tasks per platform. Make note of how different DBAs in your team carry out these tasks. There are different ways to accomplish this. The simplest way would be for the DBA to jot down notes on each step she is taking next time she is asked to perform that task. Now remember, it is important for her to do this WHILE she is performing the task rather than jotting down the steps from memory. The latter scenario leaves room for errors and omitting steps. If the DBAs are too busy or plain unwilling to do this and you are unwilling to MAKE THEM do it, you may need to watch over their shoulders and take notes while they do it. (In most cases, merely threatening them with watching over their shoulders will make them do it just to get you off their backs!)

Also it’s been my experience that if you assuage them and keep them in the loop about what you are trying to achieve, they will support you because after all, their quality of life goes up as well due to such initiatives.

Your goal should be to compile a spreadsheet listing each DBA and his/her task recipe for each of the 5 tasks. Look for areas of commonality or divergence. If multiple DBAs are carrying out the same task in different ways, determine the best method to carry out that task based on the success criteria that’s most relevant to you and your organization – indicators such as fewer errors (“first time right”), faster completion time (“on time delivery”), etc. Once the most efficient methods per task are nailed down via a formal SOP document (email me if you would like me to share a good SOP documentation template), coach all your DBAs to use that method each time they perform that specific task. Regardless of which time shift they work, or where they are physically located, they need to follow that SOP.

Once you have the best recipe for each of the Top 5 tasks selected, everyone in the DBA team has signed off on it and is actively using it, it’s the right time to be automating it. This can be achieved using an automation platform such as Data Palette.

BTW, this Simplification -> Standardization -> Automation cycle is not a one-time thing. It needs to be an iterative process. Once you have successfully done it for the top 5 tasks, look at the next 5 tasks. Continue doing this until most mundane tasks in your environment are running on auto-pilot, or to put it more succinctly, you have reached OMM Level 3 or higher.

One more interesting fact, you don’t have to automate a large percentage of your mundane tasks to gain value. Even just the top 5 or 10 tasks will give you great returns for your efforts!

Friday, October 06, 2006

Automation Attempts without Executive Sponsorship = Failure

I’m part of a team that works with numerous companies attempting to inject a higher degree of automation in their IT operations, specifically within database administration (DBA). We are more successful in making this happen in some companies than others. Even companies in the same industry sometimes have dramatically varied results. I have often wondered why this is the case. After all, DBA work is DBA work and there is a huge amount of commonality – no matter what individuals working within that company may think per se.

When I put my finger on the pulse of this issue, it boils down to one core element: executive management involvement or lack of it. While we typically go into a company from the top down (at the C-level or VP-level), the Executive often hands us over to middle-management who in some cases, then hands us over to the DBA team. We are very successful in implementing automation within companies where the Executive keeps track of where we are traversing within his/her organization and the reactions we are getting from his/her team, even after the hand-off. In case, the DBAs or their managers are not that automation-friendly and push back saying their environment is too complex to automate, a red flag goes off in the Executive’s head. He/she probes further - Why is this the case? Have we backed ourselves into a corner due to the variety of customizations we have done over the years? What can be done to simplify the environment? Can we still automate some tasks that are the most painful/time-consuming? Without this thought process occuring and without the Executive being engaged, it is difficult to attain positive results.

Does this mean that middle managers and DBAs are not to be trusted with evaluating automation options? Not necessarily… Middle managers and DBAs that are typically tasked with evaluating and validating automation-enabling products are already busy with their day-to-day work. Any free time is taken up by meetings. This makes it hard for even the DBA with the best of intentions to invest time in the process thereby causing it to die on the vine or alternatively, being pushed out until no one remembers any longer what this effort was all about. In spite of automation having the capability to significantly enhance their environment and help them reduce their task burden, the effort never takes off.

There are seven simple things Executives can do to ensure this doesn’t happen:
1. Be up-to-date on virtualization and automation technologies and some of the activity from especially startups in this space. (Interestingly, almost all innovation comes from startups; I feel industry giants are incapable of innovating, I will elabore more on that in a future blog…).

2. Talk to your peers to find out what they are doing to reduce their pain of managing large number of databases. Have they pursued any innovative approaches?

3. Talk to your favorite analysts about which companies they are seeing as emerging stars in this space. (For instance, an analyst I follow closely is Noel Yuhanna from Forester Research. He provides a rather unique perspective on this space. For example, see http://www.forrester.com/findresearch/results?Ntt=Noel+Yuhanna&Ntk=MainSearch&Ntx=mode+matchallany&N=0).

4. Don’t hesitate to get into the weeds. Meet with your line-level IT managers and DBAs at least once a quarter. Challenge them to think out of the box. Ask them to come up with solutions besides just throwing more bodies at the problem. The latter approach doesn’t scale in the long run. Make them feel comfortable that their jobs are not at stake. Show them by example that their value in the organization will only increase by them investing time in strategic initiatives such as standardization and automation.

5. Help your executive admins be more aware of technology areas that are interesting to you (such as data center management technologies, IT automation tools, etc.) That way, when vendors in this space attempt to reach you, your gatekeepers can properly vet them and see if their offerings fall in any of the “interesting” categories and if so, bring them to your attention.

6. After the initial due diligence with these vendors, if any of them are brought in for a more detailed evaluation or a proof of concept, keep your ear to the ground on how their efforts are progressing. Have periodic status meetings with your internal people as well as the vendor representatives to understand each party’s perspectives and coach both sides to achieve success.

7. Work with your internal team and help them re-prioritize some of their activities so they have appropriate time and energy to work on these high-value areas with the vendor. Otherwise it becomes an exercise in futility if the attention span of the internal staff is very limited during the evaluation.

These steps seem overly dewy-eyed, but yet it is painful to see so many Executives ignore these and toss any automation related messaging to their subordinates expecting them to magically have the time to fit yet another thing in their already packed schedules. It’s a straight equation:

New Technology Initiatives without Adequate Exec Sponsorship = Failure!

With the above approach, Executives will be able to more effectively speerhead new technology initiatives and leverage the ones that really add value to their organizations in the shortest amount of time.

Wednesday, October 04, 2006

Don’t Let Security be a Deterrent to DBA Outsourcing

If I got a dollar for each time an IT Manager in a medium-sized company says “we like the offering, but our security policies prevent us from outsourcing our database support…”, I would have hundreds of dollar bills stashed in my pockets! Most recently, I heard this from an experienced DBA Manager for a company in the entertainment industry. That made me wonder – many of the goliaths in the financial services industry, including large credit card processors, insurance providers and banks (with the notable exception of JP Morgan Chase that actually backsourced their IT services from IBM; see http://www.cio.com/archive/090105/whiplash.html?action=print) rely every day on outsourced IT services, so why is this a problem for a company in the entertainment business?

Maybe since the other organizations are much larger, they have the financial and professional clout to invest in the outsourcing relationship (or make the vendor pay for dedicated redundant leased lines, specialized staff, etc.) to get their security concerns addressed and it’s the small and medium sized businesses (SMB) that encounter this problem.

But regardless of company size, I just don’t believe that security should be a deterrent to outsourcing. If good security policies are followed by a company and enforced by the outsourcing partner, the company can avail all the benefits of outsourcing without fear of compromising security. The DBA outsourcing industry has matured to offer numerous advantages and some of the better known companies in this space seem committed to addressing their customers’ diverse security requirements.

In my experience, three primary areas come into play to ensure the outsourcing vendor is not going to jeopardize your environment:
· Confidentiality agreements.
· Secure environment for the outsourcer to log in and work, regardless of where they are working from (in-house, from their houses or from their remote office). Make key security requirements part of the overall service level agreement.
· A tamper-proof multi-level audit trail.

A good vendor will walk you through these in detail and usually, will bring these tools and templates with them.

In the case of the confidentiality agreements, look at the confidentiality clauses in your standard employment agreement that an in-house DBA (employee) would sign, and ensure that the vendor agreement is a super-set of that agreement. (After all, every company out there trusts certain employees sufficiently to let them access their mission-critical systems and data.)

In terms of providing a secure environment, larger companies with the financial resources may request for dedicated leased lines prior to commencing work. However most security issues can be kept at bay by following good security practices internally, auditing the vendor to ensure they have similar or better policies and having a good VPN solution with a key-fob for any remote access that may be necessary. When it comes to remote access, regardless of your outsourcing philosophy, you need good policies to allow even your internal personnel to work from their houses, especially during off-hours. Ask your vendor for their security policy manual. If required, hire a third-party security consultant (or use your internal sys/network admin, if you have one and if he/she is well-versed with infrastructural security requirements) to look for any gaps in their policies. (The following site has links to some real useful security-related publications pertaining to industry standards: http://www.csrc.nist.gov/publications/nistpubs.) Ensure the vendor agrees to address any significant gaps prior to commencing work. Once that is done, have an audit done at the vendor’s site to ensure they indeed implement everything they mention in their policy manual and all gaps have been dealt with.

If required, you could even segregate DBA work such that vendor personnel cannot access any raw data in the database. Any work that requires access to data can be routed through internal personnel/managers. In the case of certain databases (like Oracle), it is also possible to control this at a granular level by granting privileges to carry out physical DBA tasks (sysoper) without access to everything (sysdba) or specifically, user data within the database.

Lastly, a good audit trail needs to be maintained both within and outside the database (at the operating system and network level) so events such as login times, userids, etc. can be checked and correlated when required. Ideally, rather than just waiting for violations to show up and then investigating the root cause, it is advisable to set up events within the audit software to look for violation patterns and take appropriate automated actions including sending out alerts. For database-level auditing, there are multiple tools that accomplish various things – some of these do not even require native database auditing to be turned on. Depending on the tool, they sniff SQL statements over the network, latch on the shared memory and/or periodically capture database activity and alert on problem signatures. All of these are effective methods to keep an eye on your environment and ensure the vendor is complying with your policies.

In addition to these, if your business requires it, you can also make other requests of your outsourcing partner such as not to use offshore resources to work in your environment and so on. Experienced vendors would already be familiar with such requests and may have special packages to accommodate them (often at a premium; but that premium may be well worth it if the overall costs are still significant lower and at better quality than doing it yourself and to be able to sleep at night).