Follow the sun in global operations

Managing infrastructure in a global business with 24x7x365 availability is not just about watching SLA performance.  Continuously improve systems, follow the sun in your support models, and leverage industry frameworks to keep your systems flying.

  • It is continuous improvement of end-to-end services, heatmapping the systems and addressing areas of need.  I use a technique called strategy mapping to understand where business value is (1) most visible to the users and define development work to keep a competitive edge and (2) reliant on the utility of the function and evaluate where we might source a 100% quality service.  I like a “one hand to shake” approach with vendors where they understand their accountability and are transparent about performance and progress to perfection.
  • Following the sun with geographically distributed support services shifts your US-based individuals from being 24x7x365 — depending on the support model, this may improve service.  Cross-training team members and setting up a buddy system for overnight on-call distributes the work through the team.
  • IT Service Management (ITSM) is not rocket science and thankfully has frameworks like ITIL we can implement.  A few key tools in managing challenges in this space are having a socialized IT strategy, implementing demand management, publishing a service catalog with OLAs, defining and testing infosec policies and BC/DR plans, rolling out change management, and finally relentlessly executing on both incident and problem management.  These are the core functions in managing the performance of global business capabilities.

And let’s not forget about good design, architecture and testing…

  • High availability and reliability are architected into the systems with redundant data flow paths and infrastructure, eliminating single points of failure (SPOF) before going into production.
  • To ensure the teams have properly designed the systems, the teams conduct operational acceptance testing for functional performance, pen testing for security risk evaluation, and variations of business continuity / disaster recovery (BC/DR) to account for component failure. 
  • Failures happen – so the teams need to fail early, often, small, and gracefully.  Proper planning and testing minimize risks and maximize uptime and performance.

Leave a comment