IMPORTANT: During the outset of this product launch and support phase we will likely have to rely on Cloud Compute to fulfill or verify some tasks regarding the actual failover

Overview

This page will eventually contain the full procedures for handling all types of DR Failover Requests

The most current version of the Managed DRaaS Run Book document is at \\192.168.2.5\Product Development\ITaaS\Managed DRaaS\Managed DRaaS Run Book


Cloud Service DRaaS ZT page

Priority Levels

Priority Level

Event

1

Disaster Declared, Unscheduled/On Demand Failover

4

Scheduled Live or Test Failovers, Non-Critical Communication

Test Failover Procedures

Scheduled Live and on-demand Failover Procedures

Open the Managed DRaaS Run Book for the customer and have it ready at all times.

  1. Start a ticket in ConnectWise
    1. Log as a P1 ticket
    2. Log the details of the Live or On-demand failover situation. If there is an actual disaster log a brief general statement or two about the scenario (weather-related, fire-related, equipment-related, etc, estimated time to original site recovery, etc).
    3. Save the ticket # and provide it to the caller
  2. Authenticate the contact
    1. Notify the contact that you must authenticate them by calling the phone # designated in the company's Managed DRaaS Run Book document
    2. End the call, and then return the call to the user's phone # listed in the Run Book. You can further authenticate by asking the user for the ticket # that you just provided them in Step 1
  3. Take action or plan the failover
    1. Disaster declared/on-demand failover request: Ticket is a P1 and action needs to be taken within 1 hour. Explain to the client that you will begin the process of failing over their environment within the hour.
    2. Live/planned failover request: Schedule the failover with the client and move on to the next step.
  4. Notify stakeholders
    Disaster declared/on-demand failover
    1. Help Desk: If you receive this request during normal business hours, email the MSP Team Austin distribution group and CC your manager the following information
      1. The client
      2. The ticket #
      3. The details of the request (is this a live/planned failover or a disaster/declared on-demand request for a failover?)
    2. On-call: proceed with the above steps. Additionally, notify your on-call contacts: your secondary, the on-call system administrator and/or on-call system engineer
    Scheduled live failover
    1. Generally speaking you want to notify anyone who will be working during this time-frame of the situation. If the work is to be performed after-hours, then the ticket would need to be handed to the Primary on-call technician.
  5. Begin the failover process: consult the DRaaS-ZT - Live Failover procedure and the Managed DRaaS Run Book
  6. Address any network configuration or DNS-related issues - Cloud compute must assist with this process
    1. If necessary test with workstation VM to ensure business continuity
    2. Verify printers, mapped drives, etc
    3. Get peer review to verify conclusions/business continuity
  7. Verify all servers, vms and network
  8. Notify the customer CTA of the results of the failover How to find a client CTA



Jordan Bean 2:06 PM

One internal note for Managed DRaaS - if there is an XTIUM outage, we need to prioritize managed DRaaS customers for restoration since they are actually paying for the services. will want to add that rule in the general product support area we create