Live Ops - Continuous Operations Admin Guide

LiveOps Admin Guide


Overview


LiveOps features are being introduced in Eyeglass releases to allow customers to move from DR solutions to “Continuous Operations”.   

  1. Continuous Operations allows storage administrators to allow online operations during production hours without risk.

  2. Ensures all daily data protection and storage management functions are synced to the replication pair cluster

  3. Allows a storage admin to failover some or all data to a cluster pair to operate the business with all storage policies for storage management in Sync, data in sync, configuration in sync and snapshot and dedupe settings

LiveOPS Continuous Operations Key Features


The LiveOPS Continuous Operations features will introduce a dashboard on the DR Dashboard that provides Continuous Operations Readiness, in the same way DR Readiness is calculated and displayed for clusters under Eyeglass management.

  1. DR Test Mode

  2. Snapshot settings Sync

  3. Dedupe settings sync


What’s New


  1. Release 1.9 introduces support for 1 minute SyncIQ policies to create 3rd copy and maintain a synced copy.

    1. When using DR test mode enable mode it checks for running policies and waits until they complete before setting the schedule to manual while in test mode.   When exiting DR test mode the schedule is re-applied to the 3rd copy policy.  

    2. This ensures the policy will not run and cause syncIQ error on the policy

    3. NOTE: if the appliance is restarted the schedule to reapply the schedule.


LiveOPS Continuous Operations Dashboard


Similar to the DR dashboard that provides status on DR Readiness, errors for Access Zones and clusters.  The LiveOPS icon on the Eyeglass Desktop provides a single pane of glass to see cluster and policy sync Status for Snapshots and dedupe settings on clusters.  


You can also see cluster reachability from Eyeglass appliance to all managed clusters and cluster release leave and API interop version in use.


What do the columns Mean?


  1. Cluster Name column lists cluster managed by Eyeglass

  2. Cluster reachability column indicates if Eyeglass can login to the cluster (tested every minute)

  3. Cluster Version indicates the detected version of OneFS the cluster is running

  4. Effective Cluster Version means Eyeglass is in mixed API mode and uses the lowest cluster version API, you can see below a OneFS 8 cluster is operating at API 7.x.  This means only objects or attributes supported in OneFS 7.x will sync to 8.x clusters.  

  5. Continuous OPs Status Summarizes each policy and cluster wide status, indicated snapshots and dedupe settings are in sync and audited between the cluster pairs that are replicating


Screen Shot 2016-08-22 at 2.56.52 PM.png

Overview - LiveOPS Snapshot Sync


Replicating pairs of clusters now have a new job type in the jobs window that will scan for Snapshot policies at or under SyncIQ policies and sync them to the same (or different path based on SyncIQ target path).  This process scans for changes and syncs the changes to the schedule or other settings.


How it Works

  1. Any changes to the snapshot settings will be synced to the peer cluster of the policy including path changes.  

  2. If the policy is failed over, then the target cluster owns the snapshot settings and any changes or new snapshots will be synced back to the source cluster.

  3. Any other options on the settings tab are not synced.


How to Enable

  1. Open Jobs icon

  2. Screen Shot 2016-07-27 at 7.39.21 PM.png

  3. Enable the Snapshot Sync jobs

  4. Select one or wait until Configuration sync runs again.

  5. The Running Jobs window shows which policies have snapshot updates that are synced to the peer cluster on the same path specified by the SyncIQ policy

  6. Screen Shot 2016-07-27 at 7.40.59 PM.png

  7. Verify by checking the snapshot schedules tab on OneFS UI

Overview - LiveOPS Dedupe Sync


Replicating dedupe settings between replicating clusters allows dedupe to process the data on the DR cluster and achieve the same disk space savings, so that post failover the cluster usage matches the source cluster usage.  This ensures normal operations on the target cluster without any time delay for dedupe jobs to reduce the data before normal usage levels are achieved.

How it Works

  1. File system paths and assessment paths  are added to the source cluster AND that match a SyncIQ policy path will sync to the target cluster of the syncIQ policies managed by eyeglass.

    1. Screen Shot 2016-07-27 at 9.54.18 PM.png

  2. The dedupe scheduled job is not synced and must be setup or changed on the target cluster manually.

  3. Licenses for dedupe should exist on the target cluster


How to Enable

  1. Dedupe paths are auto synced with normal configuration replication, no setup required, it's automatic.

  2. Dedupe paths added to the target cluster are only paths that match a syncIQ policy path or below in the filesystem

  3. Dedupe settings for path are synced once for all policies not per policy

  4. Screen Shot 2016-07-27 at 7.41.09 PM.png

  5. Verify paths are added to the target cluster

Overview - LiveOPS DR Test mode


Eyeglass has introduced LiveOps features aimed at zero downtime, with the first feature offering DR testing that allows IT use cases to be executed without incurring downtime or impact to production systems.

  1. Test DR procedures during normal business hours with full copy of production data.  Storage administrators can validate application data integrity post failover, test end to end application procedures under various simulated conditions.

  2. Upgrade testing of  applications in a sandbox that allows writeable copy of production data

  3. Execute planned monthly or quarterly dr tests

  4. Mirror shares, exports to test access to production data in a sandbox


The key capability introduced with Liveops DR Testing is the ability to avoid impacting DR readiness for failover with  SyncIQ replication fully operational and Eyeglass configuration sync running between production and DR clusters.   Production to DR  replication and failover abilities is never disrupted during DR Test mode.

Operating  View


Eyeglass Isilon Edition - SyncIQ DR Orchestration Appliance Overview v35 (1).png


3rd copy of data is read-only until ready for DR Testing. Data and configuration is in sync with production in a sandbox Access zone that isolates data and access using an Access zone with dedicated IP pool to for DR testing.


In Eyeglass it's easy to see the status of the 3rd copy from the DR Dashboard.


Screen Shot 2016-02-27 at 8.55.30 PM.png


Prerequisites

  1. Eyeglass Enterprise Licensing with LiveOPS DR test key

    1. Screen Shot 2016-02-27 at 3.57.02 PM.png

  2. SyncIQ licensed clusters

  3. Free Disk space on the Target DR cluster equal to the size of the test data set

  4. Superna Eyeglass Release 1.5 or higher

  5. Dr test policy with daily sync interval, to select dr test data and maintain a staggered copy of dr data

  6. Supported Isilon OneFS Release as per the Superna Eyeglass Feature Release Compatibility matrix found in the Release Notes here.

How it’s Configured


NOTE: This solution requires enough disk space on the target cluster to store the DR copy of the data and enough space to contain the “Test data set” which can be all the DR data or a subset based on the policy path created.

If the cluster has dedupe license key enabling dedupe on the DR folder and the DR test folder on the cluster and reclaim nearly 100% of the disk space used by this feature.  In the example below /ifs/data/userdata is the target folder for the production cluster to copy DR data with SyncIQ.  This data is copied into /ifs/data/dr-testing and configuring the dedupe policy and running it will be able to reclaim 100% of the disk usage for DR testing.

Screen Shot 2016-06-17 at 7.27.34 AM.png

Screen Shot 2016-06-17 at 7.30.06 AM.png

  1. Production to DR clusters SyncIQ synchronizes data as follows:

    1. Example:  SyncIQ policy on Prod /ifs/data/corpdata

    2. Destination path DR /ifs/data/dr/corpdata

  2. Eyeglass Sync’s configuration data between Prod and DR clusters (shares, exports, aliases)

    1. NOTE: quotas are not used with DR test mode since the SyncIQ policies are one way failover.   Quotas will not exist on the DR clusters read-only path and no quotas will be detected in DR test mode.  If quotas are required for DR  testing, they will need to be created manually.

  3. Create an Access zone that will be used to test access to production data,  Suggested  base path /ifs/data/dr-testing (example only) on the DR cluster , the name of the access zone can be any name but it should have a good name like this  “DR-Testing-Zone

    1. The Access Zone will need an IP pool associated to a IP pool and smartconnect zone name and should be setup for DR testing applications on a dedicated network.

    2. The Access Zone MUST be different than the Access Zone used for production data.

    3. The Access Zone MUST use the same Authentication Provider as the production Access Zone.

  4. On the DR Cluster a SyncIQ policy is created using prefix of Eyeglass-DR-Testing” and uses the DR copy of the data as a source path (note: name can include -and other text to add a description)

Limit: 1 SyncIQ Job for DR Testing

    1. SyncIQ policy source path for DR cluster  /ifs/data/dr/corpdata

    2. Destination /ifs/data/dr-testing (Note: this path must match the Access Zone base path created above.  A sub folder can be used as well,  the key requirement is the target path MUST fall under the target access zone base path that will be used for DR Testing)

      1. NOTE: If you change the target path after the DR Testing Zone Access Zone exists, Eyeglass Zone Replication will not be able to update the path for this existing DR Testing Access Zone as per Isilon default behaviour.

    3. Destination cluster is the DR cluster (same cluster policy)

      1. Use IP address of a subnet service IP on the DR cluster, leave default use all nodes to speed up the copy and sync process

      2. Manually run the policy after creating it (first copy can take a long time to run depending on the amount of data and will consume cluster resources, schedule this in off peak times)

    4. Setup a schedule on this policy to an interval that matches your testing requirements to maintain sync with production data.

IMPORTANT: For case where Eyeglass-DR -Testing SyncIQ Policy Source Path is the same as the production SyncIQ Policy Target Path

  • Stagger the schedule such that the Eyeglass-DR-Testing policy Jobs do NOT start while the production SyncIQ policy is in a running state.  The Eyeglass-DR-Testing policy which starts running at the same time as the production SyncIQ policy is already running will result in Sync Job failure. This means dr sync policy schedule should be at least 12 hours or 24 hours to maintain a near copy of production data.

  • This is not an issue if the Eyeglass-DR-Testing policy is a sub-folder of the production SyncIQ Policy Target Path

  • Example:

Production SyncIQ Policy Target Path

Eyeglass-DR-Testing SyncIQ Policy Source Path

Schedule Overlap Allowed

/ifs/data/dr/corpdata

/ifs/data/dr/corpdata

No

/ifs/data/dr/corpdata

/ifs/data/dr/corpdata/app1

Yes


   

NOTE: Make sure to run the policy once after creating it to ensure the target path folder structure has been created.

    1. Check the Eyeglass Jobs Window after config sync has run successfully to detect the new policy  AND enable the policy since it will be in user-disabled state by default.   It will appear in a new section called “Disaster Recovery Testing”

      1. Screen Shot 2016-02-27 at 4.05.02 PM.png

      2. Run the job now by selecting and using Bulk actions run Now.

    2. Now check the cluster Access Zone SMB and export screens on OneFS to verify the configuration data from the production access zone has been created in the DR Testing access zone.

      1. Example below shows the DR test access zone and path has configuration data created with the correct path for the DR testing Access zone used in this example.

      2. Screen Shot 2016-02-27 at 4.09.35 PM.png

      3. Screen Shot 2016-02-27 at 4.09.44 PM.png





How to Enable DR Test Mode


Once DR Test mode is enabled from the DR Assistant icon,  the Access Zone data in the DR Test access zone becomes writable and available for testing.   The feature will make a final sync to ensure a recent copy is available for testing.  The configuration is also synced to ensure it’s a close mirror of the production configuration.  


  1. Eyeglass will detect the DR test policies based on the prefix name of “Eyeglass-DR-Testing” prefix and place this policy on the DR Assistant DR Testing tab


  1. The Configuration data is also synced to the Access zone created above.  NOTE: The target path of the policy must match the Access Zone target for DR testing.  Eyeglass will automatically match the path in the policy to an access zone with a matching path.  Never use a target path of a production access zone

  2. The configuration data is now synced from the DR cluster SyncIQ path used as the source (which is a mirror of the production access zone path configuration data).

    1. All configuration data that matches the SOURCE path used on the DR testing policy is now created in the new access zone detected as the path of the DR Testing policy.

    2. This allows control over exactly what portion of the DR data and its configuration is synced for testing purposes.  

    3. If a subset of the DR data is needed for testing, then a subset of the data and config can be synced by building a policy with different source paths.

  3. To enable write access

IMPORTANT: For case where Eyeglass-DR -Testing SyncIQ Policy Source Path is the same as the production SyncIQ Policy Target Path

DO NOT enable DR Test Mode while the production SyncIQ policy is in a running state OR allow this policy to run during a DR test, we recommend daily schedule or set to manual on the DR test procedure day.  The Eyeglass-DR-Testing policy which starts running at the same time as the production SyncIQ policy is already running will result in Sync Job failure.

NOTE: 1.9 now waits for the DR test mode policy to finish if it is already running, then runs the policy one last time, then sets the schedule to manual and caches the schedule in memory.  This allows the DR Test mode policy to be running at the time enable write access is turned on.

    1. Open the DR assistant

    2. Select DR Testing tab (see screenshot above)

    3. Select the policy you want to enable for testing

    4. Select a bulk action -> Make Target Writeable

    5. Goto running jobs window to monitor completion of the DR test enable job. Creates a Job for “Enable DR test mode”  that can be viewed from the running jobs window.

    6. Once data and config is fully in Synced.

  1. DR Test mode is enabled

  2. DR Testing can now begin.


How to Disable DR Test Mode


When DR Test mode is disabled on a policy, writes are disabled on the DR testing policy path.  


  1. Open the DR Assistant

  2. Select DR Testing tab

  3. Select the policy to disable DR Test mode

  4. Select a bulk action -> Make Target Read-Only

  5. Screen Shot 2016-12-13 at 9.25.28 AM.png

  6. View running jobs window to confirm all steps are executed to reset and resync data and configuration from production Access zone

  7. Screen Shot 2016-02-27 at 7.54.13 PM.png

  8. Disable DR Test mode completed

  9. The next time DR Test mode is enabled, a refreshed copy of data and configuration will be available for testing.

  10. NOTE:  1.9 and later will re-apply the cached schedule on the DR test mode policy only if eyeglass service has not restarted after the enable write access.

 

How Jobs are displayed in Eyeglass UI

  1. Jobs UI

    1. Job from production cluster is displayed as job type “Configuration Replication” .

    2. Job from DR cluster that is prefixed with “Eyeglass-DR-Testing” is displayed as job type “Disaster Recovery Testing (AUTOMATIC)”.

    3. DR type Job can be run manually or on schedule with other configuration replication jobs.

    4. User can put the job in USERDISABLED or ENABLED state, while the job itself can put it in POLICYDISABLED after running Enabling DR Test.



  1. DR Assistant

    1. All jobs that are prefixed with  “Eyeglass-DR-Testing” are displayed in DR Assistant. This is the UI from where we can Enable/Disable DR Test Mode.


  1. DR Dashboard

    1. From DR Dashboard, we can see the state of DR type Jobs. Different states that the Jobs can be in are :

      1. Entering DR Testing : When Enable DR test mode is initiated from DR Assistant and job is still running, it’s state is “Entering DR Testing

      2. Target Writable : When DR Testing job finishes its “Entering DR Testing” phase, it will end in the “Target Writable” status,

      3. Exiting DR Testing : When Disable DR test mode is initiated from DR Assistant and job is still running, it’s state is “Exiting DR Testing”.

      4. Target Readonly : When DR Testing job finishes its “Exiting DR Testing” phase, it will end in  “TargetReadonly” state.



[Note : “Entering DR Testing” and “Exiting DR Testing” are transitory states, they exist only during the execution phase of a DR Testing job. The stable states of the DR Testing jobs are  “Target Writable”, which involves disabling the policy corresponding to the current job and allowing writes on target destination and “Target Read-Only”, where the policy attached to the current job is re-enabled and writes on the target destination are disallowed]


Advanced DR Test mode Configuration

Enable to control which applications are tested multiple policies can be configured and select different source paths and direct to different target paths.

Reasons for multiple DR test mode policies

  1. Too much data to make a full copy, multiple policies allows targeting shares or exports or both for application specific testing

  2. Different groups testing different data need to execute DR testing at different times

  3. Application upgrade testing only requires a subset of the overall DR data to test with

Procedures to DR Test different data sets independently

Copy into 3 separate DR testing access zones with 3 separate policies

  1. Create a policy with the required policy prefix name and name for application test scenario example add -name of app or -shares

  2. Create each policy to select a source path that matches the target application (NOTE: ensure the path includes shares and exports required for testing)

    1. Create target path inside the target test access zone created for DR Testing

    2. The target path can be any path depth below the base Access Zone path and allows moving the data to a different path than exists in production example /ifs/data/dr-testing/applications/shares/application1

    3. NOTE: Each policy target path must be a different Access Zone

  3. Run policies to sync data and set sync schedule as normal on SyncIQ policies to match your sync requirements to maintain a full copy up to date.

  4. Enable the DR Test mode policies in the Eyeglass jobs window after they have been discovered

  5. Run the job with bulk actions run now option

  6. Verify the configuration data is created in the DR test access zone.

  7. Open DR Assistant, DR Testing tab

  8. Select one or more policies to enable for DR Test mode in a writeable file system.

  9. two policies writeable.png


Copy data set  into 1 DR Testing access zone From 3 separate policies

  1. Create 3 separate policies with the required policy prefix name and name for application test scenario. For example: Eyeglass-DR-Testing-1, Eyeglass-DR-Testing-2 and Eyeglass-DR-Testing-3.

  2. Create each policy to select a source path that matches the target application within its own access zone base path. Each policy source path should be within an individual access zone. (NOTE: ensure the path includes shares and exports required for testing).

    1. Create target path inside the target test access zone created for DR Testing

    2. The target path can be any path depth below the base Access Zone path and allows moving the data to a different path than exists in production example /ifs/data/dr-testing/applications/shares/application1

    3. NOTE: Each policy target path must be the same Access Zone

  3. Run policies to sync data and set sync schedule as normal on SyncIQ policies to match your sync requirements to maintain a full copy up to date.

  4. Enable the DR Test mode policies in the Eyeglass jobs window after they have been discovered

  5. Run the job with bulk actions run now option

  6. Verify the configuration data is created in the DR test access zone.

  7. Open DR Assistant, DR Testing tab

  8. Select one or more policies to enable for DR Test mode in a writeable file system.

DR Test Mode States