The RosettaHealth High Availability Policy establishes policies and procedures designed to ensure continuous availability of the RosettaHealth Platform to customers . This policy is maintained by the RosettaHealth Security Officer and CTO.
This RosettaHealth High Availability Policy has been developed as required under the Office of Management and Budget (OMB) Circular A-130, Management of Federal Information Resources, Appendix III, November 2000, and the Health Insurance Portability and Accountability Act (HIPAA) Final Security Rule, Section §164.308(a)(7), which requires the establishment and implementation of procedures for responding to events that damage systems containing electronic protected health information.
Applicable Standards
Applicable Standards from the HITRUST Common Security Framework
- 12.c - Developing and Implementing Continuity Plans Including Information Security
Applicable Standards from the HIPAA Security Rule
- 164.308(a)(7)(i) - Contingency Plan
Architecture Based Approach
The architecture of the RosettaHealth platform is based on the concept of high availability (HA). High availability is defined as providing a solution that is resilient to unexpected surges in demand as well as unexpected degradation of capability. There are 5 main mechanisms that provide this HA capability.
-
Two geographically separated data centers
-
Replication of all platform components (servers and systems) across both data centers
-
Rapidly scalable capacity.
-
Multi-level load balancers that route traffic between not only each data center but between the platform components within the data centers.
-
24/7 monitoring of RosettaHealth components
Geographically Separated Data Centers
The RosettaHealth Platform is hosted in Amazon Web Services at 2 distinct locations, or Availability Zones (AZ) in the Northern Virginia Region. "Each Availability Zone is designed as an independent failure zone. This means that Availability Zones are physically separated within a typical metropolitan region and are located in lower risk flood plains (specific flood zone categorization varies by AWS Region). In addition to discrete uninterruptable power supply (UPS) and onsite backup generation facilities, they are each fed via different grids from independent utilities to further reduce single points of failure. Availability Zones are all redundantly connected to multiple tier-1 transit providers. https://docs.aws.amazon.com/whitepapers/latest/aws-overview/global-infrastructure.html” .
Replication of Platform Components
The RosettaHealth Platform is a system of systems comprised of multiple components that fall into one of three categories.
-
HealthBus components developed and/or maintained by RosettaHealth
-
AWS Infrastructure components managed by ClearDATA
-
AWS cloud services services managed by ClearDATA.
For HealthBus components developed and/or maintained by RosettaHealth, each component is duplicated in each AZ. Those components rely on AWS infrastructure components (ex EC2, EBS, …) that is managed by ClearDATA. RosettaHealth technical team coordinates with with ClearDATA to ensure that each component is available for supporting HealthBus components. Additionally AWS cloud services are utilized (ex Lambda, RDS, S3, …). All of these services are redundant across multiple AZ.
Rapidly Scalable Capacity
The use of 2 AZ provides sufficient capacity for RosettaHealth normal operations. In addition if demand on components suddenly increases, either due to customer usage or degradation in one AZ, individual platform components can be changed, either manually or automatically to handle the increase. Therefore most all operations can be continued in a single AZ for at least a limited amount of time if needed.
Multi-Level LoadBalancers
Supporting the replication of HealthBus components across two AZ is the use of 2 levels of loadbalancers. The Level 1 are AWS Network Load Balancers (NLB) used to load balance traffic between the two AZ. All inbound traffic is passed to one of the two NLB. Each NLB with then pass the traffic to the Level 2 loadbalancers in a round-robin pattern. If a NLB detects that either of the Level 2 loadbalancers is not available it will re-route to the available loadbalancer. The Level 2 loadbalancers are based on HaProxy running on 2 EC2 instances each in a different AZ. Each of these level 2 loadbalancers route traffic to the appropriate HealthBus component. These Level 2 loadbalancers also use a round-robin strategy to distribute traffic to components across both AZ.
24/7 Monitoring
Overseeing this HA architecture is a set of policies and procedures described in the RosettaHealth Auditing Policy concerning Monitoring and Alerting. These support the HA approach by providing continuous monitoring of all of the components. In most cases when a component fails it will initiate a restart process to bring itself back online. In the event that the component isn’t able to automatically restart, monitors will alert RosettaHealth administrators. Administrators can the take appropriate action to restore the components or take other remediating actions. This provides the RosettaHealth platform with a realtime Disaster Recovery capability both at the data center level as well as the individual component level.
Line of Succession
The following order of succession to ensure that decision-making authority for the RosettaHealth is uninterrupted. The Chief Technology Officer (CTO) is responsible for ensuring the safety of personnel and the execution of procedures documented within this RosettaHealth Contingency Plan. If the CTO is unable to function as the overall authority or chooses to delegate this responsibility to a successor, the CEO shall function as that authority. To provide contact initiation should the contingency plan need to be initiated, please use the contact list below.
-
Kevin Puscas, CTO: (301) 919-2978, kevin.puscas\@rosettahealth.com
-
Buff Colchagoff, CEO: (202) 345-0298, buff.colchagoff\@rosettahealth.com
Responsibilities
The RosettaHealth Tech Team is responsible for working with ClearDATA in setting up the HA capabilities of the RosettaHealth production environment in AWS to include AWS services, network services, and all EC2 servers. The RosettaHealth Tech Team is directly responsible for assuring all RosettaHealth Platform components are working.
Testing and Maintenance
The HA capability is routinely tested as part of normal maintenance operations. During these operations one component in one data center will be taken off-line and traffic is automatically rerouted to the healthy system in the other data center. Once the maintenance operation on the one component is complete it is brought back up and added back on to the loadbalancer(s).