Saviynt’s Journey to AWS – Part 1 – From Snowflakes to Phoenix Servers
Adopting any new technology/platform comes with a learning curve for any organization and Saviynt was no exception to it.
As part of Saviynt’s Cloud Adoption, we went through the typical “Crawl, Walk and Run” learning curve and are currently between the “Walk” and “Run” phases.
This is a 3-part blog series and aims to explain some of the Architectural processes and patterns Saviynt have adopted and refined over a period of time on its journey to AWS.
Part 1 – From Snowflakes to Phoenix Servers
Part 2 – The arduous road to securing native AWS Services
Part 3 – Challenges in securing devOps and CI/CD Processes
Let’s try to understand these processes and patterns in detail:
1.Process of Dissection and Layering the Cloud
– To take full advantage of AWS Infra services elasticity and scalability, it required us to dissect Saviynt’s application and network architecture and map it to different layers on AWS.
The process of dissection resulted in following layers on AWS
- Static Layer – Bare-bone Network Tier – This layer included the core network services including the VPCs (Virtual Private Cloud), ELBs (Elastic Load Balancer), VPN connections etc., Subnet Routing, Route Table definitions etc. and was defined to be more or less “static” in nature.
- Dynamic Layer – Application tier – This layer included various Saviynt application services distributed across multiple VMs, forming the dynamic layer. This layer was marked as “dynamic” because it would be frequently changed and the end goal was defined to re-create the dynamic layer whenever required from scratch. This laid down the guiding principle for implementing an Immutable Infrastructure.
- Persistence Layer – Logging and Data Tier – This layer included all the components which required state and had to be segregated from the dynamic layer. It required the application logging to be centralized and pushed out from individual application servers. This also required the various application services to be refactored and defined to be stateless in nature.
2.From Snowflake Servers to Phoenix Servers – Moving towards Immutable Infrastructure –
I believe everyone of us have spent countless hours troubleshooting issues in a clustered setup. Surprisingly, often this was a result of missing code patches or configuration changes on few servers in the cluster. Single threading the infrastructure components at each tier (web, app tier), digging into logs across architecture layers to identify the “culprit” server had been considered to be a ‘part and parcel’ of software deployments and integration. This configuration changes over a period of time lead to ‘Snowflakes’.
‘Snowflakes’ are servers which keep evolving and changing from their first deployed state due to code changes, patching or configuration changes.
Saviynt’s vision with its Cloud Adoption has been to implement “Immutable Infrastructure” by adopting the “Phoenix Server” pattern.
‘Phoenix servers’ can be created from scratch by rebuilding them anytime. Martin Fowler has written a great article on the concept of Phoenix Servers and why they are important to manage drift.
Any changes in our code, no matter how small they are, require us to deploy the complete application stack. This deployment model has helped us tremendously in cutting down our application troubleshooting efforts due to “missed configurations” on “culprit servers”. The idea was to build the entire “dynamic layer” from scratch whenever required just like a “Phoenix Bird” rises from nothing.
Saviynt with its CI/CD processes have built automated deployment pipelines with all the configurations defined per environment/s and are used to deploy the complete “dynamic layer” within minutes. Not only this has helped us tremendously in optimizing our deployment procedures but also have saved countless hours of troubleshooting time.
The “Static layer” and “Persistence layer” are defined as “Infrastructure Code” templates and are created from scratch in cases of DR scenarios or emergencies.
3.Auto Scaling Integration is “must” for elasticity
I use to dread the semi annual server inventory process which used to be a large excel sheet as an inventory of your servers, OS type, people having access to them and some other mundane details. Concept of “static infrastructure” or to be precise “a static application tier” no more exists. With the concept of Auto Scaling application tier ceases to be static and thereby maintaining excel sheets based inventory is neither required nor should be exercised. “Dynamic Layer” has to be auto scaled to reap the true benefits of AWS Elasticity not only for scalability and reliability purposes but it also as tremendous cost savings. However, making the application tier to use auto scaling required Saviynt’s engineering team to make significant changes in the application design. With the application services deep integration using Auto Scaling features of AWS, the robustness and scalability of Saviynt services have significantly improved.
4.SSH-Less environment is our ultimate goal
Having a developer mindset, I always had the urge to SSH into any workload, put in a patch, deploy a new WAR or make configuration changes to “fix things”. This works fine for sandbox/development workloads.
Now increase the number of developers and the workloads ten times. Do you see the problem? Not yet ?
Increase the number of developers and the workloads ten more times.
Managing drift and scaling this model has limitations. Allowing SSH connections only via “Bastion Hosts” or “Jump Boxes” is not the solution to this problem. Let’s understand why –
- Managing drift (app server configuration changes, application configuration, integration wirings like API URLs, serviceAccount IDs) on servers would become extremely difficult and result in creation of “Snowflake Servers”, leading to inconsistent environments
- Managing a secure process to distribute SSH keys/passwords to developers is tedious
- Rotation of Passwords/SSH keys would require the re-distribution of new
SSH keys/passwords to developers
Creating a pure SSH – less environment is something we are working very actively at Saviynt and the team have already started seeing the benefits of it. The application design changes are already in beta stage to support and realize the implementation of SSH-less environment.
In the next series of this blog post read about how we use Saviynt to secure Saviynt services on AWS in the entire CI/CD process and also use the same to secure and monitor AWS Security configurations.
Saviynt also has implemented several AWS Access Security design patterns and if you are interested in learning more about access design patterns, please read my previous blog posts
- Cloud Infra Design Patterns – https://www.linkedin.com/pulse/securing-cloud-infrastructure-services-3-design-pattern-sinha
- AWS Privileged Access Management – https://www.linkedin.com/pulse/privilege-access-management-aws-vibhuti-sinha
- AWS IAM Governance – https://www.linkedin.com/pulse/aws-role-policy-governance-guiding-principles-vibhuti-sinha