Easing into the waters
Some food for thoughts about early infrastructure efforts for teams.
I keep reading good articles about Kubernetes, Nomad and other orchestration solutions for containers. They all look interesting. Yet many times I can't help but wonder if all that isn’t just reinventing the wheel on top of an existing set (imagine people building a road on top of a railway).
In parallel to those readings I am happy to read posts about why we might not need Kubernetes (or Nomad) and that rings a bit more true to my heart, especially for small and medium scales.
We are lucky to have offerings such as AWS, Azure and GCP those already offer services that allow a small team to get the benefits of auto scaling and monitoring without worrying much more than setting up the resources to run the services.
A tale of simplicity
What most of small companies and early startups want is to get a service (http one) running with a few bits around : a SQL or NoSQL database, a cache database and possibly a way to treat background jobs. Maybe there are to be a few other HTTP services around, but they are likely to use the same principles as the first one.
In the case of AWS one can get containers built from code into an automatically prepared auto scaling group of ec2 instances in just a few clicks. This would rely on CodeCommit, CodeBuild, Beanstalk.
Using terraform one could very well setup this, or go a bit further by replacing the Beanstalk blob with custom EC2 Auto Scaling Groups (ASG), a Loab Balancer (ALB). All those can be tied directly to CloudWatch for performance metrics and logs.
Now … Think what we need to setup those things :
- a vpc
- a set of private and public subnets and their internet, nat gateways
- a set of EC2 ASGs with the relevant Launch Template and scaling rules
- an ALB
- Cloudwatch log groups and rules
- CodeCommit and Codebuild repositories and build lines
- ECR repositories for the container
As for the data stores you would rely most probably on :
- an RDS instance (whatever the engine)
- an Elasticache instance (whatever the engine)
As for deployments and orchestration of your ASGs you could rely on some conventions and a handful of lambdas tied with your CodeCommit and CodeBuild chains. Every successful build would see its release id added to a table in dynamodb, and the live one for each ASG can be marked there too. A lambda can serve this through an API Gateway to a simple curl in your user data within the Launch Template (or init process within your custom AMI), thus knowing which container release to pull and start.
All those steps rely on AWS primitives, classic Linux tools and a bit of python, ruby, go or sh scripting skills.
All this will get you started and more. If you build all this with Terraform you can extend the infrastructure by re using the same principles and modules again.
There are also a lot of tools that are already very friendly with those tools. They already are present and polished because we have been using them for so long.
Problems, problems, problems
Tools like Kubernetes are made to solve some problems :
- automatic deployment
- application network configuration
- resource allocation
- distribution of services
- load balancing
- automatic fail over
- automatic scaling
- storage sharing
One has to keep in mind that those problems were met and really are an issue if you start from a set of racks with servers, aka “bare metal”.
If you start from the point of AWS offerings we just saw that something relying on primitives of the platform already cover a lot of this.
- one can figure out how to deploy automatically on EC2 instances upon their start
- network configuration is done with Security Groups when the service is setup and can be quite precise to limit access to data stores only from specific ASGs
- resource allocation : size up the instances before launch, adjust as time go by depending on the
- distribution of services : non existent as each service has its resources
- load balancing : dedicated resource among the primitives we identified
- automatic fail over : AWS takes care of that internally, bad hosts are replaced
- automatic scaling : that too is handled by AWS with ASGs and scaling rules
- storage sharing : that one is possibly the remaining issue but it’s covered in many place of the web with different solutions, including … not sharing storage between instances and just do something else
Jumping on the bandwagon
It’s great to see the ecosystem so active and presenting all those new tools, libraries, abstractions and so on.
Yet, before jumping on the bandwagon I tend to ponder and ask myself : is that actually solving something for me ? Do I want to fight this fight ?
It all depends on the project, the team and the timeline we have in sight. Big IaaS such as AWS, Azure and GCP, already give you a lot to start with and hit the road running. You can, with a minimal engineering team (2, 3 or 4 people), get something out of the ground with already auto scaling, network policing, monitoring, alerting, a good set of CI/CD processes and security for the data.
This might lock you in for a time with the said provider. Yet, at this time, are you able to deal with more than just making your product ?
Tools such as Terraform already give us a big help to make it easy to setup what is, indeed, complex infrastructures for small and medium products within AWS.
Kubernetes and Nomad might give you also a good set of tools, but are you ready to tackle the complexity they add on top of something that already works ok ?
And then, if the need arise and your team is ready for it, you will stil be able to make it more complex by replacing some bits with Nomad, Kubernetes or something else.
Remember that when Unix was made, the first idea was to keep things simple, one tool was to do one thing and do it well. The second idea was to be able to chain small, working, blocks to make more complex systems. That way, each administrator was able to tailor, on the go, those chains by replacing one or a few links with what they actually needed in their specific ecosystem.
All that applies if you are willing to go with AWS, GCP or another big IaaS provider and their services. If you don’t, then well you are back to square one and tools like kubernetes and nomad do make sense to get you out of the starting blocks faster, they will be your IaaS and you will be free from the start.
Conclusion ? Make a choice based on your team and the battle you want to fight, not just a fad you read or heard about at the coffee shop.
Have fun.
Interested in this architecture or something similar ? I am happy to tell you more about this and help you. I am based in France, and can consult remotely and on site. Contact me : thomas@imfiny.com.