08 August 2019

Prometheus with Azure Monitor

Tags: Cloud Azure

I have been using Azure AKS for quite some time now and haven't had many problems with it. I run my own observability stack (Prometheus + Grafana) and logging stack (EFK) on AKS myself. I recently noticed on Azures blog that I can ditch my Prometheus from inside kubernetes and have azure take care of scraping and storing the application metrics. If you have been using Prometheus with AKS than there is not much that will change as a developer. If you, however, have been using Prometheus for DevOps related work and are quite good at writing PromQL than embrace yourself for KQL (Kusto Query Language)

<Queries>
InsightsMetrics
| where Name == "user_management"
| extend dimensions=parse_json(Tags)
| where request_status == "fail"
| where TimeGenerated > todatetime('2019-08-02T09:40:00.000')
| where TimeGenerated < todatetime('2019-08-02T09:54:00.000')
| project request_status, Num, TimeGenerated | render timechart

 

24 June 2019

Streaming kubernetes logs from multiple replicas

I have been really busy these days with kubernetes. Not only managing it but also developing microservices on top of it. One thing that has been particularly annoying is viewing logs in the command line from multiple replicas. First I installed stern and then I tried kubetail. Both are fine but require me to remember some commands on top of all the commands I have to remember using kubectl. 

I was reading this document the other day and noticed the paragraph:

Begin streaming the logs from all containers in pods defined by label app=nginx
kubectl logs -f -l app=nginx --all-containers=true

I am not sure when this feature was added but I wanted to try it out and see if it was working the way I expected it. So I tried it out on one of the microservices I am working on.
I have a microservice called user-information it is of kind deployment and has the replica count set to 3. I want to stream all pod logs of this service so I tried:

kubectl logs -f --namespace user-information -l app=user-information

and see there all logs from all pods displayed in my console! Nice. No more external tools, just plain kubectl  and grep

Bonus

If your services are structured and all of them have labels (which they should) you can further simplify the above command by adding a function to your ~/.functions file. Open the ~/.functionsfile with your preferred editor and paste the following block inside it:

logs() {
  kubectl logs -f -n $1 -l app=$1 --max-log-requests 5 --all-containers=true --timestamps=true
}

in a new terminal I can then simply view the logs with:

logs user-information

Caveats

Unfortunately, this works only on deployments / statefulSets with replication count <= 5. If you try the above and encounter this

error: you are attempting to follow 6 log streams, but maximum allowed concurrency is 5

You might need to resort to something else then kubectl.

05 May 2018

Kubernetes in a private AWS network

I have been working on setting up a kubernetes cluster on AWS. Usually, the setup isn't difficult and there are many tools that can assist, for example, kops, spray, conjure-up and probably many others I am forgetting. The problem I had with these tools is that they are configured to create all the resources a kubernetes cluster might need. For example, trying to use kops to create a cluster in a private subnet fails if no Internet gateway exists, thus it will try to create an internet gateway. But what if you are in a corporate environment and no internet gateway has been provisioned? What if the internet breakout should go through your own data center? Now your options are limited to:

  1. Setting up the cluster manually (hard way)

  2. Dig deep into the kops / spray code and modify it to do what you need 

Obviously, both options are time-consuming. What if there is another way? The semi-automated way. Actually, the creators of kubernetes thought about the case where flexibility and customizability are needed. That's why they gave us kubeadm currently my go-to tool for provisioning and managing kubernetes clusters. I call it the semi-automated way because I have to ssh into the master / nodes and issue kubeadm commands and write some config files but with a little bash scripting and terraform knowledge it's rather easy to automate everything.

02 February 2018

Docker for developers: part 2

Tags: Docker

In a previous post, I wrote about some useful commands developers should know when using docker as a development environment. In this post, I'll add two more topics, namely multi-stage builds and permission problems with host mounted volumes. Let's start with docker multi-stage builds.

Docker multi-stage  builds

In docker version 17.05 multi-stage builds were introduced which simplified life for us who used docker in production and testing. Back when applications, where deployed to production servers over SCP or similar, production servers were kept lean. Only packages related to the apps where added to the OS and nothing else. Keeping the production OS lean was a way of reducing the attacker surface and thus smaller probabilities of vulnerabilities.

Similar to that logic when you deploy containers to production you want the images to be as lean as possible. Build-essentials in the container? No way.
However, some packages require that you have compilers installed when doing package installs. The way to solve this puzzle prior to docker multi-stage builds was to have two separate Dockerfiles. One for building the binaries and one for running the actual application. With docker multi-stage builds you can now combine the two dockerfiles in one and you will end up with a lean image for production. An example might be useful to show this

20 January 2018

Docker for developers

Tags: Docker

With the rise of microservices, the complexity of dev environments has increased. Technology stacks can be used independently and thus the choices of programming languages have increased. Companies used to be proficient in one language and implement their solutions using the tools/libraries available to them in the specific language in a monolithic way. Today with containers it's totally different. If you are a Ruby on Rails shop but need to implement a real-time service you might dip into NodeJS. Or you are a PHP shop and need to implement a high traffic service you might look into Golang. Or you have been using python2 but want the new service to use asyncio (python3). Now imagine the time it will take onboarding developers and setting up their dev environments. That can be quite time-consuming for the developer and for you! So what is the solution? Docker of course :)

Usually, docker is utilized for CI (building / testing / QA..etc) but I think that every project should begin with a Dockerfile and a docker-compose file. That way when somebody new comes and joins the team he would just need to do a docker-compose up to start his work. NodeJS 6 or 8? No problem just define it in the Dockerfile and the developer doesn't have to bother with installing anything.

For optimum results, the developer has to know some useful docker commands. In this post, I intend to go over some of the good to know docker commands that I have found myself using on a regular basis while using docker as my development environment.

docker in dev envs

22 December 2017

When to use DynamoDB and when not to

Having seen projects that use DynamoDB as their persistence layer with good results, I have also seen many projects where DynamoDB wasn't suitable but was used regardless. Upon asking the question why DynamoDB was chosen the answers were usually unclear and defensive. All good Architects and Engineers should ask themselves before using a new technology what the benefits are, the technology will bring and if these benefits warrant the inclusion of the technology.
In this post, I will list a couple of answers to the simple question "Why did you go with DynamoDB" with my remarks based on my experience with DynamoDB. Let's start with the most common answer:

We don't have to think about scaling the system it is done for us

Yes, scaling is done behind the scenes by AWS, for every 10GB of data a new Node is added but what people forget is that when you initially provision 30 RCU/WCU and suddenly your data grows and you have 2-3 Nodes the provisioned RCUs/WCUs is a total of all Nodes. Hence if a query throws a ProvisionedThroughputExceededException you can't provision more throughput on a Node by Node basis you will have to overprovision your throughput or re-design your data model. Also with the newly introduced auto-scaling feature for provisioned throughput, care must be taken to not exceed budget limits because of a bad initial data model design.

12 December 2017

Aerospike vs Redis vs Tarantool

I have been working on a project where we have to store application data (clicks, views, mentions,...etc) in order to do some analytics later on on the data. There are several ways to architect a solution for this kind of problem and although in the past I architected a similar solution to this problem using Apache Kafka and the Citus extension for PostgreSQL this time we decided to use an in-memory storage engine and implement a batch job that regularly aggregates the data and stores it in PostgreSQL.

The question was which in-memory storage engine should we go for? In this post, I will compare three promising tools and do some performance testing on them in order to better evaluate which one to go with. Before I start I have to mention that I am slightly biased towards Redis because I have used it regularly in the past and know my way around it but after having now worked with both Aerospike and Tarantool I have to say that all of them are worthy candidates.

Test Setup

The tests are run on AWS. I use terraform to provision the infrastructure. The picture below shows an overview of the infrastructure. The tools were provisioned as single instances on their own nodes:
YCSB-tester on  a c4.2xlarge (This instance was used to run YCSB tool to perform the tests)
Redis-tester(v. 4.0.6) on a m4.xlarge
Tarantool-tester(v. 1.8) on a m4.xlarge
Aerospike-tester(v. 3.15.0.1) on a m4.xlarge
Elasticache-tester(v. 3.2.4) on a cache.m4.xlarge