06 November 2020

Azure SAS tokens on cloudflare workers

Tags: Cloud Azure

I have been playing with cloudflare workers a lot recently. I fell in love with the tech. Although you need a different attitude when using cloudflare workers than when building NodeJS apps. For starters, cloudflare workers run as service workers on the edge, which means you can't use all NodeJS libraries out there.

In my opinion this is also gives cloudflare workers an advantage. We are so used to doing npm install in NodeJS world that we almost forgot how to write a left padding algorithm ourself :)

With cloudflare workers we have more constraints and have to think about the libraries we include carefully. These constraints made me start writing the libraries my applications needed myself instead of relying on npm for everything. And it has been refreshing and fun :)

Just recently I wrote a library that generates Azure SAS tokens. 

Azure SAS token is a string that gives your clients temporaily access to Azure storage


 

Microsoft provides a javascript SDK that can generate SAS tokens but unfortunately it only works for NodeJS and not in the service workers and browsers.

I decided to write a library that can generate SAS tokens in service workers and is compatible with browsers and cloudflare edge workers. For this I used the Web Crypto API. You can view the code on GitHub. I wrote it in typescript, so you can use it with javascript or typescript.

The API of the library is also pretty simple. Just call createBlobSas and use the returned string to store / view files in Azure.

For example, Imagine you have someBlob.txt file stored in azure storage and you want to give somebody access to view the file for five minutes. You could use the following code snippet:
 

08 August 2019

Prometheus with Azure Monitor

Tags: Cloud Azure

I have been using Azure AKS for quite some time now and haven't had many problems with it. I run my own observability stack (Prometheus + Grafana) and logging stack (EFK) on AKS myself. I recently noticed on Azures blog that I can ditch my Prometheus from inside kubernetes and have azure take care of scraping and storing the application metrics. If you have been using Prometheus with AKS than there is not much that will change as a developer. If you, however, have been using Prometheus for DevOps related work and are quite good at writing PromQL than embrace yourself for KQL (Kusto Query Language)

<Queries>
InsightsMetrics
| where Name == "user_management"
| extend dimensions=parse_json(Tags)
| where request_status == "fail"
| where TimeGenerated > todatetime('2019-08-02T09:40:00.000')
| where TimeGenerated < todatetime('2019-08-02T09:54:00.000')
| project request_status, Num, TimeGenerated | render timechart

 

24 June 2019

Streaming kubernetes logs from multiple replicas

I have been really busy these days with kubernetes. Not only managing it but also developing microservices on top of it. One thing that has been particularly annoying is viewing logs in the command line from multiple replicas. First I installed stern and then I tried kubetail. Both are fine but require me to remember some commands on top of all the commands I have to remember using kubectl. 

I was reading this document the other day and noticed the paragraph:

Begin streaming the logs from all containers in pods defined by label app=nginx
kubectl logs -f -l app=nginx --all-containers=true

I am not sure when this feature was added but I wanted to try it out and see if it was working the way I expected it. So I tried it out on one of the microservices I am working on.
I have a microservice called user-information it is of kind deployment and has the replica count set to 3. I want to stream all pod logs of this service so I tried:

kubectl logs -f --namespace user-information -l app=user-information

and see there all logs from all pods displayed in my console! Nice. No more external tools, just plain kubectl  and grep

Bonus

If your services are structured and all of them have labels (which they should) you can further simplify the above command by adding a function to your ~/.functions file. Open the ~/.functionsfile with your preferred editor and paste the following block inside it:

logs() {
  kubectl logs -f -n $1 -l app=$1 --max-log-requests 5 --all-containers=true --timestamps=true
}

in a new terminal I can then simply view the logs with:

logs user-information

Caveats

Unfortunately, this works only on deployments / statefulSets with replication count <= 5. If you try the above and encounter this

error: you are attempting to follow 6 log streams, but maximum allowed concurrency is 5

You might need to resort to something else then kubectl.

 

 

05 May 2018

Kubernetes in a private AWS network

I have been working on setting up a kubernetes cluster on AWS. Usually, the setup isn't difficult and there are many tools that can assist, for example, kops, spray, conjure-up and probably many others I am forgetting. The problem I had with these tools is that they are configured to create all the resources a kubernetes cluster might need. For example, trying to use kops to create a cluster in a private subnet fails if no Internet gateway exists, thus it will try to create an internet gateway. But what if you are in a corporate environment and no internet gateway has been provisioned? What if the internet breakout should go through your own data center? Now your options are limited to:

  1. Setting up the cluster manually (hard way)

  2. Dig deep into the kops / spray code and modify it to do what you need 

Obviously, both options are time-consuming. What if there is another way? The semi-automated way. Actually, the creators of kubernetes thought about the case where flexibility and customizability are needed. That's why they gave us kubeadm currently my go-to tool for provisioning and managing kubernetes clusters. I call it the semi-automated way because I have to ssh into the master / nodes and issue kubeadm commands and write some config files but with a little bash scripting and terraform knowledge it's rather easy to automate everything.

02 February 2018

Docker for developers: part 2

Tags: Docker

In a previous post, I wrote about some useful commands developers should know when using docker as a development environment. In this post, I'll add two more topics, namely multi-stage builds and permission problems with host mounted volumes. Let's start with docker multi-stage builds.

Docker multi-stage  builds

In docker version 17.05 multi-stage builds were introduced which simplified life for us who used docker in production and testing. Back when applications, where deployed to production servers over SCP or similar, production servers were kept lean. Only packages related to the apps where added to the OS and nothing else. Keeping the production OS lean was a way of reducing the attacker surface and thus smaller probabilities of vulnerabilities.

Similar to that logic when you deploy containers to production you want the images to be as lean as possible. Build-essentials in the container? No way.
However, some packages require that you have compilers installed when doing package installs. The way to solve this puzzle prior to docker multi-stage builds was to have two separate Dockerfiles. One for building the binaries and one for running the actual application. With docker multi-stage builds you can now combine the two dockerfiles in one and you will end up with a lean image for production. An example might be useful to show this

20 January 2018

Docker for developers

Tags: Docker

With the rise of microservices, the complexity of dev environments has increased. Technology stacks can be used independently and thus the choices of programming languages have increased. Companies used to be proficient in one language and implement their solutions using the tools/libraries available to them in the specific language in a monolithic way. Today with containers it's totally different. If you are a Ruby on Rails shop but need to implement a real-time service you might dip into NodeJS. Or you are a PHP shop and need to implement a high traffic service you might look into Golang. Or you have been using python2 but want the new service to use asyncio (python3). Now imagine the time it will take onboarding developers and setting up their dev environments. That can be quite time-consuming for the developer and for you! So what is the solution? Docker of course :)

Usually, docker is utilized for CI (building / testing / QA..etc) but I think that every project should begin with a Dockerfile and a docker-compose file. That way when somebody new comes and joins the team he would just need to do a docker-compose up to start his work. NodeJS 6 or 8? No problem just define it in the Dockerfile and the developer doesn't have to bother with installing anything.

For optimum results, the developer has to know some useful docker commands. In this post, I intend to go over some of the good to know docker commands that I have found myself using on a regular basis while using docker as my development environment.

docker in dev envs

22 December 2017

When to use DynamoDB and when not to

Having seen projects that use DynamoDB as their persistence layer with good results, I have also seen many projects where DynamoDB wasn't suitable but was used regardless. Upon asking the question why DynamoDB was chosen the answers were usually unclear and defensive. All good Architects and Engineers should ask themselves before using a new technology what the benefits are, the technology will bring and if these benefits warrant the inclusion of the technology.
In this post, I will list a couple of answers to the simple question "Why did you go with DynamoDB" with my remarks based on my experience with DynamoDB. Let's start with the most common answer:

We don't have to think about scaling the system it is done for us

Yes, scaling is done behind the scenes by AWS, for every 10GB of data a new Node is added but what people forget is that when you initially provision 30 RCU/WCU and suddenly your data grows and you have 2-3 Nodes the provisioned RCUs/WCUs is a total of all Nodes. Hence if a query throws a ProvisionedThroughputExceededException you can't provision more throughput on a Node by Node basis you will have to overprovision your throughput or re-design your data model. Also with the newly introduced auto-scaling feature for provisioned throughput, care must be taken to not exceed budget limits because of a bad initial data model design.