Cloud Run is the serverless product on Google Cloud that lets you run serverless containers. After a short period in beta, the product is now generally available and ready for production workloads. Let me tell you three reasons why I think Cloud Run is different!
With Cloud Run, you push a container and get an HTTPS endpoint back. Other serverless platforms, like AWS lambda, Cloud Functions, or App Engine are source-based instead. This means you upload your source-code and the platform compiles your source. This is very convenient and it helps to get you started quickly, but it does have some drawbacks.
I favor a container-based platform over a source-based one. A container provides a very clear division of responsibilities between me, the developer, and the platform it runs on. My responsibility is to start a binary which opens a port to receive HTTP requests on, the platform runs my container. This container runtime contract is very stable and unlikely to change. It does require me to build a Dockerfile and start my own HTTP server, but that is a small price to pay for long-term stability and portability.
When you are practicing continuous delivery, you want to separate build and deploy. With Cloud Run, you first build an artifact (the docker container). You can test the artifact, deploy it on a testing or staging environment and finally promote it to your production environment. With a source-based platform, you never really know what ends up in production: you can only see the outcome of the build when it is deployed.
Cloud Run implements Knative. This is both an open API and a Kubernetes-based platform providing a uniform interface for running serverless workloads. For you, as a developer, this means you can take your Cloud Run workload and deploy it on your own on-premise Kubernetes cluster, or a managed Kubernetes cluster on Azure, Google Cloud, AWS, Alibaba Cloud, and more.
The Knative project is here to stay. It is supported by around 450 individual contributors and organisations like Google, IBM, Red Hat, and Pivotal.
The fully managed platform Cloud Run is scalable. By default, it can scale to 1000 instances, and you can request for that limit to be lifted. Scaling is very quick with Cloud Run, and this is why: auto-scaling is based on concurrency, not on system metrics.
Most traditional autoscalers watch a group of instances and monitor system metrics. When the group gets too busy, according to for instance CPU usage, they add a new node to the group. System metrics are relatively slow to collect, and when you have a sudden spike in traffic, you’re always lagging behind.
Cloud Run takes a different approach: You are required to specify the number of HTTP requests your container can handle at the same time. The platform uses this setting to decide when to add new instances. This is what makes scaling with Cloud Run much faster.
When your system sees very stable and constant demand, you might think that fast scaling performance is not really important. Think about what happens when you deploy a new version. This new version will have to scale up to meet demand, and the old version has to scale down. With fast scaling, this happens quickly. This allows you to push new versions to production fast, even when your system is under load.
I’ve built little game to get you started with Cloud Run. Experience the workflow of building and deploying a serverless container using the command line. No credit card or Google Cloud project required. Check out the challenge on instruqt.com
I’m authoring the O’Reilly book “Mastering Serverless Applications with Google Cloud Run”. The first two chapters are already available online for safari subscribers. With Early Release ebooks, you get books in their earliest form. This is the raw and unedited content as I am writing.
I think Cloud Run is a revolutionary serverless product: it will enable a whole new category of systems to become serverless. If you are interested, be sure to watch this space, I will be sharing more.