DEVOPS WEEKLY
ISSUE #421 - 20th January 2019

Some good modern operations posts this week, with a few debugging tales, discussion of monitoring, getting started with cloud services and a good introduction to service level objectives.


From our sponsor, VictorOps
======================

Incident commanders and lifecycle coordinators help organize incident response to make on-call suck less. See how DevOps teams benefit from centralizing alerting, incident response and collaboration:

http://try.victorops.com/devopsweekly/incident-lifecycle-coordinators


News
====

A great tale of debugging an incident in a world of interlinked distributed systems. Some good takeaways for anyone interested in resilient architecture as well.

https://medium.com/@daniel.p.woods/on-infrastructure-at-scale-a-cascading-failure-of-distributed-systems-7cff2a3cd2df


A long and very detailed post on how Stack Overflow do monitoring. Lots of internal dashboards, graphs and discussions of approaches and different facets of the monitoring problem.

https://nickcraver.com/blog/2018/11/29/stack-overflow-how-we-do-monitoring/


A quick introduction to Service Level Objectives, Service Level Indicators and Error Budgets for managing reliability, accompanied by an experience report from one team starting to implement them.

https://medium.com/kudos-engineering/managing-reliability-with-slos-and-error-budgets-37346665abf6


Kubernetes cluster cost overhead can be an issue, often down to the cost of idle resources. This post explores how to measure and reduce that overhead, in particular looking at CPU, memory and disk. It also explores why autoscaling probably shouldn’t be step one.

https://medium.com/kubecost/detect-overspending-by-measuring-idle-kubernetes-resources-d5d97eb205e0


An interesting article on how many metrics an individual application might expose, including examples of a few real world apps. Interesting to think about when considering capacity and cost of monitoring.

https://www.robustperception.io/how-many-metrics-should-an-application-return


When public cloud services launched they had a small number of services, and you could learn the APIs and tools fairly quickly. Not so today. This post acts as a reminder, as well as a good collection of links for anyone starting getting up to speed to with AWS or GCP.

https://peadarcoyle.com/2019/01/12/im-lost-how-do-i-learn-to-use-the-cloud/


A walkthrough of building, and debugging a simple deployment pipeline using GitLab. Takes in some certificate issues and overriding existing Docker images used in each step.

https://joaorosa.io/2019/01/13/using-flyway-and-gitlab-to-deploy-a-mysql-database-to-aws-rds-securely/


A detailed walkthrough for anyone looking to implement structured logging in Java.

https://github.com/tersesystems/terse-logback


A look at a multi-cloud data processing pipeline. Starting in AWS, this walks through a few ways of duplicating data into GCP, utilising existing services wherever possible. No mention of the cost of transfer or storage though.

https://www.vibrato.com.au/blog/towards-a-multi-cloud-serverless-data-warehouse


A quick introduction to Kubernetes networking concepts, and a basic explanation of the role of ingress, with an example using the Traefik proxy.

https://www.cloudops.com/2018/06/why-kubernetes-ingresses-arent-as-difficult-as-you-think/


Tools
====

GoFish is a cross-platform systems package manager, bringing the ease of use of Homebrew to Linux and Windows. Early days but a few packages starting to be available.

https://gofi.sh/index.html
https://github.com/fishworks/fish-food


Incident commanders and lifecycle coordinators help organize incident response to make on-call suck less. See how DevOps teams benefit from centralizing alerting, incident response and collaboration:

http://try.victorops.com/devopsweekly/incident-lifecycle-coordinators



If you received this email directly then you're already signed up, thanks! If however someone forwarded this email to you and you'd like to get it each week then you can subscribe at http://devopsweekly.com

--

You opted in for Devops Weekly at http://devopsweekly.com

You can always unsubscribe by visiting https://devopsweekly.us2.list-manage.com/unsubscribe?u=b6635e37e35fa5eff0c2a947a&id=a63f24d068&e=[UNIQID]&c=ba07f06830

If you have other queries you can contact the list maintainer at gareth@morethanseven.net

Our mailing address is 43 Gwydir Street, Cambridge, UK, CB1 2LG