I'm reminded of a time when the Ingres Engineering team had to take ownership of the Sustaining Engineering role for the first 6 months of a new release. Having a vested interest in the code quality drives better code. The SRE approach is very similar. I like it.
In my experience, production freezes cause downtime very, very often. Many of the "changes" that developers are releasing are monitors, responses to new data flows from vendors, etc. Without which, the services are either down or irrelevant. A causal analysis will usually show how a production freeze now - even if it improves reliability - can increase crashes later, since larger bundles of features get deployed per release after the freeze.
Appreciation for the culture, tools and drive needed to make it work. Thank you.
This is just mind-blowing!
Interesting. I loved the "error budget", people are responsible enough to use it well - to take risks when it's worth it.
I'm reminded of a time when the Ingres Engineering team had to take ownership of the Sustaining Engineering role for the first 6 months of a new release. Having a vested interest in the code quality drives better code. The SRE approach is very similar. I like it.
Great presentation.
Thank you kindly!
Thanks ❤️
example of project work from SRE from 930
All devs need to do more of SRE if they are involved in writing software that runs a site (aka cloud)
In my experience, production freezes cause downtime very, very often. Many of the "changes" that developers are releasing are monitors, responses to new data flows from vendors, etc. Without which, the services are either down or irrelevant. A causal analysis will usually show how a production freeze now - even if it improves reliability - can increase crashes later, since larger bundles of features get deployed per release after the freeze.
hi
Google BigQuery ❤️