Computer Science in Practice - University of Houston
Skip to main content

Computer Science in Practice

Shrinking Production Incidents

When: Monday, March 2, 2020
Where: PGH 563
Time: 11:00 AM

RSVP: https://forms.gle/LBbNAHGh9wK4r3EM9

Speaker: Annalee Nagami, Google

Host: Dr. Omprakash Gnawali

“Hope is not a strategy” - Google Site Reliability Engineering motto

For large-scale systems, the question is not whether something will go wrong, but when. Site Reliability Engineers manage this risk. This talk will outline strategies for detecting problems, mitigating their effects, shortening their duration, and reducing their frequency.

Bio:

Annalee Nagami is a Site Reliability Engineer at Google. Her job is to keep the account management infrastructure highly available. Annalee graduated from the University of Houston in 2011 with a Bachelors of Science in Computer Science. She started her career working in Compiler Support at Intel. In 2014, Annalee joined Google to develop internal tools. She built integration testing infrastructure and led initiatives to standardize best practices for correctness testing. In early 2019, she joined SRE to pursue her interest in building large-scale production systems.