Senior Software Engineer - Site Reliability

  • Coinbase
  • San Francisco, CA
  • Jul 10, 2018

Job Description

Back To All Jobs Senior Software Engineer - Site Reliability
San Francisco, CA

Our vision is to bring more innovation, efficiency, and equality of opportunity to the world by building an open financial system. Our first step on that journey is making digital currency accessible and approachable for everyone. Two principles guide our efforts. First, be the most trusted company in our domain. Second, create user-focused products that are easier and more delightful to use.

The SRE team helps realize that vision by supporting Coinbase engineering teams to build software that is world-class in terms of its reliability. As a core service team, the Coinbase SREs work closely with the rest of engineering. We actively seek out and gather the state-of-the-art, best practices from the industry at large. Through education and advocacy, we seek to ensure that reliability is a core value of our engineering culture. We level up other engineers by sharing deep knowledge, performing proactive analysis and improving processes, tools, and automation. Ultimately, SRE succeeds when all engineering teams are able to build reliable software on their own.

Our SRE team highly values individuals with intellectual curiosity and openness. We collaborate across the organization, helping our engineers think big and take risks while building a culture of diversity, positive energy and blameless truth-seeking. We encourage self-starting on high-impact projects within the context of strong support and mentorship.

Reliability at Coinbase
  • Scale & Integrations: 5 Million Users , 33 Countries , NYSE Integrations
  • Transparency: Public Post-Mortems , Open Sourcing Our Infrastructure
  • Outreach: Deploying ElasticSearch at Coinbase
  • Blog Post: Scaling & Developer Productivity

  • Educate, advocate and mentor the engineering team on improving the reliability of our systems and make reliability a core value of the Coinbase engineering culture.
  • Work closely with product engineering teams to identify and measure SLIs and corresponding SLOs.
  • Pro-actively find and analyze reliability problems across our business units and stack, then design and implement software to create step-function improvements.
  • Build automation and improve systems to eliminate toil and operations work.

  • At least 4 years software engineering experience in industry.
  • Deep knowledge of UNIX/Linux system internals such as system calls, TCP/IP and debugging tools.
  • Understanding of data structures & algorithms especially as they pertain to performance and reliability.
  • Ability to debug complex systems and evolve a running environment while maximizing availability.
  • Fluent in at least one dynamic programming language such as Python, Ruby or JavaScript.

Bonus points for:
  • Experience with AWS.
  • Experience with databases such as MongoDB, Redis, PostgreSQL.
  • Expertise in operating distributed systems at scale.


We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.