Senior Site Reliability Engineer

Remotive

Remote

•

1 week ago

•

No application

About

Virta Health is on a mission to transform diabetes care and reverse the type 2 diabetes epidemic. Current treatment approaches aren’t working—over half of US adults have either type 2 diabetes or prediabetes. Virta is changing this by helping people reverse type 2 diabetes through innovations in technology, personalized nutrition, and virtual care delivery reinvented from the ground up. We have raised over $350 million from top-tier investors, and partner with the largest health plans, employers, and government organizations to help their employees and members restore their health and live diabetes-free. Join us on our mission to reverse diabetes in 100M.

As an SRE on the Infrastructure team at Virta, you will be building the foundation that will help our company move as fast as possible while meeting security and compliance requirements. Key projects for the team over the next two quarters include:

Implement an AI‑driven observability and metrics platform that automatically detects anomalies and highlights SLO risks, enabling product teams to make data‑driven decisions.
Enhancing system observability, reliability, and efficiency using off-the-shelf technology combined with internal tools developed in Python and Go to increase transparency and visibility into our systems as well as centralizing data.
Building out more products for our Product Development teams like observability (SLOs, alerting, dashboards) modules to allow them to spin up an MVP out of the box.
Improving incident readiness with better tooling and the right hygiene practices such as game days.
Engage with feature development teams in toil reduction exercises, capacity planning, load testing, SLO process, and other best practices — partnering with product teams to replace manual capacity planning with predictive/AI-driven scaling models and to codify self-healing runbooks that minimize toil
Improving the velocity and quality of our developer platform and tooling
General AI fluency desired: comfortable with concepts like prompt engineering, operational chatbots, and AI-assisted workflows to accelerate incident response and reliability improvements

We are in the midst of re-defining our incident response tooling/strategy, improving test tooling, and developing a strategy to ensure all applications are performant and available. Joining Virta would make you one of the key people defining and driving the future vision of what reliability and observability should look like.

Responsibilities

Ship automation and tooling that reduces toil, with high-quality, well-structured code.
Design and codify self-healing workflows and guardrails to minimize toil and improve reliability.
Steward SLO dashboards enhanced with AI/ML-assisted insights, leveraging AIOps-style observability to surface anomalies, predict error-budget burn, and improve signal quality across golden signals
Integrate load-testing into reliability engineering efforts, ensuring outcomes directly inform SLOs, scaling strategies, and capacity planning.
Partner with product teams to replace manual capacity planning with predictive/AI-driven scaling models and implement burn-rate based alerting.
Coach and mentor engineers; champion best practices and pragmatic reliability trade-offs.

90 Day Plan

Within your first 90 days at Virta, we expect you will do the following:

Teach and inspire other engineering team members through knowledge sharing, pair programming, and giving feedback during code reviews
Propose and implement one or more process improvements related to reliability and observability to make our engineering team even better
Deliver a proof-of-concept for an AIOps initiative, demonstrating how a manual reliability or observability process can be transformed into automation to reduce toil and improve insight

Must-Haves

Highly proficient in shipping backend code in high-quality production environments, with strong hands-on coding and automation expertise, and a deep understanding of reliability and production readiness practices
Hands-on expertise with automation and infrastructure-as-code (Terraform modules preferred), ideally with experience in observability
Experience designing and implementing highly observable, scalable systems — with a proven track record configuring AIOps / ML-based monitoring platforms — that support large numbers of users while reducing operational burden
Applied and general AI fluency: ability to leverage AI/ML-assisted observability (e.g., anomaly detection, error-budget burn prediction) while also being comfortable with concepts like prompt engineering, operational chatbots, and AI-assisted workflows to accelerate incident response and reliability improvements
Growth mindset and craftsmanship: ability to coach, mentor, and evangelize AI-first insights while continually improving engineering practices and following best practices

Values-driven culture

Virta’s company values drive our culture, so you’ll do well if:

You put people first and take care of yourself, your peers, and our patients equally
You have a strong sense of ownership and take initiative while empowering others to do the same
You prioritize positive impact over busy work
You have no ego and understand that everyone has something to bring to the table regardless of experience
You appreciate transparency and promote trust and empowerment through open access of information
You are evidence-based and prioritize data and science over seniority or dogma
You take risks and rapidly iterate

Is this role not quite what you're looking for? Join our Talent Community and follow us on Linkedin to stay connected!

As part of your duties at Virta, you may come in contact with sensitive patient information that is governed by HIPAA. Throughout your career at Virta, you will be expected to follow Virta's security and privacy procedures to ensure our patients' information remains strictly confidential. Security and privacy training will be provided.

Virta has a location based compensation structure. Starting pay will be based on a number of factors and commensurate with qualifications & experience. For this role, the compensation range is [min of $167,249 - $216,000. Information about Virta’s benefits is on our Careers page at: https://www.virtahealth.com/careers.

As a remote-first company, our team is spread across various locations with office hubs in Denver and San Francisco.
Clinical roles: We currently do not hire in the following states: AK, HI, RI
Corporate roles: We currently do not hire in the following states: AK, AR, DE, HI, ME, MS, NM, OK, SD, VT, WI.

#LI-remote

Remove Ads

Similar Positions

Care Assistant – Bank &#...

Barchester Healthcare

Adzuna

Willesborough, Ashford

ABOUT THE ROLE As a Bank Care Assistant at a Barchester care home, ...

29619 - 29619 7 minutes ago

Housekeeping Assistant –...

Barchester Healthcare

Adzuna

Pennygillam, Launceston

ABOUT THE ROLE As a Bank Housekeeping Assistant at a Barchester car...

26436 - 26436 7 minutes ago

University Assistant Professor

Cambridge Service Alliance

Cambridge

Welcome to the Cambridge Service Alliance - a unique global allianc...

18 minutes ago

Senior Service Manager

Adferiad

Swansea

Adferiad Recovery is a user and carer-led organisation which provid...

18 minutes ago

Team Assistant | 080925

Luna Partners

Greater London

Boutique business support recruitment consultancy.

18 minutes ago

Get our app today

Senior Site Reliability Engineer