Site Reliability Engineer

July 11, 2022
Application ends: September 7, 2022

Apply for this job

Email *
Password *
Confirm Password *

Job Description

Key Responsibilities:
Establish and maintain expert level knowledge of all systems & applications
Leverage suite of SaaS based observability tools to ensure our platform is scalable, fault tolerant, and highly available
Take ownership of customer issues reported and see problems through to resolution
Attend in-person meetings with clients to analyze, troubleshoot and diagnose technical and data-related issues
Coordinate resolutions via Jira tickets with the Technology Development team
Test resolutions in conjunction with the Quality Assurance team
Communicate resolution of technical issues to Client Success team members and internal stakeholders
Partner with Level 1 support to ensure timely resolution of support issues
Collaborate with the Product Development team to fix areas with high issue volume
Improve operations by conducting systems analysis and recommending changes in policies and procedures
Update job knowledge by studying best practices in technical support
Work with teams across the organization to build and maintain monitor-able, performant, reliable, and highly-scalable software systems
Participate in timely post-mortems of production incidents

Critical Skills/Experience:
5+ years experience developing and monitoring mission-critical systems
Experience with SaaS-based observability tools, such as CloudWatch, Sentry, New Relic, DataDog, and Uptime
Working knowledge of and passion for automating software delivery processes
Proven track record for designing and building top tier monitoring and alerting infrastructure
Experience with administering SaaS-based application cloud environments, preferably AWS
Experience administering CI/CD pipelines
Thorough understanding of security & compliance best practices
Strong written and verbal communication skills with both internal team members and external customers with varying levels of technical knowledge
Strong initiative to find ways to improve solutions, operations, and processes
Internally motivated, with the ability to work proficiently both independently and in a team environment
A roll-up-your-sleeves, GSD approach to the day-to-day