Site Reliability Engineer (req-174), CATHEXIS, VA, US

Job Description

Team CATHEXIS elevates the government contracting experience through rapid response, deep skill, and thoughtful problem-solving and communication. Our core capabilities are our top-tier program and project management, data analytics, and audit services, the backbone of which is our integrated approach to operational excellence.

You worked hard to get to where you are. You strive to make every day better than the day before. So do we. Team CATHEXIS operates with an all-in mindset. We are working together to create a company that supports our shared values and individual goals. Our values are centered around Respect, Engagement, Customer Service, Integrity, Teamwork, and Excellence in everything we do for our employees, clients, partners, and communities. We believe success is best when we listen and lead with empathy; model high standards of ethics to provide a rewarding candidate experience; work hard, have fun, and appreciate the strengths we all bring to the team; and empower our employees to create innovative and trusted results.

We are looking for a dynamic Site Reliability Engineer (SRE) to join our team.  The Site Reliability Engineer (SRE) will manage, monitor, and optimize our clusters on Kubernetes. Together, we’re accelerating our clients’ digital transformation through the building and deployment of data-driven, scalable AI solutions.  The ideal candidate will have a deep understanding of Kubernetes, Cloud Infrastructure, and Infrastructure as Code (IaC) practices. You will be responsible for ensuring the reliability and scalability of our Kubernetes clusters and Cloud Infrastructure.

Responsibilities:

  • Monitor and Manage Kubernetes Clusters: Ensure the stability, health, and scalability of Kubernetes Clusters, deploying applications and services on Kubernetes
  • Kubernetes Management: Deploy, monitor, and scale applications on Kubernetes clusters. Maintain Helm charts, manage services, and ensure resource allocation for optimal cluster performance
  • Cloud Infrastructure Management: Work with leading Cloud Platforms (AWS, GCP, Azure) to set up, configure, and manage infrastructure resources using Infrastructure as Code (Terraform, CloudFormation, etc.)
  • Monitoring & Incident Response: Set up monitoring solutions, define alerts, and manage the incident response process for any issues related to Jenkins, or Kubernetes clusters
  • Automate Infrastructure Processes: Build automation tools for scaling, monitoring, and maintaining infrastructure using modern tools like Terraform, Ansible, or equivalent
  • Collaborate Across Teams: Work closely with development, services, and operations teams to ensure a seamless integration between application development and infrastructure
  • Security & Compliance: Ensure all systems follow best practices in terms of security and compliance with relevant regulations. This includes role-based access, encryption, and automated vulnerability scanning

Requirements:

  • Active Secret Clearance is required
  • Bachelor’s degree (or equivalent) in computer science or related discipline
  • A minimum of two(2) years of experience working with on-premise and off-premise cloud environments
  • Experience with AWS, Azure and / or GCP
  • Ability to program (structured and OOP) using one or more high-level languages, such as Python, Java, C/C++, Ruby, and JavaScript
  • Experience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic resource management frameworks (Apache Mesos, Kubernetes, Yarn)
  • Proactive approach to identifying problems, performance bottlenecks, and areas for improvement
  • Agile/Scrum experience

CATHEXIS offers competitive compensation packages to all eligible employees. Our goal is to provide a compensation package that reflects the value you bring to our team, is competitive with market rates, and promotes your financial security and personal well-being. The annual salary range for this role is $136,000 - $170,000. Please note that the salary information provided is a general guideline. CATHEXIS considers various factors in its final offer, including location, qualifications, experience, and skills. 

 Benefits

  • Performance Bonuses
  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • 401(k) Plan (Traditional and ROTH)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off
  • 11 Federal Holidays
  • Parental Leave
  • Commuter Benefits
  • Short Term & Long Term Disability
  • Training & Development
  • Wellness Program
  • Community Outreach Initiatives

CATHEXIS is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law. If you are an individual with a disability and would like to request a reasonable accommodation as part of the employment selection process, please contact the Recruiting@cathexiscorp.com.

AI-Powered Job Matching

Get personalized insights and tailored applications with our AI tools:

AI Match Scoring

Get your exact compatibility score for each job based on your CV and experience

CV Tailoring

Automatically optimize your CV for each specific job application

Gap Analysis

Identify missing skills and get actionable improvement recommendations

Start Free Today

No credit card required • 100% free to start

Get Your Personal Job Feed

Join thousands of professionals getting AI-powered job recommendations tailored to their skills.

Daily job alerts matching your profile
AI match scores for every job
One-click CV tailoring
Application tracking
Get Started Free

Frequently Asked Questions about Site Reliability Engineer (req-174) Jobs in VA, US