Site Reliability Engineer (SRE) - Kubernetes

Job opening ID

Posting title
Site Reliability Engineer (SRE) - Kubernetes

Roles and responsibilities
Estimated duration is long term with the potential for extension or conversion.
W2 with full benefits
Customer and contract specific training will be required and provided.

Job Description:
Provide an Enterprise Software Engineer who will use software engineering expertise to ensure availability, low latency, performance, and capacity for our Enterprise Search platform and ever expanding portfolio of mission-critical web applications hosted on a cloud-native distributed container orchestration platform.  In addition to traditional systems operations responsibilities, will fix, extend, and scale the code to keep it working and harden it against the ever evolving demands of our missions.  Will require both systems and software experience, utilizing expertise in coding, algorithms, complexity analysis and large-scale system design to tackle complex problems and continually improve the reliability of JPL systems and processes.
●Design, write and deliver software to improve the availability, scalability, latency, and efficiency of JPL production systems.  
●Work with other JPL application development teams to provide access to resources, guidance, and to optimize deployments.  
●Solve problems relating to mission critical services and build automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.  
●Influence and create new designs, architectures, standards and methods for large-scale distributed systems.  
●Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.  
●Conduct periodic on call duties.  

Required Skills:
●Must be a US Citizen or Green Card Holder
●Offer contingent on ability to successfully pass a background check and drug screen
●Typically requires a Bachelor’s Degree in Computer Science or related discipline and nine (9) years of related work experience
●At least ten (10) years experience in software development/engineering, including requirements analysis, software development, installation, integration, evaluation, enhancement, maintenance, testing, and problem diagnosis/resolution
●At least five (5) years experience in engineering software products used in large complex information systems, enterprise IT environments, and preferably Cloud Environments
●Five (5) or more years experience developing software to deploy Web and application platforms on Unix/Linux operating systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way.
●Familiarity with running web services at scale; understanding of Linux systems internals and networking.
●Systematic problem solving approach, coupled with a strong sense of ownership and drive.
●Experience in one or more of: Python, Java, Ruby, Go, C, C++.
●Experience with Amazon Web Services: Route 53, ELB, EC2, RDS, S3, EBS, SQS, etc.
●Experience in:

Desired Skills:
●MS degree in Computer Science or related technical field.
●Expertise in designing, analyzing and troubleshooting large-scale distributed systems.
●Experience with Docker and containerized application development patterns.
●Experience in one or more container orchestration technologies: Kubernetes, Docker Swarm, Mesosphere DC/OS.
●Experience with Continuous Integration and Continuous Deployment technologies such as Jenkins or CircleCI.
●Experience with algorithms, data structures, complexity analysis and software design.
●Networking: knowledge and understanding of network theory, such as different protocols (TCP/IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing).
●Experience in:
-Elasticsearch and ELK stack
-.NET (and DotNet Core)
-Postgres, Oracle

Number of positions