Cloud Site Reliability Engineer (ISD Engineer III)

Location: VA Winchester - Operations Full/Part Time: Full-Time Regular/Temporary: Regular

Job Description


You have goals, dreams, hobbies and things you’re passionate about.

What’s Important to You Is Important to Us
We’re looking for people who not only want to do meaningful, challenging work, keep their skills sharp and move ahead, but who also take time for the things that matter to them—friends, family and passions. And we're looking for team members who are passionate about our mission—making a difference in military members' and their families' lives. Together, we can make it happen.

Don’t take our word for it.

• Military Times 2021 Best for Vets Employers
• WayUp Top 100 Internship Programs
• Forbes® 2022 The Best Employers for New Grads
• Forbes® America's Best Employers
• Newsweek Top 100 Most Loved Workplaces
Fortune Best Workplaces for Women
Fortune 100 Best Companies to Work For®
• Computerworld® Best Places to Work in IT

Basic Purpose

The Site Reliability Engineer is a member of the Cloud Team and providing support on software development, operations and maintenance while dealing with complex infrastructure to improve performance, visibility, stability, availability and reliability using automated solutions. This role will provide Tier 3 support, either directly or by engaging with other stakeholders, for applications and platforms residing in the Cloud. Ideal candidate has hands-on experience and understanding of software development lifecycle from inception to implementation. The successful candidate should have knowledge and understanding of maintaining and will be responsible for ensuring the reliability and speed of the software.   


•    Monitor alerts, metrics and logs to detect incidents, events and correlate them to find the root cause of outages.
•    Conduct Post-Incident Review  with various roles including developers, infrastructure engineers, product owners, system owners, information security to identify the cause and solution through automation to improve the agility and performance of the system.
•    Work with other Site Reliability Engineers to drive standards and consistency around best practices.
•    Create and modify runbooks and knowledge base which can be used by other engineers to follow and resolve incidents quickly.  Identify opportunities and implement the automation needed to address and prevent operational issues.
•    Ability to understand and modify existing code, scripts used for automation to build applications and infrastructure. Clearly identify and enable new alerts and monitors for critical services impacting system reliability.
•    Drive increased efficiency across the teams, eliminating duplication, leveraging common DevOps processes, tools, and technology.
•    Collaborate with team in defining architecture; identify potential risks to successful implementation. 
•    Work closely with business partners and software development teams in a matrixed organization structure.
•    Automate tasks to reduce manual work, reduce outages, and enhance customer and employee experience.
•    Communicate and resolve complex production issues and implement preventative measures.
•    Implement and tune monitoring, metric collection and alerting.
•    Identify opportunity and implement the automation needed to address and prevent operational issues.

•    Solid hand-on experience in setting up and correlating SIEM Monitoring Tools including but limited to:  Azure Sentinel, Azure Log Analytics, Azure Monitor, Application Insights, Splunk, Moogsoft, CA APM/Wily Introscope etc.
•    Senior Software developer in developing applications using tools such as Java, Spring Boot, Spring Framework, .NET Core, Angular, React, Vue.js 
•    Hands on experience with a variety of database technologies including relational database such as Azure SQL, SQL Server, MySQL or NoSQL databases such as Azure Cosmos DB, MongoDB, Postgres SQL, etc.
•    Hands on experience integrating systems with REST APIs, Databases(RDBMS), LDAP, Active Directory, Azure Active Directory, RabbitMQ, Redis Cache, Azure Functions (Serverless)
•    Hands on experience in deploying applications to Production through automated CI/CD pipelines or automated scripts using tools such as Maven, Gradle, Docker, Git, JUnit, MSTest, Tomcat, SonarQube, Fortify, Selenium, Cucumber, Contrast Security, etc.
•    Understanding and experience delivering Twelve-Factor cloud-native applications
•    Understanding and experience with Microservices architecture
•    Knowledge, understanding and experience using ticketing systems for Catalogs and Change Management like ServiceNow, HP ITSM, BMC Remedy.
•    Excellent communication and coordination skills to interact with different stakeholders who are technical and non-technical

Desired Skills 

•    Knowledge, understanding and experience of DevOps and Agile Methodologies
•    Experience in Microsoft Azure Technologies
•    Experience in Tanzu Application/Container Services (TAS/TKS) (Previously Pivotal Cloud Foundry) or equivalent container based platforms/products like Openshift, Azure Kubernetes Services, Google Container Services etc.
•    Experience using ServiceNow ITOM and ITSM to create catalogs or to automate processes by integrating with other systems
•    Knowledge and understanding of how software is built and managed

Hours: Monday - Friday, 8:00AM - 4:30PM

Location: 820 Follin Lane, Vienna, VA 22180 | 5550 Heritage Oaks Dr. Pensacola, FL 32526 | 141 Security Dr. Winchester, VA 22602 | Remote

Navy Federal is now hybrid! Our standard enterprise requirement for a hybrid schedule is to report on-site 4-16 days each month. The number of days reporting on-site will ultimately be determined by the employee's leadership and business unit needs. You will learn more throughout the hiring and on boarding process.

Salary Range:  $83,100 - $156,100 annually 

Navy Federal Credit Union assesses market data to establish salary ranges that enable us to remain competitive. You are paid within the salary range, based on your experience, location and market position.

Equal Employment Opportunity

Navy Federal values, celebrates, and enacts diversity in the workplace.  Navy Federal takes affirmative action to employ and advance in employment qualified individuals with disabilities, disabled veterans, Armed Forces service medal veterans, recently separated veterans, and other protected veterans.  EOE/AA/M/F/Veteran/Disability

COVID-19 Safety Protocols

All employees are expected to follow our COVID-19 safety protocols.


Navy Federal reserves the right to fill this role at a higher/lower grade level based on business need. An assessment may be required to compete for this position.

Bank Secrecy Act

Remains cognizant of and adheres to Navy Federal policies and procedures, and regulations pertaining to the Bank Secrecy Act.

Employee Referrals

This position is eligible for the TalentQuest employee referral program. If an employee referred you for this job, please apply using the system-generated link that was sent to you.