Cloud Site Reliability Engineer (ISD Engineer IV)

Location: VA Vienna - Headquarters Full/Part Time: Full-Time Regular/Temporary: Regular

Job Description


You have goals, dreams, hobbies and things you’re passionate about.

What’s Important to You Is Important to Us
We’re looking for people who not only want to do meaningful, challenging work, keep their skills sharp and move ahead, but who also take time for the things that matter to them—friends, family and passions. And we're looking for team members who are passionate about our mission—making a difference in military members' and their families' lives. Together, we can make it happen.

Don’t take our word for it.

  • FORTUNE 100 Best Companies to Work For®
  • Computerworld® Best Places to Work in IT
  • FORTUNE® Best Workplaces for Millennials
  • Forbes® America’s Best Employers


Basic Purpose

The Site Reliability Engineer is a member of the Cloud Team and providing support on software development, operations and maintenance while dealing with complex infrastructure to improve performance, visibility, stability, availability and reliability using automated solutions. This role will provide Tier 3 support, either directly or by engaging with other stakeholders, for applications and platforms residing in the Cloud. Ideal candidate has hands-on experience and understanding of software development lifecycle from inception to implementation. The successful candidate should have knowledge and understanding of maintaining and will be responsible for ensuring the reliability and speed of the software.   


-    Monitor alerts, metrics and logs to detect incidents, events and correlate them to find the root cause of outages.
-    Conduct Post-Incident Review with various roles including developers, infrastructure engineers, product owners, system owners, information security to identify the cause and solution through automation to improve the agility and performance of the system.
-    Work with other Site Reliability Engineers to drive standards and consistency around best practices.
-    Create and modify runbooks and knowledge base which can be used by other engineers to follow and resolve incidents quickly.   Identify opportunities and implement the automation needed to address and prevent operational issues.
-    Ability to understand and modify existing code, scripts used for automation to build applications and infrastructure. Clearly identify and enable new alerts and monitors for critical services impacting system reliability.
-    Drive increased efficiency across the teams, eliminating duplication, leveraging common DevOps processes, tools, and technology.
-    Collaborate with team in defining architecture; identify potential risks to successful implementation. 
-    Work closely with business partners and software development teams in a matrixed organization structure.
-    Automate tasks to reduce manual work, reduce outages, and enhance customer and employee experience.
-    Communicate and resolve complex production issues and implement preventative measures.
-    Implement and tune monitoring, metric collection and alerting.
-    Identify opportunity and implement the automation needed to address and prevent operational issues.

Required Skills 

-    Solid hand-on experience in setting up and correlating SIEM Monitoring Tools including but limited to:  Azure Sentinel, Azure Log Analytics, Azure Monitor, Application Insights, Splunk, Moogsoft, CA APM/Wily Introscope etc.
-    Senior Software developer in developing applications using tools such as Java, Spring Boot, Spring Framework, .NET Core, Angular, React, Vue.js 
-    Hands on experience with a variety of database technologies including relational database such as Azure SQL, SQL Server, MySQL or NoSQL databases such as Azure Cosmos DB, MongoDB, Postgres SQL, etc.
-    Hands on experience integrating systems with REST APIs, Databases(RDBMS), LDAP, Active Directory, Azure Active Directory, RabbitMQ, Redis Cache, Azure Functions (Serverless)
-    Hands on experience in deploying applications to Production through automated CI/CD pipelines or automated scripts using tools such as Maven, Gradle, Docker, Git, JUnit, MSTest, Tomcat, SonarQube, Fortify, Selenium, Cucumber, Contrast Security, etc.
-    Understanding and experience delivering Twelve-Factor cloud-native applications
-    Understanding and experience with Microservices architecture
-    Knowledge, understanding and experience using ticketing systems for Catalogs and Change Management like ServiceNow, HP ITSM, BMC Remedy.
-    Excellent communication and coordination skills to interact with different stakeholders who are technical and non-technical

Preferred Skills 

-    Knowledge, understanding and experience of DevOps and Agile Methodologies
-    Experience in Microsoft Azure Technologies
-    Experience in Tanzu Application/Container Services (TAS/TKS) (Previously Pivotal Cloud Foundry) or equivalent container based platforms/products like Openshift, Azure Kubernetes Services, Google Container Services etc.
-    Experience using ServiceNow ITOM and ITSM to create catalogs or to automate processes by integrating with other systems
-    We highly encourage SREs, DevOps, Application Developers, System Developers, System Engineers have knowledge and understanding of how software is built and managed

Hours: Monday - Friday, 8:00am - 4:30pm

Location: 820 Follin Lane, Vienna, VA 22180

*Due to COVID-19 and social distancing, this position will be temporarily working from home with plans to return to campus at the desired location listed once Navy Federal is back to normal operations. The specific logistics for returning to campus will be determined at a future date by individual leadership* 

Equal Employment Opportunity

Navy Federal values, celebrates, and enacts diversity in the workplace.  Navy Federal takes affirmative action to employ and advance in employment qualified individuals with disabilities, disabled veterans, Armed Forces service medal veterans, recently separated veterans, and other protected veterans.  EOE/AA/M/F/Veteran/Disability


Navy Federal reserves the right to fill this role at a higher/lower grade level based on business need.
An assessment may be required to compete for this position.

Bank Secrecy Act

Remains cognizant of and adheres to Navy Federal policies and procedures, and regulations pertaining to the Bank Secrecy Act.


Employee Referrals

This position is eligible for the TalentQuest employee referral program. Please indicate the employee who referred you when applying.