image
  • Snapboard
  • Activity
  • Reports
  • Campaign
Welcome ,
loadingbar
Loading, Please wait..!!

Sr Principal Engineer It Resiliency

  • ... Posted on: Dec 15, 2024
  • ... Appsierra Group
  • ... Pune, Maharashtra
  • ... Salary: Not Available
  • ... Full-time

Sr Principal Engineer It Resiliency   

Job Title :

Sr Principal Engineer It Resiliency

Job Type :

Full-time

Job Location :

Pune Maharashtra United States

Remote :

No

Jobcon Logo Job Description :

 Job Mode : Hybrid
Notice period : 15days or 30days

Roles & Responsibilities

  End-to-End Engineering Leadership: Oversee the design and implementation of resilient engineering across the technology domains.

  Cloud and On-Premises Infrastructure Expertise: Design and review resilient solutions in both cloud-based and on-premises environments.

  Chaos Engineering Infrastructure Initiatives: Lead chaos engineering efforts to proactively identify and mitigate potential system weaknesses.

  Standards for Monitoring and Alerting: Collaborate with Teams to evolve existing standards for system monitoring and alerting to ensure rapid detection and response.

  Resiliency Architecture Reviews: Represent the IT Resiliency Office during the Architectural Review Board.

  Enterprise-wide Collaboration and stakeholder management: Collaborate with various teams across the organization to align and prioritize resiliency and recovery efforts.

  Automation: Expertise with IaC and Tools such as Ansible. 

  Incident Response and Recovery: Integrate with post mortem process, from a major incident, to identify areas of opportunity for enhancing resiliency.

  Development: Evangelize standards and practices among the Technology organization to enrich our resiliency posture. 

  Reporting and Documentation: Develop standardized regular reporting on resilience activities, risks, and improvements to the Leadership team.

 

Experience & Qualifications:

  • Bachelor's degree or equivalent experience.

  • 5-10 years experience with platform engineering with a focus on IaC, DevOps practices, and orchestration tools.

  • Preferred but not required experience as a Team lead or a hands on Technical Manager role that can engage and deliver projects to completion

  • A track record of successfully architecting and deploying enterprise-level solutions that prioritize system uptime and data integrity across various operational scenarios.

  • Demonstrated ability to design and implement systems that ensure high availability, support massive transaction volumes, and facilitate seamless disaster recovery processes.

  • Infrastructure and service architecture & engineering experience, including functional and technical requirements gathering, and solution development.

  • Strong dedication to customer needs, with excellent communication and the ability to build lasting relationships, alongside the capability to articulate complex resilience strategies in a clear and impactful manner.

  • Deep insight into the complexities of multi-AZ and multi-Region cloud platforms, with a keen understanding of how these impact system resilience and disaster recovery planning.

  • Proven experience in the ongoing management of mission-critical systems that require constant uptime, including out-of-hours support and rapid response to incidents.

  • Knowledgeable in evaluating and deciding on trade-offs between consistency, availability, and partition tolerance, especially in the context of system failures and recovery strategies.

  • Well-versed in various cloud service models such as SaaS, PaaS, and IaaS, with hands-on experience in designing resilient services on leading public cloud platforms.

  • Proficient in Chaos Engineering principles and practices, with experience in designing and conducting experiments to validate the system's capability to withstand turbulent conditions.

  • Skilled in implementing observability solutions that provide real-time insights into the performance and health of systems, aiding in proactive issue detection and resolution.

  • Practical experience operating in an Agile development environment.

Jobcon Logo Position Details

Posted:

Dec 15, 2024

Employment:

Full-time

Salary:

Not Available

Snaprecruit ID:

SD-PIT-93c5a0ffc676c5e60594e2f59f031b0e1cf2e812c82ccdbca821e964e905ccce

City:

Pune

Job Origin:

PITCHNHIRE

Share this job:

  • linkedin

Jobcon Logo
A job sourcing event
In Dallas Fort Worth
Aug 19, 2017 9am-6pm
All job seekers welcome!

Sr Principal Engineer It Resiliency    Apply

Click on the below icons to share this job to Linkedin, Twitter!

 Job Mode : Hybrid
Notice period : 15days or 30days

Roles & Responsibilities

  End-to-End Engineering Leadership: Oversee the design and implementation of resilient engineering across the technology domains.

  Cloud and On-Premises Infrastructure Expertise: Design and review resilient solutions in both cloud-based and on-premises environments.

  Chaos Engineering Infrastructure Initiatives: Lead chaos engineering efforts to proactively identify and mitigate potential system weaknesses.

  Standards for Monitoring and Alerting: Collaborate with Teams to evolve existing standards for system monitoring and alerting to ensure rapid detection and response.

  Resiliency Architecture Reviews: Represent the IT Resiliency Office during the Architectural Review Board.

  Enterprise-wide Collaboration and stakeholder management: Collaborate with various teams across the organization to align and prioritize resiliency and recovery efforts.

  Automation: Expertise with IaC and Tools such as Ansible. 

  Incident Response and Recovery: Integrate with post mortem process, from a major incident, to identify areas of opportunity for enhancing resiliency.

  Development: Evangelize standards and practices among the Technology organization to enrich our resiliency posture. 

  Reporting and Documentation: Develop standardized regular reporting on resilience activities, risks, and improvements to the Leadership team.

 

Experience & Qualifications:

  • Bachelor's degree or equivalent experience.

  • 5-10 years experience with platform engineering with a focus on IaC, DevOps practices, and orchestration tools.

  • Preferred but not required experience as a Team lead or a hands on Technical Manager role that can engage and deliver projects to completion

  • A track record of successfully architecting and deploying enterprise-level solutions that prioritize system uptime and data integrity across various operational scenarios.

  • Demonstrated ability to design and implement systems that ensure high availability, support massive transaction volumes, and facilitate seamless disaster recovery processes.

  • Infrastructure and service architecture & engineering experience, including functional and technical requirements gathering, and solution development.

  • Strong dedication to customer needs, with excellent communication and the ability to build lasting relationships, alongside the capability to articulate complex resilience strategies in a clear and impactful manner.

  • Deep insight into the complexities of multi-AZ and multi-Region cloud platforms, with a keen understanding of how these impact system resilience and disaster recovery planning.

  • Proven experience in the ongoing management of mission-critical systems that require constant uptime, including out-of-hours support and rapid response to incidents.

  • Knowledgeable in evaluating and deciding on trade-offs between consistency, availability, and partition tolerance, especially in the context of system failures and recovery strategies.

  • Well-versed in various cloud service models such as SaaS, PaaS, and IaaS, with hands-on experience in designing resilient services on leading public cloud platforms.

  • Proficient in Chaos Engineering principles and practices, with experience in designing and conducting experiments to validate the system's capability to withstand turbulent conditions.

  • Skilled in implementing observability solutions that provide real-time insights into the performance and health of systems, aiding in proactive issue detection and resolution.

  • Practical experience operating in an Agile development environment.

Loading
Please wait..!!