Find Role W Sre Jobs in Cosmos

Loading, Please wait..!!

Role W Sr Sre Site Reliability

Posted on: Sep 18, 2024

Comprehensive Resources INC
Cosmos, Minnesota
Salary: Not Available
Full-time

Save this job

Apply Here Save this job

Role W Sr Sre Site Reliability

Job Title :

Role W Sr Sre Site Reliability

Job Type :

Full-time

Job Location :

Cosmos Minnesota United States

Remote :

Job Description :

Sr. SRE ( Site Reliability Engineer) - Data DevOps/ DataOps/ No- SQL, Kafka , Databricks, Kubernetes, Kafka , Terrafoam

Seattle Based client

Location Seattle WA- - needs to come to office 3 days a week.

Visa any

Duration -10-12 month

Imp Note –

This is a Sr. SRE role and not devops role

Kubernetes – skill level expert is required

Kafka- skill level expert is required

Terraform- skill level expert is required

Databricks – skill level intermediate is ok

NO-SQL Database - Cassandra, Mongo, PostGres- very imp for this role

Pl match skills before submitting resumes

Core skills needed -

Azure Clous, AKS – Scalability, monitoring, deployment, check logs, ensure node and pod health.

Databases include - Cassandra, Mongo, PostGres

Databricks Notebooks – There are a lot of jobs on Databricks – experience with Databricks to know how a notebook is created and run - run queries against the database and finding discrepancies and perform fixes.

Based microservices, responsible for deployment, scripting language is python.

Should have an understanding around terraform.

Emphasis on Logs and Monitoring (datadog and splunk)

Summary of Experience

Requires 10-12 years experience in the IT industry
Requires 9+ years of software and DevOps development engineering
Experience in working with cloud environment Azure preferred.
Experience with Kubernetes, Azure Kubernetes (AKS) preferred.
Experience with using Kafka, Event Hub, NATS or any messaging broker.
Experience with Cassandra, PostgresSQL, Mongo, Elastic Search, Cosmos DB
Experience on Azure DevOps, Jenkins/ Python / Terraform / Ansible
Experience with Databricks
Experience with DataDog, Splunk or other logging and APM tools.
Experience in working with Linux environment.

Summary of Key Responsibilities

Responsibilities and essential job functions include but are not limited to the following:

• Responsible for health of production system

• Develop monitoring dashboards

• Configure alerts and automate process for system recovery

• Monitor alerts and take proactive steps to resolve system issues

• Troubleshoot production issues

• Lead production troubleshooting calls

• Responsible for patches and updates on production systems.

• Design and build cutting-edge, multi-micro service solutions to support Starbucks's growth worldwide.

• Helping CI/CD team during rolling out application and infrastructure globally.

• Collaborates with development team, other Information Technology (IT) team's developer leads. Initiates process improvements for new and existing systems.

• Participates in a production support rotation that includes pager responsibilities.

• Ability to accurately break down complex application designs into component deliverables and estimate design and development timelines