Site Reliability Engineer Sre

TekWissen LLC

Atlanta, Georgia,

Full-time
Salary: 85 per hour

Posted on: Sep 09, 2024

Save this job

Apply Here Save this job

Site Reliability Engineer Sre

JOB TITLE:

Site Reliability Engineer Sre

JOB TYPE:

Full-time

JOB LOCATION:

Atlanta Georgia United States

REMOTE:

Yes

JOB DESCRIPTION:

Overview:

TekWissen Group is a workforce management provider throughout the USA and many other countries in the world. Our client provider of digital technology and transformation, information technology and services

Position: Site Reliability Engineer (SRE)

Location: Atlanta, GA / Frisco, TX

Duration: 9 Months

Job Type: Contract

Work Type: Remote

Job Description:

Client's telecommunications practice is looking for dynamic and driven professionals to join a rapidly growing high-performance team.
Our client is a leading provider of digital Global System for Mobile Communications/ wireless voice and data technology standards.
Position duties and responsibilities include, but are not limited to:
Provide consulting services for improved system stability, availability, performance and reliability.
Assist in determining the impact of operational issues and provide input into their resolution via data extraction and quantification.
Work through day-to-day support issues, ensure effective and timely resolution of issues in production environment, troubleshoot customer impacting issues. Forecast and plan for rapidly growing environment.
Support multiple applications, specifically running in Kubernetes/Java based systems in an enterprise environment.
Apply monitoring and creating complex alerts and dashboards for production systems using Grafana, Prometheus Provide capacity analysis, tuning analysis for Cloud applications in a LINUX and container platform.
Available to provide 24X7 on call support on a rotating basis with other team members.
Lead efforts in troubleshooting, recovery, and root cause investigation.
Perform analysis of user requirements and problems to automate or improve systems and review system capabilities, workflow, and scheduling limitations. Facilitate DR (Disaster Recovery) exercises to ensure that the team are fully prepared in any event.
Lead root cause analysis session to understand what causes issues in Production and come up with solutions that will prevent them from happening in the future.
Ensure documentation is created and remain updated for any related work. Strong understanding of UNIX operating systems and any scripting language.

Skill requirements:

Strong experience with infrastructure and support.
Strong experience with Linux OS.
Strong experience with Kubernetes
Experience with Cloud Native Applications.
Experience with REST or SOAP API support.
Experience with tools like: Docker, PostMan, SOAP UI, ELK, App Dynamics, CI/CD tools and GITLab, Prometheus, Grafana Good Experience in performance measures and tuning, capacity planning and management, contingency and disaster recovery Strong scripting knowledge and experience, preferably in Python Good understanding of networking and routing.

Mandatory Skills:

Kubernetes, Devops, Reliability Engineering, Python
SRE with Kubernetes, Advanced python scripting experience.

TekWissen Group is an equal opportunity employer supporting workforce diversity.

Position Details

POSTED:

Sep 09, 2024

EMPLOYMENT:

Full-time

SALARY:

85 per hour

SNAPRECRUIT ID:

SD-b6bbd78381e61e01ad9ca85f6273b98a539adbd7fc436395f529f1178862d9bd

CITY:

Atlanta

Job Origin:

CIEPAL_ORGANIC_FEED

Similar Jobs

Site Reliability Engineer Sre Apply

Click on the below icons to share this job to Linkedin, Twitter!

Overview:

Position: Site Reliability Engineer (SRE)

Location: Atlanta, GA / Frisco, TX

Duration: 9 Months

Job Type: Contract

Work Type: Remote

Job Description:

Client's telecommunications practice is looking for dynamic and driven professionals to join a rapidly growing high-performance team.
Our client is a leading provider of digital Global System for Mobile Communications/ wireless voice and data technology standards.
Position duties and responsibilities include, but are not limited to:
Provide consulting services for improved system stability, availability, performance and reliability.
Assist in determining the impact of operational issues and provide input into their resolution via data extraction and quantification.
Work through day-to-day support issues, ensure effective and timely resolution of issues in production environment, troubleshoot customer impacting issues. Forecast and plan for rapidly growing environment.
Support multiple applications, specifically running in Kubernetes/Java based systems in an enterprise environment.
Apply monitoring and creating complex alerts and dashboards for production systems using Grafana, Prometheus Provide capacity analysis, tuning analysis for Cloud applications in a LINUX and container platform.
Available to provide 24X7 on call support on a rotating basis with other team members.
Lead efforts in troubleshooting, recovery, and root cause investigation.
Perform analysis of user requirements and problems to automate or improve systems and review system capabilities, workflow, and scheduling limitations. Facilitate DR (Disaster Recovery) exercises to ensure that the team are fully prepared in any event.
Lead root cause analysis session to understand what causes issues in Production and come up with solutions that will prevent them from happening in the future.
Ensure documentation is created and remain updated for any related work. Strong understanding of UNIX operating systems and any scripting language.

Skill requirements:

Strong experience with infrastructure and support.
Strong experience with Linux OS.
Strong experience with Kubernetes
Experience with Cloud Native Applications.
Experience with REST or SOAP API support.
Experience with tools like: Docker, PostMan, SOAP UI, ELK, App Dynamics, CI/CD tools and GITLab, Prometheus, Grafana Good Experience in performance measures and tuning, capacity planning and management, contingency and disaster recovery Strong scripting knowledge and experience, preferably in Python Good understanding of networking and routing.

Mandatory Skills:

Kubernetes, Devops, Reliability Engineering, Python
SRE with Kubernetes, Advanced python scripting experience.

TekWissen Group is an equal opportunity employer supporting workforce diversity.

Please wait..!!

Find Site Reliability Engineer Sre Job in Atlanta, Georgia | Snaprecruit