image
  • Snapboard
  • Activity
  • Reports
  • Campaign
Welcome ,
loadingbar
Loading, Please wait..!!

Senior Site Reliability Engineer Uscanada

  • ... Posted on: Nov 06, 2024
  • ... DataVisor
  • ... Research Triangle Park, North Carolina
  • ... Salary: Not Available
  • ... Full-time

Senior Site Reliability Engineer Uscanada   

Job Title :

Senior Site Reliability Engineer Uscanada

Job Type :

Full-time

Job Location :

Research Triangle Park North Carolina United States

Remote :

No

Jobcon Logo Job Description :

DataVisor is a next generation security company that utilizes industry leading unsupervised machine learning to detect fraudulent activity for financial transactions, mobile user acquisition, social networks, commerce and money laundering. Our solution is used by some of the largest internet properties in the world, including Pinterest, FedEx, AirAsia, Synchrony Financial, Zomato and Ping An, to protect them from the ever-increasing risk of fraud. Our award-winning software is powered by a team of world-class experts in big data, security, and scalable infrastructure. Our culture is open, positive, collaborative, and results driven. Come join us!

We are seeking a Senior Site Reliability Engineer (SRE) to join our growing team. The ideal candidate will have a passion for building reliable systems, experience with automation, and a solid understanding of large-scale distributed systems. You will work closely with the engineering team to improve reliability, scalability, and performance across our infrastructure.

You will report to CTO direclty and be working with a team of seasoned engineers to automate, increase the reliability and enhance the security of our production environment. Projects include scaling our global, multi-cloud footprint, optimize our large real-time decision platform and improve the reliability of our global cloud footprint.

Requirements

5+ years of experience with production environment running Linux

3+ years of experience working with cloud solutions such as AWS, Azure or Aliyun

Familiar with big data technology such as Spark and/or Flink

Love to automate tasks through coding and scripting

Experience with algorithms, data structures, complexity analysis and software design

Code well on Python, Java and Bash

Key Responsibilities:

  • Design, implement, and maintain release automation pipelines to streamline the deployment process.
  • Develop systems for proactive monitoring, auto-diagnosis, and incident resolution in production environments.
  • Work with big data platforms such as Apache Spark or Apache Flink, optimizing and scaling our data processing pipelines.
  • Perform maintenance and troubleshooting for databases, with preference for experience in Yugabyte, ClickHouse, and MySQL.
  • Ensure the reliability of cloud infrastructure using Kubernetes on AWS or GCP.
  • Participate in on-call rotation to ensure system reliability, with a focus on automation to minimize manual intervention.
  • Collaborate with engineering teams to improve system performance and manage capacity planning.

PREFERRED EXPERIENCE

  • Familiar with container technology such as Docker, Kubernetes
  • Experience with database system best practices on Yugabyte, Clickouse and MySQL etc.
  • Strong understanding of security best practices
  • Completed a SOC 2/PCI certification in the past is a big plus

Benefits

  • Health insurance
  • PTO and sick days
  • 401K Plan

Jobcon Logo Position Details

Posted:

Nov 06, 2024

Employment:

Full-time

Salary:

Not Available

Snaprecruit ID:

SD-WOR-29d3ef64e1d8889759e60daa0200d8dfa8b696bb061b4f6c9d234e96b5340b8b

City:

Research Triangle Park

Job Origin:

WORKABLE_ORGANIC_FEED

Share this job:

  • linkedin

Jobcon Logo
A job sourcing event
In Dallas Fort Worth
Aug 19, 2017 9am-6pm
All job seekers welcome!

Senior Site Reliability Engineer Uscanada    Apply

Click on the below icons to share this job to Linkedin, Twitter!

DataVisor is a next generation security company that utilizes industry leading unsupervised machine learning to detect fraudulent activity for financial transactions, mobile user acquisition, social networks, commerce and money laundering. Our solution is used by some of the largest internet properties in the world, including Pinterest, FedEx, AirAsia, Synchrony Financial, Zomato and Ping An, to protect them from the ever-increasing risk of fraud. Our award-winning software is powered by a team of world-class experts in big data, security, and scalable infrastructure. Our culture is open, positive, collaborative, and results driven. Come join us!

We are seeking a Senior Site Reliability Engineer (SRE) to join our growing team. The ideal candidate will have a passion for building reliable systems, experience with automation, and a solid understanding of large-scale distributed systems. You will work closely with the engineering team to improve reliability, scalability, and performance across our infrastructure.

You will report to CTO direclty and be working with a team of seasoned engineers to automate, increase the reliability and enhance the security of our production environment. Projects include scaling our global, multi-cloud footprint, optimize our large real-time decision platform and improve the reliability of our global cloud footprint.

Requirements

5+ years of experience with production environment running Linux

3+ years of experience working with cloud solutions such as AWS, Azure or Aliyun

Familiar with big data technology such as Spark and/or Flink

Love to automate tasks through coding and scripting

Experience with algorithms, data structures, complexity analysis and software design

Code well on Python, Java and Bash

Key Responsibilities:

  • Design, implement, and maintain release automation pipelines to streamline the deployment process.
  • Develop systems for proactive monitoring, auto-diagnosis, and incident resolution in production environments.
  • Work with big data platforms such as Apache Spark or Apache Flink, optimizing and scaling our data processing pipelines.
  • Perform maintenance and troubleshooting for databases, with preference for experience in Yugabyte, ClickHouse, and MySQL.
  • Ensure the reliability of cloud infrastructure using Kubernetes on AWS or GCP.
  • Participate in on-call rotation to ensure system reliability, with a focus on automation to minimize manual intervention.
  • Collaborate with engineering teams to improve system performance and manage capacity planning.

PREFERRED EXPERIENCE

  • Familiar with container technology such as Docker, Kubernetes
  • Experience with database system best practices on Yugabyte, Clickouse and MySQL etc.
  • Strong understanding of security best practices
  • Completed a SOC 2/PCI certification in the past is a big plus

Benefits

  • Health insurance
  • PTO and sick days
  • 401K Plan

Loading
Please wait..!!