Loading, Please wait..!!

Data Engineer The Data Pipeline Architect

Posted on: Oct 16, 2024

Unreal Gigs
San Francisco, California
Salary: Not Available
Full-time

Save this job

Apply Here Save this job

Data Engineer The Data Pipeline Architect

Job Title :

Data Engineer The Data Pipeline Architect

Job Type :

Full-time

Job Location :

San Francisco California United States

Remote :

Job Description :

Are you passionate about building data infrastructure that powers advanced analytics and machine learning? Do you thrive on transforming raw data into well-organized, accessible, and reliable datasets that fuel data-driven decision-making? If you’re excited about working with cutting-edge data technologies and architecting scalable pipelines, then our client has an exciting opportunity for you. We’re looking for a Data Engineer (aka The Data Pipeline Architect) to design, develop, and optimize the data systems that form the backbone of our products.

As a Data Engineer at our client, you’ll be responsible for constructing efficient, scalable data pipelines, ensuring data is accessible and usable for analysts, data scientists, and business stakeholders. You’ll work with large datasets, implement ETL processes, and build the infrastructure that powers analytics and AI-driven insights.

Key Responsibilities:

Design and Develop Data Pipelines:

Build and maintain robust, scalable, and efficient data pipelines to ingest, process, and store data from a variety of sources. You’ll design ETL (Extract, Transform, Load) processes to move and transform data, ensuring data integrity and accuracy.

Data Warehouse Management:

Architect and maintain data warehouses or data lakes using cloud platforms (e.g., AWS Redshift, Google BigQuery, or Snowflake) to organize and store large-scale datasets. You’ll ensure the infrastructure is optimized for fast querying and scalability.

Collaborate with Data Scientists and Analysts:

Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver datasets that meet business needs. You’ll provide clean, well-structured data to enable advanced analytics and machine learning projects.

Data Quality and Governance:

Implement data quality checks and monitoring systems to ensure the accuracy, completeness, and consistency of data across the pipeline. You’ll help establish data governance standards and policies to ensure compliance and security.

Performance Optimization:

Optimize the performance of data systems, ensuring fast and reliable data access. You’ll tune queries, design efficient storage architectures, and implement best practices for data retrieval and processing.

Automation and Monitoring:

Automate data workflows, pipeline deployments, and data quality checks to minimize manual intervention. You’ll set up monitoring and alerting systems to detect issues early and ensure smooth operation of data pipelines.

Data Security and Compliance:

Implement security protocols to protect sensitive data, ensuring compliance with relevant regulations such as GDPR, HIPAA, or SOC2. You’ll work with security teams to enforce access controls, encryption, and data privacy best practices.

Requirements

Required Skills:

Data Engineering Expertise: Strong experience building and maintaining data pipelines, ETL processes, and data warehouses using cloud platforms (AWS, GCP, Azure). You’re skilled at handling large, complex datasets efficiently.
Programming and Scripting: Proficiency in languages such as Python, SQL, or Scala, and experience with data engineering tools like Apache Spark, Airflow, or Kafka. You can write efficient code to process and transform large datasets.
Data Warehousing and Storage: Expertise in managing and optimizing data warehouses or data lakes (e.g., Redshift, BigQuery, Snowflake). You understand partitioning, indexing, and storage optimization techniques.
Database and Query Optimization: Strong knowledge of database design and query optimization for performance. You can fine-tune SQL queries and structure databases for fast, reliable access to large volumes of data.
Data Governance and Security: Solid understanding of data governance practices, security protocols, and compliance regulations. You can enforce data privacy and implement measures to safeguard sensitive information.

Educational Requirements:

Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Technology, or a related field. Equivalent experience in data engineering is also highly valued.
Certifications in cloud platforms (AWS, GCP, Azure) or data engineering technologies (e.g., Apache Hadoop, Apache Spark) are a plus.

Experience Requirements:

3+ years of experience in data engineering, with hands-on experience building and managing data pipelines, data warehouses, and cloud-based storage solutions.
Proven experience working with big data technologies and distributed systems, optimizing data flows and processes to handle large datasets.
Familiarity with data quality, data governance, and data security best practices is highly desirable.

Benefits

Health and Wellness: Comprehensive medical, dental, and vision insurance plans with low co-pays and premiums.
Paid Time Off: Competitive vacation, sick leave, and 20 paid holidays per year.
Work-Life Balance: Flexible work schedules and telecommuting options.
Professional Development: Opportunities for training, certification reimbursement, and career advancement programs.
Wellness Programs: Access to wellness programs, including gym memberships, health screenings, and mental health resources.
Life and Disability Insurance: Life insurance and short-term/long-term disability coverage.
Employee Assistance Program (EAP): Confidential counseling and support services for personal and professional challenges.
Tuition Reimbursement: Financial assistance for continuing education and professional development.
Community Engagement: Opportunities to participate in community service and volunteer activities.
Recognition Programs: Employee recognition programs to celebrate achievements and milestones.

Position Details

Posted:

Oct 16, 2024

Employment:

Full-time

Salary:

Not Available

Snaprecruit ID:

SD-WOR-36d6aed459eb80e95652c59a16984dc3007270cbe2e0f81e57ffd2a18bcff1fe

City:

San Francisco

Job Origin:

WORKABLE_ORGANIC_FEED

Share this job:

Similar Jobs

Data Engineer The Data Pipeline Architect Apply

Click on the below icons to share this job to Linkedin, Twitter!

Key Responsibilities:

Design and Develop Data Pipelines:

Build and maintain robust, scalable, and efficient data pipelines to ingest, process, and store data from a variety of sources. You’ll design ETL (Extract, Transform, Load) processes to move and transform data, ensuring data integrity and accuracy.

Data Warehouse Management:

Architect and maintain data warehouses or data lakes using cloud platforms (e.g., AWS Redshift, Google BigQuery, or Snowflake) to organize and store large-scale datasets. You’ll ensure the infrastructure is optimized for fast querying and scalability.

Collaborate with Data Scientists and Analysts:

Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver datasets that meet business needs. You’ll provide clean, well-structured data to enable advanced analytics and machine learning projects.

Data Quality and Governance:

Implement data quality checks and monitoring systems to ensure the accuracy, completeness, and consistency of data across the pipeline. You’ll help establish data governance standards and policies to ensure compliance and security.

Performance Optimization:

Optimize the performance of data systems, ensuring fast and reliable data access. You’ll tune queries, design efficient storage architectures, and implement best practices for data retrieval and processing.

Automation and Monitoring:

Automate data workflows, pipeline deployments, and data quality checks to minimize manual intervention. You’ll set up monitoring and alerting systems to detect issues early and ensure smooth operation of data pipelines.

Data Security and Compliance:

Implement security protocols to protect sensitive data, ensuring compliance with relevant regulations such as GDPR, HIPAA, or SOC2. You’ll work with security teams to enforce access controls, encryption, and data privacy best practices.

Requirements

Required Skills:

Data Engineering Expertise: Strong experience building and maintaining data pipelines, ETL processes, and data warehouses using cloud platforms (AWS, GCP, Azure). You’re skilled at handling large, complex datasets efficiently.
Programming and Scripting: Proficiency in languages such as Python, SQL, or Scala, and experience with data engineering tools like Apache Spark, Airflow, or Kafka. You can write efficient code to process and transform large datasets.
Data Warehousing and Storage: Expertise in managing and optimizing data warehouses or data lakes (e.g., Redshift, BigQuery, Snowflake). You understand partitioning, indexing, and storage optimization techniques.
Database and Query Optimization: Strong knowledge of database design and query optimization for performance. You can fine-tune SQL queries and structure databases for fast, reliable access to large volumes of data.
Data Governance and Security: Solid understanding of data governance practices, security protocols, and compliance regulations. You can enforce data privacy and implement measures to safeguard sensitive information.

Educational Requirements:

Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Technology, or a related field. Equivalent experience in data engineering is also highly valued.
Certifications in cloud platforms (AWS, GCP, Azure) or data engineering technologies (e.g., Apache Hadoop, Apache Spark) are a plus.

Experience Requirements:

3+ years of experience in data engineering, with hands-on experience building and managing data pipelines, data warehouses, and cloud-based storage solutions.
Proven experience working with big data technologies and distributed systems, optimizing data flows and processes to handle large datasets.
Familiarity with data quality, data governance, and data security best practices is highly desirable.

Benefits

Health and Wellness: Comprehensive medical, dental, and vision insurance plans with low co-pays and premiums.
Paid Time Off: Competitive vacation, sick leave, and 20 paid holidays per year.
Work-Life Balance: Flexible work schedules and telecommuting options.
Professional Development: Opportunities for training, certification reimbursement, and career advancement programs.
Wellness Programs: Access to wellness programs, including gym memberships, health screenings, and mental health resources.
Life and Disability Insurance: Life insurance and short-term/long-term disability coverage.
Employee Assistance Program (EAP): Confidential counseling and support services for personal and professional challenges.
Tuition Reimbursement: Financial assistance for continuing education and professional development.
Community Engagement: Opportunities to participate in community service and volunteer activities.
Recognition Programs: Employee recognition programs to celebrate achievements and milestones.

Please wait..!!

Find Data Engineer The Data Pipeline Architect Job in San Francisco, California | Snaprecruit

Data Engineer The Data Pipeline Architect

Apply Here Save this job

Data Engineer The Data Pipeline Architect

Job Title :

Job Type :

Job Location :

Remote :

Job Description :

Position Details

Posted:

Employment:

Salary:

Snaprecruit ID:

City:

Job Origin:

Share this job:

Similar Jobs

Download mobile appsJust Swipe to apply for jobs. Try it out !

For Employers

For Partner

For Jobseekers

Help

Follow Us

snaprecruit

Download mobile appsJust Swipe to apply for jobs. Try it out !

Data Engineer The Data Pipeline Architect Apply

Please input your account's email

Find Your Next Job In A Snap!

Recent Searches

Surprising !! No peanuts found :)