Data Engineer Apply
Title: Data Engineer
Experience: 4-6 Years
Location: Gurgoan (ANY NCR)-Hybrid
Mandatory skills : Python (Libraries), Advance SQL & Good Data warehouse (Google BigQuery, Redshift, Snowflake) and Data Lakes (GCS, AWS S3 etc.) -GCP(preffered), GIT source control
Responsibilities:
3+ years of experience.
Strong development skills in Python.
Writing effective and scalable Python codes.
Strong experience in processing data and drawing insights from large data sets
Good familiarity with one or more libraries: pandas, NumPy, SciPy etc.
In-depth knowledge of spaCy and similar NLP libraries like NLTK, textacy etc.
Experience with Python development environments, including, but not limited to Jupyter,
Google Colab notebooks, Matplotlib, Plotly, and geoplotlib.
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Good to have some exposure to
Experience with setting up and maintaining Data warehouse (Google BigQuery, Redshift, Snowflake) and Data Lakes (GCS, AWS S3 etc.) for an organization
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra / MongoDB.
Experience with data pipeline and workflow management tools: Airflow, Dataflow, Dataproc, etc.
Exposure to any Business Intelligence (BI) tools like Tableau, Dundas, Power BI etc.
Agile software development methodologies.
Working in multi-functional, multi-location teams
Strong development skills in Python.
Writing effective and scalable Python codes.
Strong experience in processing data and drawing insights from large data sets
Good familiarity with one or more libraries: pandas, NumPy, SciPy etc.
In-depth knowledge of spaCy and similar NLP libraries like NLTK, textacy etc.
Experience with Python development environments, including, but not limited to Jupyter,
Google Colab notebooks, Matplotlib, Plotly, and geoplotlib.
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Good to have some exposure to
Experience with setting up and maintaining Data warehouse (Google BigQuery, Redshift, Snowflake) and Data Lakes (GCS, AWS S3 etc.) for an organization
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra / MongoDB.
Experience with data pipeline and workflow management tools: Airflow, Dataflow, Dataproc, etc.
Exposure to any Business Intelligence (BI) tools like Tableau, Dundas, Power BI etc.
Agile software development methodologies.
Working in multi-functional, multi-location teams