Data Scientist Engineer Apply
***This position is fully remote at 37.5 hours per week. If the individual is local, they may get the equipment in-person at Harrisburg, PA 17120.***
Description
The ideal candidate is a gritty, collaborative data engineer who is comfortable with autonomy, ships fast and iterates. We're connecting and correcting databases to power analytics for executives doing high-impact work. Previous examples of our team's work include providing the Department with a full-fledged analytics dashboard spanning permitting and HR that has been used by executives to drive decision-making and policy planning. We're a small, fast-moving teams who collaborate closely and learn from each other.
Requirements
Proven experience in designing and maintaining robust data pipelines that are performant on big data sets
3+ years of experience programming in SQL, Python and relevant packages like Pandas, and with Apache Spark
Experience writing clean code that utilizes object-oriented paradigms when appropriate
A strong foundation in data modeling, architecture and data warehousing
Experience working in a CI/CD environment and deploying pipelines
Hands-on experience with a cloud platform, preferably Azure (Azure SQL, Synapse, Databricks, etc.)
Experience building visualizations in Power BI
Communication skills to translate data asks into technical specs for a pipeline and visualization
An understanding of data security best practices and how to implement them
Passion for staying updated on emerging technologies in data engineering
Contributing to a culture of continuous improvement and innovation within the team
Education
A Bachelor's degree in Computer Science, Engineering, Statistics, or equivalent professional experience
Desired Skills
Experience in agile development methodologies and version control systems (e.g., Git) is a plus
Experience working with NoSQL databases
Skill | Required / Desired | Amount | of Experience |
3+ years of experience programming in SQL, Python and relevant packages like Pandas, and with Apache Spark | Required | 3 | Years |
Proven experience in designing and maintaining robust data pipelines that are performant on big data sets | Required | ||
Experience writing clean code that utilizes object-oriented paradigms when appropriate | Required | ||
A strong foundation in data modeling, architecture and data warehousing | Required | ||
Experience working in a CI/CD environment and deploying pipelines | Required | ||
Hands-on experience with a cloud platform, preferably Azure (Azure SQL, Synapse, Databricks, etc.) | Required | ||
Experience building visualizations in Power BI | Required | ||
Communication skills to translate data asks into technical specs for a pipeline and visualization | Required | ||
An understanding of data security best practices and how to implement them | Required | ||
Passion for staying updated on emerging technologies in data engineering | Required | ||
Contributing to a culture of continuous improvement and innovation within the team | Required | ||
A bachelor's degree in Computer Science, Engineering, Statistics, or equivalent professional experience | Required | ||
Experience in agile development methodologies and version control systems (e.g., Git) is a plus | Desired | ||
Experience working with NoSQL databases | Desired | ||
An ability to work with cross-functional, inter-agency teams | Desired | ||
Experience working on small teams in fast-paced environments | Desired |
This position is 37.5 hours per week. Is this understood? |
Where does your candidate currently reside (e.g. city, state)? |