In this highly visible technical lead position, you will be responsible for providing Data Engineering leadership and support across the business. The focus will be on expanding the model building and deployment capabilities across the teams. The individual will work closely with a team of very talented data scientists and data engineers in driving requirements, development, design, and implementation of advanced platforms, tools, and systems in support of data science and machine learning initiatives.
Requirements include at least 6 years of relevant experience in industry, experience designing and developing end-to-end solutions, large scale data acquisition and transformation, and understanding of data warehouse and data lake technology.
Expertise with PySpark, and experience with the Hadoop ecosystem, AWS, Java, and SQL are also required. Prior experience with DataRobot is a plus.
* 6+ years of relevant work experience * Bachelor's degree in Computer Science, Engineering, or related field. * 2+ years experience with PySpark * 2+ years of production experience of building Mapreduce jobs, Spark scripts, Oozie workflows, or other Hadoop based applications. * Good understanding of database concepts (Oracle, MS SQL, generic SQL) * Good understanding of the distributed data processing. * Experience of creating Spark scripts either in Python or Scala. * Experience of diagnosing and mitigating performance issues in Spark scripts. * Experience of setting up and querying Hive, Presto, or Impala databases. * Experience of diagnosing and mitigating performance issues in Hive, Presto, or Impala queries. * Experience in Open systems and Cloud based applications with AWS (e.g ec2, s3) * Good to have experience of creating streaming solutions and reporting tools. * Prior experience with DataRobot is a plus.
* Build end-to-end ETL pipelines to enable training and operationalization of machine learning models. * Build code for ingesting data from relational databases, NoSQL database, flat files, and message queues into big data solutions. * Integrating the code, produced by data scientists, into data pipelines. * Build code or configuration to push data from big data solutions into reporting tools and other software. * Diagnose and mitigate performance issues. * Communicate with the customer IT personnel to clarify technical details. * Document the implementation. * Assist with the installation and set up of big data solutions and DataRobot.
Individuals seeking employment at DataRobot are considered without regards to race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation.
DataRobot Engineering is a hard-working, fast moving, fun-loving team of developers who put product before pride. Our team is flexible and adaptable. We genuinely like each other and work hard to make sure that we all succeed, both for individual and company success, because we believe that one doesn't happen without the other.
Interested? Apply now!
DataRobot is a company that offers a machine learning platform for data scientists to build and deploy predictive models.