Our Data Science team is looking for an outstanding Data Engineer who will participate in the design, implementation and ongoing governance of App Annie's data products. You will be responsible for expanding and optimizing our data and data pipeline architecture as well as optimizing data flows for cross-functional teams. You will work closely with Product Managers, Data Scientists, Data Analysts and other Big Data and BI Engineers to develop and sustain our understanding of the behavior of our data products, including conceiving and implementing new insights and features as well as building large-scale data integration solutions. You should be passionate about what you do and excited to do it in the context of an entrepreneurial start-up.
As Data Engineer, you will be an integral part of a team responsible for data estimates quality across a number of Intelligence products, including data cleansing of key inputs to our data science models and root cause analysis of data quality issues. You will also help to expand our reporting and analysis tools to support the Data Science team in building new product features. This includes:
* Ability to assemble large, complex data sets that meet function and non-functional business requirements * Build the infrastructure required for optimal extraction, transformation and loading of data from a wide variety of data sources using SQL, Apache Spark of Facebook Presto * Responsibility for monitoring and diagnosing the quality of data inputs to App Annie's Data Science models, including specifying the metrics required and requirements for dashboards and other tools to track these over time. * Responsibility for keeping a constant feedback loop with the Data Science, Data Analyst team and the Big Data team on various identified data issues to identify the impact on our models of data input quality, with in-depth analysis of model predictions and reports. * Ability to develop complex queries, scripts and data tools as the support for the data science team in research, building and testing of models powering new product features * Responsibility for assisting product managers with building analytics to provide actionable insights into various key business performance metrics * The ability to communicate well and clearly with teams and team members across multiple time zones and countries.
You should be a proven contributor with outstanding Data Engineering skills, including:
* A passion for everything to do with data and already have top-level experience working as a data pipeline builder and data wrangler * An ability to perform root cause analysis on various datasets to answer specific business questions and identify opportunities for improvement * Strong experience in building and optimizing big data pipelines and datasets as well as building processes to support various data manipulations * 3+ years work experience, including at least 2 years performing quantitative analysis, preferably for Internet or technology companies. * Demonstrated ability to produce results as part of a highly distributed team that crosses cultural and country boundaries * Excellent data mining skills, including a working knowledge of several of the following tools: * Big data tools: Hadoop, Spark, Hive * Relational SQL databases * Data pipelines and workflow management tools: Luigi, Airflow etc * AWS cloud services: EC2, Redshift, EMR * Scripting languages: Python, Scala, C++ etc. * A proficiency with Linux OS * Experience with data visualization and dashboard tools is a plus: Tableau Software, Pentaho, Business Objects. * Strong presentation skills. Comfortable explaining data solutions in a business context to multiple stakeholders, e.g. product managers, senior management, etc. * BS preferred in Computer Science, Statistics, Informatics or related field * A passion for digital content and mobile apps is a definite plus * The authorization to work in the U.S.
Let your dream job find you.
Sign up to start matching with top companies. It’s fast and free.