Associate Director, Data Engineering Team Lead (Python, Hadoop, hands-on) – Warsaw (R1077714) in Warsaw, PL at IQVIA™

Date Posted: 2/15/2020

Job Snapshot

Job Description

IQVIA™ is the leading human data science company focused on helping healthcare clients find unparalleled insights and better solutions for patients. Formed through the merger of IMS Health and Quintiles, IQVIA offers a broad range of solutions that harness the power of healthcare data, domain expertise, transformative technology, and advanced analytics to drive healthcare forward.

Associate Director, Data Engineering Team Lead (Python, Hadoop, hands-on) – Warsaw

Predictive Analytics, Real-World Solutions (RWS) Technology Platforms.

We are seeking an experienced and talented Senior Data Engineer to join our new Predictive Analytics (PA) team in Warsaw. You will be key in defining development activities, working closely with our team who are building innovative machine learning solutions, addressing some of the most pressing issues in healthcare, such as under-diagnosis of rare diseases and identifying patients at high risk of disease progression.

The PA team is currently based in London and Philadelphia. You will be instrumental in building new software development and engineering functions in Warsaw. Initially this will consist of a single scrum team, with the expectation that we will build additional scrum teams over the next one to two years. There will be an ongoing need to collaborate closely with colleagues in London, UK and Philadelphia, US.

Working with petabytes of data, modern distributed systems, advanced data science models and challenging requests in an agile environment, you will help shape the way our team approaches prototyping and development of data-driven analytics products. This crucial role will involve identifying opportunities for better data modelling, including processing, scaling, internal tooling, and other data engineering activities that will help our team maximise our efficiency in delivering data science projects. You will have the opportunity to provide technical mentorship to the team, and to set the standards for code quality and software architecture for the projects you work on.

You will be responsible for creating tools to be used by data scientists as well as non-technical experts. Amongst other things, these tools will include:

  • Data engineering tools to enable complex querying and diverse feature engineering tasks on very large data (hundreds of millions of patients) from a Hadoop environment using Python and PySpark.
  • Creating analytical pipelines to support a range of analytical functions, from data inspection through advanced machine learning solutions.
  • Initially carrying out proof-of-concepts to validate the use of new technologies, and if successful roll out team-wide solutions that will allow us to scale more effectively and tackle more complex problems.


  • Lead the building of a dedicated, enthusiastic team of software developers and engineers, providing technical leadership to your own and the wider team.
  • Promote best-practice software development in Python, Spark, Hadoop and related technologies and participate in the full software life-cycle.
  • Comprehensive testing of your own code.
  • Lead agile practices such as daily stand ups, sprint planning, sprint refinements, and retrospectives; work to fortnightly sprints and organise bi-weekly demos across multiple teams.
  • Deliver development reports, milestones and delivery schedules to the business.
  • Work alongside other team members such as Product Owners and Software QA to manage the development cycle from ‘thought to delivery’.

Our ideal candidate will have:

  • Leadership experience; building, managing and motivating software engineering team/s.
  • Substantial (5+ years) experience developing maintained Python applications.
  • At least 3 years of experience with Hadoop ecosystem including tools like YARN, Hive, Impala, HDFS including some knowledge about Hadoop cluster architecture.
  • Experience in putting machine learning models into production.
  • Familiarity with advanced Python data structures like numpy arrays and pandas.
  • Proficiency with relational databases and SQL.
  • Experience with more than one non-relational databases like MongoDB and Redis.
  • Strong unit testing and debugging skills.
  • Good understanding of code versioning tools such as Git and Linux proficiency.
  • Experience in following Scrum best practices.
  • Fluency in English (spoken and written).

We would also appreciate if you have some of the following:

  • Experience working in a function as part of or supporting a data-science team with a machine learning focus.
  • Proficient understanding of designing microservices based applications.
  • Experience with workflow managers like Airflow, Azkaban or Luigi.
  • Experience with deploying code into production through CI/CD tools like Jenkins.

IQVIA is a strong advocate of diversity and inclusion in the workplace.  We believe that a work environment that embraces diversity will give us a competitive advantage in the global marketplace and enhance our success.  We believe that an inclusive and respectful workplace culture fosters a sense of belonging among our employees, builds a stronger team, and allows individual employees the opportunity to maximize their personal potential.

Join Us

Making a positive impact on human health takes insight, curiosity, and intellectual courage. It takes brave minds, pushing the boundaries to transform healthcare. Regardless of your role, you will have the opportunity to play an important part in helping our clients drive healthcare forward and ultimately improve outcomes for patients.

Forge a career with greater purpose, make an impact, and never stop learning.

Job ID: R1077714


  1. Software Engineer Jobs
  2. Project Engineer Jobs