Data Engineer (Python), Predictive Analytics (R1077523) in Warsaw, PL at IQVIA™

Date Posted: 12/27/2019

Job Snapshot

  • Employee Type:
  • Location:
    Warsaw, PL
  • Experience:
    Not Specified
  • Date Posted:
  • Job ID:

Job Description

IQVIA™ is the leading human data science company focused on helping healthcare clients find unparalleled insights and better solutions for patients. Formed through the merger of IMS Health and Quintiles, IQVIA offers a broad range of solutions that harness the power of healthcare data, domain expertise, transformative technology, and advanced analytics to drive healthcare forward.

Data Engineer, Predictive Analytics (Software Development - Python, Hadoop, hands-on) – Warsaw

Real-World Solutions (RWS) Technology Platforms.

We are seeking a Software Developer to join our Predictive Analytics (PA) team in Warsaw. You will play an important role in our development activities, working closely with our team who are building innovative machine learning solutions, addressing some of the most pressing issues in healthcare, such as under-diagnosis of rare diseases and identifying patients at high risk of disease progression.

The PA team is currently based in London, UK and Philadelphia, US. You will be one of the first hires into a new software development and engineering function in Warsaw. Initially there will be a single Warsaw based scrum team, with the expectation that we will build additional scrum teams over the next one to two years. There will be an ongoing need to collaborate closely with colleagues in London and Philadelphia.

Working with petabytes of data, modern distributed systems, advanced data science models and challenging requests in an agile environment, you will help to set standards for the development of data-driven analytics products. This crucial role will involve identifying opportunities for better data modelling, including processing, scaling, internal tooling, and other data engineering activities that will help our team maximise our efficiency in delivering data science projects. You will have the opportunity to provide technical guidance to data-scientists in the UK and US delivery teams, and with your team you will set the standards for code quality and software architecture in the packages you maintain.

You will be responsible for creating tools to be used by data scientists as well as non-technical experts. Amongst other things, these tools will include:

  • Data engineering tools to enable complex querying and diverse feature engineering tasks on very large data (hundreds of millions of patients) from a Hadoop environment using Python and PySpark.
  • Creating analytical pipelines to support a range of analytical functions, from data inspection through advanced machine learning solutions.
  • Initially carrying out proof-of-concepts to validate the use of new technologies, and if successful roll out team-wide solutions that will allow us to scale more effectively and tackle more complex problems.


  • Make high quality contributions to software development activities, exhibiting pragmatism to maximise the impact and value of the output that you produce.
  • Help to establish a positive, open and high-performing culture in the new Warsaw team you will be joining.
  • Promote best-practice software development in Python, Spark, Hadoop and related technologies and participate in the full software life-cycle.
  • Comprehensive testing of your own code.
  • Engage in the team’s agile practices such as daily stand ups, sprint planning, sprint refinements, and retrospectives; work to fortnightly sprints and be proactive in suggesting evolution to team process that will make you a stronger unit.
  • Work with your team lead to maintain a healthy scrum backlog, engaging in story refinement sessions and inputting your own ideas.

Our ideal candidate will have:

  • Substantial (2+ years) experience developing and maintaining Python packages that are integrated into additional applications.
  • Strong unit testing and debugging skills.
  • Advanced experience in data manipulation using Python data structures like numpy arrays and pandas.
  • Proficiency with relational databases and SQL.
  • Practical experience with the Hadoop ecosystem including tools like YARN, Hive, Impala, HDFS including some knowledge about Hadoop cluster architecture.
  • Good understanding of code versioning tools such as Git and Linux proficiency.
  • Experience in following Scrum best practices.
  • Fluency in English (spoken and written).

We would also appreciate if you have some of the following:

  • Experience working in a function as part of or supporting a data-science team with a machine learning focus.
  • Proficient understanding of designing microservices based applications.
  • Experience with workflow managers like Airflow, Azkaban or Luigi.
  • Experience with deploying code into production through CI/CD tools like Jenkins.

Join Us

Making a positive impact on human health takes insight, curiosity, and intellectual courage. It takes brave minds, pushing the boundaries to transform healthcare. Regardless of your role, you will have the opportunity to play an important part in helping our clients drive healthcare forward and ultimately improve outcomes for patients.

Forge a career with greater purpose, make an impact, and never stop learning.

Job ID: R1077523


  1. Software Engineer Jobs
  2. Project Engineer Jobs