Lead/Sr. Hadoop Administrator (R1024440) in Plymouth Meeting, PA at IQVIA™

Date Posted: 5/1/2018

Job Snapshot

Job Description

Lead/Sr. Hadoop Administrator 

IQVIA  is developing our next-generation Global Data Lake and Analytics platform to support analytics and insights against hundreds of Terabytes and Petabytes of health care data, and doing it in near real-time.

We are currently seeking resources with experience in building and taking to production low latency, Massive Parallel Processing (MPP) data and analytic systems, ideally on Hadoop and Spark. We encourage interested applicants to please APPLY TODAY! 

Required Experience (in order of importance):
∙  Hadoop, Hive, Impala, HBase and related technologies
∙  Spark/Spark2
∙  MPP, shared nothing database systems, NoSQL systems
∙  Object Oriented and Functional Programming Experience
∙  Excellent knowledge of Linux, AIX, or other Unix flavors

∙  Experience with scripting (i.e. Chef/Puppet, Bash, Linux scripting)

∙  Data Warehousing design and concepts

∙  Exposure to Infrastructure as Code (Ansible, Terraform)

Minimum Education, Experience, & Specialized Knowledge Required:
∙  Computer Science Degree/Student
∙  3+ years strong native SQL skills
∙  3+ years strong experience in database and data warehousing/data lake concepts and techniques. Understand: relational and dimensional modeling, star/snowflake schema design, BI, Data Warehouse operating environments and related technologies, ETL, MDM, and data governance practices.
∙  2+ years’ experience working in Linux
∙  2+ years’ experience with Hadoop, Hive, Impala, HBase, and related technologies
∙  1+ years strong experience with low latency (near real time) systems and working with Tb data sets, loading and processing billions of records per day
∙  1+ years’ experience with Chef/Puppet, Bash, Linux scripting, Ansible, and/or Terraform

∙  1+ years’ experience with containerization (Mesosphere, Docker, Kubernetes)
∙  1+ years’ experience with MPP, shared nothing database systems, and NoSQL systems

∙  1 year experience with Spark, Scala, Python, Java, and/or R
∙  Ability to work in a fast-paced, team-oriented environment
∙  Ability to complete the full lifecycle of software development and deliver in an Agile/Scrum environment, leveraging Continuous Integration/Continuous Development
∙  Strong interpersonal skills, including a positive, solution-oriented attitude
∙  Must be passionate, flexible and innovative in utilizing the tools, their experience, and any other resources, to effectively deliver to very challenging and always changing business requirements with continuous success
∙  Must be able to interface with various solution/business areas to understand the requirements and support development
∙  Healthcare and/or reference data experience is a plus

∙  Operational management and architecture of Hadoop ecosystem, managing 100s – 1000s of nodes globally

∙  Build out clusters in data centers around the world

∙ Tuning multi-tenant Hadoop ecosystem for operational efficiency, balancing various workloads and optimizing Yarn and Impala accordingly

∙  Implement security, encryption, authentication, and authorization controls to adhere to corporate security policies

∙  Support Data Governance and data lineage on the cluster

∙  Enable High Availability and resiliency in the cluster, achieving 99.9999% uptime

∙  Understand network optimization and DR strategies

∙  Support and help to drive our hybrid cloud strategy, develop strategies for compute burst

∙  Work with data architects on the logical data models and physical database designs optimized for performance, availability and reliability
∙  Helping to tuning and optimization of backend and frontend data operations
∙  Serve as a query tuning and optimization technical expert, providing feedback to team
 ∙  Scripting and automation to support development, QA and production database environments, deployments to production and management of services and infrastructure
∙  Mentors development team members
∙  Proactively helps to resolve difficult technical issues
∙  Provide technical knowledge to teams during project discovery and architecture phases
∙  Keep management informed of work activities and schedules
∙  Assess new initiatives to determine the work effort and estimate the necessary time-to-completion
∙  Document new development, procedures or test plans as needed
∙  Participate in data builds and deployment efforts.  Help mature our Continuous Integration and Continuous Deployment methodologies
∙  Participate in projects through various phases
∙  Performs other related duties as assigned
∙  Partner with the business units to develop effective solutions that solve business challenges

Total Rewards:
We invest in people through a range of initiatives in compensation, benefits, and learning and development, and we strive to create an environment where our employees are challenged, empowered and can flourish.

IQVIA is an Equal Opportunity Employer. We cultivate a diverse corporate culture across the 100+ countries where we operate, celebrating and rewarding teamwork and inclusiveness. By embracing our differences, we create innovative solutions that are good for IQVIA, our clients, and the advancement of healthcare everywhere.


Job ID: R1024440