Job Details

Senior Data Platform Engineer – R&D Data Platforms

Information Technology

icon-location-marker Princeton - NJ - US
Posted 13 days ago Full_time R1583282

Working with Us
Challenging. Meaningful. Life-changing. Those aren’t words that are usually associated with a job. But working at Bristol Myers Squibb is anything but usual. Here, uniquely interesting work happens every day, in every department. From optimizing a production line to the latest breakthroughs in cell therapy, this is work that transforms the lives of patients, and the careers of those who do it. You’ll get the chance to grow and thrive through opportunities uncommon in scale and scope, alongside high-achieving teams rich in diversity. Take your career farther than you thought possible.

Bristol Myers Squibb recognizes the importance of balance and flexibility in our work environment. We offer a wide variety of competitive benefits, services and programs that provide our employees with the resources to pursue their goals, both at work and in their personal lives. Read more: careers.bms.com/working-with-us.

Position Summary

As a Senior Data Platform Engineer, you will play a vital role in supporting the broader Data Engineering community to deliver cutting-edge data and analytics platforms for our Research & Development group. We seek a candidate who excels at creating innovative, reliable, secure, and easy-to-use data infrastructure for ingesting, storing, processing, governing and interacting with data. You will collaborate closely with data scientists, analysts, and data engineers to support various data-driven initiatives and enhance our overall data ecosystem. Additionally, you will leverage Generative AI (GenAI) technologies to drive innovation and efficiency within our data platforms. Be part of our exciting Data Platform journey.

Key Responsibilities

  • Define and implement end-to-end R&D data platform capabilities aligned with business objectives, considering data variety, discovery & consumption needs, latency requirements, and access governance.
  • Design and develop data APIs and automate the registration and discovery of data products and their metadata, adhering to open data product specifications
  • Partner with data product owners, engineers, and data scientists to develop GenAI and LLM-powered applications (Chat with Data), leveraging techniques like RAG, fine-tuning, and vector embeddings to deliver high-quality data discovery & consumption features.
  • Deep understanding of the GenAI Tech Stack, including its architecture, components, and capabilities.
  • Develop and Maintain CI/CD workflows and tools for efficient code and infrastructure deployment.
  • Ensure accessibility of data products with appropriate security controls across all computational platforms (e.g., Domino, R-Shiny, Tableau, and SuperPOD), fostering a secure and interconnected ecosystem.
  • Collaborate with stakeholders and cross-functional leaders in data engineering, data product teams, and data operations to ensure the effective adoption of our Data Platform.
  • Provide guidance and establish processes to ensure engineering excellence, efficiency, and operational sustainability of our platform.
  • Continuously optimize the data platform for performance, scalability, and cost-effectiveness using techniques such as parallel processing, caching, and partitioning.
  • Design and develop new data solutions to accelerate data usage across R&D, ensuring robustness and scalability.
  • Participate in data product design reviews to ensure efficient and secure data usage.
  • Stay updated with emerging technologies, trends, and best practices in data discovery & consumption to enhance our data infrastructure.

Qualifications & Experience

  • 7-10+ years of hands-on experience implementing and operating data capabilities and cutting-edge solutions in a cloud environment, with expertise in data lakehouses, master/reference data management, data quality, and analytics/AI ML.
  • In-depth knowledge and hands-on experience with AWS Glue services and the AWS Data Engineering & Analytics ecosystem.
  • Strong understanding of RAG, fine-tuning, vectorization, and prompt engineering techniques for optimizing LLM performance.
  • Hands-on experience developing DataOps and ETL solutions using AWS data services (e.g., Redshift, RDS, Athena, Lake Formation), Cloudera Data Platform, and Tableau Labs.
  • 5+ years of experience in data platform engineering or software development.
  • Proficiency in creating and maintaining optimal data pipeline architecture for large, complex datasets.
  • Strong programming skills in Python, R, PyTorch, PySpark, Pandas, Scala, etc.
  • Experience with SQL and database technologies such as OpenSearch, MySQL, PostgreSQL, Presto, etc.
  • Proficient in implementing, integrating, and managing LLMs, with a deep understanding of their capabilities and applications.
  • Experience with cloud platforms (e.g., Azure, Google Cloud), orchestration engines (e.g., Airflow), and containerization technologies (e.g., Docker, Kubernetes) is advantageous.
  • Excellent communication and collaboration skills, with functional knowledge or prior experience in the Life Sciences Research and Development domain considered a plus.
  • Demonstrated ability to improve processes, structures, and knowledge, providing strong recommendations and executing complex solutions effectively.

#LI-Hybrid

If you come across a role that intrigues you but doesn’t perfectly line up with your resume, we encourage you to apply anyway. You could be one step away from work that will transform your life and career.

Uniquely Interesting Work, Life-changing Careers
With a single vision as inspiring as “Transforming patients’ lives through science™ ”, every BMS employee plays an integral role in work that goes far beyond ordinary. Each of us is empowered to apply our individual talents and unique perspectives in an inclusive culture, promoting diversity in clinical trials, while our shared values of passion, innovation, urgency, accountability, inclusion and integrity bring out the highest potential of each of our colleagues.

On-site Protocol

BMS has a diverse occupancy structure that determines where an employee is required to conduct their work. This structure includes site-essential, site-by-design, field-based and remote-by-design jobs. The occupancy type that you are assigned is determined by the nature and responsibilities of your role:

Site-essential roles require 100% of shifts onsite at your assigned facility. Site-by-design roles may be eligible for a hybrid work model with at least 50% onsite at your assigned facility. For these roles, onsite presence is considered an essential job function and is critical to collaboration, innovation, productivity, and a positive Company culture. For field-based and remote-by-design roles the ability to physically travel to visit customers, patients or business partners and to attend meetings on behalf of BMS as directed is an essential job function.

BMS is dedicated to ensuring that people with disabilities can excel through a transparent recruitment process, reasonable workplace accommodations/adjustments and ongoing support in their roles. Applicants can request a reasonable workplace accommodation/adjustment prior to accepting a job offer. If you require reasonable accommodations/adjustments in completing this application, or in any part of the recruitment process, direct your inquiries to adastaffingsupport@bms.com. Visit careers.bms.com/eeo-accessibility to access our complete Equal Employment Opportunity statement.

BMS cares about your well-being and the well-being of our staff, customers, patients, and communities. As a result, the Company strongly recommends that all employees be fully vaccinated for Covid-19 and keep up to date with Covid-19 boosters.

BMS will consider for employment qualified applicants with arrest and conviction records, pursuant to applicable laws in your area.

Any data processed in connection with role applications will be treated in accordance with applicable data privacy policies and regulations.