Senior Data Engineer
Job Details
About the Company
With operational hubs scattered across Europe, Asia, and LATAM, and its headquarters situated in San Francisco, US, the company boasts a workforce of over 1,000 adept professionals. Spanning across more than 20 countries, ALLSTARSIT offers a diverse range of skilled employees across various verticals, including AI, cybersecurity, healthcare, fintech, telecom, media, and so on.
About the Project
The client’s company is revolutionizing the way new drugs and vaccines are approved for market by replacing manual processes with deep tech. Their core product, the project, is an AI-powered statistical analysis workspace designed to improve workflows and accelerate regulatory approval processes.
Accurate analysis of clinical research data is critical to helping millions of people receive safe and effective medical treatments faster. The project provides assurance that the data underpinning these analyses are of the highest integrity. By joining the client’s team, you can directly impact the speed at which new medications reach patients—many of whom may be your family or friends. You will also help drive the translation of meaningful ideas into beneficial products with the support of an inclusive and diverse group of highly talented professionals.
We are seeking a highly motivated and experienced Senior Data Engineer to join the client’s team.
Specialization
Headquarters
Years on the market
Team size and structure
Current technology stack
Required skills:
- 3+ years of experience in Python development (or 1+ year of Python plus 2+ years in another language)
- 2+ years of experience in production data environments
- Strong experience with modern data engineering tools and libraries (e.g., Pandas, PySpark, SQLAlchemy)
- Hands-on experience with Databricks or other distributed data processing frameworks
- Proficiency in working with structured and semi-structured data from various sources (databases, APIs, files, etc.)
- Experience with Agile methodologies and tools
- Familiarity with data modeling, data warehousing, and performance optimization
- Excellent analytical and problem-solving skills
- Strong English communication skills, both written and verbal
- Team player with strong collaboration and interpersonal abilities
Advantages
- Experience with Databricks or similar platforms for large-scale data processing and analytics, as well as data orchestration tools and MLOps frameworks
- Experience delivering AI-ready datasets and knowledge of AI/ML frameworks
- Proficiency with regular expressions
- Experience with NoSQL databases (e.g., MongoDB)
- Familiarity with cloud services (AWS and/or Azure)
- Hands-on experience with Python testing and debugging tools such as PyTest
- Experience building and maintaining CI/CD pipelines
- Background in the pharmaceutical or life sciences industry
- Bachelor’s degree in Computer Science, Engineering, a related technical field, or equivalent practical experience
Scope of work:
- Design and implement robust, scalable data pipelines and workflows using modern data engineering frameworks
- Collaborate closely with Data Scientists, Analysts, Developers, and Product Managers to support data needs and model deployment
- Take ownership of data quality, data architecture, and production readiness across systems
- Build and optimize data extraction, transformation, and loading (ETL) processes across structured and unstructured datasets
- Develop reusable data processing components and libraries
- Troubleshoot data issues, performance bottlenecks, and maintain high data integrity across environments
- Contribute to the evolution of the client’s cloud-based data infrastructure
- Document data workflows and participate in the development and execution of test strategies for data systems