Senior / Tech Lead Data Engineer
Job Details
About the Company
With operational hubs scattered across Europe, Asia, and LATAM, and its headquarters situated in San Francisco, US, the company boasts a workforce of over 1,000 adept professionals. Spanning across more than 20 countries, ALLSTARSIT offers a diverse range of skilled employees across various verticals, including AI, cybersecurity, healthcare, fintech, telecom, media, and so on.
About the Project
Bridgewise is building advanced data platforms that power real-time analytics and decision-making processes for global clients. With a focus on scalability, performance, and cloud-native architecture, the company delivers end-to-end data engineering solutions. The team works with cutting-edge technologies such as AWS EMR, Glue, Spark, and Airflow, and leverages Python and PySpark for efficient data processing.
Specialization
Headquarters
Years on the market
Team size and structure
Current technology stack
Required skills:
- Bachelor’s Degree in Computer Science or a related field
- 5+ years in data engineering or architecture roles, with leadership experience
- Advanced Python and PySpark skills for ETL and data transformation
- Strong hands-on experience with AWS cloud ecosystem, especially EMR and Glue
- Deep understanding of Apache Spark, including performance optimization
- Familiarity with orchestration tools like Airflow
- Solid knowledge of OOD (Object-Oriented Design) and SOLID principles
Must-have confirmations:
✅ Practical experience with Spark job performance optimization
✅ Understanding of SOLID principles and object-oriented design
✅ Familiarity with at least 4 of the following PySpark topics: Window functions, Broadcast join, Sort & merge join, Watermark, UDF (User-Defined Functions), Lazy computation
Scope of work:
- Design and implement ETL pipelines for ingesting and transforming data from multiple sources
- Collaborate with cross-functional teams to ensure reliable deployment and integration of data workflows
- Lead performance tuning and query optimization for high-efficiency data processing
- Model and structure data to support scalable, robust, and maintainable platforms