Job brief.

Job Role: Lead Data Engineer
Location: Pune
Exp: 10-12 Y

Requirements.

Lead and manage multiple data projects leveraging Databricks, from project initiation to completion, ensuring adherence to project timelines, budget, and quality standards. Collaborate closely with cross-functional teams including Subject Matter Experts, data scientists, engineers, analysts, and other stakeholders to define project requirements, scope, and deliverables. Define and implement scalable and robust data lake architectures leveraging Databricks Delta Lake technology Design data ingestion, transformation, and storage strategies to ensure efficient and reliable data management Oversee the development of data pipelines to ingest, process, and transform data from various sources into Databricks Delta Lake Define data models and schemas to support analytical and reporting needs.

Optimize data structures, partitioning strategies, and storage formats for efficient query performance Implement ML pipelines and workflows for model training, validation, and deployment using Databricks MLflow and related tools to support real-time and batch inference.

Work closely with BI Analysts and Data Visualization specialists to design and optimize data schemas and structures for BI reporting and analytics.

Establish monitoring and alerting mechanisms to proactively detect issues and optimize data lake performance Stay abreast of industry trends, best practices, and emerging technologies in data engineering and Databricks Delta Lake Provide technical guidance and leadership on Databricks best practices, methodologies, and implementation strategies.

Manage Databricks clusters and resources efficiently to optimize performance, scalability, and cost-effectiveness. Develop and maintain metadata management solutions to capture, organize, and govern data assets across the organization Qualifications

About You.

  • Bachelor’s degree in Computer Science, Information Systems, Data Science, or a related field. Advanced degree preferred.
  • Excellent communication, interpersonal, and leadership skills, with the ability to effectively collaborate with diverse teams and stakeholders.
  • Strong analytical and problem-solving abilities, with a focus on delivering innovative and impactful data-driven solutions.
  • 8-12 years of total years of experience.
  • Deep expertise in Apache Spark, Databricks runtime environment, and Databricks Delta Lake
  • Professional/Associate level Databricks certification is required.
  • Strong understanding of master data management principles, metadata management, and data cataloging concepts and best practices.
  • Strong background in data modeling, ETL/ELT development, and data warehousing.
  • Experience with cloud platforms such as AWS, Azure, or Google Cloud Platform and big data technologies.

Assets.

  • Financial product knowledge and knowledge of Hedge Fund Administration
  • Experience in setting up and managing Data Center of Excellence (CoE) is highly desirable
  • Create interactive reports in Qlik/Tableau/Power BI/Alteryx
  • Experience integrating machine learning models and algorithms into data pipelines (experience with Databricks MLflow is a plus).
  • Experience working in an Agile environment with knowledge of JIRA, Confluence etc

Apply for this job

Use the form below to submit your job application.

Allowed Type(s): .pdf, .doc, .docx