We are seeking a skilled Data Engineer with strong expertise in AWS, Databricks, and Informatica Intelligent Data Management Cloud (IDMC) to design, build, and maintain scalable data platforms. The ideal candidate will have hands-on experience in developing robust data pipelines, managing large-scale data systems, and ensuring data quality, security, and performance.
Key Responsibilities
- Design and implement scalable data storage solutions, including data lakes, data warehouses, and databases using AWS services such as Amazon S3, Amazon RDS, Amazon Redshift, and Amazon DynamoDB, along with Databricks Delta Lake.
- Develop, maintain, and optimize data pipelines for ingestion, processing, and transformation using AWS Glue, AWS Lambda, and Databricks.
- Utilize Informatica IDMC for data integration, data quality, metadata management, and governance.
- Build and manage ETL/ELT workflows to cleanse, transform, and enrich data for analytics and reporting.
- Integrate structured and unstructured data from multiple internal and external sources while ensuring data consistency and integrity.
- Monitor and enhance system performance across AWS and Databricks environments to meet scalability and efficiency requirements.
- Implement data security, encryption, and governance practices to ensure compliance with regulatory standards.
- Automate workflows using tools such as AWS Step Functions, AWS Lambda, and Databricks Jobs.
- Collaborate with data scientists, analysts, and engineering teams to deliver data-driven solutions.
- Troubleshoot data issues and ensure high availability and reliability of data systems.
- Optimize resource utilization to control costs while maintaining performance.
- Maintain proper documentation of data architecture, pipelines, and processes.
Preferred Skills
- Experience with Apache Spark and Hadoop ecosystems.
- Familiarity with Docker and Kubernetes.
- Exposure to data visualization tools such as Tableau or Power BI.
- Understanding of DevOps and CI/CD practices for data engineering workflows.
- Experience with Git or other version control systems.
- Knowledge of data governance and cataloging tools, especially Informatica IDMC.