Description
Roles & Responsibilities:
- Experience in Data Integration and Big data is mandatory
- Experience in Talend/Informatica/Or Any other Big Data ETL tool and RDBMS
- Experience in Azure Data Factory and Azure Components
- Experience in Hadoop Hive and RDBMS
- Mandatory knowledge of SQL, NoSQL, data warehousing
- Must have lead a team for min 4 as Tech lead.
- 8 to 12 years relevant experience in Big Data platform, Hive and Hadoop
- Must have solid hands-on technical expertise in Big Data platform and Hive and Hadoop
- Extensive expertise in Data Lake ETL design and development process using Any ETL
- Experience in the Big Data components.
- Significant experience in working as a technical Lead with depth in data integration and data architecture for Enterprise Data Lake implementations
- Experience in CICD, Devops etc
- Developing and configuring a Big Data batch Job to use the Spark framework
- Experience in Hadoop, Hive, Sqoop and Spark Jobs and SQL
- Should have prior experience of deployment and orchestration of ETL, optimization, and automation.
- Should have worked on a large volume of data.
- Strong analytical skills and communication skills.
- Design, Implement and deploy ETL
- Experience in Enhancements and Fixes and Troubleshooting production issues
- Hands-on experience in HiveQL and ability to write MapReduce jobs.
- Knowledge in Datasets, Linked services, Copy Data Activity
- Must have worked in multiple types of data load, including Batch Loads, RT, NRT
Preferred Skills:
- Must have worked in Big data technologies such as Cloudera / Hortonworks / HP Ezmeral / Hadoop
- Must have worked extensively in Spark / Spark streaming, HIVE, Kafka, Hadoop
- Must have implemented atleast 2 end-to-end big data projects
- Must have worked in Talend Big Data / Informatica Big Data as ETL processes
- Must have worked on performance management optimisation and tuning for data loads, data processes, data transformation in big data
- Must have understanding of big data architecture with configuration details
- Must be flexible to write code in big data using JAVA, Scala, Python etc. as required