Experience: 5 – 7 years
Location: Pune, India (Work from Office)
Notice Period: 0 – 30 Days
Must Have:
- Proficiency in at least one of the following programming languages: Java/Scala/Python
- Good understanding of SQL
- Experience of development and deployment of at least one end-to-end data storage/processing pipeline
- Strong Experience in Spark development with batch and streaming
- Intermediate level expertise in HDFS and Hive
- Experience with Pyspark and Data Engineering
- ETL implementation and migration to spark
- Experience of working with Hadoop cluster
- Python, PySpark, Data Bricks developer with knowledge of cloud
- Experience with Kafka and Spark streaming (Dstream and Structured Streaming)
- Experience with using Jupyter notebooks or any other developer tool
- Experience with Airflow or other workflow engines
- Good communication skills and logical skills
Good to Have Skills:
- Prior experience of writing Spark jobs using Java is highly appreciated
- Prior experience of working with Cloudera Data Platform (CDP)
- Hands-on experience with NoSQL databases like HBase, Cassandra, Elasticsearch, etc.
- Experience of using maven and git
- Agile scrum methodologies
- Flink and Kudu streaming
- Automation of workflows CI/CD
- Nifi streaming and transformation
