- Monitor Hadoop cluster for performance and capacity planning
- Develop and implement data processing pipelines for different kinds of data sources, formats
and content for our platform
- Developing techniques to analyze and enhance both structured, semi-structured, unstructured
and real time data
- Work on Hadoop Distribution Platforms like Cloudera/Hortonworks
- Configuration and management of security for Hadoop clusters
- Develop, deploy, maintain & upgrade Big Data ecosystem components.
- Stream Analytics preferably using Apache Kafka, Flume, Spark Streaming, Flink/Storm
Education and Skills
- Bachelor’s Degree in Engineering or MCA
- 2 years implementation or consulting experience in building large-scale big data systems
- Hands on experience in real time ingestion of data into HDFS
- Experience in HDFS sharding, replication, recovery, and performance optimizations.
- Hands on experience of setting Hadoop cluster and using open source tools used with Hadoop
- Experience in big data technologies such as Hadoop Hive, Sqoop, Map Reduce, Mongodb,
- Cassandra and SQL based data stores.
- Experience integrating data in Batch and real time fashion with multiple data sources
- Good Programming expertise in Spark, Map Reduce, PIG with languages like Scala/Python/Java
|Job Category||Software Development|