Job Description:
• Deep Knowledge on Open Source Technologies for Distributed Data processing
• Writing jobs to process vast amounts of data and ETL processing
• Requires expertise in Apache Hadoop, MapReduce, Pig, Spark, Databricks, Hive
• Enables ingestion and processing of millions of streaming events per second
• Requires expertise in Apache Kafka, Apache Storm, Apache Spark
• Performance Analysis and Optimization of Apache tools
• Experience in technologies such as Azure Data Factory and SQL Server Integration Services
Requirements:
• Deep Knowledge of Open Source Technologies for Distributed Data processing
• Deep Knowledge of Open Source Technologies for fast, interactive SQL queries
• Deep Knowledge of Open Source Technologies to enable ingestion and processing of streaming events
• Deep Knowledge of Performance Analysis and Optimization
• Deep Knowledge of Open Source Technologies for Machine Learning
• Strong expertise in ETL/ELT approaches
• Software Development skills in Java, Spark Scala, PySpark, Python
Benefits:
• Drug-Free Workplace
• Equal Opportunity Employer (EOE)
• Affirmative Action Employer