Seeking a Hadoop Administrator with expert understanding of the Big Data Hadoop Ecosystem (HDFS, Accumulo, Pig, Spark, Impala, Hive, HBase, Kafka etc.) and with expert development skills to work within a DevOps team. This position will be responsible for developing and maintaining Big Data infrastructures, interacting with vendors to resolve problems, monitor and troubleshoot problems within the Big Data Ecosystem. The developer will work closely with customer teams to insure business applications are highly available and performing within agreed upon service levels.
* Lead troubleshooting and development on Hadoop technologies including HDFS, Hive, Pig, HBase, Accumulo, Tez, Impala, Sqoop, Spark, and Kafka.
* Performance tuning applications and systems for high volume throughput
* Develop and maintain security with AD, Kerberos, Knox and Ranger.
* Install and maintain platform level Hadoop infrastructure including additional tools like SQL Workbench, Presto and R.
* Lead analysis of data stores and uncover insights.
* Ensures solutions developed adhere to security and data privacy policies.
* Lead investigations and proof of concepts as Big Data technology evolves.
* Develop and test prototypes and oversee handover to operational teams.
* Lead designing, building, installing, configuring applications that lend themselves to a continuous integration environment.
* Performance tuning applications and systems for high volume throughput.
* Translate complex functional and technical requirements into detailed design.
* Define best practices/standards.
* Maintain clear documentation to help increase overall team productivity.
* Expert (5 years) Java programming with frameworks, Scrum Agile, SOA, Event based architecture, and containers with Kubernetes.
* Strong expertise (1-2 years) in Administering Big Data Hadoop Ecosystem and components (HDFS, Hive, Pig, Tez, Impala, Sqoop, Spark, Kafka, etc.)
* Expert with Docker containers and Kubernetes
* Expert scripting with shell scripting (Python, Bash, PHP, PERL, etc.)
* Experience in administering high performance and VERY large Hadoop clusters
* Strong understanding of Hadoop architecture, storage and IO subsystems, network and distributed systems
* Experience with Kerberized Hadoop clusters
* Experience managing and developing utilizing open source technologies and libraries
* In depth understanding of system level resource consumption (memory, CPU, OS, storage, and networking data), and the Linux commands such as sar and netstat
* Familiarity with version control, job scheduling and configuration management tools such as Github, Puppet, UC4
* Ability to lead and take ownership of projects
Desired Skills/ Experience:
* Experience with RDBMS technologies and SQL language; Oracle and MySQL highly preferred
* Knowledge of NoSQL platforms
* Hands on experience with open source management tools (Pig, Hive, Thrift API, etc.) including participation in the community
* Large scale data warehousing
* Hadoop Certified
* AWS Certified
Comcast is an EOE/Veterans/Disabled/LGBT employer