4M Research, Inc. (4M) is a Woman owned small business established in 2007 and headquartered in Huntsville, AL. 4M specializes in Systems and Software Engineering and Analysis, Information Technology, Planning and Analysis, and Field Support Services to a variety of customers including MDA, NASA, and AMCOM. Our continuous and rapid growth is attributable to the caliber of employees 4M strives to hire and retain.
4M Research is looking for a HADOOP Developer that will be part of a SAP BI/BW team. Responsibilities include providing data resident in HADOOP to reports being developed in BW on HANA using various tools. As new customers are defined new data sets will have to be brought into HADOOP and provide for reporting purposes.
* Writing complex HADOOP MapReduce programs
* Writing Pig scripts
* Building new HADOOP clusters and maintaining the privacy and security of the HADOOP clusters
* Designing and implementing column family schemas of Hive and HBase within HDFS
* Applying different HDFS formats and structure like Parquet, Avro, etc. to speed up analytics
* Developing efficient pig and hive scripts with joins on datasets using various techniques
* Defining and running workflows and scheduling HADOOP jobs using Oozie
* Loading data from different datasets and deciding on which file format is efficient for a task
* Cleaning data as per business requirements using streaming API's or user defined functions.
* Building distributed, reliable and scalable data pipelines to ingest and process data in real-time
* Fine tuning HADOOP applications for high performance and throughput
* Reviewing and managing HADOOP log files
* Troubleshooting and debugging any HADOOP ecosystem run time issues
* 12+ years of experience
* Bachelor's degree or Master's Degree plus 10 years total relevant experience
* Active Security Clearance or ability to obtain Security Clearance (or company may sponsor qualified/suitable candidates)
* Willingness to travel to Aberdeen, MD and Huntsville, AL and Ft. Lee, VA (25-50%)
* Strong understanding the requirements of input to output transformations
* Strong understanding of the knowledge of HADOOP ecosystem and its components HBase, Pig, Hive, Sqoop, Flume, Oozie, etc.
* Strong skills of the JAVA essentials for HADOOP
* Strong know-how on basic Linux administration
* Strong knowledge of scripting languages like Python or Perl
* Excellent analytical and problem-solving skills
* Data modelling experience with OLTP and OLAP
* Good knowledge of concurrency and multi-threading concepts
* Understanding the usage of various data visualizations tools like Tableau, Qlikview, etc.
* Should have basic knowledge of SQL, database structures, principles, and theories
* Basic knowledge of popular ETL tools like Pentaho, Informatica, Talend, etc.
Qualified candidates should apply online at www.4mresearch.com
4M Research, Inc. is an Affirmative Action/ Equal Opportunity Employer and an active participant in the Employment Eligibility Verification Program (E-Verify).