Knowledge Management (Data Scientist) - SME
Raytheon is seeking experienced data scientist/applications developer SMEs to create, leverage, and apply data science approaches to successfully implement information management solutions.
* Identify, implement and improve methods for duplicate detection, document categorization, entity and information extraction using natural language processing, machine learning, data mining, and statistical algorithms.
* Assist with the design and implementation of visualizations and reports for business intelligence metrics.
* Propose, implement, and evaluate content analytic strategies for characterizing and categorizing large data sets of unstructured files and messages using COTS/GOTS/Open Source tools.
* Develop custom software as required by Sponsor to characterize and categorize large datasets of unstructured files and messages.
* Serve as a Subject Matter Expert (SME) in discussions with analytic tool developers and enterprise IT management.
* Partner with information management SMEs to define and refine framework, strategies, and actions for collecting and analyzing unstructured file metadata and content stored in Sponsor's automated systems (e.g. email repositories, databases, shared drives).
* Implement collection and analysis actions, such as ingesting, indexing, normalizing, and structuring file content and metadata in preparation for analysis using tools in the big data environment (GOTS, COTS, and open source tools including but not limited to Hadoop, Hive, Tableau, Spark, Visual Studio, Tensorflow and other emerging technologies).
* Partner with information management SMEs to determine baseline, analyze patterns and characteristics in file content and metadata, and construct visualizations to share lessons learned and provide output recommendations based upon analytic results.
* Lead and/or contribute to discussions with Sponsor and Sponsor partners on collection and analysis framework, strategies, processes, and methodologies.
* Build relationships with stakeholders to negotiate access, security, and storage needs for the unstructured file objects and the features created during the collection and analysis process.
* Provide recommendations and training to Sponsor and Sponsor partners on techniques and tools in the big data environment.
* Write MapReduce jobs, Hive queries; Python, Java, Scala, R, and Scala programs as appropriate to perform various tasks related to machine learning and data science activities including data cleanup, data transformation, data mashing, data searching, and algorithm parallelization.
* Implement algorithms from various sources (academia, federal labs or other Government Agencies) into parallelized MapReduce.
* Analyze and correlate large amounts of data.
* Run machine learning workflows from various platforms (such as Python, Spark, and Tensorflow) on large amounts of data.
* Administer, configure, and optimize a distributed cluster ecosystem such as Hadoop or Spark.
* Demonstrated on-the-job experience integrating and analyzing large data sets using big-data technologies such as Hadoop and Spark
* Demonstrated on-the-job experience indexing data sets using Solr and/or ElasticSearch
* Demonstrated on-the-job experience with Java, Python, and Bash scripting
* Demonstrated on-the-job experience proposing, implementing and evaluating strategies for characterizing and categorizing large data sets.
* Demonstrated on-the-job experience performing statistical analysis on large data sets
* Ability to communicate technical concepts to a non-technical audience.
* Understand and implement methodologies that are consistent with standard techniques in the data science field.
* Familiarity with Linux/Windows
* Systems administration of AWS
* Systems administration of distributed systems such as Hadoop and Spark
* Experience using databases such as Oracle and MySQL
* Familiarity with Scikit-Learn, Gensim, NLTK, Spacy and the applications of these tools to Natural Language Processing
* Familiarity with Theano, Tensorflow, Torch, Keras, Mxnet, Deeplearning4j and the application of these tools to Natural Language Processing
* Familiarity with classification and clustering algorithms such as LightGBM, Xgboost, Random Forest, Support Vector Machine, K-means and t-SNE
127522BR 127522 Business Unit Profile
Raytheon Intelligence, Information and Services (IIS) is a leader in intelligence, surveillance and reconnaissance; advanced cyber solutions; weather and environmental solutions; information-based solutions for law enforcement and homeland security; and training, logistics, engineering, product support, and operational support services and solutions for the mission support, homeland security, space, civil aviation, counter-proliferation and counter-terrorism markets. IIS, which operates at nearly 551 sites in 80 countries, is headquartered in Dulles, VA. and generated $6 billion in 2014 revenues. As a global business, our leaders must have the ability to understand, embrace and operate in a multicultural world -- both in the marketplace and in the workplace. We strive to hire individuals who reflect our communities and proactively embrace diversity and inclusion in order to advance our culture, develop our employee and leaders, and grow our marketshare with our clients."
TS/SCI with Poly - Current
Type Of Job
VA - Herndon
Raytheon is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, creed, sex, sexual orientation, gender identity, national origin, disability, or protected Veteran status.
Raytheon is a technology company, which specializes in defense and other government markets.