Title: Data Scientist
Location: McLean, VA
*Clearance: *Active TS/SCI w/ Polygraph needed to apply *
Cornerstone Defense is the Employer of Choice within the Intelligence, Defense, and Space communities of the U.S. Government. Realizing early on that our most prized assets are our employees, we continually focus our attention on improving the overall work/life experience they have supporting the mission. Our Team is pushed every day to use their industry leading knowledge to provide end-to-end solutions to combat our nation’s toughest and most secure problems. If you are looking for a place to not only be professionally challenged, but encouraged and supported by a company that cares, don’t look any further than Cornerstone Defense.
The Sponsor provides data-driven business analysis to support senior organizational leaders. The Sponsor requires support specializing in natural language processing (NLP) and associated data preparation to help identify challenges and opportunities for the Sponsor’s customers. The Sponsor needs experienced SQL and Python skills to transform the Sponsor’s structured and unstructured data into clear and supported analytic insights to help customers with decision making related to production, resources and personnel. The work may be performed independently or within a team environment.
Data Science Support – HRR: NO
The Contractor shall work closely with the Sponsor’s data scientists and technical team to implement requirements; however, the Sponsor’s GTM will manage the priorities.
• The Contractor shall conduct sophisticated analysis using deployed tools and natural language processing.
• The Contractor shall analyze large amounts of raw data, including text data, to provide business insights.
• The Contractor shall preprocess or clean structured and unstructured Sponsor data, including text data.
• The Contractor shall design and implement advanced ETL code and table configurations for complex data sets.
• The Contractor shall use Structured Query Language (SQL) in Sponsor’s Oracle database to develop and organize relevant information with supporting analytics.
• The Contractor shall independently, or with a team, author analytic publications and produce ad-hoc reports to include data visualizations using the Sponsor’s templates.
• The Contractor shall stay current with the Sponsor’s enterprise metadata collection tools.
• The Contractor shall implement the Sponsor’s existing coordination process.
• The Contractor shall provide technical education to staff on an ad-hoc basis.
• The Contractor shall provide subject matter expertise in NLP to support Sponsor’s initiatives
1. Demonstrated professional or academic experience performing NLP tasks, including selecting the best Python libraries for a given task, choosing appropriate pre-processing actions, performing analysis, and assessing model performance.
2. Demonstrated professional or academic experience using Python NLP packages such as Spacy, Gensim, or NLTK to analyze or process collections of documents.
3. Demonstrated professional or academic experience with deep learning frameworks such as PyTorch, Tensorflow, or Keras
4. Demonstrated professional or academic experience with the HuggingFace Transformers library and hub.
5. Demonstrated experience creating machine learning models that conduct text classification and topic modeling in Python using standard machine learning (Scikit-learn) or deep learning models.
6. Demonstrated academic or professional experience using encoder-decoder and generative language models to perform NLP tasks.
7. Demonstrated academic or professional experience communicating methodological choices and model results.
8. Demonstrated professional or academic experience and proficiency with SQL to include using common table expressions, set operations, aggregated functions and nested subqueries.
9. Demonstrated professional or academic experience with version control systems such as Github and Jenkins.
10. Demonstrated experience leveraging GPUs for accelerated computing.
1. Demonstrated experience writing Python scripts that pull data from web-based APIs and relational databases.
2. Demonstrated experience with cloud computing development and architecture
3. Demonstrated experience with front-end web development frameworks such as Flask.
4. Demonstrated experience developing applications for semantic search.
5. Demonstrated experience tuning LLMs on custom data sets and applying results to specific use cases.
6. Demonstrated professional or academic experience and proficiency with Tableau to produce visualizations and dashboards.