Sr Data Scientist Apply
Sr. Data Scientist
Location: Candidate local to Raleigh is strongly preferred (hybrid schedule 2x per week)
12+ months contract
PLEASE REVIEW IN DETAILS AS CANDIDATES WILL NEED TO HAVE THE FOLLOWING SKILLS (NO EXCEPTIONS):
- Master's or PhD Preferred!
- 5-10+ years of experience in AI and machine learning, model building and strong coding skills in python
- 2+ years of working knowledge of applying recent LLMs including ChatGPT, GPT 3.5, OPT, BLOOM, etc. UTILIZING RAG!
- Experience working directly with large language models and Transformer based architectures including BERT, RoBERTa, T5 etc.
- Experience with conversational search / semantic search, reinforcement learning, prompt engineering, hallucination mitigation
- DevOps repos Debugging, building APIs and managing the algorithm flow across multiple workstreams in one repo
- Senior level experience deploying models in the Cloud (AWS) or Azure as secondary.
Nice to have: Candidate local to Raleigh is strongly preferred (hybrid schedule 2x per week)
- FANG Experience (Facebook, Amazon, Netflix, Google, or even Microsoft)
- Python Proficiency : Expert level of Python, with experience in writing efficient, clean, and modular code with ability to debug and test the new code thoroughly
- RAG Systems
- Experience and deep understanding of Retrieval-Augmented Generation, including concepts like embedding-based search, document retrieval, and combining retrieved information with LLMs.
- Hands-on experience with advanced RAG platform development and maintenance.
Familiarity with knowledge base creation, indexing, and retrieval pipelines.
- Knowledge of AI Architectures : Understanding of the end-to-end architecture of generative AI systems, including pre-processing, retrieval, ranking, and post-processing steps.
- Prompt Engineering
- Expertise in crafting effective prompts for LLMs tailored to specific tasks
- Experience with techniques like zero-shot, few-shot prompting, prompt tuning, and chain of thought.
- Content Generation
- Understanding of generative AI applications in content creation, including best practices for producing accurate, coherent, and domain-specific outputs.
- Ability to fine-tune components for custom use cases.
- Debugging and Performance Tuning
- Skills in profiling and optimizing LLM responses for latency and accuracy.
- Experience diagnosing issues in complex multi-component systems.
2. Monorepo and Collaboration Skills
- Working in Monorepo Environments
- Experience managing and contributing to large, centralized codebases (monorepos).
- Understanding of version control workflows suited for monorepos (e.g., Git-based branching strategies).
- Collaboration Tools and Practices
- Proficient with CI/CD pipelines and tools like Jenkins, GitHub Actions, or GitLab CI.
- Ability to work collaboratively with cross-functional teams in Agile settings.
- Proficiency with code review practices and tools.
3. AI and NLP Knowledge
- NLP Expertise
- Solid understanding of transformers, embeddings, and attention mechanisms.
- Familiarity with techniques for handling domain-specific language models.
4. Complementary Skills
- Documentation and Communication
- Ability to write clear technical documentation for processes, workflows, and API usage.
- Strong communication skills for conveying technical insights to stakeholders.
Preferred Experience
- Previous experience working in legal tech or domain-specific generative AI use cases.
- Hands-on experience with deploying AI models in production at scale.
- Familiarity with multilingual generative AI and fine-tuning for specific language like French