Sr Aiml Engineer Apply
Sr AI/ML Engineer
Location onsite - Boston, MA
Role:
We are looking for a highly skilled Senior AI/ML Engineer with a strong background in designing, deploying, and operationalizing AI/ML services in production environments. You will be a key contributor in building and maintaining robust, scalable systems that support machine learning workflows, including Large Language Models (LLMs) and AI agent frameworks. This position requires deep expertise in MLOps, distributed systems, cloud infrastructure (particularly AWS), and modern software development practices. Candidate need to collaborate with cross functional teams, drive outcomes by thinking backward from business objectives, and deliver impactful results under specific timelines.
Key Responsibilities
- Design & Implement AI/ML Solutions o Architect and develop end to end ML solutions from data ingestion to model deployment, including LLM based applications. o Evaluate and select appropriate frameworks, libraries, and tools to meet both short term project goals and long term scalability.
- LLM & Prompt Engineering o Develop and optimize prompts for Large Language Models (e.g., Openai/Claude/Llama) to improve the quality and relevance of outputs. o Conduct experiments to evaluate LLM performance and apply prompt engineering best practices to ensure high impact results.
- AI Agent Frameworks o Incorporate AI agent frameworks (e.g., LangChain, AgentGPT, or similar) to enable autonomous or semi autonomous decision making within applications. o Integrate AI agents with existing systems, ensuring robust communication and secure data handling.
- MLOps & Production Operations Set up and optimize CI/CD pipelines for ML models, ensuring continuous integration, testing, and deployment. Monitor, troubleshoot, and refine production ML systems for performance, cost efficiency, and reliability.
- Cloud Development (AWS) o Leverage AWS services (e.g., EC2, S3, Lambda, SageMaker, EKS) to design and maintain scalable, secure, and cost efficient ML infrastructure. Implement best practices for cloud resource allocation, scaling, and maintenance.
- Software Engineering & Distributed Systems o Write clean, maintainable, and well documented code in Python and other modern languages (e.g., Go, Java, or Rust). Develop and maintain distributed systems, focusing on reliability, fault tolerance, and performance. Work with databases (SQL/NoSQL) to handle large scale data processing and storage.
- Front End Integration Collaborate on front end projects using React/Next.js to build user interfaces or internal tools that interact with AI/ML services.
- Cross Team Collaboration o Work closely with product managers, data scientists, DevOps engineers, and other stakeholders to define requirements and deliver high impact solutions. o Communicate technical decisions effectively, balancing trade offs between short term needs and long term product vision.
- Autonomy & Time Management o Operate with minimal supervision, proactively identifying issues and taking ownership to drive solutions. o Manage multiple priorities in a fast paced environment, and effectively escalate blockers to ensure timely delivery.
- Continuous Learning & Adaptability o Stay updated with emerging AI/ML technologies, LLM advancements, and best practices, sharing insights with the team. o Adapt quickly to new domains, frameworks, and technologies as project needs evolve.
Qualifications & Requirements
- Experience: professional software engineering experience, including distributed systems and databases.
Technical Skills:
- Required: AWS (or other major cloud provider) with hands on experience in deploying, monitoring, and scaling production services.
- Python (preferred) and proficiency in at least one other modern programming language (e.g., Go, Java, Rust). ¿ Strong understanding of MLOps concepts, CI/CD pipelines, containeriz