Job ID: DEV-IND-2266
Department: Innovation & Strategy
Job Type: Permanent
Location: India
Job Details
We are seeking a visionary and highly skilled AI
Architect to join our leadership team. This pivotal role will be
responsible for defining and implementing the end-to-end architecture for
deploying our machine learning models, including advanced Generative AI and LLM
solutions, into production. You will lead and mentor a talented
cross-functional team of Data Scientists, Backend Developers, and DevOps
Engineers, fostering a culture of innovation, technical excellence, and
operational efficiency.
Key responsibilities:
- Architectural
Leadership:
- Design,
develop, and own the scalable, secure, and reliable end-to-end
architecture for deploying and serving ML models, with a strong focus on
real-time inference and high availability.
- Lead
the strategy and implementation of the in-house API wrapper
infrastructure for exposing ML models to internal and external customers.
- Define
architectural patterns, best practices, and governance for MLOps,
ensuring robust CI/CD pipelines, model versioning, monitoring, and
automated retraining.
- Evaluate
and select the optimal technology stack (cloud services, open-source
frameworks, tools) for our ML serving infrastructure, balancing
performance, cost, and maintainability.
- Team
Leadership & Mentorship:
- Lead,
mentor, and inspire a diverse team of Data Scientists, Backend
Developers, and DevOps Engineers, guiding them through complex
architectural decisions and technical challenges.
- Foster
a collaborative environment that encourages knowledge sharing, continuous
learning, and innovation across teams.
- Drive
technical excellence, code quality, and adherence to engineering best
practices within the teams.
- Generative
AI & LLM Expertise:
- Architect
and implement solutions for deploying Large Language Models (LLMs),
including strategies for efficient inference, prompt engineering, and
context management.
- Drive
the adoption and integration of techniques like Retrieval Augmented
Generation (RAG) to enhance LLM capabilities with proprietary and
up-to-date information.
- Develop
strategies for fine-tuning LLMs for specific downstream tasks and
domain adaptation, ensuring efficient data pipelines and experimentation
frameworks.
- Stay
abreast of the latest advancements in AI, particularly in
Generative AI, foundation models, and emerging MLOps tools, and evaluate
their applicability to our business needs.
- Collaboration
& Cross-Functional Impact:
- Collaborate
closely with Data Scientists to understand model requirements, optimize
models for production, and integrate them seamlessly into the serving
infrastructure.
- Partner
with Backend Developers to build robust, secure, and performant APIs that
consume and serve ML predictions.
- Work
hand-in-hand with DevOps Engineers to automate deployment, monitoring,
scaling, and operational excellence of the AI infrastructure.
- Communicate
complex technical concepts and architectural decisions effectively to
both technical and non-technical stakeholders.
Requirements (Qualifications/Experience/Competencies)
- Education:
Bachelor's or Master's degree in Computer Science, Machine Learning, Data
Science, or a related quantitative field.
- Experience:
- 10+
years of progressive experience in software engineering, with at least 5+
years in an Architect or Lead role.
- Proven
experience leading and mentoring cross-functional engineering teams (Data
Scientists, Backend Developers, DevOps).
- Demonstrated
experience in designing, building, and deploying scalable,
production-grade ML model serving infrastructure from the ground up.
- Technical
Skills:
- Deep
expertise in MLOps principles and practices, including model
versioning, serving, monitoring, and CI/CD for ML.
- Strong
proficiency in Python and experience with relevant web frameworks
(e.g., FastAPI, Flask) for API development.
- Expertise
in containerization technologies (Docker) and container
orchestration (Kubernetes) for large-scale deployments.
- Hands-on
experience with at least one major cloud platform (AWS, Google Cloud,
Azure) and their AI/ML services (e.g., SageMaker, Vertex AI, Azure ML).
- Demonstrable
experience with Large Language Models (LLMs), including deployment
patterns, prompt engineering, and fine-tuning methodologies.
- Practical
experience implementing and optimizing Retrieval Augmented Generation
(RAG) systems.
- Familiarity
with distributed systems, microservices architecture, and API design best
practices.
- Experience
with monitoring and observability tools (e.g., Prometheus, Grafana, ELK
stack, Datadog).
- Knowledge
of infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation).
- Leadership
& Soft Skills:
- Exceptional
leadership, mentorship, and team-building abilities.
- Strong
analytical and problem-solving skills, with a track record of driving
complex technical initiatives to completion.
- Excellent
communication (verbal and written) and interpersonal skills, with the
ability to articulate technical concepts to diverse audiences.
- Strategic
thinker with the ability to align technical solutions with business
objectives.
- Proactive,
self-driven, and continuously learning mindset.
Bonus Points:
- Experience
with specific ML serving frameworks like BentoML, KServe/KFServing,
TensorFlow Serving, TorchServe, or NVIDIA Triton Inference Server.
- Contributions
to open-source MLOps or AI projects.
- Experience
with data governance, data security, and compliance in an AI context.