Job ID: DEV-IND-2266

Department: Innovation & Strategy
Job Type: Permanent
Location: India

Job Details

We are seeking a visionary and highly skilled AI Architect to join our leadership team. This pivotal role will be responsible for defining and implementing the end-to-end architecture for deploying our machine learning models, including advanced Generative AI and LLM solutions, into production. You will lead and mentor a talented cross-functional team of Data Scientists, Backend Developers, and DevOps Engineers, fostering a culture of innovation, technical excellence, and operational efficiency.

Key responsibilities:

  • Architectural Leadership:
    • Design, develop, and own the scalable, secure, and reliable end-to-end architecture for deploying and serving ML models, with a strong focus on real-time inference and high availability.
    • Lead the strategy and implementation of the in-house API wrapper infrastructure for exposing ML models to internal and external customers.
    • Define architectural patterns, best practices, and governance for MLOps, ensuring robust CI/CD pipelines, model versioning, monitoring, and automated retraining.
    • Evaluate and select the optimal technology stack (cloud services, open-source frameworks, tools) for our ML serving infrastructure, balancing performance, cost, and maintainability.
  • Team Leadership & Mentorship:
    • Lead, mentor, and inspire a diverse team of Data Scientists, Backend Developers, and DevOps Engineers, guiding them through complex architectural decisions and technical challenges.
    • Foster a collaborative environment that encourages knowledge sharing, continuous learning, and innovation across teams.
    • Drive technical excellence, code quality, and adherence to engineering best practices within the teams.
  • Generative AI & LLM Expertise:
    • Architect and implement solutions for deploying Large Language Models (LLMs), including strategies for efficient inference, prompt engineering, and context management.
    • Drive the adoption and integration of techniques like Retrieval Augmented Generation (RAG) to enhance LLM capabilities with proprietary and up-to-date information.
    • Develop strategies for fine-tuning LLMs for specific downstream tasks and domain adaptation, ensuring efficient data pipelines and experimentation frameworks.
    • Stay abreast of the latest advancements in AI, particularly in Generative AI, foundation models, and emerging MLOps tools, and evaluate their applicability to our business needs.
  • Collaboration & Cross-Functional Impact:
    • Collaborate closely with Data Scientists to understand model requirements, optimize models for production, and integrate them seamlessly into the serving infrastructure.
    • Partner with Backend Developers to build robust, secure, and performant APIs that consume and serve ML predictions.
    • Work hand-in-hand with DevOps Engineers to automate deployment, monitoring, scaling, and operational excellence of the AI infrastructure.
    • Communicate complex technical concepts and architectural decisions effectively to both technical and non-technical stakeholders.

Requirements (Qualifications/Experience/Competencies)

  • Education: Bachelor's or Master's degree in Computer Science, Machine Learning, Data Science, or a related quantitative field.
  • Experience:
    • 10+ years of progressive experience in software engineering, with at least 5+ years in an Architect or Lead role.
    • Proven experience leading and mentoring cross-functional engineering teams (Data Scientists, Backend Developers, DevOps).
    • Demonstrated experience in designing, building, and deploying scalable, production-grade ML model serving infrastructure from the ground up.
  • Technical Skills:
    • Deep expertise in MLOps principles and practices, including model versioning, serving, monitoring, and CI/CD for ML.
    • Strong proficiency in Python and experience with relevant web frameworks (e.g., FastAPI, Flask) for API development.
    • Expertise in containerization technologies (Docker) and container orchestration (Kubernetes) for large-scale deployments.
    • Hands-on experience with at least one major cloud platform (AWS, Google Cloud, Azure) and their AI/ML services (e.g., SageMaker, Vertex AI, Azure ML).
    • Demonstrable experience with Large Language Models (LLMs), including deployment patterns, prompt engineering, and fine-tuning methodologies.
    • Practical experience implementing and optimizing Retrieval Augmented Generation (RAG) systems.
    • Familiarity with distributed systems, microservices architecture, and API design best practices.
    • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Datadog).
    • Knowledge of infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation).
  • Leadership & Soft Skills:
    • Exceptional leadership, mentorship, and team-building abilities.
    • Strong analytical and problem-solving skills, with a track record of driving complex technical initiatives to completion.
    • Excellent communication (verbal and written) and interpersonal skills, with the ability to articulate technical concepts to diverse audiences.
    • Strategic thinker with the ability to align technical solutions with business objectives.
    • Proactive, self-driven, and continuously learning mindset.

Bonus Points:

  • Experience with specific ML serving frameworks like BentoML, KServe/KFServing, TensorFlow Serving, TorchServe, or NVIDIA Triton Inference Server.
  • Contributions to open-source MLOps or AI projects.
  • Experience with data governance, data security, and compliance in an AI context.