Skip to main content
Back to Experience
LLM Research & EngineeringNVIDIA Partnered

LLM Researcher / Engineer

Sudha Gopalakrishnan Brain Centre, IIT Madras

Contributed to cutting-edge neuroscience research at the Sudha Gopalakrishnan Brain Centre, a world-class facility powered by NVIDIA DGX A100 systems and SuperPOD technology. Developed and fine-tuned advanced Vision-Language Models (VLMs) and LLMs on histology data, built internal AI tools, and deployed production-ready inference pipelines as Jenkins plugins for seamless integration into the research workflow.

+

Sudha Gopalakrishnan Brain Centre

IIT Madras

NVIDIA DGX A100 SuperPOD
Duration

December 2025 - February 2026

3-month intensive LLM research and engineering role with hands-on model development

Location

IIT Madras Campus

NAC-1 Building, IIT Madras, Chennai - 600036 (On-site)

Organization

Sudha Gopalakrishnan Brain Centre

World-class neuroscience research facility with NVIDIA-powered HPC infrastructure

Completed

NVIDIA Partnership & Infrastructure

The Sudha Gopalakrishnan Brain Centre is supported by NVIDIA, utilizing their cutting-edge DGX A100 systems and SuperPOD technology to process petabytes of brain imaging data at cellular resolution. This high-performance computing infrastructure enables processing of massive histology datasets, training large-scale deep learning models, and running complex Vision-Language Model inference for neuroscience research.

Role Overview

As an LLM Researcher and Engineer at the Brain Centre, I worked on developing AI-powered tools for neuroscience research. My primary focus was on fine-tuning Vision-Language Models (VLMs) including OpenCLIP and other multimodal architectures on high-resolution histology data of human brain tissue. I developed internal tools for automated analysis, created production-ready inference pipelines, and deployed these systems as Jenkins plugins for seamless integration into the research team's existing workflows. This work leveraged NVIDIA's world-class HPC infrastructure to process and analyze brain imaging data at unprecedented scale.

Key Responsibilities

Vision-Language Model Development

Fine-tuned advanced Vision-Language Models including OpenCLIP, BLIP-2, and custom multimodal architectures on histology imaging data. Developed specialized embeddings for brain tissue classification, cell-type identification, and anatomical region segmentation using contrastive learning approaches optimized for microscopy images.

OpenCLIPBLIP-2Vision TransformersContrastive Learning
Histology Data Fine-tuning

Curated and preprocessed large-scale histology datasets from whole-brain imaging pipelines. Implemented domain-specific data augmentation strategies for microscopy images, handled gigapixel whole-slide images using efficient tiling approaches, and developed custom training pipelines optimized for the DGX A100 multi-GPU environment.

Histology AnalysisWhole-Slide ImagingData AugmentationMulti-GPU Training
Internal AI Tools Development

Built various internal tools for the research team including automated image quality assessment systems, intelligent tissue region annotation assistants, and LLM-powered documentation generators. Created REST APIs for model inference, developed interactive visualization dashboards, and implemented batch processing pipelines for large-scale analysis.

FastAPIREST APIsBatch ProcessingPython
Jenkins Plugin Deployment

Deployed fine-tuned models as production-ready Jenkins plugins for seamless CI/CD integration. Containerized inference pipelines using Docker, implemented model versioning and A/B testing capabilities, and created automated deployment workflows that enabled researchers to run AI-powered analysis as part of their standard processing pipelines.

JenkinsDockerCI/CDMLOps

Technical Environment

NVIDIA HPC Infrastructure

DGX A100 Systems

Multiple NVIDIA DGX A100 nodes with 8x A100 80GB GPUs each, NVLink interconnect, and 2TB system memory for training large-scale vision-language models.

SuperPOD Technology

NVIDIA SuperPOD architecture enabling petaflop-scale computing for processing massive brain imaging datasets and running distributed training workloads.

Storage Infrastructure

High-performance parallel file systems with petabyte-scale storage for whole-brain imaging data at cellular resolution.

AI/ML Software Stack

Deep Learning Frameworks

PyTorch with DeepSpeed and FSDP for distributed training, Hugging Face Transformers, and OpenCLIP for vision-language model development.

Model Optimization

NVIDIA TensorRT for inference optimization, Triton Inference Server for production deployment, and mixed-precision training with AMP.

MLOps & Deployment

Jenkins for CI/CD pipelines, Docker and Kubernetes for containerized deployments, MLflow for experiment tracking and model registry.

Technology Stack

PythonPyTorchOpenCLIPBLIP-2Vision TransformersHugging FaceNVIDIA DGX A100TensorRTTriton ServerDeepSpeedFSDPJenkinsDockerKubernetesFastAPIMLflowCUDAMixed PrecisionHistology AnalysisContrastive Learning

Key Achievements

VLM Fine-tuning Pipeline

Successfully developed and deployed a complete fine-tuning pipeline for Vision-Language Models on histology data, achieving significant improvements in tissue classification accuracy over baseline models.

Production Jenkins Integration

Deployed AI inference capabilities as production-ready Jenkins plugins, enabling seamless integration into the research team's existing imaging pipeline workflow.

Multi-GPU Distributed Training

Implemented efficient distributed training strategies using DeepSpeed and FSDP, enabling training of large VLM models across multiple A100 GPUs with near-linear scaling.

Internal Tools Suite

Built a comprehensive suite of internal AI tools for automated image quality assessment, tissue annotation assistance, and batch processing of whole-slide images.

About the Brain Centre

Sudha Gopalakrishnan Brain Centre

The Sudha Gopalakrishnan Brain Centre at IIT Madras operates a world-class high-throughput multimodal whole-brain histology-imaging-compute pipeline to digitize and study human brains at unprecedented high-resolution and scale.

The centre leverages expertise across IIT Madras, national, and international collaborations, aiming to become a globally leading R&D centre for human brain research with transformative impact in neuroscience and neurotechnologies.

Their flagship project, DHARANI (Developing Human-brain Atlas Resource to Advance Neuroscience Internationally), creates a comprehensive 3D atlas of the human brain at cellular resolution.

Petabyte-Scale Data

Whole-brain imaging at cellular resolution

NVIDIA-Powered

DGX A100 & SuperPOD infrastructure

Multi-Modal Imaging

High-throughput histology pipeline

Documents & Resources

Internship Certificate

Official completion certificate from the Sudha Gopalakrishnan Brain Centre, IIT Madras, acknowledging LLM research and engineering contributions