LLM Research & EngineeringNVIDIA Partnered

LLM Researcher / Engineer

Sudha Gopalakrishnan Brain Centre, IIT Madras

Contributed to cutting-edge neuroscience research at the Sudha Gopalakrishnan Brain Centre, a world-class facility powered by NVIDIA DGX A100 systems and SuperPOD technology. Developed and fine-tuned advanced Vision-Language Models (VLMs) and LLMs on histology data, built internal AI tools, and deployed production-ready inference pipelines as Jenkins plugins for seamless integration into the research workflow.

Brain Centre Website DHARANI Brain Atlas

Sudha Gopalakrishnan Brain Centre

IIT Madras

NVIDIA DGX A100 SuperPOD

Duration

December 2025 - February 2026

3-month intensive LLM research and engineering role with hands-on model development

Location

IIT Madras Campus

NAC-1 Building, IIT Madras, Chennai - 600036 (On-site)

Organization

Sudha Gopalakrishnan Brain Centre

World-class neuroscience research facility with NVIDIA-powered HPC infrastructure

Completed

NVIDIA Partnership & Infrastructure

The Sudha Gopalakrishnan Brain Centre is supported by NVIDIA, utilizing their cutting-edge DGX A100 systems and SuperPOD technology to process petabytes of brain imaging data at cellular resolution. This high-performance computing infrastructure enables processing of massive histology datasets, training large-scale deep learning models, and running complex Vision-Language Model inference for neuroscience research.

Role Overview

As an LLM Researcher and Engineer at the Brain Centre, I worked on developing AI-powered tools for neuroscience research. My primary focus was on fine-tuning Vision-Language Models (VLMs) including OpenCLIP and other multimodal architectures on high-resolution histology data of human brain tissue. I developed internal tools for automated analysis, created production-ready inference pipelines, and deployed these systems as Jenkins plugins for seamless integration into the research team's existing workflows. This work leveraged NVIDIA's world-class HPC infrastructure to process and analyze brain imaging data at unprecedented scale.

Key Responsibilities

Vision-Language Model Development

Fine-tuned advanced Vision-Language Models including OpenCLIP, BLIP-2, and custom multimodal architectures on histology imaging data. Developed specialized embeddings for brain tissue classification, cell-type identification, and anatomical region segmentation using contrastive learning approaches optimized for microscopy images.

OpenCLIPBLIP-2Vision TransformersContrastive Learning

Histology Data Fine-tuning

Curated and preprocessed large-scale histology datasets from whole-brain imaging pipelines. Implemented domain-specific data augmentation strategies for microscopy images, handled gigapixel whole-slide images using efficient tiling approaches, and developed custom training pipelines optimized for the DGX A100 multi-GPU environment.

Histology AnalysisWhole-Slide ImagingData AugmentationMulti-GPU Training

Internal AI Tools Development

Built various internal tools for the research team including automated image quality assessment systems, intelligent tissue region annotation assistants, and LLM-powered documentation generators. Created REST APIs for model inference, developed interactive visualization dashboards, and implemented batch processing pipelines for large-scale analysis.

FastAPIREST APIsBatch ProcessingPython

Jenkins Plugin Deployment

Deployed fine-tuned models as production-ready Jenkins plugins for seamless CI/CD integration. Containerized inference pipelines using Docker, implemented model versioning and A/B testing capabilities, and created automated deployment workflows that enabled researchers to run AI-powered analysis as part of their standard processing pipelines.

JenkinsDockerCI/CDMLOps

Technical Environment

NVIDIA HPC Infrastructure

DGX A100 Systems

Multiple NVIDIA DGX A100 nodes with 8x A100 80GB GPUs each, NVLink interconnect, and 2TB system memory for training large-scale vision-language models.

SuperPOD Technology

NVIDIA SuperPOD architecture enabling petaflop-scale computing for processing massive brain imaging datasets and running distributed training workloads.

Storage Infrastructure

High-performance parallel file systems with petabyte-scale storage for whole-brain imaging data at cellular resolution.

AI/ML Software Stack

Deep Learning Frameworks

PyTorch with DeepSpeed and FSDP for distributed training, Hugging Face Transformers, and OpenCLIP for vision-language model development.

Model Optimization

NVIDIA TensorRT for inference optimization, Triton Inference Server for production deployment, and mixed-precision training with AMP.

MLOps & Deployment

Jenkins for CI/CD pipelines, Docker and Kubernetes for containerized deployments, MLflow for experiment tracking and model registry.

Technology Stack

PythonPyTorchOpenCLIPBLIP-2Vision TransformersHugging FaceNVIDIA DGX A100TensorRTTriton ServerDeepSpeedFSDPJenkinsDockerKubernetesFastAPIMLflowCUDAMixed PrecisionHistology AnalysisContrastive Learning

Key Achievements

VLM Fine-tuning Pipeline

Successfully developed and deployed a complete fine-tuning pipeline for Vision-Language Models on histology data, achieving significant improvements in tissue classification accuracy over baseline models.

Production Jenkins Integration

Deployed AI inference capabilities as production-ready Jenkins plugins, enabling seamless integration into the research team's existing imaging pipeline workflow.

Multi-GPU Distributed Training

Implemented efficient distributed training strategies using DeepSpeed and FSDP, enabling training of large VLM models across multiple A100 GPUs with near-linear scaling.

Internal Tools Suite

Built a comprehensive suite of internal AI tools for automated image quality assessment, tissue annotation assistance, and batch processing of whole-slide images.

About the Brain Centre

Sudha Gopalakrishnan Brain Centre

The Sudha Gopalakrishnan Brain Centre at IIT Madras operates a world-class high-throughput multimodal whole-brain histology-imaging-compute pipeline to digitize and study human brains at unprecedented high-resolution and scale.

The centre leverages expertise across IIT Madras, national, and international collaborations, aiming to become a globally leading R&D centre for human brain research with transformative impact in neuroscience and neurotechnologies.

Their flagship project, DHARANI (Developing Human-brain Atlas Resource to Advance Neuroscience Internationally), creates a comprehensive 3D atlas of the human brain at cellular resolution.

Petabyte-Scale Data

Whole-brain imaging at cellular resolution

NVIDIA-Powered

DGX A100 & SuperPOD infrastructure

Multi-Modal Imaging

High-throughput histology pipeline

Visit Brain Centre Website Explore DHARANI Brain Atlas

Documents & Resources

Internship Certificate

Official completion certificate from the Sudha Gopalakrishnan Brain Centre, IIT Madras, acknowledging LLM research and engineering contributions