Sudha Gopalakrishnan Brain Centre, IIT Madras
Contributed to cutting-edge neuroscience research at the Sudha Gopalakrishnan Brain Centre, a world-class facility powered by NVIDIA DGX A100 systems and SuperPOD technology. Developed and fine-tuned advanced Vision-Language Models (VLMs) and LLMs on histology data, built internal AI tools, and deployed production-ready inference pipelines as Jenkins plugins for seamless integration into the research workflow.
IIT Madras
December 2025 - February 2026
3-month intensive LLM research and engineering role with hands-on model development
IIT Madras Campus
NAC-1 Building, IIT Madras, Chennai - 600036 (On-site)
Sudha Gopalakrishnan Brain Centre
World-class neuroscience research facility with NVIDIA-powered HPC infrastructure
CompletedThe Sudha Gopalakrishnan Brain Centre is supported by NVIDIA, utilizing their cutting-edge DGX A100 systems and SuperPOD technology to process petabytes of brain imaging data at cellular resolution. This high-performance computing infrastructure enables processing of massive histology datasets, training large-scale deep learning models, and running complex Vision-Language Model inference for neuroscience research.
As an LLM Researcher and Engineer at the Brain Centre, I worked on developing AI-powered tools for neuroscience research. My primary focus was on fine-tuning Vision-Language Models (VLMs) including OpenCLIP and other multimodal architectures on high-resolution histology data of human brain tissue. I developed internal tools for automated analysis, created production-ready inference pipelines, and deployed these systems as Jenkins plugins for seamless integration into the research team's existing workflows. This work leveraged NVIDIA's world-class HPC infrastructure to process and analyze brain imaging data at unprecedented scale.
Fine-tuned advanced Vision-Language Models including OpenCLIP, BLIP-2, and custom multimodal architectures on histology imaging data. Developed specialized embeddings for brain tissue classification, cell-type identification, and anatomical region segmentation using contrastive learning approaches optimized for microscopy images.
Curated and preprocessed large-scale histology datasets from whole-brain imaging pipelines. Implemented domain-specific data augmentation strategies for microscopy images, handled gigapixel whole-slide images using efficient tiling approaches, and developed custom training pipelines optimized for the DGX A100 multi-GPU environment.
Built various internal tools for the research team including automated image quality assessment systems, intelligent tissue region annotation assistants, and LLM-powered documentation generators. Created REST APIs for model inference, developed interactive visualization dashboards, and implemented batch processing pipelines for large-scale analysis.
Deployed fine-tuned models as production-ready Jenkins plugins for seamless CI/CD integration. Containerized inference pipelines using Docker, implemented model versioning and A/B testing capabilities, and created automated deployment workflows that enabled researchers to run AI-powered analysis as part of their standard processing pipelines.
Multiple NVIDIA DGX A100 nodes with 8x A100 80GB GPUs each, NVLink interconnect, and 2TB system memory for training large-scale vision-language models.
NVIDIA SuperPOD architecture enabling petaflop-scale computing for processing massive brain imaging datasets and running distributed training workloads.
High-performance parallel file systems with petabyte-scale storage for whole-brain imaging data at cellular resolution.
PyTorch with DeepSpeed and FSDP for distributed training, Hugging Face Transformers, and OpenCLIP for vision-language model development.
NVIDIA TensorRT for inference optimization, Triton Inference Server for production deployment, and mixed-precision training with AMP.
Jenkins for CI/CD pipelines, Docker and Kubernetes for containerized deployments, MLflow for experiment tracking and model registry.
Successfully developed and deployed a complete fine-tuning pipeline for Vision-Language Models on histology data, achieving significant improvements in tissue classification accuracy over baseline models.
Deployed AI inference capabilities as production-ready Jenkins plugins, enabling seamless integration into the research team's existing imaging pipeline workflow.
Implemented efficient distributed training strategies using DeepSpeed and FSDP, enabling training of large VLM models across multiple A100 GPUs with near-linear scaling.
Built a comprehensive suite of internal AI tools for automated image quality assessment, tissue annotation assistance, and batch processing of whole-slide images.
The Sudha Gopalakrishnan Brain Centre at IIT Madras operates a world-class high-throughput multimodal whole-brain histology-imaging-compute pipeline to digitize and study human brains at unprecedented high-resolution and scale.
The centre leverages expertise across IIT Madras, national, and international collaborations, aiming to become a globally leading R&D centre for human brain research with transformative impact in neuroscience and neurotechnologies.
Their flagship project, DHARANI (Developing Human-brain Atlas Resource to Advance Neuroscience Internationally), creates a comprehensive 3D atlas of the human brain at cellular resolution.
Petabyte-Scale Data
Whole-brain imaging at cellular resolution
NVIDIA-Powered
DGX A100 & SuperPOD infrastructure
Multi-Modal Imaging
High-throughput histology pipeline
Official completion certificate from the Sudha Gopalakrishnan Brain Centre, IIT Madras, acknowledging LLM research and engineering contributions