About
Building AI that actually works! Currently deep into Vision-Language Models and Agentic Systems, with hands-on experience taking AI projects from wild ideas to real products. Love tinkering with model fine-tuning and cloud deployments. Big open-source enthusiast - you'll find me contributing to projects that make AI more accessible to everyone.
Work Experience
CognitiveLabOn Site
AI Researcher
TurboMLRemote
AI Developer
NeoHumansRemote
AI Researcher
Indian Institute of Science (IISC)On Site
Research Intern
Mandelbulb TechnologiesRemote
Generative AI Developer
Enable TechnologiesOn Site
Full Stack Developer
Skills
Open Source
omniparse 5504
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
VARAG 346
Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine
AI-Engineering.academy 162
Navigating the World of AI, One Step at a Time
indic_eval 32
A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks
Indic-llm 10
A open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.
Research Projects
Indic Eval/Leaderboard
Developed an evaluation framework for Indic Large Language Models, accommodating multiple translated benchmarks and a leaderboard around it for comparison.
Ambari-7b
India's first Kannada bilingual LLM utilizing the LLama2/3 base model, fine-tuned across multiple stages with 1 billion Kannada tokens and tokenization efficiency by 85%
YoloGemma
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detection and segmentation.
VARAG
Vision-Augmented Retrieval and Generation : a system integrating textual and visual information, enhancing RAG by 35% and improving contextual precision by 60%.
Mixture of Lora Experts
A novel architecture facilitating the dynamic serving of multiple finetuned LLMs by swapping Lora Adapters during inference.
ViViD
A state-of-the-art Vision-Language model specialized in converting complex PDFs into markdown with high speed and efficiency.
Projects
Cognitune
All-in-one platform for LLMops, featuring distributed data processing, multi-GPU fine-tuning, dynamic evaluation, and one-click high-throughput API deployment.
Storyblocks
Generate Story Video from a Prompt : Transformed text prompts into dynamic story videos with script generation, synchronized audio, and consistent visual style.
Marker API
A production-ready server with 400 github ⭐, easily deployable to convert PDFs, Word documents, etc., into markdown to aid RAG pipelines.
PyRaft
Python implementation of the RAFT consensus algorithm from scratch using FastAPI, achieving a throughput of 50-250 transactions per second
Tokenizer Arena
A friendly arena to easily compare different tokenizers of various LLMs simultaneously, running completely in the browser.
Topic2Dataset
Create high-quality instruction fine-tuning datasets for LLMs by providing a topic or website, allowing massive synthetic data generation.