Rishav Upadhaya
Featured Projects
Gemma Fine-tuning for Tool Calling
Fine-tuned Gemma 3:1b using Hugging Face Transformers with QLoRA/PEFT for specialized tool-calling. Built an evaluation pipeline for tool-call accuracy, precision, and recall, and served the model via FastAPI for production-ready inference.
Sentiment Analysis & NLP Classification System
Built an NLP text classification system with TF-IDF vectorization, feature engineering, and cross-validation using scikit-learn. Implemented end-to-end evaluation tracking precision, recall, F1-score, and ROC-AUC, and served real-time predictions through a Flask REST API.
Data Extractor Agent
Built an agentic extraction pipeline using LangChain and Gemini API to parse structured company data from PDFs, achieving 62% structured extraction accuracy through iterative LLM workflow orchestration. Enforced strict Pydantic schema validation for downstream database integrity.
System Design
Multi-Agent CRM + RAG Architecture (Production)
Designed for low-latency retrieval and reliable agent execution in a study-abroad CRM workflow.
- Ingestion: source documents processed and embedded into pgvector with HNSW indexing for semantic retrieval.
- Serving: FastAPI orchestration layer coordinates LangGraph agents, retrieval, and response synthesis.
- Reliability: strict Pydantic schemas, structured error handling, and OpenAPI contracts for backend consistency.
- Observability: LangSmith traces for token usage, latency tracking, and prompt iteration feedback loops.
- Impact: reduced retrieval latency from ~10s to under 4s with end-to-end response times around 3–5s.
Experience
AI Engineer (Part-Time) - AsterGaze Technologies
May 2025 – Present • Built production multi-agent CRM AI systems with 3–5s response times and cut LLM costs by up to 55%.
View detailed responsibilities
- Designed and built multi-agent AI system for a CRM platform in study abroad consultation, orchestrating agent workflows with LangGraph and FastAPI achieving 3–5 second end-to-end response times.
- Architected RAG pipeline with hybrid search combining semantic (neural) and metadata retrieval, optimizing vector search from approximately 10 seconds to under 4 seconds using pgvector HNSW indexing.
- Reduced LLM inference costs by up to 55% through prompt engineering, context compression, and token optimization; tracked LLM trace observability, token usage, and latency via LangSmith.
- Designed scalable PostgreSQL schema with SQLAlchemy ORM, implementing N+1 query optimization, lazy loading, and OpenAI text-embedding models for semantic search.
Backend Developer Intern - Proshore.eu
Nov 2025 – Feb 2026 • Built OCR/document pipelines reducing manual processing from 8 human-hours to under 3 seconds per batch at 96% data fidelity.
View detailed responsibilities
- Built multi-engine OCR pipeline integrating AWS Textract, Azure Document Intelligence, and Google Vision API for automated document data extraction in regulated workflows, reducing manual processing from 8 human-hours to under 3 seconds per batch while maintaining 96% data fidelity.
- Engineered computer vision preprocessing pipeline using OpenCV for skew correction, grayscale conversion, and orientation detection, improving OCR accuracy by 35% measured via Character Error Rate (CER) and Word Error Rate (WER).
- Developed RESTful APIs with Pydantic schema validation, comprehensive error handling, and OpenAPI documentation for document processing workflows.
- Collaborated in Agile sprints with cross-functional teams of 10+ engineers (PM, QA, Frontend, Backend), translating business requirements into scalable technical solutions.
Student Partner - Leapfrog Technology
Apr 2025 – Oct 2025 • Selected from 500+ applicants; led backend development of Reviso.ai with GPT-4, LangChain, and Pinecone.
View detailed responsibilities
- Selected from 500+ applicants for competitive software engineering apprenticeship; led development of Reviso.ai, an LLM-powered exam platform with automated question generation and NLP-based evaluation capabilities.
- Built modular FastAPI backend integrating GPT-4, LangChain, and Pinecone vector database with CI/CD pipelines, supporting concurrent evaluations with real-time analytics.
- Applied Agile methodologies, testing practices, and SOLID principles; participated in peer code reviews to maintain code quality and system maintainability.
Skills
Primary Stack
Python, FastAPI, PostgreSQL, REST APIs, Docker, Git, Linux
Backend Engineering
API design, SQLAlchemy ORM, PostgreSQL schema design, Pydantic validation, OpenAPI, CI/CD, performance optimization, error handling
Gen AI Frameworks
LLMs (GPT-4, Gemma), RAG systems, LangChain, LangGraph, Hugging Face Transformers, Multi-agent orchestration, LangSmith
ML
scikit-learn, NLP, Text Classification, Sentiment Analysis, Feature Engineering, Cross-Validation, Precision/Recall/F1/ROC-AUC, CER/WER
Database
PostgreSQL, SQL, pgvector (HNSW indexing), Pinecone, query optimization
Cloud Services
AWS Textract, Azure Document Intelligence, Google Vision API
Production Readiness
LLM observability and tracing, latency/token monitoring, technical documentation, Agile/Scrum delivery
Blog
The Model Context Protocol (MCP): The Future of LLM Tool Integration
Exploring how MCP reshapes LLM integrations for real-world AI systems and production environments.
Read on Medium