Retrieval-Augmented Generation Systems, LLM's and Recommendation Systems
I am a Master's student in Computational Data Science at Carnegie Mellon University, with expertise in machine learning, data science, and artificial intelligence. My research interests include Retrieval-Augmented Generation (RAG), working with LLM's, Natural Language Processing, Recommednation Systems and link prediction algorithms.
With experience in developing ML models for real-world applications and conducting research in various ML domains, I am passionate about creating solutions that make a significant impact.
Master of Computational Data Science
GPA: 4.0/4.0 | Pittsburgh, PA | 12/2025
B.Tech CSE with spec. in Data Science
GPA: 9.2/10 | Vellore, India | 05/2024
Submitting to ACL 2026
CMU, Pittsburgh | 06/2025 – 08/2025
https://arxiv.org/abs/2510.06534
Under review at "ICLR 2026"
CMU, Pittsburgh | 02/2025 – 05/2025
https://arxiv.org/abs/2505.19253
Under review by "Expert Systems with Applications" Springer Journal
VIT, Vellore | May 2023 - August 2023
Under review by "Pattern Recognition" Elsevier Journal
VIT, Vellore | January 2024 - May 2024
Accepted by "International Conference on Informatics (ICI) 2023" IEEE Conference
VIT, Vellore | September 2022 - January 2023
Collaborating with Amazon to design and implement an end-to-end multimodal Retrieval-Augmented Generation (mRAG) pipeline, integrating LLMs with specific image and text retrieval engines for diverse visual question answering datasets, thereby enhancing context accuracy and mitigating hallucinations.
Developed a retrieval-augmented generation (RAG) system using Langchain, FAISS and few-shot prompting to enhance the performance of the Mistral-7B model for factual Q&A about Pittsburgh and CMU achieving 61% F1-score with no fine-tuning.
Improved the CodeRAG Bench by using fixed-size and semantic pre-retrieval chunking in the retrieval pipeline, obtaining ~31% increase in the baseline values on the ODEX coding tasks (open domain) dataset.