Manas Pathak Texas Longhorn logo

Hi, I'm Manas — an ECE student at UT Austin focused on software engineering, AI systems, and applied machine learning.

I build reliable AI infrastructure and agentic systems that turn research ideas into real-world products. My work spans LLM reasoning evaluation, backend engineering, distributed systems, and developer tooling — I like projects where strong engineering, research intuition, and product impact intersect.

Reach out about AI systems, infrastructure, research, or just to connect.

GitHub: github.com/Manas2006  |  Email: manaspathak@utexas.edu

Experience

Software Developer Intern · IBM

Summer 2026 · New York, NY

Building agent infrastructure for LangGraph Deep Agents on OpenShift, enabling skill-based orchestration across AI workflows.

Machine Learning Researcher · HUMAIN Lab, UT Austin

Mar 2025 — present

LLM evaluation and reliability with Prof. Leqi Liu. Built CUDA-parallel PyTorch and vLLM pipelines that cut evaluation runtime 70% across 100+ models, a distributed GPU job scheduler on AWS that saved ~$10k in cloud costs, Chain-of-Thought hallucination detection for Qwen/Gemma-7B, and React/FastAPI dashboards tracking 50+ distributed jobs.

Undergraduate Department Tutor · ECE 312, UT Austin

Jan 2026 — May 2026

Mentoring 40+ students in C/C++ memory management, recursion, and graph traversal, with GDB and Valgrind debugging sessions.

Software Engineering Intern · Graph Neural Networks Lab, UT Austin

Dec 2024 — Mar 2025

Pipeline orchestration and monitoring with TypeScript, React, and PostgreSQL, plus vectorized geometric tooling and performance profiling in Python.

Software Engineering Intern · ModHeader

Jun 2023 — Oct 2023

Full-stack work for a 250k-user browser extension: Svelte UI, Node.js APIs on DynamoDB, and an analytics dashboard in React and TypeScript.

Research

arXiv:2604.11996 under review, COLM 2026 first author

Filtered Reasoning Score: Evaluating Reasoning Quality on a Model's Most-Confident Traces

Manas Pathak, Xingyao Chen, Amy Zhang, Leqi Liu

Accuracy treats every answer the same, but deployed systems act on the outputs a model is most confident about. FRS evaluates reasoning quality on exactly those traces, conditioning the metric on the model's own confidence. The result is a view of reliability that accuracy alone cannot provide: two models with identical scores can behave very differently when you only trust what they are sure of, and FRS makes that difference measurable.

Projects

Education

The University of Texas at Austin

Expected May 2028

B.S. Electrical & Computer Engineering, Software Concentration · GPA 3.96 / 4.00

Relevant coursework

Distributed Systems Deep Reinforcement Learning Operating Systems Machine Learning Computer Architecture Data Structures & Algorithms Discrete Math Software Design & Implementation I/II

About

I work on AI systems, with a focus on agents. As agents take on longer horizons and more autonomy, the hard problem shifts from making them capable to making them legible: tracing what they did, why they did it, and whether it worked.

That is the thread through my work. My research builds evaluation methods that condition on a model's own confidence, so a score reflects behavior you would actually act on. My engineering builds the observability layer for agent workflows, instrumenting how agents select and compose skills. Capability claims should be verifiable. That is how we trust progress.

Degree
B.S. ECE, May 2028
GPA
3.96 / 4.00
Honors
Undergraduate Research Fellowship ($10,000) · Tau Beta Pi · Eta Kappa Nu
Stack
Python, C/C++, Java, Rust, Go, TypeScript · PyTorch, CUDA, JAX · Kubernetes, AWS, Terraform