A.K_
Initializing systems · ./build_portfolio
SYS / ONLINE
LAT 24.86° N
LON 67.00° E
NODE / KARACHI-PK
UPTIME · 730d
BUILD 2026.05.08
STACK PY · LLM · SQL
MODEL LLAMA-3 / GPT-4
STATUS ACCEPTING_BRIEFS
Available · Karachi, PK

AMAN KHAN

[ AI Engineer ] · ENGINEERING INTELLIGENCE · ARCHITECTING DATA

Building production-grade LLM systems, retrieval pipelines, and analytics infrastructure that turn raw signal into measurable business intelligence.

Python LangChain Snowflake Spark Power BI LLaMA 3 FastAPI Airflow
Scroll
// 02 — About

Operator profile

SUBJ_001 / AKH
Aman Khan
Currently building · RAG pipelines 📍 Karachi, PK ⚡ Open to work · Immediate

I build the connective tissue between language models and the data they reason over.

I'm an AI Engineer and Data Scientist focused on shipping LLM applications that work reliably outside the demo. My work spans the full intelligence stack — from retrieval and prompt orchestration with LangChain & FAISS, through data pipelines on Spark, Airflow and Snowflake, to analytics surfaces in Power BI and Streamlit.

I care about the boring parts: latency budgets, eval harnesses, drift, observability. Currently completing a CS degree while interning across AI engineering and data analytics roles.

01
2+
Years building
02
10+
Projects shipped
03
5+
Tech stacks
04
24h
Avg. response
// 03 — Capabilities

The tooling I deploy

/ AI · Large Language Models12 tools · primary stack
G4GPT-4OpenAI
LLaMA 3Meta · local
Mistralopen-weights
LangChainorchestration
LiLlamaIndexretrieval
Hugging Facemodels · datasets
Anthropic APIClaude
AIOpenAI APIchat · embeddings
RRAGretrieval-augmented
FFAISSvector index
ChromaDBvector store
{ }Prompt Eng.system design
/ Data Science · Modeling9 tools · core stack
Python3.11+
Pandastabular ops
NumPylinear algebra
Scikit-learnclassical ML
PyTorchdeep learning
📈Matplotlibplotting
📊Seabornstat. graphics
EDAEDAexploratory
f(x)Feature Eng.pipelines
/ Data Engineering · Pipelines6 tools · production
Apache Sparkdistributed
Airfloworchestration
Snowflakewarehouse
Databrickslakehouse
SQLPostgres · MySQL
↳↴ETL Pipelinesbatch · streaming
/ MLOps · Infrastructure6 tools · ship-ready
Dockercontainers
Kubernetesorchestration
MLflowexperiment tracking
Gitversion control
CI / CDGitHub Actions
FastAPIinference svc.
/ Visualization · BI4 tools · presentation
PBPower BIDAX · dashboards
Streamlitml apps
📊Seabornstatistical
📈Matplotlibpublication
// 04 — Selected work

Things I've shipped

P_001AGENT_OK
AI Agent RAG

AI Appointment Agent

Conversational scheduling agent that books, reschedules and cancels appointments via natural language — backed by a vector store of practitioner availability.

LangChainGroqChromaDBFastAPISQLite
P_002VEC_INDEXED
RAG Q&A

RAG-Based Document Q&A

Drop in any PDF or knowledge base and query it conversationally — embeddings indexed in FAISS, answers grounded with inline citations.

LangChainFAISSHuggingFaceStreamlit
P_003SIMILARITY
ML Recommender

Movie Recommendation System

Hybrid recommender combining content-based similarity with sentence-transformer embeddings of plot synopses. Streamlit UI for live exploration.

Scikit-learnSentence-TransformersStreamlit
P_004MULTI_TOOL
AI Agent Tool-use

Multi-Tool LangChain Agent

A locally-run agent (Ollama) that decides between web search, calculator, and document QA tools — with visible chain-of-thought traces.

LangChainOllamaSerpAPIStreamlit
P_005PIPELINE_WIP
Data Eng In progress

End-to-End Data Pipeline

Orchestrated ETL — ingest from S3, transform with Spark, materialize curated marts in Snowflake, scheduled and observed via Airflow DAGs.

AirflowSnowflakeSpark
P_006BI_DASHBOARD
Viz BI

Power BI Sales Intelligence

Executive dashboard tracking pipeline velocity, region-level cohorts and forecast vs. actual — DAX measures, drill-throughs and row-level security.

Power BIDAXSQL
// 05 — Trajectory

Where I've been operating

Freelance Data Scientist · Gen AI / Agentic AI @ Independent

Apr 2026 — Present
  • Currently delivering custom Gen AI and agentic builds for clients — RAG systems, multi-tool agents, fine-tuning, and applied data-science engagements.
  • Owning each engagement end-to-end: scoping, architecture, modeling, deployment, and post-launch tuning.
  • Available for new briefs — contract or project-based, remote-friendly, EN/UR.
Data Science Gen AI Agentic AI Open

AI Engineering Intern @ Out2Soul

Dec 2025 — Mar 2026
  • Designed and shipped a production RAG pipeline over the company knowledge base — embeddings in FAISS, served behind a FastAPI inference layer.
  • Built prompt-evaluation harness with golden-set scoring; reduced hallucination rate by ~38% across the customer-support workflow.
  • Owned end-to-end deployment: Docker, GitHub Actions CI/CD, observability via structured logs and request-level tracing.
LangChain RAG FastAPI Docker

Data Analyst Intern @ Excelerate

Sep 2025 — Nov 2025
  • Authored Power BI dashboards tracking program-level KPIs across 3,000+ participants with daily refresh from a Snowflake warehouse.
  • Wrote DAX measures for cohort retention and activation funnels; translated stakeholder questions into reproducible analytics.
  • Cleaned and modeled raw event data with Pandas; built a lightweight semantic layer for self-serve analytics.
Power BI DAX Snowflake Pandas

Bachelor of Data Science @ SMIU

In Progress
  • Coursework: Statistics, Data Structures & Algorithms, Database Systems, Machine Learning, Linear Algebra, Probability.
  • Independent study tracks: Deep Learning Specialization (Andrew Ng), Hugging Face NLP Course, DeepLearning.AI LangChain.
  • Focus areas: applied ML, retrieval systems, and production data pipelines.
Data Science · BS In Progress

Diploma in Software Engineering @ Aptech Metro Star Gate

2023 — 2026 · Completed
  • Three-year intensive program covering full-stack software engineering, OOP, databases, and modern web development.
  • Hands-on projects across Python, JavaScript, SQL, and version-controlled team workflows.
  • Capstone: production-style application demonstrating end-to-end engineering practice.
Software Engineering Completed 2023 — '26
// 06 — Credentials

Verified training

Google Cloud3 courses

Generative AI · LLMs · Responsible AI

Issued · 2025✓ Verified
DeepLearning.AI2 courses

LangChain & Prompt Engineering

Issued · 2025✓ Verified
Hugging Facecohort

NLP Course · Transformers

Issued · 2025✓ Verified
Coursera · Stanfordspecialization

Machine Learning · Andrew Ng

Issued · 2024✓ Verified
// 07 — Get in touch

Let's build something

Open channel.

I'm available for AI engineering, data science, and MLOps roles — full-time or freelance. Send a brief and I'll reply within a day.

Usually responds within 24 hours
  • Job Opportunity
  • Freelance Project
  • Collaboration
  • Other

Message transmitted.

Reply window · 24 hours