Technical Portfolio — Rajdeep Gupta

AI, Engineering & Product | January – April 2026

#Executive Summary

Over the past 4-5 months, I've been building across 11 active repositories spanning real estate SaaS, medical education AI, autonomous trading, a personal AI operating system, a native macOS app, and a Chrome extension. The common thread: using AI not as a feature, but as core infrastructure — from RAG pipelines and multi-agent orchestration to ML signal generation and local model experimentation.

This document covers every project's goals, architecture, AI integration, current state, and future direction. The AI/ML sections go deep — chunking strategies, embedding models, retrieval mechanisms, agentic state machines, memory management, context retrieval, and ML training pipelines.

Key numbers:

—11 repositories, 7 in production
—32 Convex tables powering the AI agent system
—4-tier memory architecture with 5-layer retrieval (Hermes-inspired)
—98 ML features per coin in the trading pipeline with walk-forward validation
—14-type semantic chunking in the RAG pipeline with conditional reranking
—XGBoost → ONNX → TypeScript inference (6e-8 max probability diff)
—6 AI model providers actively used (Anthropic, OpenAI, Google, local MLX, ONNX Runtime, DeepSeek)
—Hetzner VPS running 7+ autonomous agents 24/7
—Smart home integration controlling 15+ IoT devices via command queue

#Project Portfolio at a Glance

#	Project	What It Does	Tech Stack	AI Usage	Status
1	11sqft Platform	Real estate backend + admin	Next.js 15, Neon PostgreSQL	Cron analytics	Production
2	Broker OS	SaaS for real estate brokers	Next.js 15, Convex, Cloudflare R2	36-file Conversation OS, voice parsing	Active Dev
3	Broker Mobile	Mobile app for brokers	React Native 0.81, Expo 54	—	Beta
4	Properties App	Public property discovery	Next.js 16, Mapbox, TanStack Query	—	Production
5	11sqft AI	Scrapers + LLM services	Next.js 16, Python Flask, AI SDK v6	LLM reasoning, web scraping	Active Dev
6	Academy	Educational content site	Next.js 16, MDX, next-intl	LLM content generation	Production
7	Entellect	ENT medical education + RAG	Next.js 16, Neon pgvector, Claude	Full RAG pipeline, content generation	Production
8	Raven	Autonomous crypto trading	Bun, TypeScript, CCXT, ONNX	XGBoost ML + multi-LLM veto	Active Dev
9	Jarvis	Personal AI second brain	Convex, Claude Code, Telegram	Multi-agent orchestration, memory system	Active Dev
10	Ticker	macOS menu bar calendar	Swift 5.9, SwiftUI	—	Production
11	LeadMapsHub	Google Maps lead scraper	Chrome Extension, Manifest V3	—	Development

Production URLs: 11sqft.com↗ | broker.11sqft.com↗ | jarvis.rajdeepgupta.in↗ | polymarket.rajdeepgupta.in↗ | gettickerapp.com↗

#Part 1: The 11sqft Real Estate Ecosystem

The Problem

Indian real estate brokerage is fundamentally broken. Brokers operate through WhatsApp groups, maintain listings in Excel sheets, and have zero digital presence. There's no "Shopify for brokers."

11sqft is a suite of 6 interconnected products solving this.

1.1 Platform (11sqft.com)

Backend API + admin dashboard. 3-layer architecture:

API Layer (public/v1 + admin/v1) — validation, auth, CORS, rate limiting
Service Layer (48+ classes) — business logic, cache invalidation
Repository Layer (30+ models) — SQL via postgres.js tagged templates
PostgreSQL (Neon serverless)

Stack: Next.js 15.5, React 19, TypeScript 5.5, MUI 6.2 + Tailwind, BullMQ + Redis, NextAuth 4.24, Mapbox GL, Sentry, Vercel with crons.

Tables: people, properties, property_groups, leads, favorites, builders, amenities, landmarks, addresses, feedback, region-profiles, media, cache.

1.2 Broker OS (broker.11sqft.com) — The Flagship

"Shopify for Real Estate Brokers" — the product I'm betting everything on.

The AI Story — Conversation OS:

A 36-file multi-turn dialog engine with:

—27 intents with slot-filling state machine
—Voice-first property entry — brokers speak into WhatsApp/Telegram, system parses and structures
—Client relationship extraction from conversation logs
—Weekly engagement digest — AI-generated broker activity summaries

Architecture

WhatsApp/Telegram Message → AiSensy / Bot API
  → Intent Classification (27 intents)
  → Slot Filling State Machine
  → Action Execution (property creation, lead capture, CRM)
  → Response Generation → Channel Delivery

Stack: Next.js 15, Convex 1.17, Firebase Phone Auth, Cloudflare R2 + Images, OpenAI API, next-intl (3 languages), Vitest (2,529 tests) + Playwright (156 E2E tests).

Convex Schema (modular): brokerTables, propertyTables, networkingTables, conversationTables, brokerMemoryTables, referralTables, analyticsTables, complianceTables, adminTables, salesTables.

1.3-1.6 Other 11sqft Projects

—Broker Mobile: React Native 0.81 + Expo 54. Camera/photo upload, deep linking. iOS + Android ready.
—Properties App: Next.js 16.1, Mapbox GL, Framer Motion, TanStack Query v5. Consumer property search.
—11sqft AI: Next.js 16.1 + Python Flask. 99acres scraper (BeautifulSoup), Vercel AI SDK v6.
—Academy: Next.js 16.1 + MDX, next-intl v4.7, recharts, service worker. Custom LLM generation scripts.

Architecture: How They Connect

—Broker OS uses Convex (real-time for messaging/notifications)
—Platform uses PostgreSQL/Neon (relational for complex property search)
—Separate databases by design — different access patterns
—Shared domain (11sqft.com) with Cloudflare DNS

#Part 2: Entellect — AI-Powered Medical Education (DEEP DIVE)

The Problem

ENT postgraduate students study from 9+ textbooks (Dhingra, Scott-Brown, Cummings). No intelligent tool does cross-textbook retrieval, exam/clinical mode adaptation, spaced repetition, or generates practice MCQs from textbook content.

RAG Pipeline Architecture

Architecture

User Query
  → Rule-Based Query Classifier (ZERO LLM cost)
    ├── Mode: Clinical vs Exam (keyword signals)
    ├── Complexity: Reasoning vs Simple ("why", "explain" triggers)
    └── Topics: 40+ ENT keywords extracted
  → OpenAI Embedding (text-embedding-3-small, direct API)
  → Neon pgvector Cosine Similarity (top 20 chunks)
  → Topic Boosting (1.2x multiplier for matching topics)
  → Source Tiering (Tier 1: Indian textbooks for exam, Tier 2: international for clinical)
  → Deduplication (near-duplicate removal)
  → Conditional Reranking (GPT-4o-mini, ONLY for "reasoning" queries)
  → Generation (Claude Sonnet 4.6 via Vercel AI Gateway + OIDC)
  → Structured citations at end

Semantic Chunking Strategy

—Max chunk size: ~1,200 tokens with 200-token overlap
—Section-aware parsing: PDF → pdf-parse → heading-based segmentation → token estimation
—14 content type classifications (detected per chunk via 30-50 keyword indicators each):

No LLM needed for classification — pure keyword matching. This saves significant cost on high-volume indexing.

PDF Extraction Experiments (3 methods compared)

Method	Text Coverage	Accuracy	Cost	Speed	Verdict
Gemini Vision batch-35	0% (total failure)	0%	$0.078	493s	FAILED
Gemini Vision batch-50	37% (cascading failure after page 56)	25% chapter	$0.016	133s	FAILED
pdftotext (native)	131% (overcoverage)	0% structural	$0	0.8s (300x faster)	PASSED

Key finding: Gemini Vision batch extraction hit a degradation wall after page 56 — API context window limits or cumulative token exhaustion. Native pdftotext extracts raw text reliably but loses structural metadata. Decision: Hybrid approach — pdftotext for text, selective LLM refinement for metadata.

Content Generation Pipeline

All generators use Claude Sonnet 4.6 with 20 chunks retrieved per topic:

Content Type	Per Topic	Total (30 topics)	Key Details
MCQs	5	150	30% easy / 40% medium / 30% hard, 6 trap types
Flashcards	5	150	4 types: definition, concept, clinical, mnemonic
Notes	1	30	9 required sections (Definition → Mnemonics)
Viva	3	90	question + model_answer + examiner_notes + common_mistakes
Total	14	420 items per run

MCQ Trap Engine (Phase B): 6 engineered trap types — conceptual_confusion, similar_options, outdated_concept, overthinking_trap, negative_framing, partial_knowledge. Each MCQ has structured explanation_v2 (JSONB): correct_reasoning + why_not_others for each option.

Database Schema Evolution

Phase A (Source Classification): Added source_type, weight (0.0-3.0), domains to documents. Dhingra = exam_standard, weight=1.0. Enables tier-based retrieval weighting.

Phase B (Content Upgrade, 8 new tables): media (images/audiograms/CTs), pyqs (previous year questions with year/exam/session), topic_weights (pyq_count, trend, yield_tier), cases + case_steps (branching clinical casebook with OSCE scoring), drug_interactions, session_logs, user_annotations.

PYQ Intelligence: 30 topics x 5 concepts = 150 base PYQ questions. Topic weights track pyq_count, pyq_last_5_years, trend (rising/stable/declining), yield_tier (high/medium/low). High-yield topics get more generated content.

Dual Database Architecture

—CONTENT_DATABASE_URL (Neon, us-east-1) → MCQs, flashcards, notes, viva, generation_log
—DATABASE_URL (Neon, us-east-1) → Embeddings, chunks (pgvector), documents

Cost tracking: Every RAG generation logged with tokens_used and cost_usd (Sonnet: $3/M input, $15/M output).

Current State

Shipped: MCQ practice, mock exams (50/100/200 Q), flashcards (SM-2 spaced repetition), viva practice, notes, mistake bank, progress dashboard, RAG Q&A, topic mastery heatmap. 25 API routes.

In Progress: Phase B migration, clinical casebook framework, drug system.

#Part 3: Raven — ML-Powered Autonomous Trading (DEEP DIVE)

The Problem & Evolution

Version	Approach	Result	Learning
v1	Rigid AND gates (RSI<30 AND BB<lower AND ADX>25)	9,607 cycles, 0 trades	Rules too tight
v2	BB mean reversion + intelligence layer	2 signals/month, $3,335 PnL but Sharpe -0.84	Signal-starved
v3	ML signals + multi-TF alignment + LLM veto	4.9 signals/day at P>0.6	Hybrid approach

V3 philosophy: "Indicators detect. ML predicts. LLMs veto. Risk protects."

ML Training Pipeline

Architecture

Historical OHLCV (24 months, 1h candles)
  → Feature Engineering (98 features per coin)
  → Walk-Forward Validation (6mo train / 1mo test, 16 folds, 20-candle purge buffer)
  → XGBoost Training (Python, scikit-learn)
    - n_estimators=200, max_depth=4, learning_rate=0.05
    - subsample=0.8, colsample_bytree=0.8, min_child_weight=5
    - Multi-class softmax (LONG/SHORT/HOLD)
    - compute_sample_weight('balanced') for ~70% HOLD class imbalance
    - Early stopping (20 rounds)
  → Platt Scaling Calibration (11,680 out-of-fold samples)
  → ONNX Export (skl2onnx)
  → TypeScript Inference (onnxruntime-node, Bun runtime)
    - Feature parity verified: 57/57 features match Python↔TS at 1e-11 precision
    - ONNX model accuracy: max probability difference 6e-8 across all 3 models

Models: BTC (419KB), ETH (413KB), SOL (421KB) — all ONNX format.

Feature Engineering (98 Total Per Coin)

19 indicators x 5 timeframes (15m, 1h, 4h, 1d, 1w) = 95 per-timeframe features:

Category	Features
Momentum	rsi_14, stoch_k, stoch_d, adx_14
Trend	ema_12_26_diff, ema_50_200_diff, price_vs_ema20, macd_histogram, macd_signal_diff
Volatility	bb_position, bb_z_score, bb_bandwidth, atr_pct
Volume	vol_ratio (vs SMA20), obv_slope
Price Action	price_position (quantile), ret_1bar, ret_5bar, ret_20bar

3 cross-timeframe features: trend_alignment, momentum_divergence, vol_regime_expanding

Top importances (XGBoost): 4h_vol_ratio, vol_regime_expanding, 4h_ret_1bar

Walk-Forward Validation (Lopez de Prado methodology)

—Train windows: 6 months (~4,380 1h candles)
—Test windows: 1 month (~730 candles)
—Purge buffer: 20 candles between train/test (prevents lookahead bias)
—Min folds: 5 before deployment gate
—Total folds: 16+ per asset
—Calibration: Platt scaling on out-of-fold predictions (11,680 total samples)

Current results: 53.9% accuracy on BTC (target >55%), train-test gap 0.180 (target <0.15, improved from 0.235 via regularization).

Multi-LLM Veto Layer (Frank Morales Pattern)

Architecture

ML Signal (P>0.6 threshold)
  → Multi-Timeframe Alignment Check
    - 1h (40%) + 4h (30%) + 15m (20%) + 1d (10%)
    - Score ≥0.6 → full position, 0.4-0.6 → half, <0.4 → skip
  → Multi-LLM Ensemble Veto
    - Claude Haiku 4.5 (screening) + DeepSeek V3 (ensemble diversity)
    - Each returns single float: -1.0 (bearish) to +1.0 (bullish)
    - Consensus gate: average scores, either fails → block trade
    - Cost: <$5/month total API spend
  → Kelly-Criterion Position Sizing (Platt-calibrated probabilities)
  → Execution (Bybit v5 REST via CCXT)

Context Layer ("World Brain")

8 data sources aggregated via PageRank-weighted synthesis:

—Funding rates (perpetual contract cost) | Open interest (whale positioning)
—News sentiment (polarity scoring) | Macro calendar (Fed events)
—Fear & Greed Index | Polymarket prediction odds
—Long-short ratio (leverage positioning) | CoinMarketCap spot context

Polymarket Forecaster (Separate Module)

—Two-stage debiased AI estimation — Haiku pre-filter sees NO market price before forming initial estimate (anti-anchoring)
—Kelly sizing: 0.05x fractional (very conservative)
—3-gate filter: confidence >0.7, edge >10%, articulable hypothesis
—Paper trading verified, running on VPS via pm2
—Dashboard: polymarket.rajdeepgupta.in/status↗

V2 Backtest Results (Before v3 Fixes)

Metric	Value	Target
Total PnL	+$3,335	Positive
Win Rate	20.16%	>33% (with 2:1 R:R)
Max Drawdown	-$2,194	<$1,500
Total Trades	191	—
Sharpe Ratio	-0.84	>1.0

Next: CNN-LSTM model (Conv1D 3 layers 64→32 + LSTM(64) + Dense(32) → softmax(3)), ensemble with LightGBM/TabNet.

#Part 4: Jarvis — Personal AI Second Brain (DEEP DIVE)

The Vision

Jarvis is NOT an AI assistant. It's a personal AI operating system — a system that manages engineering projects (6 repos), personal productivity, finances, learning, research, smart home, and life admin. The goal: one interface for everything, powered by specialized AI agents that coordinate, learn, and converge toward my actual decision-making patterns.

What Makes This More Than "Claude with Tools"

—It learns. Every divergence between system suggestion and actual action is a gradient signal. Over 100+ decisions, the system converges toward my real patterns.
—It has memory. 4-tier memory hierarchy with 5-layer retrieval. Knowledge decays, gets resurfaced, gets rated.
—It has a body. OpenClaw daemon runs 24/7 on VPS with 20+ messaging platforms, 100+ skills.
—It coordinates agents. 12+ specialized agents with different models, budgets, and authorities.
—It controls my environment. Smart home integration — lights, fans, AC, cameras via command queue.
—It builds itself. Engineering pipeline dispatches agents that write code, review it, and create PRs autonomously.

Evolution: v1 → v2

v1 (Jan-Feb): Simple orchestration. Polling-based dispatch (Plane every 5 min). Burned 7+ Claude sessions/day with nothing to dispatch. Single agent model.

The Pivot (March): Killed polling. Event-driven dispatch. Zero idle cost. Built full state machine.

v2 (March-April): Full multi-agent system. 32 Convex tables. VPS with autonomous agents. Telegram approval flow. Knowledge engine. Personal productivity layer.

"Make Jarvis Usable" Pivot (April): After 50+ sessions building infrastructure, realized all personal productivity agents were stopped on VPS. 3 sprints to activate what existed before building more.

The 7-Step Core Loop (Intelligence Convergence Engine)

Architecture

Step 0: CAPTURE → Nexus sensory layer (Chrome ext, CLI, Telegram, Donna auto-scan)
  All inputs → knowledge_inbox → /digest → knowledge_items with embeddings
Step 1: DISCOVER → Surface relevant knowledge based on current context
Step 2: CONTEXTUALIZE → Assemble multi-source context for the task
Step 3: EXECUTE → Route to appropriate agent with assembled context
Step 4: EVALUATE → Quality gates, review, human approval
Step 5: LEARN → Feedback signals, decision logging, agent learnings
Step 6: MONITOR → Health checks, cost tracking, drift detection
  → Back to Step 0 (continuous loop)

Memory Architecture (4-Tier, Hermes-Inspired)

Architecture

┌─────────────────────────────────────────────────────────┐
│ SENSORY MEMORY — Raw captures                            │
│ knowledge_inbox table, recent context, unprocessed inputs │
│ Retention: hours. Everything enters here first.          │
├─────────────────────────────────────────────────────────┤
│ WORKING MEMORY — Current session context                 │
│ Active conversation, recent decisions, task state         │
│ 5-source context builder (multiplier effect)             │
├─────────────────────────────────────────────────────────┤
│ EPISODIC MEMORY — Timeline of events                     │
│ episodic_events table, session_summaries, DecisionLog    │
│ "What happened when?" — temporal retrieval               │
├─────────────────────────────────────────────────────────┤
│ LONG-TERM MEMORY — Patterns, learnings, knowledge        │
│ knowledge_items with embeddings (1536-dim vectors)        │
│ FTS5 full-text index (2,321 sections across 7 repos)     │
│ Confidence decay: items lose relevance unless accessed    │
│ Fields: accessCount, lastAccessed, userRating, decayedScore │
└─────────────────────────────────────────────────────────┘

5-Layer Retrieval Architecture

Layer	Method	What It Finds	Speed
1. Keyword/FTS5	Full-text search	Exact terms, file names, function names	Fast
2. Vector Similarity	Embedding cosine similarity (1536-dim)	Semantically related content	Medium
3. Temporal Recency	Timestamp-based scoring	Recent decisions, fresh context	Fast
4. Episodic Links	Decision history traversal	Past similar situations and outcomes	Medium
5. Topic Clustering	Theme coherence scoring	Related knowledge across domains	Slow

Context is assembled by combining results from all 5 layers, weighted by task type. Engineering tasks weight keyword/vector higher. Personal tasks weight episodic/temporal higher.

Nexus Knowledge Pipeline

Architecture

CAPTURE (multiple entry points):
  Chrome Extension → URL + highlights + context
  CLI /capture → quick note, idea, URL
  Telegram forward → messages, links, files
  Donna auto-scan → email signals, calendar, repo changes
    ↓
knowledge_inbox (Convex table — raw, unprocessed)
    ↓
/digest processor:
  → Summarize content
  → Extract tags and topics
  → Generate embeddings (1536-dim vectors)
  → Score relevance and quality
  → Create knowledge_connections (graph relationships)
    ↓
knowledge_items (processed, searchable, decayable)
    ↓
RESURFACING:
  Donna queries Nexus at 7:30 AM and 10 PM IST
  Surfaces items based on: current context + decay score + topic relevance
  Items accessed get accessCount++ and freshness boost
  Items ignored decay further

The Gradient Descent Intelligence Model

Core insight: The system improves through user corrections, not architectural perfection.

Architecture

decision-model.md (initial hypothesis — best guess of my decision patterns)
    ↓
System proposes action/suggestion
    ↓
I accept / reject / modify
    ↓
DecisionLog records: (context, suggestion, actual_action, outcome, delta)
    ↓
Every divergence between suggestion and actual action = gradient signal
    ↓
Over 100+ decisions, the model converges toward my ACTUAL patterns
    ↓
Trust calibration: proactive suggestions ignored >70% initially
  → Feedback loop re-weights categories
  → Auto-demotion of consistently-ignored suggestion categories after 2-3 weeks

Convergence timeline:

Stage	Timeline	System Behavior
Smart assistant	Week 1-2	Follows rules, applies heuristics
Pattern learner	Week 3-6	Feedback loops active, starts adapting
Aligned partner	Month 2-3	100+ DecisionLog entries, suggestions become genuinely useful

OpenClaw Integration (Body vs Brain)

	OpenClaw (Body)	Jarvis (Brain)
Role	Always-on daemon, personal automation	Engineering pipeline, task dispatch
Runtime	30-min heartbeat on VPS	On-demand (CLI or Telegram trigger)
Model	Claude Haiku 4.5 ($5/day cap)	Claude Opus 4.6 (orchestration)
Platforms	20+ messaging platforms, Telegram primary	Claude Code CLI, Plane
Skills	100+ (calendar, gmail, plane-tasks)	Engineering agents, code review
Memory	SQLite journals (local)	Convex (32 tables, shared)
Bridge	Nexus (shared knowledge base, embeddings, FTS5)

Handoff flow: OpenClaw detects engineering request → writes trigger file → handoff-watcher picks up → launches Claude Code session → Friday agent executes → Groot reviews → Telegram approval → PR created.

Multi-Agent Architecture (12+ Agents)

Core Team (5 VPS agents via pm2):

Agent	Role	Model	When
Jarvis	Orchestrator	Opus 4.6	On-demand
Friday	Lead Engineer	Sonnet 4.6	Trigger-based
Groot	Code Reviewer	Sonnet 4.6	After every code change
Donna	Executive Intelligence	Haiku 4.5	Scheduled (7:30 AM, 10 PM)
Bran	System Observer	Haiku 4.5	Always-on (health monitor)

Specialist Agents: Rocket (research), Vision (architecture), Engineering Agent (autonomous pipeline), Reviewer, Planner, Frontend/Backend/Mobile Dev, Security Auditor, Design Engineer, Test Writer.

Engineering Agent Pipeline (Autonomous)

Architecture

/work JARVI-42
  → Jarvis reads Plane task → moves to In Progress
  → Dispatches engineering-agent (Sonnet, 60 max tool calls)
  → Agent reads knowledge base + 72 learnings entries
  → Researches codebase → writes plan to temp file
  → Creates feature branch → implements changes
  → Runs quality gates (tsc, lint, build)
  → Self-reviews → commits → pushes branch
  → Returns AWAITING_APPROVAL
  → Jarvis dispatches Groot (reviewer)
  → If APPROVED: sends Telegram message with Approve/Reject buttons
  → I tap Approve on phone
  → Jarvis creates draft PR → updates Plane → cleans up

13-State Task Machine

Architecture

idle → queued → running → [intermediate states] → completed
                 ↓
             blocked → awaiting_approval → approved → completed
                 ↓
             rate_limited → (checkpoint saved) → resumed
                 ↓
             failed → (retry logic, 5 failure types) → running

Safety: Optimistic concurrency (version field), full audit trail (task_events), dry-run mode, kill switch, local presence flag (pauses VPS when I work locally), mid-task checkpoints, cost caps.

Convex Data Model (32 Tables)

Knowledge (6): knowledge_inbox, knowledge_items, topics, knowledge_connections, doc_insights, episodic_events

Agents (10): agent_state, agent_messages, agent_messages_dryrun, agent_decisions, agent_metrics, agent_learnings, agent_checkpoints, rate_limit_state, task_runtime_state, task_events

Memory & Scheduling (8): donna_config, donna_engagement, session_summaries, daily_budget, usage_cache, telegram_jobs, telegram_approvals, execution_log

Smart Home (2): smart_home_commands, smart_home_state

Plus: local_presence, openai_usage, and more

Framework Synthesis (3-Framework Hybrid)

Framework	What We Adopted	What We Didn't
Hermes Agent	Confidence decay, 5-layer memory abstraction, learning patterns	Full architecture (too coupled)
gstack	Artifact chaining (research → plan → code — each output becomes next input)	CLI workflow (not agent-native)
Paperclip AI	Heartbeat daemon (30 min), budget tracking, org chart of agents	Monolithic orchestrator

Continuous Learning System

Agent Learnings (72+ entries): Engineering agent reads agent-learnings.md at start of every task. Updated whenever: reviewer blocks code, I reject approval, quality gates fail. Format: What went wrong → Root cause → Rule. Tags enable context-aware injection.

Memory Write Authority: Only Reviewer/Jarvis writes to persistent memory. All other agents propose learnings via structured output. Jarvis reviews and decides what persists. Prevents multi-agent garbage — discovered this after conflicting, low-quality entries accumulated.

Cost Management

Two Claude accounts tracked (personal Max + team office):

5h Utilization	Level	Behavior
< 50%	Full	3-5 parallel agents
50-70%	Moderate	Max 2-3 parallel
70-85%	Conservative	Single agent only
85-95%	Emergency	Complete current task only
> 95%	Paused	Write handoff, defer all

Adaptive poller: 1h idle, 5m when agents active, exponential backoff on 429s.

Jarvis Dashboard (jarvis.rajdeepgupta.in)

Stack: Next.js 16.1, React 19.2, Tailwind CSS 4, shadcn/ui, Convex 1.33

Pages: /personal, /memory, /projects, /agents/[name] (7 agents), /agents/conversations, /docs, /observability, /focus, /nexus, /sessions/[date], /events, /smart-home

Data sources: Plane API (ISR 60s), Convex (real-time), markdown files copied during prebuild (sessions, subscriptions, goals, learning-log, agent-learnings, behavioral-rules).

Smart Home Integration

Command queue architecture:

Architecture

Dashboard/Jarvis/OpenClaw → queueCommand(source, command, payload) → Convex
  ↓ Home Hub (Python, pm2) polls every 2-3s via pollCommands()
  ↓ Executes via device APIs (tinytuya for Tuya, Tapo SDK, LG ThinQ, EZVIZ)
  ↓ reportResult(id, status, result, error) → Convex
  ↓ Dashboard shows real-time status (stale detection at >2 min)

15+ devices: 5 Tapo lights, 4 Tuya multi-gang switches, 2 Atomberg fans, 1 LG AC, 2 EZVIZ cameras.

Smart Scenes:

Scene	What It Does
`coding`	Blue strips 40%, room light off, fan speed 3
`goodnight`	All lights off, fans speed 2
`movie`	Lights off except bed back 20% purple, fan speed 2
`wake_up`	Warm lights 60%, fan off

The "Second Brain Intelligence" Plan (Latest, April 2026)

ExecutionUnit abstraction (Tier 0 — must exist before any intelligence): All system operations unified under one tracking model with: input/output artifacts, tools_allowed, budget, state, checkpoint, memory_authority, goal_ancestry, feedback_signals.

5-Phase Timeline:

—Phase 1F (1 week): ExecutionUnit, feedback signals, memory write authority
—Phase 1G (2 weeks): Confidence decay, goal ancestry, failure replay, Nexus feedback loop
—Phase 2A (2 weeks): Compound learning — OpenClaw skill self-improvement, Donna learning loop
—Phase 2B (2 weeks): Rajdeep OS — decision model, role engine, delegation, proactive loop
—Phase 2C (1 week): E2E + ship → then 4-week calibration period

#Part 5: Local AI Experiments — MLX on Apple Silicon

What We Tried (M1 Max 32GB)

Experiment	Model	Result
OCR	Qwen2-VL-2B-4bit (mlx-vlm)	Good for printed text
Text generation	Qwen3-4B-4bit (mlx-lm)	Good for summarization
Multi-model debate	Qwen3-4B (3 agents debating NVDA)	Worked! 1 round ~18s
Local embeddings	nomic-modernbert-embed-base-4bit	44,000 tok/s throughput
Image generation	Flux.1-schnell (mflux)	Abandoned — 15GB download

MLX vs Alternatives

	MLX	Ollama (MLX backend)	llama.cpp	Cloud API
Speed on Apple Silicon	Fastest (native)	Same (uses MLX now)	20-87% slower	Network-bound
VLM support	Full	Limited	Partial	Full
Fine-tuning	Yes (LoRA)	No	No	No
Cost	Zero	Zero	Zero	Per-token

Verdict: Local models = playground and privacy-sensitive tasks, not production replacement at our scale. 4B models are the sweet spot for M1 Max — interactive speed, leaves room for other apps. 7B saturates resources.

Potential Use Cases

Use Case	Model	Benefit
Entellect PDF OCR	DOTS-OCR / DeepSeek-OCR	Zero cost, medical data stays local
Local embeddings	nomic-modernbert	44k tok/s, zero API cost
Broker OS doc extraction	DeepSeek-OCR + Qwen3-8B	Privacy for client documents
LoRA fine-tuning	Qwen3-8B on medical Q&A	Domain-specific quality boost

#Part 6: Side Projects

What: Native macOS app showing live meeting countdown in menu bar ("Standup in 23m" → "Standup NOW"). One-click Zoom/Meet join.

Stack: Swift 5.9, SwiftUI (native macOS, no Electron).

Business model: Free tier + Pro ($4.99 one-time). Competes with Fantastical, Dato, Meeter at the lowest price point.

Status: Production (v0.3.0), DMG available. Website: gettickerapp.com↗.

LeadMapsHub — Google Maps Lead Scraper

What: Chrome extension that extracts business data (names, phones, addresses, ratings, websites) from Google Maps with auto-scroll, enrichment, and multi-export (CSV, Excel, JSON).

Stack: Vanilla JS, Shadow DOM, Chrome Manifest V3. No server needed — runs entirely in browser.

Status: Development. Competes with Outscraper ($2/1000 leads), Scrap.io ($49/mo) at a lower price.

#Part 7: AI Model Usage Across All Projects

Model Selection Matrix

Model	Provider	Used In	Purpose	Why This Model
Claude Opus 4.6	Anthropic	Jarvis orchestrator	Complex reasoning, task routing	Highest quality for critical decisions
Claude Sonnet 4.6	Anthropic	Jarvis agents, Entellect RAG, Raven analysis	Code gen, generation, trading	Best quality-cost ratio
Claude Haiku 4.5	Anthropic	OpenClaw, Donna, Raven screening	Briefings, classification, veto	Low cost for high-volume tasks
GPT-4o-mini	OpenAI	Entellect reranking	Relevance scoring (conditional)	Good at reranking, cheaper than Claude
text-embedding-3-small	OpenAI	Entellect embeddings	Chunk vectors (1536-dim)	Proven quality for medical text
Gemini 2.0 Flash	Google	Entellect PDF extraction	Vision-based document parsing	Best vision for complex layouts
DeepSeek V3	DeepSeek	Raven LLM ensemble	Trading veto (diversity)	Reduces single-model bias
XGBoost → ONNX	Local	Raven signal generation	Trading signal prediction	Best for tabular data, fast CPU training
Qwen3-4B (MLX)	Local	Experiments	Text gen, multi-model debate	Best 4B for Apple Silicon
nomic-modernbert (MLX)	Local	Experiments	Local embeddings	44k tok/s, Matryoshka dims

Model Selection Philosophy

—Cheapest model that works. Haiku for classification ($0.25/M), Sonnet for generation ($3/M), Opus only for orchestration ($15/M) — 90%+ savings.
—Right provider for each task. OpenAI for embeddings, Gemini for vision, Claude for reasoning, DeepSeek for ensemble diversity.
—Local for privacy. Medical data (Entellect) and client docs (Broker OS) benefit from on-device processing.
—Train your own when task is specific. XGBoost for trading signals — tabular data, specialized model outperforms general LLMs.
—Ensemble for reliability. Raven uses Claude + DeepSeek together — reduces single-model bias.

#Part 8: Infrastructure & Services

Complete Service Map

Category	Service	Purpose	Used By
Hosting	Vercel	All web apps (8 projects)	All
VPS	Hetzner CPX22 (Helsinki)	Agents, trading bot, smart home	Jarvis, Raven
DNS/CDN	Cloudflare	DNS, DDoS, R2 storage, Workers	Broker OS
Database	Convex	Real-time (agents, messaging)	Jarvis, Broker OS
Database	Neon PostgreSQL	Relational + pgvector	Platform, Entellect
Database	Supabase	Legacy platform data	Platform
Database	SQLite	FTS5 index (2,321 sections)	Jarvis
AI	Anthropic (Claude)	Primary LLM	All AI projects
AI	OpenAI	Embeddings + reranking	Entellect
AI	Google (Gemini)	Vision/PDF extraction	Entellect, 11sqft AI
AI	DeepSeek	Trading ensemble diversity	Raven
AI	Local MLX	Privacy processing	Experiments
AI	ONNX Runtime	ML model inference	Raven
Auth	Firebase	Phone OTP	Broker OS, Properties, Mobile
Messaging	AiSensy	WhatsApp Business API	Broker OS
Messaging	Telegram Bot API	Notifications, approvals	Jarvis
Maps	Mapbox GL + Google Maps	Property maps, geocoding	Platform, Properties
Project Mgmt	Plane (self-hosted API)	Task tracking (8 projects)	All
Monitoring	Sentry	Error tracking	Broker OS, Platform
Analytics	GA4 + Mixpanel	User analytics	Broker OS
Smart Home	Tapo, Tuya, Atomberg, LG, EZVIZ	IoT device control	Jarvis
Testing	Vitest + Playwright	Unit + E2E	Broker OS

Why Cloudflare + Vercel Together?

—Vercel: Application hosting, serverless functions, CI/CD
—Cloudflare: DNS management, DDoS protection, R2 object storage (cheaper than S3), Image CDN optimization, Workers for edge logic (short-link redirects)
—Complementary: Vercel handles compute, Cloudflare handles CDN/storage/DNS

#Part 9: Key Lessons & Pivots

—
Infrastructure spiral is real. 50+ sessions building Jarvis infra, all personal agents stopped. Fix: "Make Jarvis Usable" — activate before building more.
—
Polling is expensive, events are cheap. Jarvis v1 burned 7+ sessions/day polling with nothing to dispatch. Fix: event-driven dispatch (zero idle cost).
—
Rule-based classification saves real money. Entellect's query classifier uses zero LLM tokens — just keyword matching. Not every AI task needs an LLM.
—
PDF extraction is harder than expected. Gemini Vision batch extraction fails after page 56. Native pdftotext at 300x speed with hybrid LLM refinement won.
—
The cheapest model that works is the best model. Haiku at $0.25/M vs Opus at $15/M — 90%+ savings without quality loss on simple tasks.
—
Multi-agent garbage is real. Multiple agents writing to shared memory created conflicts. Fix: only Reviewer/Jarvis writes — others propose.
—
Walk-forward validation matters. Without purge buffers and proper train/test splits, XGBoost accuracy was artificially inflated. Lopez de Prado methodology fixed this.
—
Always research before building. Weeks spent on PineScript indicators before realizing the approach was fundamentally flawed. Deep research (ArXiv, GitHub) saves weeks.
—
Gradient descent as system design. Don't optimize architecture — optimize the feedback loop. Every feature must answer: "What signal does this create?"
—
Signal generation was the real bottleneck. Raven v1-v2 had good intelligence layers but produced 0-2 signals/month. ML-generated signals (4.9/day) solved this.

#Part 10: What's Next

Project	Next Milestone
Broker OS	THE BET — distribution > features. WhatsApp-first onboarding.
Entellect	Phase B completion, clinical casebook, drug system, LoRA fine-tuning experiments
Raven	CNN-LSTM model (Conv1D + LSTM → softmax), LightGBM/TabNet ensemble, testnet live trading
Jarvis	ExecutionUnit table, DecisionLog corpus to 100+, trust calibration at >50% acceptance
Ticker	App Store launch, Pro tier activation

Long-term vision:

—Jarvis as Rajdeep OS — morning briefing → task triage → engineering dispatch → email → learning → evening wrap-up, all autonomous
—Entellect as platform — expand beyond ENT to other medical specialties (same RAG, different knowledge)
—Raven live trading — graduate from testnet after calibration
—11sqft as THE broker SaaS — "Shopify for Indian real estate brokers"

#Part 11: Links & References

Production URLs

Project	URL
11sqft Platform	11sqft.com↗
Broker OS	broker.11sqft.com↗
Properties	11sqft.com/properties↗
Jarvis Dashboard	jarvis.rajdeepgupta.in↗
Raven Polymarket	polymarket.rajdeepgupta.in↗
Ticker	gettickerapp.com↗

GitHub Repositories

Project	Repository
Platform	github.com/sethraj14/11sqft↗
Broker OS	github.com/sethraj14/11sqft-broker-os↗
Broker Mobile	github.com/sethraj14/11sqft-broker-app↗
Properties	github.com/sethraj14/11sqft-properties↗
11sqft AI	github.com/sethraj14/11sqft-ai↗
Academy	github.com/sethraj14/11sqft-academy↗
Raven	github.com/sethraj14/raven↗
Ticker	github.com/sethraj14/ticker↗

Document generated April 4, 2026. Covers work from January to April 2026 across 11 active repositories.

JARVIS

Jarvis Command Palette

Technical Portfolio

Technical Portfolio — Rajdeep Gupta

AI, Engineering & Product | January – April 2026

#Executive Summary

#Project Portfolio at a Glance

#Part 1: The 11sqft Real Estate Ecosystem

The Problem

1.1 Platform (11sqft.com)

1.2 Broker OS (broker.11sqft.com) — The Flagship

1.3-1.6 Other 11sqft Projects

Architecture: How They Connect

#Part 2: Entellect — AI-Powered Medical Education (DEEP DIVE)

The Problem

RAG Pipeline Architecture

Semantic Chunking Strategy

PDF Extraction Experiments (3 methods compared)

Content Generation Pipeline

Database Schema Evolution

Dual Database Architecture

Current State

#Part 3: Raven — ML-Powered Autonomous Trading (DEEP DIVE)

The Problem & Evolution

ML Training Pipeline

Feature Engineering (98 Total Per Coin)

Walk-Forward Validation (Lopez de Prado methodology)

Multi-LLM Veto Layer (Frank Morales Pattern)

Context Layer ("World Brain")

Polymarket Forecaster (Separate Module)

V2 Backtest Results (Before v3 Fixes)

#Part 4: Jarvis — Personal AI Second Brain (DEEP DIVE)

The Vision

What Makes This More Than "Claude with Tools"

Evolution: v1 → v2

The 7-Step Core Loop (Intelligence Convergence Engine)

Memory Architecture (4-Tier, Hermes-Inspired)

5-Layer Retrieval Architecture

Nexus Knowledge Pipeline

The Gradient Descent Intelligence Model

OpenClaw Integration (Body vs Brain)

Multi-Agent Architecture (12+ Agents)

Engineering Agent Pipeline (Autonomous)

13-State Task Machine

Convex Data Model (32 Tables)

Framework Synthesis (3-Framework Hybrid)

Continuous Learning System

Cost Management

Jarvis Dashboard (jarvis.rajdeepgupta.in)

Smart Home Integration

The "Second Brain Intelligence" Plan (Latest, April 2026)

#Part 5: Local AI Experiments — MLX on Apple Silicon

What We Tried (M1 Max 32GB)

MLX vs Alternatives

Potential Use Cases

#Part 6: Side Projects

Ticker — macOS Menu Bar Calendar

LeadMapsHub — Google Maps Lead Scraper

#Part 7: AI Model Usage Across All Projects

Model Selection Matrix

Model Selection Philosophy

#Part 8: Infrastructure & Services

Complete Service Map

Why Cloudflare + Vercel Together?

#Part 9: Key Lessons & Pivots

#Part 10: What's Next

#Part 11: Links & References

Production URLs

GitHub Repositories