available for new opportunities

Craig Stueber

Applied AI Engineer

Builds and ships production LLM systems end to end. Doctoral researcher in AI safety.

“Work hard and be nice to people.”

current roleBHE GT&Ssince 2025

researchDoctoral CandidateNational University

[email protected]linkedin.com/in/craigstueber

experience

Work History

Senior Full Stack Engineer

AI Systems Integration

Berkshire Hathaway Energy (BHE GT&S)2025 – Present · Richmond, VA

Tech lead and people lead for a team of 6 engineers building Dekaflow 2.0, transitioning $100B+ in annual energy movement from on-prem to cloud.
Led early-stage AI agent R&D designing a six-agent LangGraph pipeline for enterprise data understanding, decreasing business stakeholder analysis time by 90%.
Built and owned full-stack features end to end across Next.js, Java, MongoDB, and Azure supporting gas flow scheduling, hourly quantity tracking, and a cross-cutting user preferences system, saving 10K+ hours monthly on operations.
Led enterprise-wide GitHub Copilot deployment across 200+ engineers, establishing behavioral guardrails and governance practices, reducing AI-introduced defects in production codebases.

LangGraphLangSmithNext.jsJavaMongoDBAzureReactTypeScript

Senior Full Stack Developer

LLM-Integrated Systems

Sauer Brands Inc2021 – 2025 · Richmond, VA

Sole engineer across 6 independent brand teams, building all customer-facing applications from 0 to 1, delivering 140+ features across all brands.
Developed customer service tool used daily by 10 reps, reducing critical issue resolution time from 36+ hours to 6-8 hours.
Automated priority classification of service requests, routing emergency-tier messages without manual triage and eliminating bottlenecks across 100+ daily incoming requests.
Built LLM-integrated pipelines for classification, summarization, and automated routing with controlled prompt A/B evaluations, saving the IT team 180+ hours monthly on customer processing.

ReactSupabasePostgreSQLRedisLLM IntegrationPower AutomateTypeScript

Full Stack Engineer

ML-Enhanced IoT Systems

Talos IoT2021 · Glen Allen, VA

Integrated ML models for time-series anomaly detection and classification into backend services to identify sensor abnormalities and operational risks.
Built real-time IoT monitoring dashboards using React, Python, and WebSockets, translating ML outputs into actionable insights for field operators.

ReactPythonWebSocketsML Integration

Earlier Experience

2017 – 2021

Kurb Media2019 – 2021

Delivered 50+ client projects across React, PHP, WordPress, Shopify, and early AR prototypes. Managed requirements and delivery timelines directly with clients, delivering 100% of projects on time.

PresenceLearning2020 – 2021

Frontend modernization, email system consolidation, mobile UX improvements, and accessibility remediation.

Soar3652021

Full WCAG and ADA audit and remediation, Wix platform migration, and staff accessibility training.

Freelance2017 – 2019

Built and delivered 10+ full-stack web applications for clients in publishing, real estate, and nonprofit industries. Owned requirements, scoping, and delivery -- secured 100% of clients through word-of-mouth referrals.

ReactWordPressShopifyPHPAccessibilityWCAG

projects

Notable Work

// featured

CodeRisk Advisor

AI Safety / LLM Systems

live → coderisk.craigstueber.com

Multi-agent AI security review system for Python, JavaScript, and TypeScript code. Combines OWASP Top 10 vulnerability scanning with AI-specific behavioral risk detection using a panel of specialized LLM agents that synthesize findings into conversational developer guidance.

LangGraph pipeline orchestrating five specialized agents: VulnScanner, BehavioralRisk, Skeptic, Remediation, Synthesizer
Skeptic agent actively disputes low-confidence findings to reduce false positives
Token-by-token SSE streaming with real-time agent status updates in the UI
Deployed on Google Cloud Run with LangSmith tracing for full observability

LangGraphFastAPIPythonOpenAIAnthropicNext.jsTypeScriptGoogle Cloud RunLangSmithSSE

// notable

DanceCard

Agentic System / Mobile Application

Co-founded and led all engineering for a cross-platform social mobile application. Built an agentic onboarding system using CrewAI alongside a full React Native application -- owned architecture, data modeling, and delivery independently from concept to App Store.

Agentic onboarding system using CrewAI with constrained generation patterns to maintain consistent, safe outputs in a consumer-facing context
Full cross-platform React Native application with real-time chat, event scheduling, and location-aware discovery across iOS and Android
Full App Store and Google Play submission including TestFlight and Play Console policy compliance

CrewAIReact NativeSupabaseExpoTypeScript

Enterprise Platform

Dekaflow 2.0

High-stakes enterprise platform managing natural gas scheduling workflows supporting billions in annual east coast energy movement. Built on a modern React and cloud stack integrating with a 25-year-old Java and SQL legacy system.

ReactNext.jsJavaMongoDBAzureLangGraphTypeScript

High-Traffic Platform

Hot Tomato Summer

Multi-city restaurant voting platform reaching 30,000+ users in two weeks with rule-based fraud detection and voting anomaly dashboards.

ReactReduxSupabasePythonFingerprinting

skills

Technical Skills

Languages

html / css10yr

javascript10yr

react8yr

php8yr

python6yr

typescript6yr

java6yr

node.js6yr

AI & LLM Systems

LangGraphLangChainLangSmithCrewAILlamaIndexPydanticAIDSPyOpenAI APIAnthropic APIPrompt EngineeringRAGBehavioral EvaluationGuardrails & Output ControlLLM ObservabilityAgentic Workflow DesignWeights & Biases

Frameworks & Libraries

Next.jsReact NativeFastAPIMaterial UITailwind CSSExpoReduxJotaiReact Query

Infrastructure & Cloud

Google Cloud RunCloudflare WorkersCloudflare PagesCloudflare VectorizeAzure OpenAI ServiceAzure AI SearchAzure DevOpsAWS (EC2, Lambda, S3)DockerLinux & Bash

Data & Backend

PostgreSQLMongoDBSupabaseRedisSQL ServerMySQLREST APIsWebSockets

Testing & Quality

JestTest-Driven DevelopmentIntegration TestingPrompt Regression TestingBehavioral Consistency ChecksMulti-run Variance Analysis

Accessibility

WCAG 2.1ADA RemediationSemantic HTMLScreen Reader TestingColor Contrast Audits

Enterprise Tooling

Power AutomateGitHub Copilot GovernanceMicrosoft 365GitPostmanSwagger / OpenAPI

research

Doctoral Research

// dissertation

Evaluating the Security of AI-Generated Code: A Quantitative Study Using a Custom Scoring Framework

National UniversityDoctoral Candidate · In Progress

Designs and validates a reproducible hybrid vulnerability scoring framework to detect and measure security risks in AI-generated code before deployment. Addresses a validated gap in the literature -- no systematic evaluation framework existed for assessing AI-generated code security across diverse programming tasks and contexts.

framework layers

// 01

OWASP Top 10 ClassificationClassifies vulnerabilities using the OWASP Top 10 taxonomy of critical application security risks

// 02

AI-Specific Vulnerability Pattern LayerIdentifies vulnerability characteristics arising from the probabilistic and training-data-dependent nature of LLM code generation

// 03

CVSS v3.1 Severity QuantificationScores vulnerability severity using CVSS metrics modified to reflect elevated risk characteristics of AI-generated outputs

writings

Published Work

// book

The Comfortable Apocalypse

When Survival Isn't the Problem — Irrelevance Is

Forthcoming · Nonfiction

The central risk of the AI age is not domination or rebellion, but displacement. As automation removes friction from daily life, it quietly erodes the cognitive and emotional capacities that effort once built — memory, judgment, curiosity, creativity, identity, and agency. The danger is not hostile AI, but a world where thinking becomes optional and human participation fades without resistance.

Medium · Applied AI Essays

Why Most AI Failures Aren't Model Failures — They're Integration FailuresWhy production breaks assumptions, not models.

2026-01→

Security Reviews Don't Catch AI Failures. Here's Why.The review passed. The system failed. Those two things can both be true.

2026-02→

How to Actually Implement AI Agents in the Real WorldWhat the tutorials skip and production demands.

2026-03→

read all essays on medium →

education

Academic Background

Doctor of Philosophy

Computer Science

National University2022 – 2026Doctoral Candidate · In Progress

AI safety and behavioral reliability
Security risks in AI-generated code
Hybrid vulnerability scoring framework combining OWASP, CVSS, and AI-specific pattern detection

Master of Science

Information Technology

Strayer University2020 – 2022

IT management and information security management
System design and architecture

Bachelor of Science

Information TechnologyMajor · Software Development

Strayer University2017 – 2020