AI Engineer

EngineeringHybrid (IL/US hours)Full-time

About the role

We're looking for an AI-native engineer to build the autonomous agents that power Yolk SalesOS. You'll work directly under the engineering head and the founders, and you'll be the one bringing the agents to life. The architecture gets set above you; the agents get built by you. Do this well and you grow fast.

We're now bringing in the engineer who will turn agent designs into shipped, reliable systems — the discovery prompts, the objection-handling logic, the orchestration that decides what the coach says and when.

This role sits close to the heart of the product. The person who gets this right will write code that real reps lean on during real deals.

A word of honesty

This is an early-team role at a fast-moving company, which means it is hands-on and unglamorous in the way that matters. You will write the agent code, not just diagram it. You will build the eval harness yourself. You will chase down why a coaching response was wrong at 9pm because a customer demo is tomorrow. You will work from a product requirement, figure out the right agent design, and live with the calls you make. The senior team sets the architecture, but inside it you own the implementation end to end — and "it works on my machine" is not the bar. If you want a research seat where you read papers and prototype in notebooks, this is the wrong role. If you want to ship autonomous agents into production and watch them earn their keep on live sales calls, read on.

What you'll do

Build and iterate the agents themselves — from design to production: coaching reps, analyzing deals, surfacing insight in real time.
Build tool-use and integrations that let agents reach CRMs, transcripts, calendars, and internal data (Gong, Fireflies, Stripe, Apollo, Resend).
Build prompt pipelines for structured outputs, classification, summarization, and coaching feedback — and the eval harnesses that prove they're good.
Write orchestration code that manages agent state, retries, fallbacks, and human-in-the-loop checkpoints — including the failure paths most people skip.
Own the database operations, FastAPI endpoints, and background workers that the agent workflows run on.

What we're looking for

A bachelor's in Computer Science or equivalent, plus at least 3 years building software in production — strong fundamentals, proven track record.
Obsessed with agents and LLMs. You've built something real with OpenAI, Anthropic, or open models — shipped at work or on your own — not just read about them.
You think in planning loops, tool selection, memory, and reliability, not prompt-in / text-out.
Strong Python, comfortable in async/await, and you reason about error handling, edge cases, and what happens when a model call fails.
AI-native in how you work: Cursor, Claude Code, Copilot daily — and you understand every line you ship, not just the ones you typed.
A fast learner who dives into unfamiliar code, reads the source, and asks sharp questions instead of waiting to be unblocked.
You take an architecture you didn't design, extend it, and stress-test it — comfortable implementing patterns rather than setting them, for now.

Nice to have

You've shipped multi-step LLM pipelines or RAG before.
You know FastAPI, async Python, PostgreSQL and SQLAlchemy, or message queues.
You've worked in sales tech or RevOps.
You've contributed to an open-source AI or agent project.

What success looks like — first weeks

Ship your first agents inside the established patterns and architecture, and have them hold up on real workflows, not just the happy path.
Build a tool integration and a prompt pipeline end-to-end, with the eval harness to show they work.
Learn the codebase, the agent framework, and enough of the sales domain to design the right thing without hand-holding.

What success looks like — first 6–12 months

Own major agent capabilities in SalesOS — designed with the senior team, built and stress-tested by you.
Strengthen the evaluation and reliability story: accuracy, latency, cost, and the guardrails that keep outputs on-brand.
Take end-to-end ownership of agent features, and start contributing to the architectural decisions you used to just implement.

Why this role matters

In a category built on trust, the agent code is the product. Whether the coach says the right thing at the right moment, whether a session holds, whether reps act on what it tells them — that's what earns the trust everything else depends on. As an early-team engineer you build the agents Yolk runs on, at the moment they matter most. This is a rare chance to own the part of the product customers actually feel.

How we work

AI-accelerated development with Cursor and Claude Code. Pragmatic architecture — simple endpoints for simple things, sophisticated agent logic where it counts. Integration-first testing that exercises real agent workflows. Modern Python tooling: uv, ruff, strict type checking. We ship weekly and iterate on feedback, and you'll work closely with senior engineers who will help you grow.

Tech stack: Python 3.12+, FastAPI, SQLAlchemy 2.0 (async), PostgreSQL, RabbitMQ + FastStream, WebSockets, Redis, Kubernetes (EKS), Helm, Docker, AWS, OpenAI / Anthropic APIs, LangChain and custom agent frameworks, Pipecat for real-time audio, OpenTelemetry, Prometheus, Sentry.

What we offer

A seat building the technical heart of a VC-backed startup (investors in Anthropic, Groq, Tensornet), working directly with the engineering head and founders. Genuinely hard problems — autonomous agents shipping to real enterprise customers — and a product that's already built, not a blank page. Real mentorship, a clear growth path as the team and platform scale, and direct impact on features that go to clients. Competitive compensation, equity, and benefits.

About Yolk

A sales rep tracks nine things at once on a live call. Human working memory holds four. Every AI sales tool tries to add more dashboards on top. We subtract — Yolk's AI coach works inside the live call, surfacing the one right thing to say at the moment it matters, then turns what broke down on the call into targeted AI roleplay practice afterward.

The facts: launched end of May 2026. Hundreds of salespeople coached in the first weeks, across several live pilots and paying clients. Thousands of real sales calls already analyzed. Backed by investors in Anthropic and Groq. SOC 2 compliant, 4 patents in core AI methods.

Ready to build with us?

Apply in a few minutes. We read every application and reply to all of them.

Apply for this role

Put a senior partner on every call.

Other tools grade the call after it's lost. Yolk is in the room while it counts.

Install Free Extension Book A Demo