Your first job with retries
You’ve got a job round-tripping. Now make it fail, watch ChasquiMQ retry it, and learn how to short-circuit retries when the job is unsalvageable.
What you’ll build
Section titled “What you’ll build”A queue that processes welcome emails. The handler fails on the first two attempts and succeeds on the third. Then a poison-pill payload triggers UnrecoverableError and goes straight to the DLQ.
1. Set up retries on the worker
Section titled “1. Set up retries on the worker”Pass attempts and backoff either at queue level (default for every add) or per-job. We’ll use per-job overrides here so the example is self-contained.
import { Queue, Worker, UnrecoverableError } from "chasquimq";
const connection = { host: "127.0.0.1", port: 6379 };const queue = new Queue("welcome", { connection });
let calls = 0;
const worker = new Worker( "welcome", async (job) => { calls += 1; console.log(`call #${calls} for job ${job.id} (attempt ${job.attemptsMade})`);
if (job.data.to === "poison@example.com") { throw new UnrecoverableError("blocked address"); } if (calls < 3) { throw new Error("transient SMTP failure"); } return { delivered: true }; }, { connection },);
await queue.add( "welcome", { to: "ada@example.com" }, { attempts: 5, backoff: { type: "exponential", delay: 100 } },);
await queue.add( "welcome", { to: "poison@example.com" }, { attempts: 5 },);import asynciofrom chasquimq import Queue, Worker, BackoffSpec, UnrecoverableError
calls = 0
async def handler(job): global calls calls += 1 print(f"call #{calls} for job {job.id} (attempt {job.attempt})")
if job.data["to"] == "poison@example.com": raise UnrecoverableError("blocked address") if calls < 3: raise RuntimeError("transient SMTP failure") return {"delivered": True}
async def main(): queue = Queue("welcome") worker = Worker("welcome", handler) asyncio.create_task(worker.run())
await queue.add( "welcome", {"to": "ada@example.com"}, attempts=5, backoff=BackoffSpec.exponential(initial_ms=100), ) await queue.add( "welcome", {"to": "poison@example.com"}, attempts=5, )
await asyncio.sleep(2.0) await worker.close() await queue.close()
asyncio.run(main())2. Run it
Section titled “2. Run it”You should see something like:
call #1 for job 01HV... (attempt 1)call #2 for job 01HV... (attempt 2)call #3 for job 01HV... (attempt 3)call #4 for job 01HW... (attempt 1)Three calls for the first job — two failures and a success. One call for the poison job — the UnrecoverableError skipped retries entirely. Job IDs are ULIDs (timestamp-prefixed, sortable).
3. What ChasquiMQ did
Section titled “3. What ChasquiMQ did”When your handler returned Err / threw / rejected:
- The worker re-encoded the job with
attempt += 1. - A single Lua script atomically
XACKDEL’d the original stream entry andZADD’d the new copy onto the delayed sorted set with a fire-time computed from the backoff. - The promoter (embedded in the consumer) moved the entry back into the stream when its delay was up.
- Your handler ran again with
job.attemptsMade(Node) /job.attempt(Python) incremented.
When the poison job threw UnrecoverableError:
- The engine bypassed the retry budget and routed the entry directly to the DLQ stream (
{chasqui:welcome}:dlq) withDlqReason::Unrecoverable.
4. Inspect the DLQ
Section titled “4. Inspect the DLQ”chasqui dlq peek welcomeYou’ll see the poison job with its reason: unrecoverable and the original payload. To put the bug-fixed job back into the main stream:
chasqui dlq replay welcome --limit 50Replayed jobs get a fresh retry budget — attempt resets to zero before the re-XADD.
Things to know
Section titled “Things to know”- Attempt count is 1-indexed. The first delivery is attempt 1.
UnrecoverableErroris a name match. Any error whosename === "UnrecoverableError"(Node) or whose class name isUnrecoverableError(Python) maps toHandlerError::unrecoverable(...)on the Rust side. Subclassing works.- Panics also go to DLQ. A handler that throws an uncaught exception does not retry — it routes to DLQ with
DlqReason::Panic. Treat panics as code bugs, not transient failures. - CLAIM is the safety net. If a worker crashes mid-handler before the retry path runs, the engine’s idle-pending claim path re-delivers the entry on the next read. You don’t need to handle that case yourself.
Next steps
Section titled “Next steps”- Delayed and repeatable jobs —
addIn, cron specs,MissedFiresPolicy. - Configure retries — backoff types, jitter, queue-wide vs per-job.
- Route to DLQ — every
DlqReasonand when each fires. - For the underlying mechanics: Retry and backoff.