Replacing cron jobs with a centralized task scheduler

I find the best comments here to be ones where people use their knowledge and experience to discuss the relative strengths and weaknesses of the technology in the post. I see a bunch of short single-sentence comments here that add no value.

For my part, I see this pattern repeatedly at different places. The raw tools in the platforms are too codey and the third-party frameworks like Temporal seem overkill, so you build a scheduler and need to solve the problems OP did: only run once, know if it errored, etc.

But it's amazing how "it's firing off a basic action!" becomes a script, then becomes a script composed of reusable actions that can pick up where they left off in case of errors ... Over time your "it's just enough for us!" feature creeps towards the framework's functionality.

I'd be curious to know how long the OP's solution stays simple before it submits to the feature creep demands. (Long may complexity be fought off, though! Every day you can live without the complexity of full workflows is a blessing)

I see that the author took a 'heuristical' approach for retrying tasks (having a predetermined amount of time a task is expected to take, and consider it failed if it wasn't updated in time) and uses SQS. If the solution is homemade anyway, I can only recommend leveraging your database's transactionality for this, which is a common pattern I have often seen recommend and also successfully used myself:

- At processing start, update the schedule entry to 'executing', then open a new transansaction and lock it, while skipping already locked tasks (`SELECT FOR UPDATE ... SKIP LOCKED`).

- At the end of processing, set it to 'COMPLETED' and commit. This also releases the lock.

This has the following nice characteristics:

- You can have parallel processors polling tasks directly from the database without another queueing mechanism like SQS, and have no risk of them picking the same task.

- If you find an unlocked task in 'executing', you know the processor died for sure. No heuristic needed

Jobs that need retries, atomicity, monitoring, rescheduling, ad hoc scheduling, and flexibility probably aren't suited to most cron servers.

Beanstalkd, cronicle, agenda, sidekiq, faktory, celery, etc. are the usual suspects.

What is often missing is HA of the controller service process.

Is there a cool lightweight alternative to cron for (at least) a single host?

To illustrate what I am looking for, I often end up using supervisord [0] (but I also like immortal [1]) for process control when not on a systemd enabled system. In my experience they are reliable, lightweight and a pleasure to work with.

I am looking for something similar for scheduled jobs.

- [0] https://supervisord.org/

- [1] https://immortal.run/

Aka workflow orchestrator, pipeline manager, process runner, automation tool.

It's not clear if they used a product or DIY solution. The nice thing many existing products offer is a web UI and a database.

On my current team we run a centralized task scheduler used by other products in our company that manages on the order of around ~30M schedules. To that end, it's a home-grown distributed system that's built on top of Postgres and Cassandra with a whole control plane and data plane. It's been pretty fun to work on.

There are two main differences between our system and the one in the post:

- In our scheduler, the actual cron (aka recurrence rule) is stored along with the task information. That is, you specify a period (like "every 5 minutes" or "every second Tuesday at 2am") and the task will run according that schedule. We try to support most of the RRule specification. [1] If you want a task to just run one time in the future, you can totally do that too, but that's not our most common use case internally.

- Our scheduler doesn't perform a wide variety of tasks. To maximize flexibility and system throughput, it does just one thing: when a schedule is "due", it puts a message onto a queue. (Internally we have two queueing systems it interops with -- an older one built on top of Redis, and a newer one built on PG + S3). Other team consume from those queues and do real work (sending emails, generating reports, etc). The queueing systems offer a number of delivery options (delayed messages, TTLs, retries, dead-letter queues) so the scheduling system doesn't have to handle it.

Ironically, because supporting a high throughput of scheduled jobs has been our biggest priority, visibility into individual task executions is a bit limited in our system today. For example, our API doesn't expose data about when a schedule last ran, but it's something on our longer term roadmap.

[1] https://icalendar.org/iCalendar-RFC-5545/3-8-5-3-recurrence-...

I love this solution, I've implemented a very similar task scheduler at many companies.

I do think the best solution for this is still RabbitMQ. It has the ability to push tasks in the queue and tell it to run at a very specific time called "Delayed Messages" and then it just processes them at that time.

I think BMW used to use a paid product named Control-M to handle this (from BMC, still exists).

It contained what people quickly need to reach for:

- schedule a job in UTC or local time zone for a particular place;

- schedule a job but only if another job ran beforehand;

- semaphore-like resource limits on jobs.

It did this with job generating resource tokens and other jobs stating a token as a condition for being scheduled.

It ended up being a not so nice system to debug to be honest, but worked fine.

For simple job, I’d reach for systemd timers on a single machine, a kubernetes cronjob on a given platform, or something external altogether otherwise (for geo-distributed scheduled jobs).

The Windows Task Scheduler is actually very nice and powerful. One cool trick is to have a task triggered by a windows event.

Great work. Did you consider buying instead of building? I’ve worked at organizations that built similar systems, but what was often lacking was developer experience, observability, and scalability, basically everything outside of core functionality; essentially the stuff that you're trying to tack on as you improve your system.

Now that I'm building on my own, I’ve thought about building as well, but I’ve found that off-the-shelf systems handle all of this far better (and they are opensourced too), ie trigger-dot-dev and many others.

> We had createScheduledPosts.ts that would run every 15 minutes, scan our table of scheduled posts and create any that needed to be published.

Why not set the publication_date when you create a post and have a function getPublishedPosts that fetches a list of posts, filtering out those with a publication_date earlier than the current date? With this approach, you don't need cron jobs at all.

Why use a 1 minute cron job to run the tasks, instead of a continuously-running queue worker (or several)?

One gotcha with roll your own task scheduler is if you want to run it across multiple machines. If you need 5 machines running different scheduled tasks, you need a locking mechanism to ensure only one machine is processing the task. In the author’s approach this is handled by the queue, but in my read the scheduler can only happen on one machine or you get multiple of the same task in the queue. Retry can get more complicated- depending on the failure you may want an exponential backoff, retrying N times and waiting longer durations between. A nice dashboard to see the status of everything is helpful also.

In .NET world I use Hangfire for this. In Node (I assume what this is) I tinkered with Bull, but not sure what best in class is there.

Unmeshed.io is a newer startup in the space - and works like a charm. Temporal seems like more targeting durable executions, but scheduling a different game. It starts with crons but soon you got to deal with holidays, adhoc skips and holds and more especially during maintenance and upgrades.

Unmeshed has all of these, managing holiday calendars etc and makes it super easy. It even has agents for AS400 server commands if that is still a thing you need.

What happens when the DB gets large? How do you handle idempotency? (What if SQS delivers twice?) The cron job is still a single point of failure...

Isn't a "centralized task scheduler" pretty much what cron is?

I find Rundeck is great for this. Using it with hundreeds of jobs for a decade, with a bunch of users accessing it and checking logs, having retries, notifications and all enterprise thingies for free. Providing easy way to have GUI for scripts.

If they are using AWS, why not use what AWS already has, battle tested for task scheduling functions?

I looked around years ago and found Rundeck to be a good system for scheduled tasks.

HTCondor is always an option. Lacks shiny tinfoil, but works like a tank.

Temporal.io is made for this

Next thing you know you'll have systemd.

Aka workflow orchestrator, pipeline manager, process runner, automation tool.

It's not clear if they used a product or DIY solution. The nice thing many existing products offer is a web UI and a database.

There are two main differences between our system and the one in the post:

[1] https://icalendar.org/iCalendar-RFC-5545/3-8-5-3-recurrence-...

I love this solution, I've implemented a very similar task scheduler at many companies.

I think BMW used to use a paid product named Control-M to handle this (from BMC, still exists).

It contained what people quickly need to reach for:

- schedule a job in UTC or local time zone for a particular place;

- schedule a job but only if another job ran beforehand;

- semaphore-like resource limits on jobs.

It did this with job generating resource tokens and other jobs stating a token as a condition for being scheduled.

It ended up being a not so nice system to debug to be honest, but worked fine.

For simple job, I’d reach for systemd timers on a single machine, a kubernetes cronjob on a given platform, or something external altogether otherwise (for geo-distributed scheduled jobs).

The Windows Task Scheduler is actually very nice and powerful. One cool trick is to have a task triggered by a windows event.

Unmeshed has all of these, managing holiday calendars etc and makes it super easy. It even has agents for AS400 server commands if that is still a thing you need.

I looked around years ago and found Rundeck to be a good system for scheduled tasks.

HTCondor is always an option. Lacks shiny tinfoil, but works like a tank.

Maybe I'm just lucky to work at a place with good tools, but in my experience Temporal isn't super heavyweight to use compared to building your own even-very-simple scheduler.

And it's worth it because now you have Temporal, which is the bees knees as far as I'm concerned. I will gladly sing praises of any tool that saves me getting paged, and Temporal has that in spades.

Cloud companies also provide globe-scale cronjobs that work a lot like a Unix cronjob. Arguably less mental overhead than adopting a separate framework.

And such a service provides reliability guarantees.

If I have to do a reliable periodic service, my go-to is a kubernetes cronjob, which is like a baby version of a cloud cronjob. I'd be reluctant to adopt some sort of task queue framework because of the complexity of the mental model plus the complexity of keeping one more thing running reliably. K8s is already running reliably, I might as well use that.

The pragmatic answer is Jenkins. Always has been.

- At processing start, update the schedule entry to 'executing', then open a new transansaction and lock it, while skipping already locked tasks (`SELECT FOR UPDATE ... SKIP LOCKED`).

- At the end of processing, set it to 'COMPLETED' and commit. This also releases the lock.

This has the following nice characteristics:

- You can have parallel processors polling tasks directly from the database without another queueing mechanism like SQS, and have no risk of them picking the same task.

- If you find an unlocked task in 'executing', you know the processor died for sure. No heuristic needed

This introduces long-running transactions, which at least in Postgres should be avoided.

Don't have to keep transaction open. What I do is:

1. Select next job

2. Update status to executing where jobId = thatJob and status is pending

3. If previous affected 0 rows, you didn't get the job, go back to select next job

If you have "time to select" <<< "time to do" this works great. But if you have closer relationship you can see how this is mostly going to have contention and you shouldn't do it.

This is exactly what we're doing. Works like a charm.

Jobs that need retries, atomicity, monitoring, rescheduling, ad hoc scheduling, and flexibility probably aren't suited to most cron servers.

Beanstalkd, cronicle, agenda, sidekiq, faktory, celery, etc. are the usual suspects.

What is often missing is HA of the controller service process.

Chronicle is a lifesaver. HA, clustering, API, clean UI, it's doing everything right. I'm using this also as an API wrapper for Bash and Python scripts.

https://github.com/jhuckaby/Cronicle/blob/master/docs/Setup....

I'd probably even add systemd timers to that list. It does most of what you list, minus the retries (but I think you could handle that in the service definition)

Is there a cool lightweight alternative to cron for (at least) a single host?

I am looking for something similar for scheduled jobs.

- [0] https://supervisord.org/

- [1] https://immortal.run/

Supercronic: https://github.com/aptible/supercronic

Designed to run in a container, but should equally well work on a single host. However, no option for "high availability" running, where multiple hosts coordinate.

Take a look at this comment for some options: https://news.ycombinator.com/item?id=44752548

> We had createScheduledPosts.ts that would run every 15 minutes, scan our table of scheduled posts and create any that needed to be published.

Maybe there's a bunch of other actions that need to take place when a post is published, such as sending notification emails, or posting stuff to social media. They could of course be scheduled jobs in their own right, but you haven't really saved yourself any effort there, and now if the publishing time changes you've got to reschedule all those individual jobs.

Why use a 1 minute cron job to run the tasks, instead of a continuously-running queue worker (or several)?

It's folk wisdom, generated by a long line of people who did not have proper dæmon management despite such tooling having been available since the 1990s. Any sort of service management, from running things once at bootstrap to having a long-running service, becomes hammered into the shape of a cron job.

There are loads of people over the years who have reached for cron instead of reaching for proper general-purpose dæmon management (SRC, SMF, daemontools, runit, daemontools-encore, perp, s6, ...). It is on Stack Exchange answers and in people's personal "How I did this" articles on WWW sites. (Although the idea goes back to the Usenet era.) It became one of those practices perpetuated because other people did it.

The next step is always discovering that cron's error handling and logging are aimed at an era when the system operator sat in the console room, and received "You have new mail" notifications at the console shell prompt.

And the step after that is (re-)discovering that the anacron approach does not fully cut the mustard. (-:

Single scripts are easier coded and can be more loosely, as you don't have to look out for sneaky memory-leaks and other problems which might emerge in long-running tasks. There is also no need to build and maintain a bespoke framework for managing your multiple jobs. This avoids mental debt for the devs. If you have many jobs, from multiple devs, it's the more pragmatic solution.

Back in the day, the reason I had 1-minute cron jobs (with flock of course) was because "what if the bespoke daemon gets killed somehow?" We also used screen/tmux a lot, but only for stuff that could afford to wait until somebody poked it (often, because if it repeatedly crashed the cause was likely novel and would need investigation).

Systemd has been a game-changer for small-scale deployments.

In .NET world I use Hangfire for this. In Node (I assume what this is) I tinkered with Bull, but not sure what best in class is there.

Oban enters the chat… :)

Isn't a "centralized task scheduler" pretty much what cron is?

It’s not even a centralized task scheduler on its native UNIX: iI’s a centralized *userspace* task scheduler.

Mainframe and minicomputer operating systems support scheduling in the operating system itself, as part of their process/thread scheduler; their native queuing systems are built on top of the primitives their scheduler offers, for proper accounting and maximum resource utilization (including prioritization).

Only UNIX would just provide a way to run processes at a specified time or interval and call the job done.

I was going to guess the author needed something that unified the task scheduling across a distributed system of computers. But that requirement is never mentioned in the article. And they still use cron to call their new scheduler... So unless I am missing something they did not replace cron al all, they just rewrote their scheduled jobs to use a common library and have more robust error handling.

It's lacking a convenient way to queue a task and inspect the task queue, but "at" (at/atq/atrm) provides exactly the "single cron job responsible for executing scheduled tasks that runs once every minute" that the author was looking for.

centralized for many computers.

What happens when the DB gets large? How do you handle idempotency? (What if SQS delivers twice?) The cron job is still a single point of failure...

Managing complex scheduled workflows at scale comes with a lot of nuances. This is exactly why we're building DBOS (shameless plug! https://github.com/dbos-inc), which provides durable cron jobs and exactly-once workflow triggering. Since it's just a library on top of Postgres, it doesn't require a centralized scheduler (well, think of Postgres as the coordinator).

One challenge is to guarantee exactly-once processing across software upgrades. DBOS uses the cron-scheduled time as an idempotency key, and tags each workflow execution with a version. We also use the database transactions to guard against conflicting concurrent updates.

If they are using AWS, why not use what AWS already has, battle tested for task scheduling functions?

I've built something similar as a service to be used by developers at a large-ish enterprise. Granted, it was based on functionality offered by AWS, but the users didn't really know that.

The reason we built it, despite the fact that developers could very well have deployed a CloudWatch EventBridge schedule + SQS + lambda or similar, is because they never did. They would consistently choose to build it into their existing services, which were rarely if ever handling things like limiting concurrency if a task took too long, emitting metrics on success/failure/duration, audit logging for when a task had to be manually triggered for some reason. If I had to guess, I think the reason was because it allowed them to piggyback on existing change controls and "just write application code" instead of having to think about additional pieces of infrastructure.

If I could do it again, I would probably have reached for something like Temporal, even though it seemed overkill for what we initially set out to do. It took about a week before people started asking for locking and retries.

So that they can drop AWS

Temporal.io is made for this

Paying $500 a month for cron just seems wrong.

And adds an external dependency for something very essential.

Unmeshed.io is another alternative. You don’t even need to write code for your schedules

Next thing you know you'll have systemd.

Cloud companies also provide globe-scale cronjobs that work a lot like a Unix cronjob. Arguably less mental overhead than adopting a separate framework.

And such a service provides reliability guarantees.

Don't have to keep transaction open. What I do is:

1. Select next job

2. Update status to executing where jobId = thatJob and status is pending

3. If previous affected 0 rows, you didn't get the job, go back to select next job

If you have "time to select" <<< "time to do" this works great. But if you have closer relationship you can see how this is mostly going to have contention and you shouldn't do it.

Chronicle is a lifesaver. HA, clustering, API, clean UI, it's doing everything right. I'm using this also as an API wrapper for Bash and Python scripts.

https://github.com/jhuckaby/Cronicle/blob/master/docs/Setup....

This is exactly what we're doing. Works like a charm.

Supercronic: https://github.com/aptible/supercronic

Designed to run in a container, but should equally well work on a single host. However, no option for "high availability" running, where multiple hosts coordinate.

And the step after that is (re-)discovering that the anacron approach does not fully cut the mustard. (-:

Systemd has been a game-changer for small-scale deployments.

Which is kind of ironic, given that systemd basically brings into Linux system services management from other UNIXes, Windows, mainframes and micros, but still gets plenty of hate.

> Systemd has been a game-changer for small-scale deployments.

Why is this? My only memory of systemd was slightly better configurations for sequencing the start of processes that depended on the completion of earlier processes so I'm a bit rusty.

> Systemd has been a game-changer for small-scale deployments.

The deep integration into nixos made me feel the same. You sound like you could enjoy a bit nix too.

Oban enters the chat… :)

It’s not even a centralized task scheduler on its native UNIX: iI’s a centralized *userspace* task scheduler.

Only UNIX would just provide a way to run processes at a specified time or interval and call the job done.

Although you're right that Unix never really reached having the full three-level scheduling mechanisms of the mainframe operating systems, cron is not the actual Unix parallel of the high-level scheduler that keeps the running jobs list fed.

That is in fact batch (and atrun, although that's considered an implementation detail).

* https://pubs.opengroup.org/onlinepubs/9799919799/utilities/b...

Most implementations flesh out the "implementation-defined algorithms" stuff to be calculations based upon load averages, as on NetBSD.

* https://man.netbsd.org/batch.1

* https://man.netbsd.org/atrun.8

Or fairly primitive parallelism limits as on Illumos.

* https://illumos.org/man/1/batch

* https://illumos.org/man/5/queuedefs

Not quite JECL, is it? (-:

centralized for many computers.

I've built something similar as a service to be used by developers at a large-ish enterprise. Granted, it was based on functionality offered by AWS, but the users didn't really know that.

So that they can drop AWS

It is a bit hard when they rely on AWS message queues for the implementation.

Paying $500 a month for cron just seems wrong.

And adds an external dependency for something very essential.

You can run it yourself for free

Unmeshed.io is another alternative. You don’t even need to write code for your schedules

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

Maybe I'm just lucky to work at a place with good tools, but in my experience Temporal isn't super heavyweight to use compared to building your own even-very-simple scheduler.

And it's worth it because now you have Temporal, which is the bees knees as far as I'm concerned. I will gladly sing praises of any tool that saves me getting paged, and Temporal has that in spades.

Temporal is awful. Difficult to test, difficult to decouple from your domain code. At least that’s what I have seen in organizations. OP’s solution is rather understandable: with a couple of interfaces, you make the code easily testable.

second temporal. plus it gives you more freedom to write jobs in different languages... not that you would or should in most cases but there's definitely good reasons

The pragmatic answer is Jenkins. Always has been.

Jenkins is a place where you can be safe for a long time, however, it starts to break down at scale. I see it time after time for these batch workflow jobs. At the start, jobs run in seconds and everyone is happy.

Over time, jobs start taking long enough to the point where you need to split them. Separate jobs are assigned slices of the original batch. Eventually, there are so many slices that you make a Jenkins job where the sole responsibility is firing off these individual jobs.

Then you start hitting the real painpoints in Jenkins. Poor allocation of jobs across your nodes/agents, often overloading CPU/Mem on machines, and you struggle to manage the ungodly interface that is the Jenkins REST endpoint. You install many Jenkins addons to try and address the scheduling problems, and end up with a team dedicated to managing this Jenkins infrastructure.

The scaling struggles continue to amass and you end up needing separate Jenkins instances to battle the load. Any attempt at replacing the Jenkins infrastructure goes on standstill, as the amount of random scripts found in Jenkinsfiles has created an insurmountable vendor lock-in.

You read a post about a select-for-update job scheduler and reflect on simpler times. You cry as you refactor your Jenkins Groovy DSL.

Jenkins is terrible for just about everything. Cron has real problems but at least you can version control the crontab. Jenkins is fat, hard to work with since you'll just have one shared instance, and everything is burred in special objects hidden behind a very unergonomic and undiscoverable web GUI.

Ugh no. It was good enough for its time, but times have moved on.

The danger is that it's so easy to start and it's decent for small and simple applications. Once your jobs start growing, both in number of contributors and in workload, the problems start. DSL is difficult to debug, plugins are buggy and the brittle master node will become your most precious pet that need constant supervising to not grind the whole system to a stop. By the time you realize this you have a hard time to get out of this lockin.

This introduces long-running transactions, which at least in Postgres should be avoided.

Depends what else you’re running on it; it’s a little expensive, but not prohibitively so.

I read too many "use Postgres as your queue (pgkitchensink is in beta)", now I'm learning listen/notify is a strain, and so are long transactions. Is there a happy medium?

t1: select for update where status=pending, set status=processing

t2: update, set status=completed|error

these are two independent, very short transactions? or am i misunderstanding something here?

edit:

i think i'm not seeing what the 'transaction at start of processor' logic is; i'm thinking more of a polling logic

    while true:
      r := select for update
      if r is None:
        return
      sleep a bit

this obviously has the drawback of knowing how long to sleep for; and tasks not getting "instantly" picked up, but eh, tradeoffs.

I'd probably even add systemd timers to that list. It does most of what you list, minus the retries (but I think you could handle that in the service definition)

systemd doesn't scale beyond one system or have high availability.

Ugh no. It was good enough for its time, but times have moved on.

Take a look at this comment for some options: https://news.ycombinator.com/item?id=44752548

Which is kind of ironic, given that systemd basically brings into Linux system services management from other UNIXes, Windows, mainframes and micros, but still gets plenty of hate.

> Systemd has been a game-changer for small-scale deployments.

Why is this? My only memory of systemd was slightly better configurations for sequencing the start of processes that depended on the completion of earlier processes so I'm a bit rusty.

> Systemd has been a game-changer for small-scale deployments.

The deep integration into nixos made me feel the same. You sound like you could enjoy a bit nix too.

That is in fact batch (and atrun, although that's considered an implementation detail).

* https://pubs.opengroup.org/onlinepubs/9799919799/utilities/b...

Most implementations flesh out the "implementation-defined algorithms" stuff to be calculations based upon load averages, as on NetBSD.

* https://man.netbsd.org/batch.1

* https://man.netbsd.org/atrun.8

Or fairly primitive parallelism limits as on Illumos.

* https://illumos.org/man/1/batch

* https://illumos.org/man/5/queuedefs

Not quite JECL, is it? (-:

You can run it yourself for free

It is a bit hard when they rely on AWS message queues for the implementation.

If you're running on AWS and not designing a system that locks you in to the AWS platform, then you're going to be overpaying by a lot.

second temporal. plus it gives you more freedom to write jobs in different languages... not that you would or should in most cases but there's definitely good reasons

Don’t do it onprem unless you want to spend six figures monthly on cassandra database nodes for pretty shit performance and face constant saas upselling and then discover how hard it is to migrate off of.

Write your own scheduler.

Oracle is cheaper in the long run.

? You can (and should) version control your Jenkins config as well, including the pipeline codes.

You read a post about a select-for-update job scheduler and reflect on simpler times. You cry as you refactor your Jenkins Groovy DSL.

it’s actually much more common than you think for people to reuse CI systems for cron tasking.

It’s always a mistake, but it’s easy in the moment and sticks around longer than I’d like.

What's the thing you should replace Jenkins with at scale?

I read too many "use Postgres as your queue (pgkitchensink is in beta)", now I'm learning listen/notify is a strain, and so are long transactions. Is there a happy medium?

Just stop worrying and use it. If and when you actually bump into the limitations, then it's time to sit down and think and find a supplement or replacement for the offending part.

Depends what else you’re running on it; it’s a little expensive, but not prohibitively so.

Long running transactions interfere with vacuuming and increase contention for locks. Everything depends on your workload but a long running transactions holding an important lock is an easy way to bring down production.

systemd doesn't scale beyond one system or have high availability.

Do you know how many timers you could run on a single instance? An absurd amount.

t1: select for update where status=pending, set status=processing

t2: update, set status=completed|error

these are two independent, very short transactions? or am i misunderstanding something here?

edit:

i think i'm not seeing what the 'transaction at start of processor' logic is; i'm thinking more of a polling logic

    while true:
      r := select for update
      if r is None:
        return
      sleep a bit

this obviously has the drawback of knowing how long to sleep for; and tasks not getting "instantly" picked up, but eh, tradeoffs.

Your version makes sense. I understood the OP's approach as being different.

Two (very, if indexed properly) short transactions at start and end are a good solution. One caveat is that the worker can die after t1, but before t2 - hence jobs need a timeout concept and should be idempotent for safe retrying.

This gets you "at least once" processing.

> this obviously has the drawback of knowing how long to sleep for; and tasks not getting "instantly" picked up, but eh, tradeoffs.

Right. I've had success with exponential backoff sleep. In a busy system, means sleeps remain either 0 or very short.

Another solution is Postgres LISTEN/NOTIFY: workers listen for events and PG wakes them up. On the happy path, this gets instant job pickup. This should be allowed to fail open and understood as a happy path optimization.

As delivery can fail, this gets you "at most once" processing (which is why this approach by itself it not enough to drive a persistent job queue).

A caveat with LISTEN/NOTIFY is that it doesn't scale due to locking [1].

[1]: https://www.recall.ai/blog/postgres-listen-notify-does-not-s...

They're proposing doing it in one transaction as a heartbeat.

> - If you find an unlocked task in 'executing', you know the processor died for sure. No heuristic needed

If you're running on AWS and not designing a system that locks you in to the AWS platform, then you're going to be overpaying by a lot.

Write your own scheduler.

Oracle is cheaper in the long run.

? You can (and should) version control your Jenkins config as well, including the pipeline codes.

Systemd has timers now which have way better error handling.

I dabbled a little with Nixos a while back (e.g. I think I reported the bug that broke the entire point of /etc/os-release for chroots, as well as commented on how to do a container install from scratch at a point when nobody documented it), but there were 3 things that really pushed me away:

  1. Nix has clear advantages for *deployment* (including end-user deployment) but really gets in the way for new *development*. Maybe flakes fix this? Maybe not though.
  2. The "Nix on other Linux" install scripts were hostile in attacking startup scripts, rather than allowing opt-in isolation.
  3. The Nix language (and library?) is not sane. Nobody actually understands it, only copy-pastes pieces of existing package scripts and hopes the changes work.

Airflow can be frustrating but when it works it is so satisfying.

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

What's the thing you should replace Jenkins with at scale?

Im a firm believer that there will never be a perfect general purpose job scheduler. The priority for how jobs are scheduled is always deeply coupled to your business needs. General purpose schedulers always end up as a jack of all trades but master of none. With a custom built scheduler you get that control, but do have to re-invent the wheel for a lot of features. Jenkins, Argo, Airflow, Cron, etc, all have their own pros and cons.

it’s actually much more common than you think for people to reuse CI systems for cron tasking.

It’s always a mistake, but it’s easy in the moment and sticks around longer than I’d like.

CI systems like Jenkins are there and they're corp-approved.

Getting a weird 3rd party scheduling system with access to internal stuff approved is HARD in big corps.

So we (ab)use the CI system we have. It has scheduling and it already accesses internal resources.

What about Camunda? It’s a corporate workflow engine.

Just stop worrying and use it. If and when you actually bump into the limitations, then it's time to sit down and think and find a supplement or replacement for the offending part.

Excellent advice across many domains/techs here.

What about Camunda? It’s a corporate workflow engine.

CI systems like Jenkins are there and they're corp-approved.

Getting a weird 3rd party scheduling system with access to internal stuff approved is HARD in big corps.

So we (ab)use the CI system we have. It has scheduling and it already accesses internal resources.

Excellent advice across many domains/techs here.

Do you know how many timers you could run on a single instance? An absurd amount.

Airflow can be frustrating but when it works it is so satisfying.

I think mistaking Airflow for a mere "task scheduler" is part of that frustration.

After using Argo Workflows, I don't think I will ever return to Airflow. Kubernetes is not an easy system to manage, but managing an Airflow setup is somehow worse. The story around disaster recovery and scheduler redundancy was an absolute nightmare for me.

Your version makes sense. I understood the OP's approach as being different.

This gets you "at least once" processing.

> this obviously has the drawback of knowing how long to sleep for; and tasks not getting "instantly" picked up, but eh, tradeoffs.

Right. I've had success with exponential backoff sleep. In a busy system, means sleeps remain either 0 or very short.

As delivery can fail, this gets you "at most once" processing (which is why this approach by itself it not enough to drive a persistent job queue).

A caveat with LISTEN/NOTIFY is that it doesn't scale due to locking [1].

[1]: https://www.recall.ai/blog/postgres-listen-notify-does-not-s...

They're proposing doing it in one transaction as a heartbeat.

> - If you find an unlocked task in 'executing', you know the processor died for sure. No heuristic needed

Systemd has timers now which have way better error handling.

I think mistaking Airflow for a mere "task scheduler" is part of that frustration.

If the system is already using SQS, DynamoDB has this locking library which is lighter weight for this use case

https://github.com/awslabs/amazon-dynamodb-lock-client

> The AmazonDynamoDBLockClient is a general purpose distributed locking library built on top of DynamoDB. It supports both coarse-grained and fine-grained locking.

What are you thoughts on using Redis Streams or using a table instead of LISTEN/NOTIFY (either a table per topic or a table with a compound primary key that includes a topic - possibly a temporary table)?

Yes, and that cannot work: if a task is unlocked but in 'executing' state, how was it unlocked but its state not updated?

If a worker/processor dies abruptly, it will neither unlock nor set the state appropriately. It won't have the opportunity. Conceptually, this failure mode can always occur (think, power loss).

If such a disruption happened, yet you later find tasks unlocked, they must have been unlocked by another system. Perhaps Postgres itself, with a killer daemon to kill long-running transactions/locks. At which point we are back to square one: the job scheduling should be robust against this in the first place.

  1. Nix has clear advantages for *deployment* (including end-user deployment) but really gets in the way for new *development*. Maybe flakes fix this? Maybe not though.
  2. The "Nix on other Linux" install scripts were hostile in attacking startup scripts, rather than allowing opt-in isolation.
  3. The Nix language (and library?) is not sane. Nobody actually understands it, only copy-pastes pieces of existing package scripts and hopes the changes work.

> 3. The Nix language (and library?) is not sane. Nobody actually understands it, only copy-pastes pieces of existing package scripts and hopes the changes work.

Perhaps Nix is "Wonko the Sane" and it is in fact the rest of us who are in the asylum?

Nix, the language, is a little strange at first but really does make sense. Nixpkgs, the "standard library", is a little stranger and sometimes makes an odd default choice. The nice thing though is that using Nix you can coerce Nixpkgs into just about any shape that suits you.

Argo workflows is much more painful for data processing than Airflow in my experience.

> 3. The Nix language (and library?) is not sane. Nobody actually understands it, only copy-pastes pieces of existing package scripts and hopes the changes work.

Perhaps Nix is "Wonko the Sane" and it is in fact the rest of us who are in the asylum?

If the system is already using SQS, DynamoDB has this locking library which is lighter weight for this use case

https://github.com/awslabs/amazon-dynamodb-lock-client

> The AmazonDynamoDBLockClient is a general purpose distributed locking library built on top of DynamoDB. It supports both coarse-grained and fine-grained locking.

Yes, and that cannot work: if a task is unlocked but in 'executing' state, how was it unlocked but its state not updated?

If a worker/processor dies abruptly, it will neither unlock nor set the state appropriately. It won't have the opportunity. Conceptually, this failure mode can always occur (think, power loss).

Argo workflows is much more painful for data processing than Airflow in my experience.

It’s a tradeoff. Ease of modeling the pipelines vs ease of managing the infrastructure. Im not really a fan of either syntax for defining DAGs, but they're the best options out there imo.

I've not used Redis Streams, but it might work. I've seen folks advise against PG, in favor of Redis for job queues.

> using a table instead of LISTEN/NOTIFY

What do you mean? The job queue is backed by a PG table. You could optionally layer LISTEN/NOTIFY on top.

I've had success with a table with compound, even natural primary keys, yes. Think "(topic, user_id)". The idea is to allow for PARTITION BY should the physical tables become prohibitively large. The downsides of PARTITION BY don't apply for this use case, the upsides do (in theory - I've not actually executed on this bit!).

Per "topic", there's a set of workers which can run under different settings (e.g. number of workers to allow horizontal scaling - under k8s, this can be automatic via HorizontalPodAutoscaler and dispatching on queue depth!).

I've not used Redis Streams, but it might work. I've seen folks advise against PG, in favor of Redis for job queues.

> using a table instead of LISTEN/NOTIFY

What do you mean? The job queue is backed by a PG table. You could optionally layer LISTEN/NOTIFY on top.

It’s a tradeoff. Ease of modeling the pipelines vs ease of managing the infrastructure. Im not really a fan of either syntax for defining DAGs, but they're the best options out there imo.

At Heartbeat, we have a lot of different tasks that need to run at a particular time. Users can create draft posts or events that get published at a certain time. Event reminders need to be sent at a certain number of hours before an event. Automated workflows can be set up that send emails or direct messages after a delay.

For the longest time, all of these tasks were managed by a variety of cron scripts. We had createScheduledPosts.ts that would run every 15 minutes, scan our table of scheduled posts and create any that needed to be published. sendEventReminders.ts would run every single minute, scan our table of events, and send any notifications that needed to be sent out. And so on.

Each of these cron jobs would need to be managed independently. Whenever a new feature was added that involved running tasks in the future, a new cron job would be created. If one of the scripts started erroring, I’d need to figure out why, fix it and then figure out a way to retroactively run the tasks that were missed while the script was broken. Sometimes, we’d get reports from customers that a certain task that was supposed to run did not. I’d painstakingly dig into the logs & code, trying to figure out why a particular event reminder did not get sent on time. The first couple times this happened, I’d usually discover that we lacked the logs to even properly diagnose the issue. All I would be able to do is add some more logs and hope that I’d find the problem the next time. Once the logs were in place, I’d uncover some bug caused by timezones, improper error handling or who knows what else.

Eventually, I came to my senses and realized that all of these various cron jobs were doing the same thing. And rather than have 10 different cron jobs each implementing their own half-baked version of a task scheduler, we should just have a robust, centralized system for scheduling tasks.

The way it works is we have a single database table called ScheduledTasks with the following schema:

enum ScheduledTaskStatus {
	QUEUED
	EXECUTING
	COMPLETED
}

model ScheduledTask {
	id                               String @id
	communityID                      String
	createdAt                        DateTime
	lastStatusUpdate                 DateTime
	timestamp                        DateTime
	status                           ScheduledTaskStatus
	expectedExecutionTimeInMinutes   Int
	expirationInMinutes              Int?
	priority                         Int
	payload                          Json
	message                          String?

	@@index([status, timestamp])
}

payload is a discriminated union that contains each type of task we have. For example:

type ScheduledTaskPayload =
	| {
			type: "PUBLISH_EVENT";
			eventID: EventID;
	  }
	| {
			type: "PUBLISH_SCHEDULED_POST";
			scheduledPostID: ScheduledPostID;
	  }
	| {
			type: "SEND_EVENT_REMINDER";
			eventID: EventID;
	  }
	| {
			type: "SEND_EMAIL";
			email: string;
			subject: string;
			body: string;
	  };

Now, whenever we have a task that needs to be scheduled for the future, all we need to do is insert a new ScheduledTask into the database. We have a single cron job responsible for executing scheduled tasks that runs once every minute.

The cron job works as follows:

Get all tasks that meet the following criteria:
- Status is not Completed
- timestamp is less than now + 30 seconds
- The task has not expired (now is less than timestamp + expirationInMinutes or expirationInMinutes is null)
- If status is Executing, now > timestamp + expectedExecutionTimeInMinutes
Sort all of the tasks by priority
Update all of the tasks as Executing in the database
Create an AWS SQS message for each task

Separately, we have an SQS consumer that listens for the SQS messages. The consumer reads the payload discriminated union and calls the corresponding function responsible for executing the task.

async function processTask(taskPayload: ScheduledTaskPayload) {
	if (taskPayload.type === "PUBLISH_EVENT") {
		await publishEvent(taskPayload.eventID);
	} else if (taskPayload.type === "PUBLISH_SCHEDULED_POST") {
		await publishScheduledPost(taskPayload.scheduledPostID);
	}
	//...
}

After the task runs, mark it as completed in the database. Some of our tasks will return a new scheduled task. If they do, insert the new scheduled task into the database. For example, after sending an event reminder for an instance of a recurring event, the next reminder is scheduled.

The system has retry logic built in. If for some reason the script does not run for some amount of time due to an outage or error, the scheduled tasks will still exist in the database. Once the script is running again, any tasks that were not executed when they were originally supposed to will be run. The expirationInMinutes enables us to control which tasks are run at a later time. Some tasks, such as event reminders, don’t make sense to be run after a certain point. Others, like publishing scheduled posts, fall into the “better late than never” bucket, in which case expirationInMinutes will be set to null. The expectedExecutionTimeInMinutes field lets us handle retry for tasks that get stuck in Executing. If a task that was scheduled for 10:00am is still marked as Executing at 10:01am, we probably don’t want to run the task again because the first run might still be in progress. However, by 10:08am, if the task is still stuck in Executing, it probably ran into an error and we can try running it again. expectedExecutionTimeInMinutes tells the system how long to wait until rerunning a task stuck in Executing.

To ensure tasks run at the right time, we need to make sure that whenever a change is made to an entity, the corresponding scheduled task is also updated. For example, when a user creates an event that starts at 3pm, we create a scheduled task for the reminder to be sent at 2pm. If the user later updates the event to be at 6pm, we need to update the same scheduled task to send the reminder at 5pm instead. We enable this by using consistent ids for editable tasks.

type ScheduledTaskPayload =
	| {
			type: "PUBLISH_EVENT";
			eventID: EventID;
	  }
	| {
			type: "PUBLISH_SCHEDULED_POST";
			scheduledPostID: ScheduledPostID;
	  }
	| {
			type: "SEND_EVENT_REMINDER";
			eventID: EventID;
	  }
	| {
			type: "SEND_EMAIL";
			email: string;
			subject: string;
			body: string;
	  };

function getTaskID(payload: ScheduledTaskPayload) {
	if (payload.type === "PUBLISH_EVENT") {
		return `task-${payload.type}-${payload.eventID}`;
	} else if (payload.type === "PUBLISH_SCHEDULED_POST") {
		return `task-${payload.type}-${payload.scheduledPostID}`;
	} else if (payload.type === "SEND_EVENT_REMINDER") {
		return `task-${payload.type}-${payload.eventID}`;
	} else if (payload.type === "SEND_EMAIL") {
		return generateUUID();
	} else {
		assertNever(payload);
	}
}

Whenever we create a task, we use getTaskID to get the id for the scheduled task. And rather than just creating the task, we do an upsert. So when a user creates an event for the first time, the scheduled task does not exist, so it will be created. When edits are made to the event, the eventID remains the same, so the id for the corresponding scheduled task will be the same. As a result, the previous scheduled task will be updated rather than creating a new one. Other tasks, such as SEND_EMAIL, can be triggered from a variety of sources and are not directly editable by the user, so those tasks just use a uuid instead.

Overall, creating the scheduled tasks system has come with enormous benefits:

We can centralize the work into one place. Our scheduled tasks cron job has a bunch of logging, error handling and monitoring built into it. We can implement all of this logic once instead of having to constantly reimplement it across various scripts.
Retries are handled naturally. If we have an outage or if one our task handlers starts running into errors, it’s easy for us to catch up on tasks that we missed after we fix the problem.
There’s a lasting record of every task that gets executed. If a user asks about why a certain action didn’t happen, we can check the database & logs to see whether the task got created. If it did get created, we can see if it ran into a particular error.
Implementing new features that involve task scheduling is now trivial. Instead of having to create a new cron script, all we need to do is add a new type to the ScheduledTaskPayload and a new handler to the processTask function.

Many of you probably already had the foresight to centralize your scheduled tasks into one place. But I haven’t seen too many people talking about this problem, so hopefully this was helpful for anyone that’s currently stuck maintaining a sea of scattered cron jobs.

Hacker Times

Hacker Times

Replacing cron jobs with a centralized task scheduler

Discussion

Discussion