FYI if anyone is looking for a production ready job runner in Elixir, I strongly recommend Oban. They have a paid pro version but the open source version is still absolutely fantastic.
Just to add to the parent the Open Source version and the Pro Version just differs in that the Pro Version has a few plugins that the normal version has (for example workflows where you can have multiple workers working together) but it is not something you need for most use-cases.
It's more than just a few - even more basic things like rate limiting or concurrency controls are gated behind Pro. It works extremely well, but I've been reluctant to use it in open source projects because there's quite a bit in there I'd need to rebuild.
im curious about your use case -- it seems weird (to me) to use it in an open source project unless its some kind of turnkey full app -- is there a way to just release it and encourage people to bring their own oban keys? that way it looks good for the elixir ecosystem that it has found a way to support hybrid open source libraries and expands the obam ecosystem
I wrote this up after having written a tutorial on doing this in GenStage around a decade ago, and thought it was interesting to have the two of them side by side to consider. I linked the original in the document linked to here.
Overall, I'm wildly impressed at how this Elixir code held up, and it was a joy to revisit this.
Great write up, thanks for sharing! Nothing against Oban, but it is nice seeing someone in the Elixir community not just say, "run Oban" and drop the mic
Years ago I built something similar, a Pub/Sub notification system: https://github.com/huqedato/qnotix. It is still running (with small modifications), in production, at my ex-customer.
Such a pity the industry (customers) reacts with skepticism every time I propose solutions based in Elixir/Erlang. I always hear: "Elixir, what? We want Java/.Net/Python/php"
I've run into this a few times before too. Sometimes discussing maintenance/ongoing costs and how minimal they are with an Elixir/Phoenix app can be compelling, though not always. I prefer to be over-transparent with clients so I'll usually add that if they plan to switch contractors often then it's probably not (tho sometimes still is) a good idea to go with Elixir. When I compare the amount of ongoing effort to keep a PHP, Node, or Python app running securely in prod with the ongoing effort with a Phoenix app, it's usually a better deal for them.
The one Danger Zone area though is when they want something between WordPress and a "real" app. A lot of the clients I've talked to already have WordPress and are really trying to stretch the framework to places it isn't really made to go. Those customers often think "I should be able to get a full app built for <=$10,000" and often shop around until they find someone who will accept it, and then they usually end up getting a bad/buggy product that is slow as dirt. At that point it's often too late as well because nobody wants to pay to start over. For those clients though I wouldn't recommend Elixir because they would almost rather spend more in maintenance costs than in initial development because the numbers feel smaller when spread out, even if they add up to more in the long run and with a much worse product.
That is not to say of course that any Elixir app is going to be better - there are bad Elixir devs just like any stack, though I think Elixir's relative lack of popularity actually leads to an overall more skilled dev pool.
It means they don't trust your judgement, aka you're just there to execute their plans. You're the floor guy, not the architect, owner, or developer of a house.
Not really. I am usually the one they pay to solve their problems. It's not a matter of trusting my judgement, rather they are trapped in an institutional/corporate mindset (old patterns are most suitable, 'best practice', 'reliable, proven tech' and such)
> I am usually the one they pay to solve their problems.
In the context that doesn't really say a lot, they pay the floor guy and the architect to solve problems, just different problems.
>they are trapped in an institutional/corporate mindset (old patterns are most suitable, 'best practice', 'reliable, proven tech' and such)
While true, it is also correct. From their perspective Elixir is an exotic material, the next "one they pay to solve their problems" is not going to know how to work with it, scrap it, and start over. Therefore the Elixir solution isn't a good solution for them.
As a business man once told me, "I never saw a feature I liked so much that I was wiling to pay for it twice."
Yup I would be concerned about how easy it is to find people.
And I say this as an elixir employer and whose company has done only elixir projects for 8 years.
In all cases though, our client just wanted us to solve a problem and didn’t care how we did it. So we’ve been very lucky. But have also earned their trust over the years by delivering.
Minor criticism, you can't claim this as a problem "Unfair distribution - Fast workers might grab all the work" and then turn around and claim it as a benefit in your own solution "Natural load balancing - Fast consumers get more work".
Fair point. Can you think of a better way to weave that into what I’m trying to illustrate as a mental model? I struggled with that. Happy to read a PR that tries to capture it better.
As someone who has been learning Elixir on and off for over a year, this looks really exciting on first skim through. Excited to give this a deep read this weekend!
If you have any feedback or anything is unclear feel free to open an issue. I am thinking I am going to take this and expand the concepts to start as a beginners primer going through the primary concepts in brief akin to Elixir School, and then expand it into building this and a web service that is using it and offering some real time features.
pgflow can be used with any language/runtime, I just started with TypeScript and Supabase, as that's what I'm using.
The worker is stateless and "dumb" by design (currently it runs on serverless functions) - it just calls SQL functions: "poll_for_tasks" to get some tasks from the queue and then either "complete_task" or "fail_task" after executing user code - that's it, nothing more, so it should be relatively easy to adopt it to other runtimes.
The absolute facility with which you can do distributed applications in Elixir is the big thing that sold me in the first place. Nice to see this tutorial!
> The Beauty of Modeling Everything as Events - hen you model work as events, powerful patterns emerge
Wouldn't you say these patterns aren't unique to event based systems since functional composition can also lead to these patterns? For eg: see the Streams API in Java
=========
- Where would push event systems make sense over pull based systems?
You also could use an advisory exact lock, which would serialize the order of input and force everyone to wait in line, but the advantage here is that we are allowed shared concurrent reads while maintaining our lock. Thank you for the feedback. It was fun to revisit after 10 years. The initial piece came after a Columbus, OH Ruby meetup that Jose attended. It was quite fun to see him put it together as a whole and I ran with things a bit further than his first ideas with this.
SKIP LOCKED is amazing for this kind of thing. I used it to build a transaction outbox a few years ago.
One thing worth looking into if you do this in production is adding a way to add partitions such that each partition is single threaded. It’s the only way to guarantee ordering if your jobs are doing anything non-deterministic.
We have a system where each pod spins up around 30 scheduled job instances of one job each processing a "partition", then transaction outbox is queried with hash of identifier equating it to partition.
We increased partition counts on sale days and it works well for us.
Couple of gotchas we had were.
1) Using hashtext from postgres is sketchy.
2) Increasing partiton count is an orchestra which requires stopping the partition.
FOR UPDATE SKIP LOCKED is great, but it needs to be in a transaction. In the example code it won't "do" anything because it selects for update then immediately loses the lock.
Claude says you can use a CTE to select and the run your update with the locked rows, but I have only ever used transactions.
I have not seen a single Elixir project that doesn't have a spaghetti mess made out of macros.
You can rant all you want about the disadvantages of OOP and other paradigms. But after your rant, do not come up with an even more absurd type system that gives you even more work to do without any upside.
Building a project out of constructs such as macros does not scale with project size or team size. The only aspect that scales is the resulting size of the unmanageable spaghetti. Tolerating having that as a widely used feature in your language and then praise it as being fantastic truly makes one wonder what are you comparing the Elixir experience with? Elixir is fantastic compared to... fighting 100 gorillas blindfolded.
There are good choices built into the language, like railroad oriented programming. But those will never undo the infinitely awful and inadequate insanity of having to deal with macros to survive.
And to be frank, even if there are use-cases where the language shines and can be excellent tool for the job, most of the time those are not your use-case, and most of the time your team is not familiar with the language enough to leverage its advantages or build something production-worthy.
And if you ever go through the pain of mastering Elixir, 99.99% of the time your next job will not use it. If you are at a startup celebrating the virtues of Elixir before having enough revenue, your next non-Elixir job is closer than you think because your company will run out of money. And if your plan is having a LLM help you write Elixir faster before you burn through your budget? Good luck debugging a labyrinth made out of AI generated macros, your on-call rotation will go great. Your stakeholders will be eager to learn more about how great Elixir is when they ask for an explanation of why nothing works, an estimate of when it will be fixed, why is the project late or how can they add more team members to make it go faster.
pure HN comedy that the article is about the disgust of running acme tools that cram all buzzwords just for the lulz, and most comments are pushing even worse tools with even more buzzwords ("still the torment Nexus, but in elixir!")
sigh. anyway. i had the same reaction to acme. but I'm lazier. i just ran ngix and the acme client in qemu and picked the certs out. i applaud the author for her tenacity.
Overall, I'm wildly impressed at how this Elixir code held up, and it was a joy to revisit this.
Such a pity the industry (customers) reacts with skepticism every time I propose solutions based in Elixir/Erlang. I always hear: "Elixir, what? We want Java/.Net/Python/php"
The one Danger Zone area though is when they want something between WordPress and a "real" app. A lot of the clients I've talked to already have WordPress and are really trying to stretch the framework to places it isn't really made to go. Those customers often think "I should be able to get a full app built for <=$10,000" and often shop around until they find someone who will accept it, and then they usually end up getting a bad/buggy product that is slow as dirt. At that point it's often too late as well because nobody wants to pay to start over. For those clients though I wouldn't recommend Elixir because they would almost rather spend more in maintenance costs than in initial development because the numbers feel smaller when spread out, even if they add up to more in the long run and with a much worse product.
That is not to say of course that any Elixir app is going to be better - there are bad Elixir devs just like any stack, though I think Elixir's relative lack of popularity actually leads to an overall more skilled dev pool.
In the context that doesn't really say a lot, they pay the floor guy and the architect to solve problems, just different problems.
>they are trapped in an institutional/corporate mindset (old patterns are most suitable, 'best practice', 'reliable, proven tech' and such)
While true, it is also correct. From their perspective Elixir is an exotic material, the next "one they pay to solve their problems" is not going to know how to work with it, scrap it, and start over. Therefore the Elixir solution isn't a good solution for them.
As a business man once told me, "I never saw a feature I liked so much that I was wiling to pay for it twice."
And I say this as an elixir employer and whose company has done only elixir projects for 8 years.
In all cases though, our client just wanted us to solve a problem and didn’t care how we did it. So we’ve been very lucky. But have also earned their trust over the years by delivering.
The worker is stateless and "dumb" by design (currently it runs on serverless functions) - it just calls SQL functions: "poll_for_tasks" to get some tasks from the queue and then either "complete_task" or "fail_task" after executing user code - that's it, nothing more, so it should be relatively easy to adopt it to other runtimes.
I have written a small architecture primer on pgflow if anyone is interested in its simple but flexible design https://www.pgflow.dev/concepts/how-pgflow-works/
https://viewer.diagrams.net/?tags=%7B%7D&highlight=0000ff&ed...
Wouldn't you say these patterns aren't unique to event based systems since functional composition can also lead to these patterns? For eg: see the Streams API in Java
========= - Where would push event systems make sense over pull based systems?
Also FOR UPDATE SKIP LOCKED is interesting.
One thing worth looking into if you do this in production is adding a way to add partitions such that each partition is single threaded. It’s the only way to guarantee ordering if your jobs are doing anything non-deterministic.
We increased partition counts on sale days and it works well for us.
Couple of gotchas we had were.
1) Using hashtext from postgres is sketchy.
2) Increasing partiton count is an orchestra which requires stopping the partition.
Claude says you can use a CTE to select and the run your update with the locked rows, but I have only ever used transactions.
You can rant all you want about the disadvantages of OOP and other paradigms. But after your rant, do not come up with an even more absurd type system that gives you even more work to do without any upside.
Building a project out of constructs such as macros does not scale with project size or team size. The only aspect that scales is the resulting size of the unmanageable spaghetti. Tolerating having that as a widely used feature in your language and then praise it as being fantastic truly makes one wonder what are you comparing the Elixir experience with? Elixir is fantastic compared to... fighting 100 gorillas blindfolded.
There are good choices built into the language, like railroad oriented programming. But those will never undo the infinitely awful and inadequate insanity of having to deal with macros to survive.
And to be frank, even if there are use-cases where the language shines and can be excellent tool for the job, most of the time those are not your use-case, and most of the time your team is not familiar with the language enough to leverage its advantages or build something production-worthy.
And if you ever go through the pain of mastering Elixir, 99.99% of the time your next job will not use it. If you are at a startup celebrating the virtues of Elixir before having enough revenue, your next non-Elixir job is closer than you think because your company will run out of money. And if your plan is having a LLM help you write Elixir faster before you burn through your budget? Good luck debugging a labyrinth made out of AI generated macros, your on-call rotation will go great. Your stakeholders will be eager to learn more about how great Elixir is when they ask for an explanation of why nothing works, an estimate of when it will be fixed, why is the project late or how can they add more team members to make it go faster.
sigh. anyway. i had the same reaction to acme. but I'm lazier. i just ran ngix and the acme client in qemu and picked the certs out. i applaud the author for her tenacity.