|
|
Subscribe / Log in / New account

PostgreSQL reconsiders its process-based model

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jonathan Corbet
June 19, 2023
In the fast-moving open-source world, programs can come and go quickly; a tool that has many users today can easily be eclipsed by something better next week. Even in this environment, though, some programs endure for a long time. As an example, consider the PostgreSQL database system, which traces its history back to 1986. Making fundamental changes to a large code base with that much history is never an easy task. As fundamental changes go, moving PostgreSQL away from its process-oriented model is not a small one, but it is one that the project is considering seriously.

A PostgreSQL instance runs as a large set of cooperating processes, including one for each connected client. These processes communicate through a number of shared-memory regions using an elaborate library that enables the creation of complex data structures in a setting where not all processes have the same memory mapped at the same address. This model has served the project well for many years, but the world has changed a lot over the history of this project. As a result, PostgreSQL developers are increasingly thinking that it may be time to make a change.

A proposal

At the beginning of June, Heikki Linnakangas, seemingly following up on some in-person conference discussions, posted a proposal to move PostgreSQL to a threaded model.

I feel that there is now pretty strong consensus that it would be a good thing, more so than before. Lots of work to get there, and lots of details to be hashed out, but no objections to the idea at a high level.

The purpose of this email is to make that silent consensus explicit.

The message gave a quick overview of some of the challenges involved in making such a move, and acknowledged, in an understated way, that this transition "surely cannot be done fully in one release". One thing that was missing was a discussion of why this big change would be desirable, but that was filled in as the discussion went on. As Andres Freund put it:

I think we're starting to hit quite a few limits related to the process model, particularly on bigger machines. The overhead of cross-process context switches is inherently higher than switching between threads in the same process - and my suspicion is that that overhead will continue to increase. Once you have a significant number of connections we end up spending a *lot* of time in TLB misses, and that's inherent to the process model, because you can't share the TLB across processes.

He also pointed out that the process model imposes costs on development, forcing the project to maintain a lot of duplicated code, including several memory-management mechanisms that would be unneeded in a single address space. In a later message he also added that it would be possible to share state more efficiently between threads, since they all run within the same address space.

The reaction of some developers, though, made it clear that the "pretty strong consensus" cited by Linnakangas might not be quite that strong after all. Tom Lane said: "I think this will be a disaster. There is far too much code that will get broken". He added later that the cost of this change would be "enormous", it would create "more than one security-grade bug", and that the benefits would not justify the cost. Jonathan Katz suggested that there might be other work that should have a higher priority. Others worried that losing the isolation provided by separate processes could make the system less robust overall.

Still, many PostgreSQL developers seem to be cautiously in favor of at least exploring this change. Robert Haas said that PostgreSQL does not scale well on larger systems, mostly as a result of the resources consumed by all of those processes. "Not all databases have this problem, and PostgreSQL isn't going to be able to stop having it without some kind of major architectural change". Just switching to threads might not be enough, he said, but he suggested that this change would enable a number of other improvements.

How to get there

Moving the core of the PostgreSQL server into a single address space will certainly present a number of challenges. The biggest one, as pointed out by Haas and others, would appear to be the server's "widespread and often gratuitous use of global variables". Globals work well enough when each server process has its own set, but that approach clearly falls apart when threads are used instead. According to Konstantin Knizhnik, there are about 2,000 such variables currently used by the PostgreSQL server.

A couple of approaches to this problem were discussed. One was pulling all of the global variables into a big "session state" structure that would be thread-local. That idea quickly loses its appeal, though, when one considers trying to create and maintain a 2,000-member structure, so the project is unlikely to go this way. The alternative is to simply throw all of the globals into thread-local storage, an approach that is easy and would work, but heavy use of thread-local storage would exact a performance penalty that would reduce the benefits of the switch to threads in the first place. Haas said that marking globals specially (to put them into thread-local storage, among other things) would be a beneficial project in its own right, as that would be a good first step in reducing their use. Freund agreed, saying that this effort would pay off even if the switch to threads never happens.

But, Freund cautioned, moving global variables to thread-local storage is the easiest part of the job:

Redesigning postmaster, defining how to deal with extension libraries, extension compatibility, developing tools to make developing a threaded postgres feasible, dealing with freeing session lifetime memory allocations that previously were freed via process exit, making the change realistically reviewable, portability are all much harder.

An interesting point that received surprisingly little attention in the discussion is that Knizhnik has already done a threads port of PostgreSQL. The global-variable problem, he said, was not that difficult. He had more trouble with configuration data, error handling, signals, and the like. Support for externally maintained extensions will be a challenge. Still, he saw some significant benefits in working in the threaded environment. Anybody who is thinking about taking on this project would be well advised to look closely at this work as a first step.

Another complication that the PostgreSQL developers have in mind is that of supporting both the process-based and thread-based modes, perhaps indefinitely. The need to continue to support running in the process-based mode would make it harder to take advantage of some of the benefits offered by threads, and would significantly increase the maintenance burden overall. Haas, though, is not convinced that it would ever be possible to remove support for the process-based mode. Threads might not perform better for all use cases, or some important extensions may never gain support for running in threads. The removal of process support is, as he noted, a question that can only really be considered once threads are working well.

That point is, obviously, a long way into the future, assuming it arrives at all. While the outcome of the discussion suggests that most PostgreSQL developers think that this change is good in the abstract, there are also clearly concerns about how it would work in practice. And, perhaps more importantly, nobody has, yet, stepped up to say that they would be willing to put in the time to push this effort forward. Without that crucial ingredient, there will be no switch to threads in any sort of foreseeable future.


(Log in to post comments)

Aim for the stars

Posted Jun 19, 2023 16:11 UTC (Mon) by Wol (subscriber, #4433) [Link]

> While the outcome of the discussion suggests that most PostgreSQL developers think that this change is good in the abstract, there are also clearly concerns about how it would work in practice.

And you might hit the moon. Aim nowhere and you're going nowhere.

Look at the GIL (was that Python?) and the Big Kernel Lock in linux. Whether you get there or not, a lot of the work on the way sounds like it's worth it in its own right. Like getting rid of all those global variables!

Even being able to break up each process into a bunch of threads for the easy stuff could lead to massive benefits - threading where it works well, processes where they work well.

I wish you all God Speed on the voyage!

Cheers,
Wol

Aim for the stars

Posted Jun 19, 2023 18:18 UTC (Mon) by zoobab (guest, #9945) [Link]

Maybe yse zeromq ipc messages between threads?

Aim for the stars

Posted Jun 20, 2023 4:44 UTC (Tue) by j16sdiz (subscriber, #57302) [Link]

ZeroMQ is a big mess when it comes to threading model and error recovery.

It do too much magic behind your back. When it comes to database, we need more explicit (or flexible) error handling.

Aim for the stars

Posted Jun 19, 2023 20:19 UTC (Mon) by nevyn (subscriber, #33129) [Link]

Python GIL and Linux Big kernel lock seem like very bad comparisons. In those cases there is/was no Parallelism, here there is Parallelism but _maybe_ the scaling is better if you change "everything" and _maybe_ the security/robustness is the same.

This is "closer" to the apache-httpd move, the main difference being I don't know enough about PostgreSQL and the plans to move to imply the outcome will be that bad.

Aim for the stars

Posted Jun 19, 2023 22:22 UTC (Mon) by Wol (subscriber, #4433) [Link]

It wasn't meant as a comparison. The Big Kernel Lock and the GIL enforced "single process". PostgreSQL *is* a single process?

Linux and Python decided that removing that restriction was worthwhile. Whether PostgreSQL succeeds or not, the effort they make towards removing that restriction may well be worthwhile.

Cheers,
Wol

Aim for the stars

Posted Jun 19, 2023 23:18 UTC (Mon) by michaelmior (guest, #165680) [Link]

Postgres scales by coordinating among multiple processes on a single machine. The proposal is to use multiple threads instead of multiple processes.

This is similar to the CPython GIL, but the GIL doesn't enforce a single process. It prevents multiple threads from running concurrently in the same process. In CPython with the GIL, multiple processes are *necessary* to scale CPU-bound code.

Aim for the stars

Posted Jun 22, 2023 10:46 UTC (Thu) by khim (subscriber, #9252) [Link]

> PostgreSQL *is* a single process?

Have you actually read the article? No, it's most definitely not a single process. They are using multiple processes, shared memory and, obviously, some locks to ensure consistency.

Which means they already have locks and don't need GIL or BKL.

Aim for the stars

Posted Jun 20, 2023 4:44 UTC (Tue) by rtpg (subscriber, #114619) [Link]

I would go even further, there are a good amount of people who argue for the GIL to stay in Python ~forever, mostly because the mental model is easier and it rules out entire classes of bugs.

The GIL stuck along enough to allow for async, and so you have async for lots of parallelism in one direction, stuff like multiprocessing in the other. Even heavy calculation stuff is pretty "eh whatever" because in practice it often calls into other libraries which release the GIL.

GILectomy work has been many many many many false starts, and I think we're learning stuff from it (and it might still be the right way to go in the end!), but it's been tough to find work from those projects that end up being usable (namely because of new locking patterns needing to be figured out in the alternative)

Aim for the stars

Posted Jun 20, 2023 8:13 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

Another part of the problem is the fact that CPython is "good enough."

Anyone who wants to get rid of the GIL can transpile to C with Cython, annotate any objects that need to be accessed outside the GIL as C types, and then write "with nogil:" to release the GIL. It will run much faster than CPython even if you're single-threaded, and can be done incrementally on a module-by-module basis in most cases.

The main downsides of this strategy are:

* CPython is more mature than Cython.
* CPython has a (slightly) more straightforward build process, especially if you have zero non-stdlib dependencies.
* Cython specifically requires a C compiler.
* C types are not Python types. There are semantic differences. You have to do additional testing if you're converting an existing codebase.
* C is not a terribly complicated language, but if you don't know it at all, then you probably need to learn it first.

But none of those are hard blockers. They're just friction. If you really strongly need to drop the GIL, this is a perfectly reasonable way of doing it. The fact is, most people asking for a GILectomy either haven't looked into alternatives like Cython, don't want free threading badly enough to overcome the activation energy of this strategy, or have already built a large CPU-bound multithreaded application in Python which is too big to annotate, despite the threading docs explicitly saying not to do that.

Aim for the stars

Posted Jun 20, 2023 11:56 UTC (Tue) by eru (subscriber, #2753) [Link]

> and it rules out entire classes of bugs.

Seems to me this applies nicely also to PostgressSQL processes vs threads, because of the address-space separation, and the automatic memory cleanup you get when a sub-process exits. With threads, a bug in one thread may trash the memory of any other thread.

PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 19:26 UTC (Mon) by raven667 (subscriber, #5198) [Link]

I know nothing of the PostgreSQL internals or the relevant engineering but throwing an opinion out there anyway; is there a way to make a minimal threaded implementation that just covers the necessary features needed for the most extreme large servers where threading could help? If you made a ton of caveats about what features are supportable, ie anything not used by the large instances you want test with, can you reduce the scope of what work is needed to something more manageable that can be iterated on? Steady improvement without taking on a big chunk of risk to rework the whole internal architecture, even if it takes longer, is probably the way to go for an old mature software project like this, right?

PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 19:45 UTC (Mon) by jhoblitt (subscriber, #77733) [Link]

Semi-seriously, why not port the postgresql sql dialect to use mariadb as the backend? Mariadb (mysql...) has had a robust threaded model and binary redo logs for literally decades.

PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 19:48 UTC (Mon) by pizza (subscriber, #46) [Link]

> Semi-seriously, why not port the postgresql sql dialect to use mariadb as the backend? Mariadb (mysql...) has had a robust threaded model and binary redo logs for literally decades.

Because it's not Postgresql's "dialect" that matters here, but rather the features and robustness that dialect exposes.

...Mariadb might as well be on another planet in comparison.

PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 23:19 UTC (Mon) by butlerm (subscriber, #13312) [Link]

I believe the short answer is doing that would be tantamount to the PostgreSQL project throwing away nearly everything they have done for the past couple of decades. In addition, unless MariaDB has made remarkable progress in the past few years it isn't anywhere close to implementing PostgreSQL's full feature set or in particular being able to implement those features in a backward compatible manner with PostgreSQL.

When you get down into the details relational database implementations tend to be remarkably different from each other in terms of more user level aspects (functions, data types, options, apis) than you can count. I think it is safe to say the PostgreSQL developers have not reached quite that level of desperation yet. But if someone wanted to take that on as a software engineering challenge the results would certainly be interesting to read about.

PostgreSQL reconsiders its process-based model

Posted Jun 21, 2023 13:01 UTC (Wed) by Sesse (subscriber, #53779) [Link]

Why do you believe that the primary thing that keeps Postgres users from moving to MariaDB is compatibility with existing code?

PostgreSQL reconsiders its process-based model

Posted Jun 21, 2023 13:54 UTC (Wed) by jhoblitt (subscriber, #77733) [Link]

As I've been using both for decades and it is very common for applications to be tied to psql specific SQL features or extensions such as pgshere. Mariadb (MySQL) has long had working binary log replication, no need to fiddle with vacuuming, and doesn't require a connection pooler for non-trivial use cases. Many of these administrative concerns have gotten better on the psql side and replication is mercifully no longer a roll-your-own situation but it is still easier to scale Mariadb. The difficulty with deploying psql for non-trivial use cases is why yugabyte/etc. are gaining popularity.

PostgreSQL reconsiders its process-based model

Posted Jun 21, 2023 13:59 UTC (Wed) by Sesse (subscriber, #53779) [Link]

But then you're surely aware that the SQL protocol and dialect is only a small fraction of that? There's no way MariaDB could ever run a Postgres extension, and it wouldn't gain any of the desired SQL features just by having a translation layer.

PostgreSQL reconsiders its process-based model

Posted Jun 21, 2023 14:43 UTC (Wed) by jhoblitt (subscriber, #77733) [Link]

Of course not. I did not suggest a SQL to SQL level translational layer. It is obviously possible to reuse the psql query parser with a different backend storage engine as yugabyte has done this. See https://en.m.wikipedia.org/wiki/YugabyteDB#/media/File%3A...

PostgreSQL reconsiders its process-based model

Posted Jun 21, 2023 14:49 UTC (Wed) by Sesse (subscriber, #53779) [Link]

But there's a _huge_ amount of code between the parser and the storage layer. I could understand your proposal if it were “use PostgreSQL with InnoDB as the storage backend” (there are significant issues with it and it would be a large project, but at least I could understand the proposal), but not “use MariaDB with the PostgreSQL parser” (which sounds basically impossible if you ever want compatibility with real, nontrivial applications).

PostgreSQL reconsiders its process-based model

Posted Jun 27, 2023 1:10 UTC (Tue) by c5h5n5o (guest, #128645) [Link]

> binary redo logs

You probably meant undo logs?

PostgreSQL reconsiders its process-based model

Posted Jun 27, 2023 3:15 UTC (Tue) by jhoblitt (subscriber, #77733) [Link]

PostgreSQL reconsiders its process-based model

Posted Jun 27, 2023 13:59 UTC (Tue) by kleptog (subscriber, #1183) [Link]

In case anyone else was briefly confused, MariaDB has both undo and redo logs. Redo logs are for error recovery, undo logs are for handling rollbacks. PostgreSQL has redo logs (aka Write Ahead Logs aka WAL), but no undo logs. It handles rollbacks via MVCC.

Advantages of undo logs are that outdated data take no space in data files, but accessing outdated data requires special actions and can be a bottleneck for concurrency. MVCC means outdated data stays in place, so no limits on transaction size. But you need something like VACUUM to maintain performance over time.

Actually, one of the most useful features I find with PostgreSQL is that schema changes are transactional. That makes migrations so much easier to manage since you don't have to worry about partial failure. You can run entire scripts changing tables, migrating data, altering foreign keys and if halfway something goes wrong, rollback and you're back in business. Talking to colleagues using MariaDB, schema changes always seem to be extremely painful. (Oracle doesn't support DDL in transactions either, helpfully autocommitting the transaction you were in.)

A PostgreSQL parser on a MariaDB database feels like some kind of frankenmonster I wouldn't touch with a very long barge-pole.

PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 20:29 UTC (Mon) by flussence (subscriber, #85566) [Link]

Oh this is quite some news. I don't mind early adopting performance features, but…

In Apache httpd I've been using every experimental threaded/event mpm as it becomes available, because the forking model always felt a bit gross to me. But that's software that has had pluggable backends for decades, and even so it's still a bit rough around the edges. I generally trust the Postgres developers to not screw up but I think this kind of change would need two or three major release cycles before I'd feel comfortable turning it on in production.

PostgreSQL reconsiders its process-based model

Posted Jun 20, 2023 11:13 UTC (Tue) by ctg (guest, #3459) [Link]

This is all very deja vu.

Back in the day, University Ingres (from which postgres, then postgresql is derived) went commercial with RTI. Version 6 was a major rewrite - going from the multi-process architecture to a multi-threaded one (and also switched to SQL as the "core" language). It wasn't that pretty. RTI didn't survive. Not saying the two things are linked.

One of the things I like(d) about postgresql was that it still had the original multiprocess model, still recognisable from ingres of the early 1980s.

PostgreSQL reconsiders its process-based model

Posted Jun 23, 2023 2:18 UTC (Fri) by kschendel (subscriber, #20465) [Link]

RTI the company didn't survive, but Ingres certainly did. It moved from a userland, internal-threads model to an OS/posix thread model in the late 90's, and picked up a column store execution engine alternative in the late '00's. Ingres in various incarnations is doing very well, thank you.

PostgreSQL reconsiders its process-based model

Posted Jun 20, 2023 12:00 UTC (Tue) by rrolls (subscriber, #151126) [Link]

A process-per-client model makes sense when you have under a thousand connected clients and they're all coming from goodness knows where: i.e. when you definitely don't want any security bugs that expose state from one client to another, or indeed allow one client to (intentionally or otherwise) cause a denial-of-service to another.

But if you have a large number of connections coming from what is essentially the _same_ client, as we often seem to do in web services for even the simple purpose of running multiple queries at the same time, then that really shouldn't be using multiple processes.

A threaded model works, I suppose, but an event-driven model would be far more ideal. Allow each client to connect once, and give each client its own process - but then allow that client to spawn however many asynchronous tasks it wishes and receive the results incrementally, rather than blocking the whole connection for every operation and thus requiring multiple connections. IIRC, IMAP works like this.

PostgreSQL reconsiders its process-based model

Posted Jun 20, 2023 15:10 UTC (Tue) by atnot (subscriber, #124910) [Link]

This is generally why you'll see these sorts of places run an instance of e.g. pgbouncer to pool many requests over a single process. That makes significantly more effective use of your processes, but it really doesn't solve the scaling issues.

PostgreSQL reconsiders its process-based model

Posted Jun 21, 2023 9:30 UTC (Wed) by ehiggs (subscriber, #90713) [Link]

There will of course be backpressure as you're bound by storage and cpu. And event-driven model with backpressure is basically an actor model. So I'd take your event-driven model and raise you an actor-driven model. :)

PostgreSQL reconsiders its process-based model

Posted Jun 22, 2023 3:48 UTC (Thu) by eklitzke (subscriber, #36426) [Link]

Using an event driven model doesn't really make sense for a database that needs to be portable and is doing blocking file I/O.

PostgreSQL reconsiders its process-based model

Posted Jun 29, 2023 20:48 UTC (Thu) by kevincox (guest, #93938) [Link]

Doesn't io_uring support true async IO for filesystem access now? Maybe it would make sense to transition to that. Definitely a bit early and a risky move but it is an interesting path forward.

Of course in a database you are hoping that a lot of your data is in memory, so maybe the gains wouldn't be nearly as much as with a network service.

PostgreSQL reconsiders its process-based model

Posted Jun 30, 2023 1:30 UTC (Fri) by andresfreund (subscriber, #69562) [Link]

> Doesn't io_uring support true async IO for filesystem access now? Maybe it would make sense to transition to that. Definitely a bit early and a risky move but it is an interesting path forward.

We are working on that....

> Of course in a database you are hoping that a lot of your data is in memory, so maybe the gains wouldn't be nearly as much as with a network service.

You do also need to write data as a database and sometimes that needs to happen in the critical path (e.g. journal commits) of returning to the user. So far it doesn't seem to help a lot on high end local NVMe, but seems quite promising for typical cloud storage.

PostgreSQL reconsiders its process-based model

Posted Jun 30, 2023 4:24 UTC (Fri) by andresfreund (subscriber, #69562) [Link]

( "we" being the postgres project)

PostgreSQL reconsiders its process-based model

Posted Jun 20, 2023 20:50 UTC (Tue) by mokki (subscriber, #33200) [Link]

If TLB overhead with shared memory and locks between co-operting processes is too high, why not try to fix it in kernel?

For example, would something like opt-in sharing of pages between processes that oracle has been trying to get into kernel be the correct option: https://lwn.net/ml/linux-kernel/cover.1682453344.git.khal...

Postmaster would just share the already shared memory between processes (containing also the locks). That explicit part of memory would opt-in to thread -like sharing and thus get faster/less tlb switching and lower memory usage. While all the rest of the state would still be per-process and safe.

tl;dr super share the existing shared memory area with kernel patch

All operating systems not supporting it would keep working as is.

PostgreSQL reconsiders its process-based model

Posted Jun 20, 2023 21:19 UTC (Tue) by andresfreund (subscriber, #69562) [Link]

> If TLB overhead with shared memory and locks between co-operting processes is too high, why not try to fix it in kernel?

I think it's not really an OS issue, but a hardware one. To avoid having to flush the TLB during context switches linux uses PCIDs on x86-64. During context switches the current the current logical cpu's pcid is updated to the the PCID of the relevant process. But a logical CPU just has a single "active" PCID. I think it's similar on ARM.

But this is a bit outside the area I normally dabble in, so I might be misunderstanding. Or just not know about some newer hardware features linux could utilize.

> For example, would something like opt-in sharing of pages between processes that oracle has been trying to get into kernel be the correct option: https://lwn.net/ml/linux-kernel/cover.1682453344.git.khal...

It'd be nice to have that, to save memory on redundant page table entries for the range of mappings that is going to be the between all the processes. But I don't think it'd meaningfully improve the TLB hit rate.

PostgreSQL reconsiders its process-based model

Posted Jun 22, 2023 14:42 UTC (Thu) by kleptog (subscriber, #1183) [Link]

This is a big change, I wonder how this will go. For me the process model makes it easier to convince people that when a client connects to a single database then they cannot, even if there are bugs, accidentally access data from other database. Corruption cannot "leak". In some contexts this separation is quite important. I can also imagine that hosting providers are not enthused by different customers connecting to the same cluster being mixed in a single process where.

But nothing is final of course. There are of course benefits to be had too. But if this means I need to start managing more clusters then of course it's not helping at all. I guess it's a question of trust though in the end. And I do trust that if the postgres developers release a threaded version, then it will work as advertised.

That said, I wonder if a hybrid scheme is possible, where you can run multiple sessions threaded in parallel in a single process but limited to a single database. Then something like pgbouncer in front of it can handle the multiplexing. You could even add restrictions like: within a single process the GUCs must be the same and they all use the same loadable objects. I feel this would solve a lot of the use-cases where they're worried about the number of simultaneous processes. OTOH might be even more complex.

But whatever they do, the very best of luck.

PostgreSQL reconsiders its process-based model

Posted Jun 22, 2023 16:15 UTC (Thu) by Lennie (subscriber, #49641) [Link]

You do know this process can just connect to an other database ?

Same connection to PostgreSQL:

postgres=# \c testdb
You are now connected to database "testdb" as user "postgres".
testdb=#

PostgreSQL reconsiders its process-based model

Posted Jun 23, 2023 0:18 UTC (Fri) by andresfreund (subscriber, #69562) [Link]

> You do know this process can just connect to an other database ?
> ...
> postgres=# \c testdb

That actually establishes a new connection from within psql and thus connects to a backend process. You can observe that with SELECT pg_backend_pid();.

PostgreSQL reconsiders its process-based model

Posted Jun 24, 2023 11:26 UTC (Sat) by Lennie (subscriber, #49641) [Link]

Seems I checked it wrong, it's the same TCP connection.

PostgreSQL reconsiders its process-based model

Posted Jun 23, 2023 3:01 UTC (Fri) by willmo (subscriber, #82093) [Link]

> when a client connects to a single database then they cannot, even if there are bugs, accidentally access data from other database.

I think it depends on the bugs, and what you mean by “accidentally”. :-) I don’t think Postgres sandboxes the backend processes to protect against an attacker who gains arbitrary code execution? Still, it seems fair to assume that the practical impact of some bugs in some circumstances would be greater in a threaded model.

PostgreSQL reconsiders its process-based model

Posted Jun 29, 2023 9:04 UTC (Thu) by jmscott (guest, #57432) [Link]

my long answer is NO to threads.

are we actually debating if the threaded model simplifies programming of custom data types?

to paraphrase my dad, threaded programming is a step backwards to the days before virtual memory and not a step forward. hardware support for virtual memory revolutionized management of linear memory ... and i expect the same, new revolution when merging vector processors/GPU with postgresql.


Copyright © 2023, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds