Replication Conflicts and How to Deal with Them — How can replication conflicts occur if the standby server is read-only? Enter vacuums, locks, buffer pins, and more to make life difficult! Luckily, there are various ways to mitigate the problems and Laurenz shares some tips to do so.
Laurenz Albe
|
phpPgAdmin 7.13.0 Released — Popular Web-based Postgres administration tool. Drops PHP 7.1 support, adds support for Postgres 13 (and 14, provisionally).
Robert Treat
|
NoisePage: A 'Self Driving' Database Management System — NoisePage is an open source, relational database (that uses a Postgres-compatible wire protocol) out of CMU entirely focused on testing out autonomous deployment and machine learning optimization techniques.
Carnegie Mellon University Database Group
|
💡 Tip of the Week
Even Faster \COPY by Ryan Booz
Postgres and psql have built in support for copying (ie. bulk loading) data TO and FROM files in various formats. It's billed as an easy way to shuffle data around between tables and databases, and rightly so!
With psql , the command is called \COPY and makes it easy to load data from both text and binary files that reside on your local workstation.
Given a table named readings with the following columns:
|
id | time | reading
----|------------------------|---------
1 | 2020-10-01 00:00:00-00 | 12.3
|
And a CSV file name readings.csv with the following data:
|
1,2020-10-01 01:00:00-00,15.4
1,2020-10-01 01:30:00-00,18.1
2,2020-10-01 01:00:00-00,20.2
...
|
We could bulk load thousands (or millions) of records with this \COPY command at the psql prompt:
|
\COPY readings FROM readings.csv CSV;
|
One disadvantage of \COPY is that it is single threaded and doesn't (yet) support batched or parallel copying. So, for a large file, this could take a significant amount of time.
Enter Parallel Copy, a free and open-source tool written in Go that wraps \COPY and provides parallel inserts of batched data. This means that data can be inserted at higher rates with lower transactional overhead.
Copying the same data with Parallel Copy would look like this, including the helpful feature to get incremental progress updates:
|
timescaledb-parallel-copy --connection {CONNECTION STRING} --db-name {DATABASE NAME} --table readings --file {PATH TO `readings.csv`} --workers 4 --reporting-period 30s
|
By default this will copy rows in batches of 5,000 at a time.
Note that Parallel Copy was developed primarily for use with TimescaleDB (a time-series database built on top of Postgres) but works with standard Postgres also.
This week’s tip is not sponsored, but we thank Ryan Booz of Timescale for supplying it to us. If you have a tip we can use, let us know.
|
|
|