PostgreSQL Streaming Replication: A Comprehensive Guide

This blog was originally published in September 2018 and was updated in Feburary of 2024

Configuring replication between two databases is considered to be the best strategy for achieving high availability during disasters and provides fault tolerance against unexpected failures. PostgreSQL satisfies this requirement through streaming replication. We shall talk about another option called logical replication and logical decoding in our future blog post.

What is PostgreSQL streaming replication?

PostgreSQL streaming replication is a process that replicates data from a primary PostgreSQL database to one or more standby databases in real-time, ensuring that these standby databases mirror the primary database accurately. This replication method ensures high availability and redundancy by keeping the standby servers synchronized with the primary server. It plays a vital role in disaster recovery, load balancing, and reducing downtime for maintenance activities. By continuously replicating data with minimal delay, streaming replication allows for a smooth transition to a standby server with virtually no interruption in case the primary server encounters a failure.

What are the benefits of PostgreSQL streaming replication?

PostgreSQL streaming replication offers a framework for enhancing database resilience, performance, and data safety. This technology underpins critical capabilities businesses rely on for continuous operation and data management.

High availability

Streaming replication ensures that your PostgreSQL databases remain accessible, minimizing downtime and maintaining business continuity. Replicating data to standby servers guarantees that a backup is always ready to take over in case the primary server fails, thus enhancing system reliability. Learn more about deploying PostgreSQL for high availability in our blog post Setting Up and Deploying PostgreSQL for High Availability.

Load balancing

One of the key advantages of streaming replication is its ability to distribute read queries across multiple servers. This maximizes the utilization of your hardware resources and improves application response times, leading to a more efficient and scalable system architecture.

Disaster recovery

In the event of a catastrophic failure, streaming replication is your first line of defense to ensure data is not permanently lost by providing a mechanism for fast recovery, minimizing data loss and operational downtime. Dive deeper into strategies in our eBook PostgreSQL Disaster Recovery.

Real-time Data Warehousing

Streaming replication facilitates real-time data integration into data warehouses, enabling timely insights and decision-making. It supports the replication of live data to a warehouse, where it can be analyzed without impacting the performance of the primary database, thus serving both operational and analytical purposes seamlessly.

Download eBook: Achieving High Availability on PostgreSQL With Open Source Tools

Understanding how streaming replication in PostgreSQL works

Streaming replication in PostgreSQL works on log shipping. Every transaction in postgres is written to a transaction log called WAL (write-ahead log) to achieve durability. A slave uses these WAL segments to continuously replicate changes from its master.

There exists three mandatory processes – wal sender , wal receiver and startup process, these play a major role in achieving streaming replication in postgres.

A wal sender process runs on a master, whereas the wal receiver and startup processes runs on its slave. When you start the replication, a wal receiver process sends the LSN (Log Sequence Number) up until when the WAL data has been replayed on a slave, to the master. And then the wal sender process on master sends the WAL data until the latest LSN starting from the LSN sent by the wal receiver, to the slave. Wal receiver writes the WAL data sent by wal sender to WAL segments. It is the startup process on slave that replays the data written to WAL segment. And then the streaming replication begins.

Note: Log Sequence Number, or LSN, is a pointer to a location in the WAL.

Streaming replication in PostgreSQL between a master and one slave

Step 1:

Create the user in master using whichever slave should connect for streaming the WALs. This user must have REPLICATION ROLE.

CREATE USER replicator
WITH REPLICATION
ENCRYPTED PASSWORD 'replicator';

CREATE USER replicator

WITH REPLICATION

ENCRYPTED PASSWORD 'replicator';

Step 2:

The following parameters on the master are considered as mandatory when setting up streaming replication.

archive_mode : Must be set to ON to enable archiving of WALs.
wal_level : Must be at least set to hot_standby until version 9.5 or replica in the later versions.
max_wal_senders : Must be set to 3 if you are starting with one slave. For every slave, you may add 2 wal senders.
wal_keep_segments : Set the WAL retention in pg_xlog (until PostgreSQL 9.x) and pg_wal (from PostgreSQL 10). Every WAL requires 16MB of space unless you have explicitly modified the WAL segment size. You may start with 100 or more depending on the space and the amount of WAL that could be generated during a backup.
archive_command : This parameter takes a shell command or external programs. It can be a simple copy command to copy the WAL segments to another location or a script that has the logic to archive the WALs to S3 or a remote backup server.
listen_addresses : Specifies which IP interfaces could accept connections. You could specify all the TCP/IP addresses on which the server could listen to connections from client. ‘*’ means all available IP interfaces. The default : localhost allows only local TCP/IP connections to be made to the postgres server.
hot_standby : Must be set to ON on standby/replica and has no effect on the master. However, when you setup your replication, parameters set on the master are automatically copied. This parameter is important to enable READS on slave. Otherwise, you cannot run your SELECT queries against slave.

The above parameters can be set on the master using these commands followed by a restart:

ALTER SYSTEM SET wal_level TO 'hot_standby';
ALTER SYSTEM SET archive_mode TO 'ON';
ALTER SYSTEM SET max_wal_senders TO '5';
ALTER SYSTEM SET wal_keep_segments TO '10';
ALTER SYSTEM SET listen_addresses TO '*';
ALTER SYSTEM SET hot_standby TO 'ON';
ALTER SYSTEM SET archive_command TO 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f';

ALTER SYSTEM SET wal_level TO 'hot_standby';

ALTER SYSTEM SET archive_mode TO 'ON';

ALTER SYSTEM SET max_wal_senders TO '5';

ALTER SYSTEM SET wal_keep_segments TO '10';

ALTER SYSTEM SET listen_addresses TO '*';

ALTER SYSTEM SET hot_standby TO 'ON';

ALTER SYSTEM SET archive_command TO 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f';

$ pg_ctl -D $PGDATA restart -mf

1	$ pg_ctl -D $PGDATA restart -mf

Step 3:

Add an entry to pg_hba.conf of the master to allow replication connections from the slave. The default location of pg_hba.conf is the data directory. However, you may modify the location of this file in the file postgresql.conf. In Ubuntu/Debian, pg_hba.conf may be located in the same directory as the postgresql.conf file by default. You can get the location of postgresql.conf in Ubuntu/Debian by calling an OS command => pg_lsclusters.

host replication replicator 192.168.0.28/32 md5

1	host replication replicator 192.168.0.28/32 md5

The IP address mentioned in this line must match the IP address of your slave server. Please change the IP accordingly.

In order to get the changes into effect, issue a SIGHUP:

$ pg_ctl -D $PGDATA reload
Or
$ psql -U postgres -p 5432 -c "select pg_reload_conf()"

$ pg_ctl -D $PGDATA reload

$ psql -U postgres -p 5432 -c "select pg_reload_conf()"

Step 4:

pg_basebackup helps us to stream the data through the wal sender process from the master to a slave to set up replication. You can also take a tar format backup from master and copy that to the slave server. You can read more about tar format pg_basebackup here

The following step can be used to stream data directory from master to slave. This step can be performed on a slave.

$ pg_basebackup -h 192.168.0.28 -U replicator -p 5432 -D $PGDATA -P -Xs -R

1	$ pg_basebackup -h 192.168.0.28 -U replicator -p 5432 -D $PGDATA -P -Xs -R

Please replace the IP address with your master’s IP address.

In the above command, you see an optional argument -R. When you pass -R, it automatically creates a recovery.conf file that contains the role of the DB instance and the details of its master. It is mandatory to create the recovery.conf file on the slave in order to set up a streaming replication. If you are not using the backup type mentioned above, and choose to take a tar format backup on master that can be copied to slave, you must create this recovery.conf file manually. Here are the contents of the recovery.conf file:

$ cat $PGDATA/recovery.conf

standby_mode = 'on'
primary_conninfo = 'host=192.168.0.28 port=5432 user=replicator password=replicator'
restore_command = 'cp /path/to/archive/%f %p'
archive_cleanup_command = 'pg_archivecleanup /path/to/archive %r'

$ cat $PGDATA/recovery.conf

standby_mode = 'on'

primary_conninfo = 'host=192.168.0.28 port=5432 user=replicator password=replicator'

restore_command = 'cp /path/to/archive/%f %p'

archive_cleanup_command = 'pg_archivecleanup /path/to/archive %r'

In the above file, the role of the server is defined by standby_mode. standby_mode must be set to ON for slaves in postgres.
And to stream WAL data, details of the master server are configured using the parameter primary_conninfo .

The two parameters standby_mode and primary_conninfo are automatically created when you use the optional argument -R while taking a pg_basebackup. This recovery.conf file must exist in the data directory($PGDATA) of Slave.

Step 5:

Start your slave once the backup and restore are completed.

If you have configured a backup (remotely) using the streaming method mentioned in Step 4, it just copies all the files and directories to the data directory of the slave. Which means it is both a back up of the master data directory and also provides for restore in a single step.

If you have taken a tar back up from the master and shipped it to the slave, you must unzip/untar the back up to the slave data directory, followed by creating a recovery.conf as mentioned in the previous step. Once done, you may proceed to start your PostgreSQL instance on the slave using the following command.

$ pg_ctl -D $PGDATA start

1	$ pg_ctl -D $PGDATA start

Step 6:

In a production environment, it is always advisable to have the parameter restore_command set appropriately. This parameter takes a shell command (or a script) that can be used to fetch the WAL needed by a slave if the WAL is not available on the master.

For example:

If a network issue has caused a slave to lag behind the master for a substantial time, it is less likely to have those WALs required by the slave available on the master’s pg_xlog or pg_wal location. Hence, it is sensible to archive the WALs to a safe location, and to have the commands that are needed to restore the WAL set to restore_command parameter in the recovery.conf file of your slave. To achieve that, you have to add a line similar to the next example to your recovery.conf file in slave. You may substitute the cp command with a shell command/script or a copy command that helps the slave get the appropriate WALs from the archive location.

restore_command = 'cp /mnt/server/archivedir/%f "%p"'

1	restore_command = 'cp /mnt/server/archivedir/%f "%p"'

Setting the above parameter requires a restart and cannot be done online.

Final step: validate that postgresql replication is setup

As discussed earlier, a wal sender and a wal receiver process are started on the master and the slave after setting up replication. Check for these processes on both master and slave using the following commands.

On Master
==========
$ ps -eaf | grep sender

On Slave
==========
$ ps -eaf | grep receiver
$ ps -eaf | grep startup

On Master

==========

$ ps -eaf | grep sender

On Slave

==========

$ ps -eaf | grep receiver

$ ps -eaf | grep startup

You must see those all three processes running on master and slave as you see in the following example log.

On Master
=========
$ ps -eaf | grep sender
postgres  1287  1268  0 10:40 ?        00:00:00 postgres: wal sender process replicator 192.168.0.28(36924) streaming 0/50000D68

On Slave
=========
$ ps -eaf | egrep "receiver|startup"
postgres  1251  1249  0 10:40 ?        00:00:00 postgres: startup process   recovering 000000010000000000000050
postgres  1255  1249  0 10:40 ?        00:00:04 postgres: wal receiver process   streaming 0/50000D68

On Master

=========

$ ps -eaf | grep sender

postgres 1287 1268 0 10:40 ? 00:00:00 postgres: wal sender process replicator 192.168.0.28(36924) streaming 0/50000D68

On Slave

=========

$ ps -eaf | egrep "receiver|startup"

postgres 1251 1249 0 10:40 ? 00:00:00 postgres: startup process recovering 000000010000000000000050

postgres 1255 1249 0 10:40 ? 00:00:04 postgres: wal receiver process streaming 0/50000D68

You can see more details by querying the master’s pg_stat_replication view.

$ psql
postgres=# x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 1287
usesysid         | 24615
usename          | replicator
application_name | walreceiver
client_addr      | 192.168.0.28
client_hostname  | 
client_port      | 36924
backend_start    | 2018-09-07 10:40:48.074496-04
backend_xmin     | 
state            | streaming
sent_lsn         | 0/50000D68
write_lsn        | 0/50000D68
flush_lsn        | 0/50000D68
replay_lsn       | 0/50000D68
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async

$ psql

postgres=# x

Expanded display is on.

postgres=# select * from pg_stat_replication;

-[ RECORD 1 ]----+------------------------------

pid | 1287

usesysid | 24615

usename | replicator

application_name | walreceiver

client_addr | 192.168.0.28

client_hostname |

client_port | 36924

backend_start | 2018-09-07 10:40:48.074496-04

backend_xmin |

state | streaming

sent_lsn | 0/50000D68

write_lsn | 0/50000D68

flush_lsn | 0/50000D68

replay_lsn | 0/50000D68

write_lag |

flush_lag |

replay_lag |

sync_priority | 0

sync_state | async

Reference: https://www.postgresql.org/docs/10/static/warm-standby.html#STANDBY-SERVER-SETUP

Performance considerations for PostgreSQL streaming replication

When implementing PostgreSQL streaming replication, it is critical to understand the performance consequences and optimization options. While this replication strategy improves availability and disaster recovery, it also adds challenges that must be addressed in order to maintain system efficiency.

Streaming replication has a low impact on the primary server’s performance but is not negligible. The primary server must handle the increased load of sending write-ahead logs (WAL) to standby servers. The process is efficient but does require network bandwidth and resources, which may compromise the primary’s speed, particularly under heavy write loads.

To address these challenges and enhance replication efficiency, it’s crucial to fine-tune the replication configuration. Making precise adjustments, including setting the WAL segment size, optimizing the WAL buffer, and configuring suitable replication timeouts, can greatly enhance the efficiency of replication and lighten the workload on the primary server. In addition, employing features such as delayed replication and replication slots significantly optimize resource consumption and maintain consistent data across servers.

Distributing read and write operations across servers is a fundamental strategy for managing a replicated PostgreSQL setup. By directing read queries to standby servers through streaming replication, the primary server’s load is reduced, enhancing the system’s performance. It necessitates meticulous planning and setup to route read operations to the most suitable server, taking into account variables such as replication delay and the capacity of each server. Achieving this balance is crucial for maximizing performance and scalability within a replicated PostgreSQL environment, enabling the system to accommodate increasing volumes of data and user requests efficiently.

5 tips for optimizing PostgreSQL streaming replication

To ensure your PostgreSQL streaming replication runs smoothly and efficiently, incorporating specific optimization strategies is essential. These tips can help you enhance performance, reduce replication lag, and ensure the high availability and reliability of your database system.

1. Optimize WAL configuration

Optimizing the Write-Ahead Logging (WAL) system, vital for ensuring data durability and facilitating replication in PostgreSQL, requires fine-tuning the configuration to balance performance efficiency and disk space utilization. By adjusting settings like wal_buffers, wal_writer_delay, and max_wal_size, you can enhance the efficiency of write operations and the speed of replication. Allocating a dedicated disk for WAL files can also decrease I/O contention, thereby boosting overall system performance.

2. Network optimization

For reduced latency and enhanced throughput, it’s crucial to configure and optimize your network specifically for replication traffic. Strategies may involve dedicating network interfaces to handle replication data, tweaking network configurations to support bulk data transfers efficiently, and maintaining dependable, high-speed connections between your primary and standby servers.

3. Monitoring and adjusting replication slots

Replication slots are used to safeguard against the premature deletion of WAL segments before they have been applied by the standby server, thus preventing data loss. By keeping an eye on and fine-tuning these slots, you can better control disk space usage on the primary server and avert replication delays. It’s advisable to periodically assess and modify the quantity of replication slots and their configurations, tailoring them to your specific replication requirements and the capabilities of your standby servers.

4. Manage replication lag

Replication delay can influence the timeliness of data on standby servers, potentially affecting read scalability and failover operations. To control replication delay, tracking delay indicators using resources like pg_stat_replication is critical. Implement strategies such as using faster hardware, optimizing queries, and adjusting WAL and network configurations to minimize lag. Consider using delayed replication to balance data freshness and availability in scenarios where lag is unavoidable.

5. Load balancing and read scaling

Efficiently routing read queries to standby servers can decrease the workload on the primary server and elevate the system’s overall efficiency. By incorporating load balancing through modifications at the application level or utilizing external utilities, you can ensure that standby servers process read-only queries. This strategy boosts system performance and facilitates read scaling, making it possible to manage an increased volume of queries by expanding the number of standby servers.

Get highly available PostgreSQL from Percona

Did you know that Percona now provides PostgreSQL support services? We’re here to help. Ensuring PostgreSQL is ready for enterprise deployment extends beyond mere software installation. Our comprehensive services range from daily support to specialized consulting for intricate performance issues and design obstacles, providing the professional know-how necessary for operating PostgreSQL in mission-critical and production settings.

Want to learn more about PostgreSQL streaming replication? Check out this eBook from Percona that covers high availability solutions, technical details, and practical deployment strategies to enhance database performance and reliability. This resource is invaluable for ensuring your PostgreSQL environments are robust and efficient. Download the eBook today for FREE.

Download eBook

FAQs

What is PostgreSQL streaming replication?

PostgreSQL streaming replication is a method for copying and synchronizing data from a primary database to one or more secondary databases in real-time, allowing for continuous data replication and ensuring high availability and data redundancy.

What are the benefits of using streaming replication in PostgreSQL?

Benefits include improved data availability and disaster recovery, load balancing for read operations across multiple servers, and minimal downtime during maintenance or unexpected failures.

How do you set up streaming replication in PostgreSQL?

Streaming replication requires setting up a primary PostgreSQL server along with one or more standby servers. Essential steps include modifying the primary server’s settings to permit connections from the standby servers, establishing authentication measures, preparing the standby servers with a base backup from the primary, and initiating the replication process.

What is the difference between synchronous and asynchronous replication in PostgreSQL?

In synchronous replication, transactions must be confirmed by both the primary and a specified number of standby servers before they are considered committed, ensuring data consistency but potentially impacting performance. Asynchronous replication allows transactions to be completed without waiting for standby servers, offering better performance but at the risk of data loss if the primary server fails before the standby is updated.

How can I monitor the performance of streaming replication in PostgreSQL?

Streaming replication performance can be monitored using various tools like Percona Monitoring and Management and PostgreSQL’s built-in functions. Key metrics include replication lag, transaction throughput, and resource usage on both primary and standby servers. Tools like pg_stat_replication provide real-time insights into the replication process.

22 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Douglas Hunley

5 years ago

You’re conflating two different, albeit related, things. WAL shipping is not strictly needed for streaming replication. Which means that ‘archive_mode’ , ‘archive_command’ , and ‘restore_command’ are not necessary. Yes, they are good practice and *should* be used, but they are not necessary for streaming replication. Your comments about needing 3 wal_sender processes is also misleading. You need two because you are using ‘-Xs’ in the pg_basebackup command. The third won’t be used during that process. If you used ‘-Xf’ (not recommended) you’d only need 1 wal_sender. And once the slave is up and running, it will only ever use 1.

avivallarapu

Reply to Douglas Hunley

5 years ago

Hi Douglas. Yes. Thats true. You may not need archive_mode or archive_command or restore_command for streaming replication. But, when you setup streaming replication and due to a network lag or whatever reason, if the Slave is falling behind, also if the WALs in pg_xlog or pg_wal are recycled, without archiving the WALs the slave can never get back to sync with Master. Thus, we made sure to ensure we suggest the best practices while building streaming replication, rather than just building it. In every Production environment, these parameters are a must. Also archiving of WALs ensure a better backup strategy. So, you consider all these factors while setting up a replica. And regarding the wal sender process being set to 3, while building a slave, if you are using ‘-Xs’ to stream WALs while streaming data directory, you need 2 wal sender processes. At the same time, if there is a backup job running on Master due to whatever reason, you need another wal sender process. It should not in fact hurt setting additional wal sender processes. I usually recommend 2 wal sender processes per each slave. And 1 dedicated to Master for many obvious reasons.

Chk Pcs

Reply to avivallarapu

5 years ago

Hi Avivallarapu
I’m really newbie to PostgreSQL, I’m confusing
Do we need to enabled archive_mode = on for streaming replication?

https://www.postgresql.org/docs/9.5/warm-standby.html

If you use streaming replication without file-based continuous archiving, the server might recycle old WAL segments before the standby has received them. If this occurs, the standby will need to be reinitialized from a new base backup. You can avoid this by setting wal_keep_segments to a value large enough to ensure that WAL segments are not recycled too early, or by configuring a replication slot for the standby. If you set up a WAL archive that’s accessible from the standby, these solutions are not required, since the standby can always use the archive to catch up provided it retains enough segments.

cmennens

5 years ago

When you create the replicator role, you MUST specify WITH LOGIN REPLICATION or you can’t do a pg_basebackup from the slave…

Avinash Vallarapu

Author

Reply to cmennens

5 years ago

You need to specify “WITH LOGIN” only when you use CREATE ROLE. As we have used CREATE USER … , it automatically assign LOGIN role to the replication user -> replicator (here in this post).

Krishna

5 years ago

HI Avinash,

Thanks for the detailed documentation, it helped me alot in my academic project.

I have tried the same on Windows environment and i am struggling to start SLAVE node (as i have moved MASTER node data to SLAVE). When I use the original data folder, able to start the SLAVE node, but when I do all the configurations, I am facing a challenge as shown below.

Error:
2018-12-24 11:17:31.235 IST [13104] FATAL: database system identifier differs between the primary and standby
2018-12-24 11:17:31.235 IST [13104] DETAIL: The primary’s identifier is 6637748246234064208, the standby’s identifier is 6637788080756493616.

Avinash Vallarapu

Author

Reply to Krishna

5 years ago

Hi Krishna, Hope you issue is resolved. You may not have used the data directory that is copied from Master. initdb generates a system identifier and the same gets copied to Slave. So, something went wrong with the copy.

Priit

5 years ago

Hi, thanks for the great guide. But I have question regarding to physical replication slot. Is it useful to create this slot also or what are you thoughts ? I see in earlier versions it was used and the Definition itself says also, ” Replication slots are a crash-safe data structure which can be created
on either a master or a standby to prevent premature removal of
write-ahead log segments needed by a standby, as well as (with
hot_standby_feedback=on) pruning of tuples whose removal would cause
replication conflicts. “

Avinash Vallarapu

Author

Reply to Priit

5 years ago

It would help too. You may read more on it in this blog post : https://www.percona.com/blog/2018/11/30/postgresql-streaming-physical-replication-with-slots/

Sergey Gavrilov

5 years ago

Thank you Avinash Vallarapu! Nice explanation! It really save my day! Keep going in that way!))

Avinash Vallarapu

Author

Reply to Sergey Gavrilov

5 years ago

Thank you Sergey. Good to know.

Kavita

4 years ago

Hello Avinash, have you setup streaming replication in postgreSQL running in docker environment?

avivallarapu

Reply to Kavita

4 years ago

Should not be much different but i can surely think of a post on it Kavita.

Balasubramanian M P

4 years ago

Hi , i have configured for Data replication but i dont see the process from select * from pg_stat_replication; , below is the output of the command,

My database name is na but recovery.conf file got created under -P directory from /opt/postgres/data/-P

[root@meylvhpnaap02 data]# pwd
/opt/postgres/data

Please help on this

avivallarapu

Reply to Balasubramanian M P

4 years ago

You see any errors in the PG logs of both Master and Standby ?

geno

4 years ago

“listen_addresses : Set it to * or the range of IP Addresses that need to be whitelisted to connect to your master PostgreSQL server. Your slave IP should be whitelisted too, else, the slave cannot connect to the master to replicate/replay WALs.”

1.
I dont think the word “whitelisted” is appropriate here. It implies something will be blocking the IP address unless specified.
Instead, your telling postgres which addresses to listen on for incoming connections.

2.
You are definately NOT adding the slave’s IP address in the ‘listen_addresses’ list on the master.
Go read the documentation again.

Otherwise, thanks for the article

Shubha

4 years ago

Hi Avinash, This is an excellent write up!
I have a question on creating a new user for replication. Why cant we use an existing user for replication job as well?
Every blog I have come across suggests creating a new user for Replication.
I would really appreciate your insight on this.

Thanks!

avivallarapu

Reply to Shubha

4 years ago

You can definitely use an existing user as well. The user just needs to have the REPLICATION ROLE. Sometime we see user using postgres, the superuser to setup replication. It works, but not recommended as you do not really a superuser that has privileges to perform any possible action on the database, just to setup replication.

Tony Libbrecht

4 years ago

Hi
A number of months ago, I have been setting up streaming replication between 2 locations.
Following your blog, setup went well and streaming worked well all the time. (Postgre 9.5).

Now an issue happened on master, and streaming stopped.
Therefore I want to take base backup on slave again and reinitiate streaming from scratch.
It is not important that I might lose some data. Just want to start over.

Question : on master, currently there are a number of wal files remaining in pg_xlog folder.
Do I need to remove the pg_xlog files on master, before taking pg_basebackup on slave ?

Thank you

-1

avivallarapu

Reply to Tony Libbrecht

4 years ago

Hi Tony, Great to hear that the blog post helped you. You don’t have to remove those WAL segments from the pg_xlog of the Master.

Push

4 years ago

Does it fail over in case master goes down ?

zifnab

3 years ago

Hi Balasubramanian M P,
i guess $PGDATA is empty. You can check it with: echo $PGDATA.
Your screenshot shows you did the command with user root. Try user postgres.

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Understanding Replication in PostgreSQL – How to Set Up PostgreSQL Streaming Replication

What is PostgreSQL streaming replication?

What are the benefits of PostgreSQL streaming replication?

High availability

Load balancing

Disaster recovery

Real-time Data Warehousing

Understanding how streaming replication in PostgreSQL works

Streaming replication in PostgreSQL between a master and one slave

Step 1:

Step 2:

Step 3:

Step 4:

Step 5:

Step 6:

Final step: validate that postgresql replication is setup

Performance considerations for PostgreSQL streaming replication

5 tips for optimizing PostgreSQL streaming replication

1. Optimize WAL configuration

2. Network optimization

3. Monitoring and adjusting replication slots

4. Manage replication lag

5. Load balancing and read scaling

Get highly available PostgreSQL from Percona

FAQs

What is PostgreSQL streaming replication?

What are the benefits of using streaming replication in PostgreSQL?

How do you set up streaming replication in PostgreSQL?

What is the difference between synchronous and asynchronous replication in PostgreSQL?

How can I monitor the performance of streaming replication in PostgreSQL?

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Why MariaDB Is “Better” Than MySQL

PostgreSQL Database Security Best Practices

Did MyDumper LIKE Triggers?

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation