PgPool – II Installation & Configuration – (Part-I)

PGPool-II is an open-source proxy software that can be used for the PostgreSQL servers(ProxySQL for MySQL) environment. It lies between the application server/PostgreSQL client and the PostgreSQL database servers and can provide much-needed features like Load Balancing, Query Cache, Connection Pooling, and Watchdog. Each feature requires a detailed explanation with supporting examples. So, in this part let’s discuss the basic setup of PGPool-II with PostgreSQL streaming replication and in the upcoming parts, we will explore more about its features.

History of PGPool-II:

PGPool-II started its life as PGPool in 2003 just as a connection pooling software. Later in 2006, it was released as PGPool-II with many features and with the elimination of many limitations in PGPool. Still, PGPool-II is one of the most popular proxy available for the PostgreSQL servers.

The necessity of PGPool-II:

Apart from the advanced features, PGPool needs for few cases like automatic failover of connections to the backend healthy DB server and maximum connection errors(Too many connections) in the applications. Also, connection pooling reduces the overhead of creating new connections on the database.

Now, let’s start with the setup of PGPool-II for the two PostgreSQL servers(Primary and Secondary) configured with streaming replication. Considering as having the primary and the secondary already which setup has not been explained here.

Server A192.168.33.11PGPool-II
Server B192.168.33.12Primary PG 
Server C192.168.33.13Secondary PG

Below is the architecture of this setup which is planned to achieve.

Installation of PGPool-II :

The following steps have been performed on server A.

  1. Installing the required repo for CentOS 7
yum install -y

2. Installing the PGPool-II

yum install -y pgpool-II-pg12-*

The PostgreSQL version used in this setup is 12, so the pg12* of PGPool-II is used for libraries and extension directories of PostgreSQL 12. Replace step 2 pg12* with the appropriate version of PostgreSQL.

In case of any difficulties, while installing using the package manager or unable to install the version required, it is good to go with source installation. 

  1. Downloading the source code of the version required

2. Extract the downloaded file

tar xf pgpool-II-4.2.1.tar.gz

It will create the directory pgpool-II-4.2.1

3. Install the below packages that need to compile the code

yum install -y

yum install gcc make libpq -y

4. Post successful installation, configure, compile and install it using the below commands



make install

Configuration of PGPool-II:

The configuration file of PGPool-II is /etc/pgpool-II/pgpool.conf

First, creating two users on the primary server 

  1. Monitor user – To perform a health check for the backend databases (only login  is required)
  2. App user – For the application use case (Provide the required privileges based on the application purpose, read or write)


And creating a database for the monitor user

create database monitor;

Make sure these users are replicated to the secondary server and also add an entry on the primary and secondary servers pg_hba.conf file to access the database from pgpool server and reload it.

host    all     all        md5

And reload it to take effect.

select pg_reload_conf();

Preparing the configuration file for the pgpool-II on server A

## Connection Details

socket_dir = '/var/run/postgresql'

## Backend Server Details

# Primary

backend_hostname0 = ''
backend_port0 = 5432
backend_weight0 = 1

# Secondary

backend_hostname1 = ''
backend_port1 = 5432
backend_weight1 = 1

## Load Balancing

load_balance_mode = 'ON'

## Replication Responsibilty

master_slave_mode = 'ON'
master_slave_sub_mode = 'stream'

## Streaming checks

sr_check_period = 10
sr_check_user = 'monitor'
sr_check_password = 'Monitor@321'
sr_check_database = 'monitor'
delay_threshold = 10240

## Client Authentication

allow_clear_text_frontend_auth = ‘ON’

Here explaining the purpose of each variable on the config, it can be modified based on the requirements.

Listen_addresses – From which IP addresses pgpool accepts incoming requests. * means it accepts all requests.

Port – On which port, pgpool wants to listen for client requests. The port here using needs to be used on the application or the client to connect to the database.

Socket_dir – Directory for socket accepting connections 

backend_hostname*, backend_port*, backend_weight* – hostname specifies the PostgreSQL backend server, port specifies the port number of the backend server, weight defines the load balance ratio of the backend server

Here we are having two servers, so the backend variable has id 0 and 1.

backend_hostname0, backend_hostname1. 

In case, if we are adding one more server as the secondary in the future, it needs to be added as backend_hostname2, backend_port2, backend_weight2.


To enable load balancing on incoming select queries to both the backend servers.

master_slave_mode, master_slave_sub_mode

To determine the replication system used for data replication. The value used is stream as it is a streaming replication.

 Streaming checks variables are used to monitor the status of streaming replication using the created monitor user. The variable delay_threshold defines the maximum lag (in bytes) in the secondary server. If the lag exceeds, pgpool won’t serve any select queries to the secondary to avoid reading stale data on the application.

allow_clear_text_frontend_auth – It allows the pgpool to get the password in plain text from the frontend and use it for backend authentication.

Once all the above settings are done, restarting the pgpool takes all the settings into effect.

[root@server1 ~]# service pgpool restart
Redirecting to /bin/systemctl restart pgpool.service
[root@server1 ~]#

Checking the connection using the pgpool from pgpool server

psql -h<pgpool_ip>  -U<appuser> -p<pgpool_port> <db_name>  -c "<query>"
[root@server1 data]# psql -h -Uappuser -p 9999 postgres  -c "select datname from pg_database"
Password for user appuser:
(7 rows)

Now doing some tests and checking the status of the node and the queries distribution on the pgpool.

[root@server1 ~]# watch -n1 "psql -h -Uappuser -p 9999 postgres  -c 'select datname from pg_database'"
[root@aakash2 ~]# psql -h -Uappuser -p 9999 postgres  -c "show pool_nodes"
 node_id |   hostname    | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change

 0       | | 5432 | up     | 0.500000  | primary | 21         | false             | 0                 |                   |                        | 2021-01-14 16:00:4
 1       | | 5432 | up     | 0.500000  | standby | 7          | true              | 0                 |                   |                        | 2021-01-14 16:00:4
(2 rows)

From the above output, the node status is online and queries are distributed.

That’s it on the basic pgpool setup with streaming replication. It is easy to add more replicas under it by adding backend_* variables.

Current architecture will be a single point of failure(SPOF) in case server1 (pgpool) goes down. So, in the next blog, we will look on how to achieve high availability on pgpool servers (PGPool Cluster) using watchdog and Virtual IP.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s