Postgres 14 Sharding with Citus

Postgres sharding with Citus is designed to horizontally scale PostgreSQL across multiple nodes. Citus extends PostgreSQL by adding the ability to distribute tables and queries across a cluster of servers.

Tables are horizontally partitioned into smaller, manageable shards that reside on different nodes. Each node contains a subset of the data and Citus intelligently routes queries to the appropriate nodes.

Sharding architecture enhances both read and write scalability, makes it well-suited for applications with growing data volumes and demanding workloads.

________________________ Step by Step Instructions to Setup Postgres Sharding ______________________________

  1. Create OL8 or RHEL8 Instance and Run the below commands on all Nodes :

a. SSH into all the Instances and configure it as below :

sudo dnf module list postgresql

sudo yum -y install gnupg2 wget vim tar zlib openssl

sudo dnf install https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm

sudo yum -qy module disable postgresql

sudo yum install postgresql14-server -y

sudo yum install postgresql14-contrib -y

## Due to policies for Red Hat family distributions, the PostgreSQL installation will not be enabled for automatic start or have the database initialized automatically

sudo systemctl enable postgresql-14

sudo postgresql-14-setup initdb

sudo systemctl start postgresql-14

sudo systemctl status postgresql-14 

b. Enable Postgres user and set Super user password

sudo -iu postgres

psql -c "ALTER USER postgres WITH PASSWORD 'RAbbithole1234#_';"

exit

c. Install Citus community edition binary and Create the Extension

# Add Citus repository for package manager

curl https://install.citusdata.com/community/rpm.sh | sudo bash

sudo yum install -y citus121_14
#Preload Citus and pg_stat_statements extensions on all Nodes

sudo -iu postgres

psql -U postgres -c 'SHOW config_file'

              config_file
----------------------------------------
 /var/lib/pgsql/14/data/postgresql.conf
(1 row)

vim /var/lib/pgsql/14/data/postgresql.conf

## Add below entry and uncomment 'shared_preload_libraries'

shared_preload_libraries = 'citus,pg_stat_statements'

## Note that “citus” has to be the first extension in the list. Otherwise, the server won’t start.

exit

sudo systemctl restart postgresql-14

sudo systemctl status postgresql-14


# Enable auto-start of Postgres 14 server when the server reboots

sudo chkconfig postgresql-14 on

sudo -i -u postgres psql -c "CREATE EXTENSION citus;"
sudo -i -u postgres psql -c "CREATE EXTENSION pg_stat_statements;"

d. Configure connection and authentication

sudo -iu postgres

vim /var/lib/pgsql/14/data/postgresql.conf

# Uncomment listen_addresses and set it as below
listen_addresses = '*'

# Uncomment and change wal_level = 'logical'
wal_level = 'logical'
vim /var/lib/pgsql/14/data/pg_hba.conf

# Change this line to allow all hosts 10.180.2.0/24 with trust
## Important Note : 10.180.2.0/24 is the subnet in which the instances reside. The subnet should have egress and ingress for the Postgres port. Alternately instead of doing a password-less setup, you can also use pgpass file to store the password on all nodes and use the normal authentication method. ##

# IPv4 local connections:
host    all             all             10.180.2.0/24           trust

exit

sudo systemctl restart postgresql-14

sudo systemctl status postgresql-14

## Whitelist Postgres Port##

sudo firewall-cmd --list-ports
sudo firewall-cmd --zone=public --permanent --add-port=5432/tcp
sudo firewall-cmd --reload
sudo firewall-cmd --list-ports

I’ve created a small automation script to perform the above steps. Save it as a .sh file, change the parameters according to your Postgres, citus version and simply execute on all the nodes:

#!/bin/bash

# Function to print commands and exit on failure
function run_command() {
    echo "$ $1"
    eval $1
    if [ $? -ne 0 ]; then
        echo "Error executing command. Exiting."
        exit 1
    fi
}

# Step 1: Install Postgres 14 Server on all Nodes
run_command "sudo dnf module list postgresql"
run_command "sudo yum -y install gnupg2 wget vim tar zlib openssl"
run_command "sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm"
run_command "sudo yum -qy module disable postgresql"
run_command "sudo yum install postgresql14-server -y"
run_command "sudo yum install postgresql14-contrib -y"
run_command "sudo systemctl enable postgresql-14"

# Check if the data directory is empty
if [ -z "$(sudo -i -u postgres ls -A /var/lib/pgsql/14/data)" ]; then
    run_command "sudo postgresql-14-setup initdb"
else
    echo "Data directory is not empty. Skipping initialization."
fi

run_command "sudo systemctl start postgresql-14"
run_command "sudo chkconfig postgresql-14 on"

# Step 2: Enable Postgres user on all Nodes and set superuser password
run_command "sudo -i -u postgres psql -c \"ALTER USER postgres WITH PASSWORD 'YOurPassword1234#_';\""

# Step 3: Install Citus on all Nodes
run_command "curl https://install.citusdata.com/community/rpm.sh | sudo bash"
run_command "sudo yum install -y citus121_14"

# Step 4: Preload Citus and pg_stat_statements extensions on all Nodes
run_command "sudo -i -u postgres psql -U postgres -c 'SHOW config_file'"
run_command "sudo -i -u postgres sed -i -E 's/^#?(listen_addresses[ \t]*=[ \t]*).*/\1'\''*'\''/' /var/lib/pgsql/14/data/postgresql.conf"
run_command "sudo -i -u postgres sed -i -E 's/^#?(shared_preload_libraries[ \t]*=[ \t]*).*/\1'\''citus,pg_stat_statements'\''/' /var/lib/pgsql/14/data/postgresql.conf"
run_command "sudo -i -u postgres sed -i -E 's/^#?(wal_level[ \t]*=[ \t]*).*/\1'\''logical'\''/' /var/lib/pgsql/14/data/postgresql.conf"
run_command "sudo -i -u postgres sed -i -E '/^# IPv4 local connections:/ { n; s/^(host[ \t]*all[ \t]*all[ \t]*)127.0.0.1\/32[ \t]*scram-sha-256$/\10.0.0.0\/0           trust/ }' /var/lib/pgsql/14/data/pg_hba.conf"

# Step 5: Configure connection and authentication on all Nodes
run_command "sudo systemctl restart postgresql-14"
run_command "sudo firewall-cmd --list-ports"
run_command "sudo firewall-cmd --zone=public --permanent --add-port=5432/tcp"
run_command "sudo firewall-cmd --reload"
run_command "sudo firewall-cmd --list-ports"

# Step 6: Create Citus extension on all Nodes
run_command "sudo -i -u postgres psql -c \"CREATE EXTENSION citus;\""
run_command "sudo -i -u postgres psql -c \"CREATE EXTENSION pg_stat_statements;\""

echo "Script execution completed successfully."



2. Create Co-ordinator and Worker nodes

We have now prepared 3 instances for sharding in total. Step 1 should have been performed on all the below instances :

IP        HOSTNAME             ROLE
10.180.2.45     Postgres-Citus-Coordinator    Worker Node

10.180.2.198     Postgres-Citus-Worker-Node-1  Worker Node

10.180.2.86     Postgres-Citus-Worker-Node-2   Worker Node

Execute the below from the Co-ordinator node and run the below commands on the same node

ssh opc@10.180.2.222

# Add co-ordinator node
sudo -i -u postgres psql -c "SELECT citus_set_coordinator_host('10.180.2.45', 5432);"
 
# Add Worker Nodes
sudo -i -u postgres psql -c "SELECT * from citus_add_node('10.180.2.198', 5432);"
sudo -i -u postgres psql -c "SELECT * from citus_add_node('10.180.2.86', 5432);"


# Check Active Worker Nodes 
sudo -i -u postgres psql -c "SELECT * FROM citus_get_active_worker_nodes();"

  node_name   | node_port
--------------+-----------
 10.180.2.198 |      5432
 10.180.2.86  |      5432



3. Create a Distributed table

All steps below to be executed from Co-ordinator node :

CREATE TABLE orders (
order_id    bigserial, 
shard_key   int PRIMARY KEY, 
n           int, 
description char(100) DEFAULT 'x');

# Create Index to further optimize the SQL performance 
CREATE UNIQUE INDEX shard_key_idx on orders (shard_key);

# Add Distributed table
SELECT create_distributed_table('orders', 'shard_key');

\timing

# Generate 5 Million rows
INSERT INTO orders (shard_key, n, description)
SELECT 
    id AS shard_key,
    (random() * 1000000)::int AS n,
    'x' AS description
FROM generate_series(1, 5000000) AS id
ON CONFLICT DO NOTHING;


#Check the Size of the table using the Citus Table and not Standard Postgres comman
\x
SELECT * FROM citus_tables ;

# Check Explain plan of Query
\x
explain (analyze, buffers, timing) SELECT count(*) from orders;
explain (analyze, buffers, timing) SELECT count(*) from orders where shard_key=2 ;

4. Add another node by performing all commands in Step 1. and add it to the cluster

IP : 10.180.2.17

Run from the Co-ordinator node

sudo -i -u postgres psql -c "SELECT * from citus_add_node('10.180.2.17', 5432);"

sudo -i -u postgres psql -c "SELECT * FROM citus_get_active_worker_nodes();"

  node_name   | node_port
--------------+-----------
 10.180.2.198 |      5432
 10.180.2.86  |      5432
 10.180.2.17  |      5432
(3 rows)

# Add .pgpass file on co-ordinator node and add the DB details >> hostname:port:database:username:password
vim /var/lib/pgsql/.pgpass

localhost:5432:postgres:postgres:YOurPassword1234#_

chmod 600 .pgpass

# Re-balance the shards without downtime
psql -U postgres -h localhost

ALTER SYSTEM SET citus.max_background_task_executors_per_node = 2;
SELECT pg_reload_conf();
SELECT citus_rebalance_start();

NOTICE:  Scheduled 10 moves as job 1
DETAIL:  Rebalance scheduled as background job
HINT:  To monitor progress, run: SELECT * FROM citus_rebalance_status();
 citus_rebalance_start
-----------------------
                     1

#Check Status of rebalancing
SELECT * FROM citus_rebalance_status();

      1 | running | rebalance | Rebalance all colocation groups | 2023-12-24 09:44:16.813663+00 |             | {"t
asks": [{"LSN": {"lag": null, "source": "0/371A5128", "target": null}, "size": {"source": "29 MB", "target": "26 MB
"}, "hosts": {"source": "10.180.2.198:5432", "target": "10.180.2.17:5432"}, "phase": "Catching Up", "state": "runni
ng", "command": "SELECT pg_catalog.citus_move_shard_placement(102012,2,4,'auto')", "message": "", "retried": 0, "ta
sk_id": 4}], "task_state_counts": {"done": 3, "blocked": 6, "running": 1}}
(1 row)

#Once completed the output will be as below :

SELECT * FROM citus_rebalance_status();

 job_id |  state   | job_type  |           description           |          started_at           |          finishe
d_at          |                     details
--------+----------+-----------+---------------------------------+-------------------------------+-----------------
--------------+--------------------------------------------------
      1 | finished | rebalance | Rebalance all colocation groups | 2023-12-24 09:44:16.813663+00 | 2023-12-24 10:18
:24.886163+00 | {"tasks": [], "task_state_counts": {"done": 10}}

# Check the Shard views
SELECT * from pg_dist_shard;
SELECT * FROM citus_shards;

#Misc rebalancing SQL queries
select get_rebalance_table_shards_plan();
SELECT citus_set_default_rebalance_strategy('by_disk_size');
SELECT * from citus_remote_connection_stats();


PostgreSQL 14 Streaming Replication on Oracle Cloud Infrastructure (OCI) VM

Introduction

PostgreSQL 14 is a powerful and feature-rich open-source relational database management system.

In this guide, we’ll walk through the process of installing and configuring PostgreSQL 14 on Oracle Cloud Infrastructure (OCI) with Oracle Linux 8. The setup includes one master node and two slave nodes, forming a streaming replication setup.

Note : The word slave and replica is used interchangeably in this article when referring to anything which is not a master node

OS– Oracle Linux 8

PostgreSQL Version – 14.10

1 Master Node – IP- 10.180.2.102

2 Slave Nodes – IPs- 10.180.2.152, 10.180.2.58

3 Node PostgreSQL 14 Cluster on OCI

You can create a DR architecture using streaming replication. Put 1 replica in the same region and 2 additional replicas in another OCI region.The VCN’s in both OCI regions have to be remotely peered using a DRG and all routes should permit the traffic over the different subnets and allow communication over port 5432. You can refer to this articleo n how to configure VCN remote peering on OCI: https://docs.oracle.com/en-us/iaas/Content/Network/Tasks/scenario_e.htm

4-Node Cross-Region PostgreSQL 14 Cluster on OCI

Step 1: Installing PostgreSQL 14 on Master and Slave Nodes

Start by updating the system and installing necessary dependencies on both the master and slave nodes:

sudo dnf update -y

sudo dnf module list postgresql

sudo yum -y install gnupg2 wget vim tar zlib openssl

sudo dnf install https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm

sudo yum -qy module disable postgresql

sudo yum install postgresql14-server -y

sudo yum install postgresql14-contrib -y

sudo systemctl enable postgresql-14

sudo postgresql-14-setup initdb

sudo systemctl start postgresql-14

sudo systemctl status postgresql-14

Step 2: Enabling Postgres User and Streaming Replication

Enable the Postgres user and configure streaming replication on both the master and slave nodes:

sudo -iu postgres

psql -c "ALTER USER postgres WITH PASSWORD 'RAbbithole1234#_';"

tree -L 1 /var/lib/pgsql/14/data

psql -U postgres -c 'SHOW config_file'

              config_file
----------------------------------------
 /var/lib/pgsql/14/data/postgresql.conf
(1 row)

Step 3: Configuring pg_hba.conf and Firewall Settings

Update the pg_hba.conf file on both the master and slave nodes to allow connections and adjust firewall settings:

sudo -iu postgres

vim /var/lib/pgsql/14/data/pg_hba.conf

# If ident is available in file then replace 'ident' with 'md5' or 'scram-sha-256'

# Change this line to allow all hosts 0.0.0.0/0 

# IPv4 local connections:
host    all             all             0.0.0.0/0               scram-sha-256

exit

sudo systemctl restart postgresql-14

#Whitelist Ports on Instance

sudo firewall-cmd --list-ports

sudo firewall-cmd --zone=public --permanent --add-port=5432/tcp

sudo firewall-cmd --reload

sudo firewall-cmd --list-ports

Step 4: Configuring Master Node for Streaming Replication

On the master node (10.180.2.102), configure streaming replication:

sudo -iu postgres

mkdir -p /var/lib/pgsql/14/data/archive

vim /var/lib/pgsql/14/data/postgresql.conf

## Uncomment and set below parameters
listen_addresses = '*'
archive_mode = on    # enables archiving; off, on, or always
archive_command = 'cp %p /var/lib/pgsql/14/data/archive/%f' 
max_wal_senders = 10            # max number of walsender processes
max_replication_slots = 10      # max number of replication slots
wal_keep_size = 50000           # Size of WAL in megabytes; 0 disables
wal_level = replica             # minimal, replica, or logical
wal_log_hints = on               # also do full page writes of non-critical updates

## Only set below if you want to create synchronous replication##
synchronous_commit = remote_apply
synchronous_standby_names = '*'

exit

sudo systemctl restart postgresql-14

netstat -an | grep 5432

Update pg_hba.conf on the master node:

sudo -iu postgres

vim /var/lib/pgsql/14/data/pg_hba.conf

#Add below entry to end of file

host    replication     all             10.180.2.152/32         scram-sha-256
host    replication     all             10.180.2.58/32         scram-sha-256

exit

sudo systemctl restart postgresql-14

Step 5: Configuring Slave Nodes for Streaming Replication

On the slave nodes (10.180.2.152 and 10.180.2.58), configure streaming replication:

sudo -iu postgres

mkdir -p /var/lib/pgsql/14/data/backup

vim /var/lib/pgsql/14/data/pg_hba.conf

exit

sudo systemctl restart postgresql-14

sudo chmod 0700 /var/lib/pgsql/14/data/backup

sudo -iu postgres

#Backup and Clone Database from Slave Node using IP of Master Node

pg_basebackup -D /var/lib/pgsql/14/data/backup -X fetch -p 5432 -U postgres -h 10.180.2.102 -R

cd /var/lib/pgsql/14/data/backup

cat postgresql.auto.conf

#Stop the Instance and Restart using Data in New location

/usr/pgsql-14/bin/pg_ctl stop

/usr/pgsql-14/bin/pg_ctl start -D /var/lib/pgsql/14/data/backup

waiting for server to start....2023-11-27 03:36:48.205 GMT [169621] LOG:  redirecting log output to logging collector process
2023-11-27 03:36:48.205 GMT [169621] HINT:  Future log output will appear in directory "log".
 done
server started

Step 6: Checking Replication Status from Slave Nodes

Check the status of streaming replication from slave nodes using psql:

psql -h localhost -p 5432 -U postgres -d postgres

postgres# select pg_is_wal_replay_paused();

 pg_is_wal_replay_paused
-------------------------
 f
(1 row)

Note - f means , recovery is running fine. t means it is stopped.



postgres# select * from pg_stat_wal_receiver;

-[ RECORD 1 ]---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 414090
status                | streaming
receive_start_lsn     | 0/A000000
receive_start_tli     | 1
written_lsn           | 0/A002240
flushed_lsn           | 0/A002240
received_tli          | 1
last_msg_send_time    | 2023-12-04 11:40:51.853918+00
last_msg_receipt_time | 2023-12-04 11:40:51.853988+00
latest_end_lsn        | 0/A002240
latest_end_time       | 2023-11-30 08:16:43.217865+00
slot_name             |
sender_host           | 10.180.2.102
sender_port           | 5432
conninfo              | user=postgres password=******** channel_binding=prefer dbname=replication host=10.180.2.102 port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any

Step 7: Checking Replication Status from Master Node

On the master node, check the status of replication:

psql -h localhost -p 5432 -U postgres -d postgres:

postgres# select * from pg_stat_replication;

-[ RECORD 1 ]----+------------------------------
pid              | 382513
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 10.180.2.152
client_hostname  |
client_port      | 47312
backend_start    | 2023-11-30 08:11:42.536364+00
backend_xmin     |
state            | streaming
sent_lsn         | 0/A002240
write_lsn        | 0/A002240
flush_lsn        | 0/A002240
replay_lsn       | 0/A002240
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async
reply_time       | 2023-12-04 11:43:12.033364+00

-[ RECORD 2 ]----+------------------------------
pid              | 382514
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 10.180.2.58
client_hostname  |
client_port      | 35294
backend_start    | 2023-11-30 08:11:42.542539+00
backend_xmin     |
state            | streaming
sent_lsn         | 0/A002240
write_lsn        | 0/A002240
flush_lsn        | 0/A002240
replay_lsn       | 0/A002240
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async
reply_time       | 2023-12-04 11:43:10.113253+00

Step 8: Additional Notes and References

To restart slave nodes, use the following commands:

/usr/pgsql-14/bin/pg_ctl stop

sudo rm -rf /var/lib/pgsql/14/data/backup/postmaster.pid

/usr/pgsql-14/bin/pg_ctl start -D /var/lib/pgsql/14/data/backup

  1. DBA Class
  2. Narasimman Tech
  3. PostgreSQL Continuous Archiving Documentation
  4. Stack Overflow
  5. Girders
  6. Kinsta

Follow this comprehensive guide to set up PostgreSQL 14 streaming replication on Oracle Cloud Infrastructure with Oracle Linux 8. Ensure high availability and robust backup capabilities for your PostgreSQL database

Run Multiple Commands on Remote Host using SSH & Sync Directories to Remote Host

#!/bin/bash

sourcefile='/var/www/html/index.html'
remotedir='/var/www/html'
rsyncremotedir='/var/www/'
rsynclocaldir='/var/www/'

### Run Multiple Commands on Remote Host ###
while IFS= read -r dest; do
   ssh -i "mynew_key.pem" $dest '
        date
	sudo yum install rsync
	sudo chown -R ec2-user:root /var/www
	sudo rm -rf /home/ec2-user/ourfile.txt
	ls -ltrh /var/www/html
        hostname' </dev/null
done < hosts.txt

### Transfer File to All Remote Hosts ###
#while IFS= read -r dest; do
#  scp -i "mynew_key.pem" $sourcefile "$dest:$remotedir"
#done <hosts.txt

# Sync Push Local Directory to Remote Hosts Using RSYNC ###
while IFS= read -r dest; do
  rsync -avz /var/www/ --delete --exclude 'stats' -e "ssh -i /home/ec2-user/mynew_key.pem" $dest:/var/www/
done <hosts.txt

# Sync Pull Remote Directory to Local Directory Using RSYNC ###
#while IFS= read -r dest; do
#  rsync -avz --dry-run --delete --exclude 'stats' -e "ssh -i /home/ec2-user/mynew_key.pem" $dest:/var/www/ /var/www/
#done <hosts.txt

Very useful in syncing Apache Web Server directories to multiple hosts when running under AWS ELB