Recently I have evaluated different tools to Cluster PostgreSQL Database.
1) PgPool-II with Streaming Replication
2) PostgreSQL-XC project :
Key Criteria of Evaluation
- Open source
- Impact on Application
- Performance gain, connection pooling support, High Availability, Support to advance feature such as Sharding.
- Documentation and Community support
Available Platform and Features Matrix
Program | License | Platform supported | Maturity | Replication Method | Sync | scalability | Connection Pooling | Load Balancing | Query Partitioning | Architecture | Support | |
Postgres-xc | BSD | All Linux | Stable | Master-Master | Synchronous | read and write | No | Yes | Yes | shared-nothing | NTT Open Source Software Center (Japanese) EnterpriseDB | |
Pgpool-II | BSD | CentOS | Stable | Statement-Based Middleware | Synchronous | read | Yes | Yes | Yes | Proxy between application and database | SRA OSS, Inc. Japan | |
PgCluster | BSD | Died | Master-Master | Synchronous | No | Yes | No | Its not more active project | ||||
slony | BSD | Stable | Master-Slave | Asynchronous | No | No | No | Its replication based tool | Slony | |||
Bucardo | BSD | Stable | Master-Master, Master-Slave | Asynchronous | No | No | No | |||||
Londiste | BSD | Stable | Master-Slave | Asynchronous | No | No | No |
List of Platform: http://www.postgresql.org/download/products/3-clusteringreplication/
I found PostgreSQL-XC & PgPool-II much closer to our requirement.Criteria | PostgreSQL-XC | PgPool-II |
Replication Method | All Node act as master and accept both Read-Write operation | 1) Write on Master and read from Slave nodes. 2) Need to use Streaming Replication to replicate writes on Slave nodes |
Architecture | shared-nothing | Act as middleware between Application and PostgreSQL database. |
PostgreSQL Version | Postgre-XC 1.1 comes with PostgreSQL 9.2 | PgPool-II 3.3.2 comes with PostgreSQL 9.2 & 9.3 both |
Optimizer | Since its shared nothing distributed database, PostgreSQL-XC has to analyze and optimize the SQL. This optimizer is different than built-in PostgreSQL optimizer. So Much more scope of improvement on SQL Optimizer side | PgPool analyze SQL for Read (SELECT) and Write (INSERT/UPDATE/DELETE/TRUNCATE/COMMIT/ROLLBACK) type of operation and forwards SQL either to Master (write) or slave(read). Actual SQL Optimization is done by PostgreSQL Database. So all optimization features of PostgreSQL optimizer holds true in this case. |
Advantages | 1. Most active community than PgPool-II 2. Provides both Read-Write Scalability 3. Replication can be done at table level, No need of additional framework | 1. Act as middleware, No Changes to Database design 2. Act as Connection pool, Load balancer for read operations, Failover for HA 3. Easy to implement than PostgreSQL-XC 4. Even if the PgPool Node goes down, application can still bypass the PgPool and connect to database. 5. Provides result and table cache features. |
Disadvantages | 1. Impact on Database design, for every table will have to design distribution key 2. Does not support Join operation across the Data node 3. Additional node will be required for Global Transaction Management 4. Table level constraints are not supported across cluster 5. Foreign Keys are not supported 6. Failure of 1 node will lead to data unavailability. | Streaming Replication must be Synchronous |
Documentation | Well explained | Adequate |
Application Compatibility | if we want to use Sharding then PostgreSQL-XC could be considered. However due to above limitation and impact on database design, this will not be good choice. | At present we are using PGbouncer for Connection pooling. PgPool-II can be used to simply replace PGbouncer and use to separate out read-write operations. This is completely transparent to Application and has no impact on Database design. In future once all restriction on Parallel SQL execution are resolved, even the read operation can be executed in parallel on all the slave node. http://girders.org/blog/2012/09/29/scaling-postgresql-with-pgpool-and-pgbouncer/ explains benefit of PgPool over PgBouncer |
1) PgPool-II with Streaming Replication
2) PostgreSQL-XC project :
Reference:
1. PostgreSQL-XC project :
- http://sourceforge.net/apps/mediawiki/postgres-xc/index.php?title=Main_Page
- http://postgres-xc.sourceforge.net/docs/1_2_beta/
- http://postgres-xc.sourceforge.net/misc-docs/PG-XC_Architecture.pdf
2. PgPool-II:
- http://wiki.postgresql.org/wiki/Pgpool-II
- http://www.pgpool.net/docs/latest/pgpool-en.html#restriction
- PgPool Vs PgBouncer : http://girders.org/blog/2012/09/29/scaling-postgresql-with-pgpool-and-pgbouncer/
- http://www.dalibo.org/_media/pgpool.pdf
PgPool-II installation process is explained here : http://www.pgpool.net/docs/latest/pgpool-en.html#install
Comments
Post a Comment