Data Ownership Models
In asynchronous replicated databases it is important to establish which site can update the data. Many data ownership models can be defined.
Master/Slave: one site (master) can modify data, the other sites (slaves) can read only access to them. This is used when data stored in a central database are to be distributed with a read-only access to other different geographical locations.
Update Anywhere: all involved sites can modify replicated data, so each site is considered a master site. Some update conflict may arise when using this model.
Workload Partitioning: if you are able to identify where (in which site) a record in a replicated table has been originated, then you can split that table into partitions (one for each site) assigning update rights at the partition level. This means that you can set up a sort of master/slave model at row level.
pgReplicator can be configured to obtain different data-models at table level, so you are able to set up, in the same database, different data-models for different replicated tables. This is obtained assigning the correct value to permissions attribute of pgr_tables replica system table.
ROT |
Read Only Table |
users can only read data, despite of their database privileges |
RWT |
Read Write Table |
does not interfere with users' database privileges |
RWLP |
Read Write on Local Partitions |
acts as RWT on local partitions (local generated rows) and as ROT on remote ones |
Many hybrid data ownership models can be obtained too, as described in the following schema in which 3 sites are considered:
Site 1 |
Site 2 |
Site 3 |
Table-level data ownership model |
Requires conflict detection |
RWT |
ROT |
ROT |
Master / slave (data dissemination) |
No |
RWT |
RWT |
RWT |
Update anywhere |
Yes |
RWLP |
RWLP |
RWLP |
Workload partitioning |
No |
RWT |
RWT |
ROT |
Multimaster |
Yes |
RWLP |
RWLP |
ROT |
Hybrid |
No |
RWT |
RWLP |
ROT |
Hybrid |
Yes |
EXAMPLE
Let us replicate 2 tables (Table1, Table2) of a database over 3 sites (Site1, Site2, Site3 on Host1, Host2, Host3).
We want a Master/Slave data model for Table1 with Site1 as master site; a Workload Partitioning data model for Table2 on Site2 and Site3. Let us configure pgReplicator:
(1) on site N=1,2,3 set the siteID = N in pgr_lsite:
pgr_sid |
---|
N |
(2) insert values in pgr_hosts
host |
domain |
site |
port |
---|---|---|---|
Host1 |
Host1.domain |
1 |
5432 |
Host2 |
Host2.domain |
2 |
5432 |
Host3 |
Host3.domain |
3 |
5432 |
Suppose we want to replicate Table1 on Site1, Site2 and Site3; Table2 on Site2 and Site3
|
Site1 |
Site2 |
Site3 |
---|---|---|---|
Table1 |
RWT |
ROT |
ROT |
Table2 |
|
RWLP |
RWLP |
(3) so pgr_distrib in each site must be:
site |
tablename |
---|---|
1 |
Table1 |
2 |
Table1 |
2 |
Table2 |
3 |
Table1 |
3 |
Table2 |
(4) Now let's configure pgr_tables on Site 1:
tablename |
permissions |
resolv_site |
resolver |
... |
---|---|---|---|---|
Table1 |
RWT |
1 |
NO |
... |
on Site 2:
tablename |
permissions |
resolv_site |
resolver |
... |
---|---|---|---|---|
Table1 |
ROT |
1 |
NO |
... |
Table2 |
RWLP |
3 |
NO |
... |
on Site 3:
tablename |
permissions |
resolv_site |
resolver |
... |
---|---|---|---|---|
Table1 |
ROT |
1 |
NO |
... |
Table2 |
RWLP |
3 |
NO |
... |
No conflict resolution algorithm is required for Master/Slave and Workload partitions data model (resolver=NO). We selected Site1 as resolv-site on Table1 and Site3 as resolv-site on Table2. The resolv-site is the site that collect all the operations submitted on a table, detects and resolv replica conflicts and propagate resolved operations to all the other involved sites.
In this example Site1 collects operations on Table1 coming from Site2 and Site3 adding them to that originated on Site1, accepts all the operations (resolver=NO) and redistributes resolved operations on Site2 and Site3.
Site3 collects operations on Table2 coming from Site2 adding them to that originated on Site2, accepts all the operations (resolver=NO) and redistributes resolved operations on Site2.
Remember that a site can be a resolv-site for a specified table only if that table is replicated on the selected site: Site1 cannot be a resolv-site for Table2!
Having different resolv-sites permits the system to be more scalable when replicated tables and sites increase.
NEXT SECTION: Conflict Resolution Algorithms |