Data Ownership Models



In asynchronous replicated databases it is important to establish which site can update the data. Many data ownership models can be defined.

Master/Slave: one site (master) can modify data, the other sites (slaves) can read only access to them. This is used when data stored in a central database are to be distributed with a read-only access to other different geographical locations.

Update Anywhere: all involved sites can modify replicated data, so each site is considered a master site. Some update conflict may arise when using this model.

Workload Partitioning: if you are able to identify where (in which site) a record in a replicated table has been originated, then you can split that table into partitions (one for each site) assigning update rights at the partition level. This means that you can set up a sort of master/slave model at row level.

pgReplicator can be configured to obtain different data-models at table level, so you are able to set up, in the same database, different data-models for different replicated tables. This is obtained assigning the correct value to permissions attribute of pgr_tables replica system table.

ROT

Read Only Table

users can only read data, despite of their database privileges

RWT

Read Write Table

does not interfere with users' database privileges

RWLP

Read Write on Local Partitions

acts as RWT on local partitions (local generated rows) and as ROT on remote ones



Many hybrid data ownership models can be obtained too, as described in the following schema in which 3 sites are considered:

Site 1

Site 2

Site 3

Table-level data ownership model

Requires conflict detection

RWT

ROT

ROT

Master / slave (data dissemination)

No

RWT

RWT

RWT

Update anywhere

Yes

RWLP

RWLP

RWLP

Workload partitioning

No

RWT

RWT

ROT

Multimaster

Yes

RWLP

RWLP

ROT

Hybrid

No

RWT

RWLP

ROT

Hybrid

Yes





EXAMPLE

Let us replicate 2 tables (Table1, Table2) of a database over 3 sites (Site1, Site2, Site3 on Host1, Host2, Host3).

We want a Master/Slave data model for Table1 with Site1 as master site; a Workload Partitioning data model for Table2 on Site2 and Site3. Let us configure pgReplicator:

(1) on site N=1,2,3 set the siteID = N in pgr_lsite:

pgr_sid

N



(2) insert values in pgr_hosts

host

domain

site

port

Host1

Host1.domain

1

5432

Host2

Host2.domain

2

5432

Host3

Host3.domain

3

5432



Suppose we want to replicate Table1 on Site1, Site2 and Site3; Table2 on Site2 and Site3


Site1

Site2

Site3

Table1

RWT

ROT

ROT

Table2


RWLP

RWLP



(3) so pgr_distrib in each site must be:

site

tablename

1

Table1

2

Table1

2

Table2

3

Table1

3

Table2



(4) Now let's configure pgr_tables on Site 1:

tablename

permissions

resolv_site

resolver

...

Table1

RWT

1

NO

...


on Site 2:

tablename

permissions

resolv_site

resolver

...

Table1

ROT

1

NO

...

Table2

RWLP

3

NO

...


on Site 3:

tablename

permissions

resolv_site

resolver

...

Table1

ROT

1

NO

...

Table2

RWLP

3

NO

...




No conflict resolution algorithm is required for Master/Slave and Workload partitions data model (resolver=NO). We selected Site1 as resolv-site on Table1 and Site3 as resolv-site on Table2. The resolv-site is the site that collect all the operations submitted on a table, detects and resolv replica conflicts and propagate resolved operations to all the other involved sites.

In this example Site1 collects operations on Table1 coming from Site2 and Site3 adding them to that originated on Site1, accepts all the operations (resolver=NO) and redistributes resolved operations on Site2 and Site3.

Site3 collects operations on Table2 coming from Site2 adding them to that originated on Site2, accepts all the operations (resolver=NO) and redistributes resolved operations on Site2.

Remember that a site can be a resolv-site for a specified table only if that table is replicated on the selected site: Site1 cannot be a resolv-site for Table2!

Having different resolv-sites permits the system to be more scalable when replicated tables and sites increase.

NEXT SECTION: Conflict Resolution Algorithms