Chapter 5. Basic Configuration

Table of Contents

5.1. Basic Properties
5.2. Node Group
5.3. Node Group Link
5.4. Node
5.5. Channel
5.6. Trigger
5.7. Using Sync Conditions
5.8. Using Node Select
5.9. Using Initial Load Select

To get an instance of SymmetricDS running, it needs to be given an identity and it needs to know how to connect to the database it will manage. A basic way to specify this is to place properties in the symmetric.properties file. When started up, the SymmetricDS reads the configuration and state from the database. If the configuration tables are missing, they are created automatically (auto creation can be disabled). A basic configuration describes the following:

During start up, the Triggers are verified against the database, and database triggers are installed on tables that require data changes to be captured. The PullJob and PushJob begin running as required to synchronize changes with other Nodes.

5.1. Basic Properties

Each Node requires properties that will connect it to the database and register it with a parent Node. To give a Node its identity, the following properties are used:

group.id

The Node Group that this Node is a member of. Synchronization is specified between Node Groups, which means you only need to specify it once for multiple Nodes in the same group. For example, you might have groups of "STORE", "REGION", and "CENTRAL" that synchronize.

external.id

The External ID for this Node has meaning to the user and provides integration into the system where it is deployed. For example, it might be a retail store number or a region number. The External ID can be used in expressions for conditional and subset data synchronization. Behind the scenes, each Node has a unique sequence number for tracking synchronization events. That makes it possible to assign the same External ID to multiple Nodes, if desired.

my.url

The URL where this Node can be contacted for synchronization. At startup and during each heartbeat, the Node updates its entry in the database with this URL.

When a new Node is first started, it is has no information about synchronizing. It contacts the registration server in order to join the network and receive its configuration. The configuration for all Nodes is stored on the registration server, and the URL must be specified in the following property:

registration.url

The URL where this Node can connect for registration to receive its configuration. The registration server is part of SymmetricDS and is enabled as part of the deployment.

For a deployment to an application server, it is common for database connection pools to be found in the Java naming directory (JNDI). In this case, set the following property:

db.jndi.name

The name of the database connection pool to use, which is registered in the JNDI directory tree of the application server. It is recommended that this DataSource is NOT transactional, because SymmetricDS will handle its own transactions.

For a deployment where the database connection pool should be created using a JDBC driver, set the following properties:

db.driver

The class name of the JDBC driver.

db.url

The JDBC URL used to connect to the database.

db.user

The database username, which is used to login, create, and update SymmetricDS tables.

db.password

The password for the database user.

5.2. Node Group

Each Node must belong to a Node Group, a collection of one or more Nodes. A common use of Node Groups is to describe a level in a hierarchy of data synchronization. For example, at a retail store chain, there might be a few Nodes that belong to "corp", which sync with hundreds of Nodes that belong to "store", which sync with thousands of Nodes that belong to "register".

The following SQL statements would create Node Groups for "corp" and "store".

insert into SYM_NODE_GROUP 
  (node_group_id, description)
values
  ('store', 'A retail store node');

insert into SYM_NODE_GROUP 
  (node_group_id, description)
values
  ('corp', 'A corporate node');

5.3. Node Group Link

To establish synchronization between Nodes, two Node Groups are linked together. The direction of synchronization is determined by specifying a source and target Node Group. If synchronization should occur in both directions, then two links are created in opposite directions. The target Node Group receives data changes by either push or pull methods. A push method causes the source Node Group to connect to the target, while a pull method causes it to wait for the target to connect to it.

The following SQL statements links the "corp" and "store" Node Groups for synchronization. It configures the "store" Nodes to push their data changes to the "corp" Nodes, and the "corp" Nodes to send changes to "store" Nodes by waiting for a pull.

insert into SYM_NODE_GROUP_LINK
  (source_node_group, target_node_group, data_event_action)
values
  ('store', 'corp', 'P');

insert into SYM_NODE_GROUP_LINK
  (source_node_group, target_node_group, data_event_action)
values
  ('corp', 'store', 'W');

5.4. Node

Each instance of SymmetricDS is a Node that can be uniquely identified. The Node has a unique identifier used by the system, and the user provides an external identifier for context in the local system. For most common use, the two identifiers are the same. The registration process generates and sends the identity and password to the Node, along with its synchronization configuration. The top-level registration server must have its identity provided by the user since it has no parent to contact.

The following SQL statements setup a top-level registration server as a Node identified as "00000" in the "corp" Node Group.

insert into SYM_NODE 
  (node_id, node_group_id, external_id, sync_enabled)
values
  ('00000', 'corp', '00000', 1);

insert into SYM_NODE_IDENTITY values ('00000');

5.5. Channel

Data changes in the database are captured in the order that they occur, which is preserved when synchronizing to other Nodes. Some data may need priority for synchronization despite the normal order of events. Channels provide a higher-level processing order of data, a limit on the amount of data, and isolation from errors in other channels. By categorizing data into channels and assigning them to Triggers, the user gains more control and visibility into the flow of data.

The following SQL statements setup channels for a retail store. An "item" channel includes data for items and their prices, while a "sale_transaction" channel includes data for ringing sales at a register.

insert into SYM_CHANNEL 
  (channel_id, processing_order, max_batch_size, enabled, description)
values
  ('item', 10, 1000, 1, 'Item and pricing data');

insert into SYM_CHANNEL 
  (channel_id, processing_order, max_batch_size, enabled, description)
values
  ('sale_transaction', 1, 1000, 1, 'retail sale transactions from register');

5.6. Trigger

At the heart of SymmetricDS are Triggers that define what data to capture. Nodes in the source Node Group will capture changes for a table and send them to a target Node Group. Changes can include inserts, updates, or deletes to the table, and it is even possible to filter data by a conditional expression. An entry in Trigger results in a database trigger being installed on the table. Whenever the Trigger entry is updated, the last_updated_time should be updated to indicate that the database trigger should also be updated.

The following SQL statement defines a Trigger that will capture data for a table named "item" whenever data is inserted, updated, or deleted. Data will be captured on Nodes in the "corp" Node Group and sent to Nodes in the "store" Node Group.

insert into SYM_TRIGGER 
  (source_table_name, source_node_group_id, target_node_group_id, channel_id, 
   sync_on_insert, sync_on_update, sync_on_delete, 
   initial_load_order, last_updated_by, last_updated_time, create_time)
values
  ('item', 'corp', 'store', 'item', 
   1, 1, 1, 
   105, 'demo', current_timestamp, current_timestamp);

5.7. Using Sync Conditions

Sync Conditions can be specified for capturing the changes to rows in a table. If the condition is met, the data change is captured for sending to other Nodes. If the condition is not met, the data change is allowed, but it will not be captured and sent to other Nodes. The condition runs within the context of the database trigger event, and it uses the language specific to the database platform. A condition can be specified for insert, update, and delete events, and it is run for each row of that event.

Because a condition makes a decision about whether or not to capture a data change, it effectively creates a subset of the table for synchronization to all Nodes in the Node Group. If a different subset of the table is required for each Node, then see the section on Using Node Select.

The following SQL statement defines a Trigger that will capture data for a table named "sale_transaction" only when the workstation column has a value of "001". All Nodes in the "store" Node Group will receive a subset of the table for only that workstation.

insert into SYM_TRIGGER 
  (source_table_name, source_node_group_id, target_node_group_id, channel_id, 
   sync_on_insert, sync_on_update, sync_on_delete,
   sync_on_insert_condition, sync_on_update_condition, sync_on_delete_condition, 
   initial_load_order, last_updated_by, last_updated_time, create_time)
values
  ('sale_transaction', 'corp', 'store', 'sale_transaction', 
   1, 1, 1, 
   'workstation = ''001''', 'workstation = ''001''', 'workstation = ''001''', 
   105, 'demo', current_timestamp, current_timestamp);

5.8. Using Node Select

After a data change is captured, the Nodes which will receive the change are selected. By default, all Nodes in the target Node Group for the Trigger will be selected. To limit which Nodes are selected, a condition is specified for the node_select of the Trigger. The condition is written in SQL syntax that is appended to the "where" clause of a "select" statement on the Node table. The table is aliased as "c" in order to allow a sub-select to correlate a join. The condition is run within the context of the database trigger event, so it can access the row being changed.

Because the node_select makes a decision about which Nodes receive the data change, it effectively creates a subset of the table for different Nodes. If the same subset of the table is required for all Nodes in the Node Group, then see the section on Using Sync Conditions.

The following variables can be used in the node_select, which will be replaced when the database trigger is built:

  • $(oldTriggerValue) - alias for columns on the "before" row that was updated or deleted

  • $(newTriggerValue) - alias for columns on the "after" row that was updated or inserted

  • $(curTriggerValue) - alias for columns on the current row that was changed

These variables are replaced with syntax in the language of the database being used. For example, on Oracle the "$(newTriggerValue)" is replaced with ":new". The curTriggerValue will use the oldTriggerValue on a delete change, otherwise use newTriggerValue.

The following SQL statement defines a Trigger that will capture all data changes for a table named "sale_transaction", and select Nodes to receive changes by using the value of the "store" column. The Nodes that have been set with an external ID that matches the "store" column will receive only those rows.

insert into SYM_TRIGGER 
  (source_table_name, source_node_group_id, target_node_group_id, channel_id, 
   sync_on_insert, sync_on_update, sync_on_delete,
   node_select, 
   initial_load_order, last_updated_by, last_updated_time, create_time)
values
  ('sale_transaction', 'corp', 'store', 'sale_transaction', 
   1, 1, 1, 
   'and external_id = $(curTriggerValue).store', 
   105, 'demo', current_timestamp, current_timestamp);

5.9. Using Initial Load Select

The Initial Load sends all required rows for each table to a Node, which prepares its database with an initial synchronized state. Afterwards, SymmetricDS begins sending incremental changes as data is changed. By default, the initial load will send all rows of a table. To send a subset of data, a condition is specified for the initial_load_select of the Trigger. The condition is written in SQL syntax that is appended to the "where" clause of a "select" statement on the table. The table is aliased as "t" in order to allow a sub-select to correlate a join.

It's important that the condition used for the initial state of a table matches the condition for capturing and sending changes. In other words, the condition used for the initial_load_select should be consistent with any Sync Conditions or Node Select Conditions, in order to guarantee consistent data in the table.

The following variables can be used in the initial_load_select, which will be replaced at runtime:

  • $(nodeId) - the node ID of the target node

  • $(groupId) - the node group ID of the target node

  • $(externalId) - the external ID of the target node

The following SQL statement defines a Trigger for the "sale_transaction" table that will only synchronize the subset of data where the workstation is "001".

insert into SYM_TRIGGER 
  (source_table_name, source_node_group_id, target_node_group_id, channel_id, 
   sync_on_insert, sync_on_update, sync_on_delete,
   sync_on_insert_condition, sync_on_update_condition, sync_on_delete_condition, 
   initial_load_select,
   initial_load_order, last_updated_by, last_updated_time, create_time)
values
  ('sale_transaction', 'corp', 'store', 'sale_transaction', 
   1, 1, 1, 
   'workstation = ''001''', 'workstation = ''001''', 'workstation = ''001''',
   'workstation = ''001''',
   105, 'demo', current_timestamp, current_timestamp);