Table of Contents
By design, whenever SymmetricDS encounters an issue with a synchronization, the batch containing the error is marked as being in an error state, and all subsequent batches for that particular channel to that particular node are held and not synchronized until the error batch is resolved. SymmetricDS will retry the batch in error until the situation creating the error is resolved (or the data for the batch itself is changed).
Analyzing and resolving issues can take place on the outgoing or incoming side. The techniques for analysis are slightly different in the two cases, however, due to the fact that the node with outgoing batch data also has the data and data events associated with the batch in the database. On the incoming node, however, all that is available is the incoming batch header and data present in an incoming error table.
The first step in analyzing the cause of a failed batch is to locate information about the data in the batch, starting with OUTGOING_BATCH To locate batches in error, use:
select * from sym_outgoing_batch where error_flag=1;
Several useful pieces of information are available from this query:
CHANNEL_ID. All subsequent batches on this channel to this node will be held until the error condition is resolved.
error_flagon the batch table, as shown above, is more reliable than using the
statuscolumn. The status column can change from 'ER' to a different status temporarily as the batch is retried.
select * from sym_data where data_id in (select data_id from sym_data_event where batch_id='XXXXXX');
where XXXXXX is the batch id of the failing batch.
This query returns a wealth of information about each data change in a batch, including:
More importantly, if you narrow your query to just the failed data id you can determine the exact data change that is causing the failure:
select * from sym_data where data_id in (select failed_data_id from sym_outgoing_batch where batch_id='XXXXX' and node_id='YYYYY');
where XXXXXX is the batch id and YYYYY is the node id of the batch that is failing.
The queries above usually yield enough information to be able to determine why a particular batch is failing. Common reasons a batch might be failing include:
Analysis using an incoming batch is different than that of outgoing batches. For incoming batches, you will rely on two tables, INCOMING_BATCH and INCOMING_ERROR. The first step in analyzing the cause of an incoming failed batch is to locate information about the batch, starting with INCOMING_BATCH To locate batches in error, use:
select * from sym_incoming_batch where error_flag=1;
Several useful pieces of information are available from this query:
BATCH_ID. Note that this is the batch number of the outgoing batch on the outgoing node.
CHANNEL_ID. All subsequent batches on this channel from this node will be held until the error condition is resolved.
For incoming batches, we do not have data and data event entries in the database we can query. We do, however, have a table, INCOMING_ERROR, which provides some information about the batch.
select * from sym_incoming_error where batch_id='XXXXXX' and node_id='YYYYY';
where XXXXXX is the batch id and YYYYY is the node id of the failing batch.
This query returns a wealth of information about each data change in a batch, including:
Once you have decided upon the cause of the issue, you'll have to decide the best course of action to fix the issue. If, for example, the problem is due to a database schema mismatch, one possible solution would be to alter the destination database in such a way that the SQL error no longer occurs. Whatever approach you take to remedy the issue, once you have made the change, on the next push or pull SymmetricDS will retry the batch and the channel's data will start flowing again.
If you have instead decided that the batch itself is wrong, or does not need synchronized, or you wish to remove a particular data change from a batch, you do have the option of changing the data associated with the batch directly.
Now that you've read the warning, if you still want to change the batch data itself, you do have several options, including:
update sym_outgoing_batch set status='OK' where batch_id='XXXXXX'where XXXXXX is the failing batch. On the next pull or push, SymmetricDS will skip this batch since it now thinks the batch has already been synchronized. Note that you can still distinguish between successful batches and ones that you've artificially marked as 'OK', since the
error_flagcolumn on the failed batch will still be set to '1' (in error).
delete from sym_data_event where batch_id='XXXXXX' and data_id='YYYYYY'where XXXXXX is the failing batch and YYYYYY is the data id to longer be included in the batch.
For batches in error, from the incoming side you'll also have to decide the best course of action to fix the issue.
Incoming batch errors that are in conflict can by fixed by taking advantage of two columns in INCOMING_ERROR which are examined each time
batches are processed. The first column,
resolve_data if filled in will be used in place of
The second column,
resolve_ignore if set will cause this particular data item to be ignored and batch processing to continue. This is the same
two columns used when a manual conflict resolution strategy is chosen, as discussed in Section 4.10, “Conflict Detection and Resolution”.
A trigger row may be updated using SQL to change a synchronization definition.
SymmetricDS will look for changes each night or whenever the Sync Triggers Job
is run (see below). For example, a change to place the table
into the price channel would be accomplished with the following statement:
update SYM_TRIGGER set channel_id = 'price', last_update_by = 'jsmith', last_update_time = current_timestamp where source_table_name = 'price_changes';
All configuration should be managed centrally at the registration node. If enabled, configuration changes will be synchronized out to client nodes. When trigger changes reach the client nodes the Sync Triggers Job will run automatically.
Centrally, the trigger changes will not take effect until the Sync Triggers Job runs. Instead of waiting for the Sync Triggers Job to run overnight after making a Trigger change, you can invoke the syncTriggers() method over JMX or simply restart the SymmetricDS server. A complete record of trigger changes is kept in the table TRIGGER_HIST, which was discussed in Section 5.2.3, “Sync Triggers Job”.
As you probably know by now, SymmetricDS stores its single configuration centrally and distributes it to all nodes. By default, a trigger-router is in effect for all nodes in the source node group or target node group. Triggers will be established on each node that is a member of the source node, and changes will be routed to all relevant nodes that are members of the target node group. If, for example, the router routes to "all" nodes, "all" means every node that is in the target node group. This is the default behavior of SymmetricDS.
Once in production, however, you will likely find you need or want to make configuration changes to triggers and routers as new features are rolled out to your network of SymmetricDS nodes. You may, for example, wish to "pilot" a new configuration, containing new synchronizations, only on specific nodes initially, and then increase the size of the pilot over time. SymmetricDS' does provide the ability to specify that only particular trigger-router combinations are applicableto particular nodes for this purpose. It does this by allowing you to define an arbitray collection of nodes, called a "grouplet", and then choosing which trigger-routers apply to the normal set of nodes (the default behavior) and which apply just to nodes in one or more "grouplets". This allows you, essentially, to filter the list of nodes that would otherwise be included as source nodes and/or target nodes. Through the use of grouplets, you can, for example, specify a subset of nodes on which a given trigger would be created. It also allows you to specify a subset of the normal set of nodes a change would be routed to. This behaviour is in addition to, and occurs before, any subsetting or filtering the router might otherwise do.
In its simplest form, a grouplet is just an arbitrary collection of nodes. To define a grouplet, you start by creating a grouplet with a unique id, a description, and a link policy,
as defined in GROUPLET. To defined which nodes are members of (or are not members of) a grouplet, you provide a list of external ids of the nodes
in GROUPLET_LINK. How those external ids are used varies based on the grouplet link policy.
grouplet_link_policy can be either I or E, representing an "inclusive" list of nodes or an "exclusive" list of
nodes, respectively. In the case of "inclusive", you'll be listing each external id to be included in the grouplet. In the case of exclusive, all nodes will be included in
the grouplet except ones which have an external id in the list of external ids.
Once you have defined your grouplet and which nodes are members of a grouplet, you can tie a grouplet to a given trigger-router through
the use of TRIGGER_ROUTER_GROUPLET.
If a particular trigger-router does not appear in this table, SymmetricDS behaves as normal.
If, however, an entry for a particular trigger-router appears in this table, the default behavior is overridden based on the
The grouplet id provides the node list, and the
applies_when indicates whether the grouplet nodes are to be used to filter the source node list, the target node list,
or both (settings are "S", "T", and "B", respectively). Nodes that survive the filtering process on as a source will have a trigger defined, and nodes that survive the filtering process
as a target are eligible nodes that can be routed to.
At this point, an example would probably be useful. Picture the case where you have 100 retail stores (each containing one database, and each a member of the "store" node group) and a central office database (external id of corp, and a member of the "corp" node group ). You wish to pilot two new trigger and routers for a new feature on your point-of-sale software (one which moves data from corp to store, and one which moves data from store to corp), but you only want the triggers to be installed on 10 specific stores that represent your "pilot" stores. In this case, the simplest approach would be to define a grouplet with, say, a grouplet id of "pilot". We'd use a grouplet link policy of "inclusive", and list each of the 10 external ids in the GROUPLET_LINK table.
For the trigger-router meant to send data from corp to store, we'd create an entry in TRIGGER_ROUTER_GROUPLET for
our grouplet id of "pilot", and we'd specify "T" (target) as the applies-when setting. In this way, the source node list is not filtered, but the target node list used during routing
will filter the potential target nodes to just our pilot stores. For the trigger-router meant to send data from a pilot store back to corp, we would have the grouplet apply when
the node is in the source node list (i.e.,
applies_when will be "S"). This will cause the trigger to only be created for stores in the pilot list and not other stores.
An important thing to mention in this example: Since your grouplet only included the store nodes, you can't simply specify "both" for the applies when setting. For the corp-to-store trigger, for example, if you had said "both", no trigger would have been installed in corp since the grouplet nodes represent all possible source nodes as well as target nodes, and "corp" is not in the list! The same is true for the store to corp trigger-router as well. You could, however, use "both" as the applies when if you had included the "corp" external id in with the list of the 10 pilot store external ids.
There may be times where you find you need to re-send or re-synchronize data when the change itself was not captured. This could be needed, for example, if the data changes occurred prior to SymmetricDS placing triggers on the data tables themselves, or if the data at the destination was accidentally deleted, or for some other reason. Two approaches are commonly taken to re-send the data, both of which are discussed below.
Be careful when re-sending data using either of these two techniques. Be sure you are only sending the rows you intend to send and, more importantly, be sure to re-send the data in a way that won't cause foreign key constraint issues at the destination. In other words, if more than one table is involved, be sure to send any tables which are referred to by other tables by foreign keys first. Otherwise, the channel's synchronization will block because SymmetricDS is unable to insert or update the row because the foreign key relationship refers to a non-existent row in the destination!
One possible approach would be to "touch" the rows in individual tables that need re-sent. By "touch", we mean to alter the row data in such a way that SymmetricDS detects a data change and therefore includes the data change in the batching and synchronizing steps. Note that you have to change the data in some meaningful way (e.g., update a time stamp); setting a column to its current value is not sufficient (by default, if there's not an actual data value change SymmetricDS won't treat the change as something which needs synched.
A second approach would be to take advantage of SymmetricDS built-in functionality by simulating a partial "initial load" of the data. The approach is to manually create "reload" events in DATA for the necessary tables, thereby resending the desired rows for the given tables. Again, foreign key constraints must be kept in mind when creating these reload events. These reload events are created in the source database itself, and the necessary table, trigger-router combination, and channel are included to indicate the direction of synchronization.
To create a reload event, you create a DATA row, using:
By way of example, take our retail hands-on tutorial covered in Chapter 2, Quick Start Tutorial. Let's say
we need to re-send a particular sales transaction from the store to corp over again because we lost the data in corp due to
an overzealous delete. For the tutorial, all transaction-related tables start with
sale_transaction channel, and are routed using the
router. In addition, the trigger-routers have been set up with an initial load order based on the necessary
foreign key relationships (i.e., transaction tables which are "parents" have a lower initial load order than those of their
"children"). An insert statement that would create the necessary "reload" events (three in this case, one for each table) would be as follows
(where MISSING_ID is changed to the needed transaciton id):
insert into sym_data ( select null, t.source_table_name, 'R', 'tran_id=''MISSING-ID''', null, null, h.trigger_hist_id, t.channel_id, '1', null, null, current_timestamp from sym_trigger t inner join sym_trigger_router tr on t.trigger_id=tr.trigger_id inner join sym_trigger_hist h on h.trigger_hist_id=(select max(trigger_hist_id) from sym_trigger_hist where trigger_id=t.trigger_id) where channel_id='sale_transaction' and tr.router_id like 'store_corp_identity' and (t.source_table_name like 'sale_%') order by tr.initial_load_order asc);
This insert statement generates three rows, one for each configured sale table. It uses the most recent trigger history id for the corresponding table. Finally, it takes advantage of the initial load order for each trigger-router to create the three rows in the correct order (the order corresponding to the order in which the tables would have been initial loaded).
The configuration of your system as defined in the
sym_* tables may be modified at runtime. By default, any changes made to
sym_* tables (with the exception of
sym_node) should be made at the registration server. The changes will
be synchronized out to the leaf nodes by SymmetricDS triggers that are automatically created on the tables.
If this behavior is not desired, the feature can be turned off using a parameter. Custom triggers may be added
sym_* tables when the auto syncing feature is disabled.
The standalone SymmetricDS installation uses Log4J for logging. The configuration file is
log4j.xml file has hints as to what logging can be enabled for useful, finer-grained logging.
There is a command line option to turn on preconfigured debugging levels. When the
--debug option is used the
conf/debug-log4j.xml is used instead of log4j.xml.
SymmetricDS proxies all of its logging through SLF4J. When deploying to an application server or if Log4J is not being leveraged, then the general rules for for SLF4J logging apply.
Monitoring and administrative operations can be performed using Java Management Extensions (JMX). SymmetricDS uses MX4J to expose JMX attributes and operations that can be accessed from the built-in web console, Java's jconsole, or an application server. By default, the web management console can be opened from the following address:
In order to use JConsole, you must enable the JVM. You can edit the startup scripts to set the following system parameters.
-Dcom.sun.management.jmxremote.port=31417 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
More details about enabling JMX for JConsole can be found here.
Using the Java jconsole command, SymmetricDS is listed as a local process named SymmetricLauncher.
In jconsole, SymmetricDS appears under the MBeans tab under then name defined by the
property. The default value is SymmetricDS.
The management interfaces under SymmetricDS are organized as follows:
Node - administrative operations
Parameters - access to properties set through the parameter service
SymmetricDS creates temporary extraction and data load files with the CSV payload of a synchronization when
the value of the
stream.to.file.threshold.bytes SymmetricDS property has been reached. Before reaching the threshold, files
are streamed to/from memory. The default threshold value is 32,767 bytes. This feature may be turned off by setting the
property to false.
SymmetricDS creates these temporary files in the directory specified by the
java.io.tmpdir Java System property.
The location of the temporary directory may be changed by setting the Java System property passed into the Java program at startup. For example,
Purging is the act of cleaning up captured data that is no longer needed in SymmetricDS's runtime tables. Data is purged through delete statements by the Purge Job. Only data that has been successfully synchronized will be purged. Purged tables include:
The purge job is enabled by the
start.purge.job SymmetricDS property. The timing of the three purge jobs (incoming, outgoing, and data gaps) is controlled
by a cron expression as specified by the following properties:
job.purge.datagaps.cron. The default is
0 0 0 * * *, or once per day at midnight.
SymmetricDS utilizes Spring's CRON support, which includes seconds as the first parameter. This differs from the typical Unix-based
implementation, where the first parameter is usually minutes. For example,
*/15 * * * * * means every 15 seconds, not every 15 minutes.
See Spring's documentation
for more details.
Two retention period properties
indicate how much history SymmetricDS will retain before purging. The
purge.retention.minutes property indicates the period
of history to keep for synchronization tables. The default value is 5 days.
indicates the period of history to keep for statistics. The default value is also 5 days.
The purge properties should be adjusted according to how much data is flowing through the system and the amount of storage space the database has. For an initial deployment it is recommended that the purge properties be kept at the defaults, since it is often helpful to be able to look at the captured data in order to triage problems and profile the synchronization patterns. When scaling up to more nodes, it is recomended that the purge parameters be scaled back to 24 hours or less.