Table of Contents
By design, whenever SymmetricDS encounters an issue with a synchronization, the batch containing the error is marked as being in an error state, and all subsequent batches for that particular channel to that particular node are held and not synchronized until the error batch is resolved. SymmetricDS will retry the batch in error until the situation creating the error is resolved (or the data for the batch itself is changed).
The first step in analyzing the cause of a failed batch is to locate information about the data in the batch, starting with either OUTGOING_BATCH or INCOMING_BATCH. We'll use outgoing batches for the examples below. To locate batches in error, use:
select * from sym_outgoing_batch where error_flag=1;
Several useful pieces of information are available from this query:
BATCH_ID
.
NODE_ID
.
CHANNEL_ID
.
All subsequent batches on this channel to this node will be held until the error condition is resolved.
FAILED_DATA_ID
.
SQL_MESSAGE
,
SQL_STATE
, and SQL_CODE
, respectively.
error_flag
on the batch table, as shown above, is more reliable than using the
status
column. The status column can change from 'ER' to a different status temporarily as
the batch is retried.
To get a full picture of the batch, you can query for information representing the complete list of all data changes associated with the failed batch by joining DATA and DATA_EVENT, such as:
select * from sym_data where data_id in (select data_id from sym_data_event where batch_id='XXXXXX');
where XXXXXX is the batch id of the failing batch.
This query returns a wealth of information about each data change in a batch, including:
TABLE_NAME
,EVENT_TYPE
,
ROW_DATA
and
OLD_DATA
, respectively.
PK_DATA
More importantly, if you narrow your query to just the failed data id you can determine the exact data change that is causing the failure:
select * from sym_data where data_id in (select failed_data_id from sym_outgoing_batch where batch_id='XXXXX');
where XXXXXX is the batch that is failing.
The queries above usually yield enough information to be able to determine why a particular batch is failing. Common reasons a batch might be failing include:
Once you have decided upon the cause of the issue, you'll have to decide the best course of action to fix the issue. If, for example, the problem is due to a database schema mismatch, one possible solution would be to alter the destination database in such a way that the SQL error no longer occurs. Whatever approach you take to remedy the issue, once you have made the change, on the next push or pull SymmetricDS will retry the batch and the channel's data will start flowing again.
If you have instead decided that the batch itself is wrong, or does not need synchronized, or you wish to remove a particular data change from a batch, you do have the option of changing the data associated with the batch directly.
Now that you've read the warning, if you still want to change the batch data itself, you do have several options, including:
update sym_outgoing_batch set status='OK' where batch_id='XXXXXX'where XXXXXX is the failing batch. On the next pull or push, SymmetricDS will skip this batch since it now thinks the batch has already been synchronized. Note that you can still distinguish between successful batches and ones that you've artificially marked as 'OK', since the
error_flag
column on
the failed batch will still be set to '1' (in error).
delete from sym_data_event where batch_id='XXXXXX' and data_id='YYYYYY'where XXXXXX is the failing batch and YYYYYY is the data id to longer be included in the batch.
A trigger row may be updated using SQL to change a synchronization definition.
SymmetricDS will look for changes each night or whenever the Sync Triggers Job
is run (see below). For example, a change to place the table price_changes
into the price channel would be accomplished with the following statement:
update SYM_TRIGGER set channel_id = 'price', last_update_by = 'jsmith', last_update_time = current_timestamp where source_table_name = 'price_changes';
All configuration should be managed centrally at the registration node. If enabled, configuration changes will be synchronized out to client nodes. When trigger changes reach the client nodes the Sync Triggers Job will run automatically.
Centrally, the trigger changes will not take effect until the Sync Triggers Job runs. Instead of waiting for the Sync Triggers Job to run overnight after making a Trigger change, you can invoke the syncTriggers() method over JMX or simply restart the SymmetricDS server. A complete record of trigger changes is kept in the table TRIGGER_HIST, which was discussed in Section 5.2.3, “Sync Triggers Job”.
There may be times where you find you need to re-send or re-synchronize data when the change itself was not captured. This could be needed, for example, if the data changes occurred prior to SymmetricDS placing triggers on the data tables themselves, or if the data at the destination was accidentally deleted, or for some other reason. Two approaches are commonly taken to re-send the data, both of which are discussed below.
Be careful when re-sending data using either of these two techniques. Be sure you are only sending the rows you intend to send and, more importantly, be sure to re-send the data in a way that won't cause foreign key constraint issues at the destination. In other words, if more than one table is involved, be sure to send any tables which are referred to by other tables by foreign keys first. Otherwise, the channel's synchronization will block because SymmetricDS is unable to insert or update the row because the foreign key relationship refers to a non-existent row in the destination!
One possible approach would be to "touch" the rows in individual tables that need re-sent. By "touch", we mean to alter the row data in such a way that SymmetricDS detects a data change and therefore includes the data change in the batching and synchronizing steps. Note that you have to change the data in some meaningful way (e.g., update a time stamp); setting a column to its current value is not sufficient (by default, if there's not an actual data value change SymmetricDS won't treat the change as something which needs synched.
A second approach would be to take advantage of SymmetricDS built-in functionality by simulating a partial "initial load" of the data. The approach is to manually create "reload" events in DATA for the necessary tables, thereby resending the desired rows for the given tables. Again, foreign key constraints must be kept in mind when creating these reload events. These reload events are created in the source database itself, and the necessary table, trigger-router combination, and channel are included to indicate the direction of synchronization.
To create a reload event, you create a DATA row, using:
By way of example, take our retail hands-on tutorial covered in Chapter 2, Hands-on Tutorial. Let's say
we need to re-send a particular sales transaction from the store to corp over again because we lost the data in corp due to
an overzealous delete. For the tutorial, all transaction-related tables start with sale_
,
use the sale_transaction
channel, and are routed using the store_corp_identity
router. In addition, the trigger-routers have been set up with an initial load order based on the necessary
foreign key relationships (i.e., transaction tables which are "parents" have a lower initial load order than those of their
"children"). An insert statement that would create the necessary "reload" events (three in this case, one for each table) would be as follows
(where MISSING_ID is changed to the needed transaciton id):
insert into sym_data ( select null, t.source_table_name, 'R', 'tran_id=''MISSING-ID''', null, null, h.trigger_hist_id, t.channel_id, '1', null, null, current_timestamp from sym_trigger t inner join sym_trigger_router tr on t.trigger_id=tr.trigger_id inner join sym_trigger_hist h on h.trigger_hist_id=(select max(trigger_hist_id) from sym_trigger_hist where trigger_id=t.trigger_id) where channel_id='sale_transaction' and tr.router_id like 'store_corp_identity' and (t.source_table_name like 'sale_%') order by tr.initial_load_order asc);
This insert statement generates three rows, one for each configured sale table. It uses the most recent trigger history id for the corresponding table. Finally, it takes advantage of the initial load order for each trigger-router to create the three rows in the correct order (the order corresponding to the order in which the tables would have been initial loaded).
The configuration of your system as defined in the sym_*
tables may be modified at runtime. By default, any changes made to
the sym_*
tables (with the exception of sym_node
) should be made at the registration server. The changes will
be synchronized out to the leaf nodes by SymmetricDS triggers that are automatically created on the tables.
If this behavior is not desired, the feature can be turned off using a parameter. Custom triggers may be added
to the sym_*
tables when the auto syncing feature is disabled.
The standalone SymmetricDS installation uses Log4J for logging. The configuration file is conf/log4j.xml
.
The log4j.xml
file has hints as to what logging can be enabled for useful, finer-grained logging.
SymmetricDS proxies all of its logging through Commons Logging. When deploying to an application server, if Log4J is not being leveraged, then the general rules for for Commons Logging apply.
Monitoring and administrative operations can be performed using Java Management Extensions (JMX). SymmetricDS uses MX4J to expose JMX attributes and operations that can be accessed from the built-in web console, Java's jconsole, or an application server. By default, the web management console can be opened from the following address:
http://localhost:31416/
Using the Java jconsole command, SymmetricDS is listed as a local process named SymmetricLauncher.
In jconsole, SymmetricDS appears under the MBeans tab under then name defined by the engine.name
property. The default value is SymmetricDS.
The management interfaces under SymmetricDS are organized as follows:
Node - administrative operations
Incoming - statistics about incoming batches
Outgoing - statistics about outgoing batches
Parameters - access to properties set through the parameter service
Notifications - setting thresholds and receiving notifications
SymmetricDS creates temporary extraction and data load files with the CSV payload of a synchronization when
the value of the stream.to.file.threshold.bytes
SymmetricDS property has been reached. Before reaching the threshold, files
are streamed to/from memory. The default threshold value is 32,767 bytes. This feature may be turned off by setting the stream.to.file.enabled
property to false.
SymmetricDS creates these temporary files in the directory specified by the java.io.tmpdir
Java System property. When
SymmmetricDS starts up, stranded temporary files are aways cleaned up. Files will only be stranded if the SymmetricDS engine is force killed.
The location of the temporary directory may be changed by setting the Java System property passed into the Java program at startup. For example,
-Djava.io.tmpdir=/home/.symmetricds/tmp
Purging is the act of cleaning up captured data that is no longer needed in SymmetricDS's runtime tables. Data is purged through delete statements by the Purge Job. Only data that has been successfully synchronized will be purged. Purged tables include:
The purge job is enabled by the start.purge.job
SymmetricDS property. The job runs periodically according to the
job.purge.period.time.ms
property. The default period is to run every ten minutes.
Two retention period properties
indicate how much history SymmetricDS will retain before purging. The purge.retention.minutes
property indicates the period
of history to keep for synchronization tables. The default value is 5 days.
The statistic.retention.minutes
property
indicates the period of history to keep for statistics. The default value is also 5 days.
The purge properties should be adjusted according to how much data is flowing through the system and the amount of storage space the database has. For an initial deployment it is recommended that the purge properties be kept at the defaults, since it is often helpful to be able to look at the captured data in order to triage problems and profile the synchronization patterns. When scaling up to more nodes, it is recomended that the purge parameters be scaled back to 24 hours or less.