Chapter 7. Administration

Table of Contents

7.1. Solving Synchronization Issues
7.1.1. Analyzing the Issue
7.1.2. Resolving the Issue
7.2. Changing Triggers
7.3. Changing Configuration
7.4. Logging Configuration
7.5. Java Management Extensions
7.6. Temporary Files
7.7. Database Purging

7.1. Solving Synchronization Issues

By design, whenever SymmetricDS encounters an issue with a synchronization, the batch containing the error is marked as being in an error state, and all subsequent batches for that particular channel to that particular node are held and not synchronized until the error batch is resolved. SymmetricDS will retry the batch in error until the situation creating the error is resolved (or the data for the batch itself is changed).

7.1.1. Analyzing the Issue

The first step in analyzing the cause of a failed batch is to locate information about the data in the batch, starting with either OUTGOING_BATCH or INCOMING_BATCH. We'll use outgoing batches for the examples below. To locate batches in error, use:

select * from sym_outgoing_batch where error_flag=1;

Several useful pieces of information are available from this query:

  • The batch number of the failed batch, available in column BATCH_ID.
  • The node to which the batch is being sent, available in column NODE_ID.
  • The channel to which the batch belongs, available in column CHANNEL_ID. All subsequent batches on this channel to this node will be held until the error condition is resolved.
  • The specific data id in the batch which is causing the failure, available in column FAILED_DATA_ID.
  • Any SQL message, SQL State, and SQL Codes being returned during the synchronization attempt, available in columns SQL_MESSAGE, SQL_STATE, and SQL_CODE, respectively.

Note

Using the error_flag on the batch table, as shown above, is more reliable than using the status column. The status column can change from 'ER' to a different status temporarily as the batch is retried.

Note

The query above will also show you any recent batches that were originally in error and were changed to be manually skipped. See the end of Section 7.1.2, “Resolving the Issue” for more details.

To get a full picture of the batch, you can query for information representing the complete list of all data changes associated with the failed batch by joining DATA and DATA_EVENT, such as:

select * from sym_data where data_id in 
        (select data_id from sym_data_event where batch_id='XXXXXX');

where XXXXXX is the batch id of the failing batch.

This query returns a wealth of information about each data change in a batch, including:

  • The table involved in each data change, available in column TABLE_NAME,
  • The event type (Update [U], Insert [I], or Delete [D]), available in column EVENT_TYPE,
  • A comma separated list of the new data and (optionally) the old data, available in columns ROW_DATA and OLD_DATA, respectively.
  • The primary key data, available in column PK_DATA
  • The channel id, trigger history information, transaction id if available, and other information.

More importantly, if you narrow your query to just the failed data id you can determine the exact data change that is causing the failure:

select * from sym_data where data_id in 
        (select failed_data_id from sym_outgoing_batch where batch_id='XXXXX');

where XXXXXX is the batch that is failing.

The queries above usually yield enough information to be able to determine why a particular batch is failing. Common reasons a batch might be failing include:

  • The schema at the destination has a column that is not nullable yet the source has the column defined as nullable and a data change was sent with the column as null.
  • A foreign key constraint at the destination is preventing an insertion or update, which could be caused from data being deleted at the destination or the foreign key constraint is not in place at the source.
  • The data size of a column on the destination is smaller than the data size in the source, and data that is too large for the destination has been synced.

7.1.2. Resolving the Issue

Once you have decided upon the cause of the issue, you'll have to decide the best course of action to fix the issue. If, for example, the problem is due to a database schema mismatch, one possible solution would be to alter the destination database in such a way that the SQL error no longer occurs. Whatever approach you take to remedy the issue, once you have made the change, on the next push or pull SymmetricDS will retry the batch and the channel's data will start flowing again.

If you have instead decided that the batch itself is wrong, or does not need synchronized, or you wish to remove a particular data change from a batch, you do have the option of changing the data associated with the batch directly.

Warning

Be cautious when using the following two approaches to resolve synchronization issues. By far, the best approach to solving a synchronization error is to resolve what is truly causing the error at the destination database. Skipping a batch or removing a data id as discussed below should be your solution of last resort, since doing so results in differences between the source and destination databases.

Now that you've read the warning, if you still want to change the batch data itself, you do have several options, including:

  • Causing SymmetricDS to skip the batch completely. This is accomplished by setting the batch's status to 'OK', as in:
    update sym_outgoing_batch set status='OK' where batch_id='XXXXXX'
    where XXXXXX is the failing batch. On the next pull or push, SymmetricDS will skip this batch since it now thinks the batch has already been synchronized. Note that you can still distinguish between successful batches and ones that you've artificially marked as 'OK', since the error_flag column on the failed batch will still be set to '1' (in error).
  • Removing the failing data id from the batch by deleting the corresponding row in DATA_EVENT. Eliminating the data id from the list of data ids in the batch will cause future synchronization attempts of the batch to no longer include that particular data change as part of the batch. For example:
    delete from sym_data_event where batch_id='XXXXXX' and data_id='YYYYYY'
    where XXXXXX is the failing batch and YYYYYY is the data id to longer be included in the batch.

7.2. Changing Triggers

A trigger row may be updated using SQL to change a synchronization definition. SymmetricDS will look for changes each night or whenever the Sync Triggers Job is run (see below). For example, a change to place the table price_changes into the price channel would be accomplished with the following statement:

update SYM_TRIGGER
set channel_id = 'price',
    last_update_by = 'jsmith',
    last_update_time = current_timestamp
where source_table_name = 'price_changes';

All configuration should be managed centrally at the registration node. If enabled, configuration changes will be synchronized out to client nodes. When trigger changes reach the client nodes the Sync Triggers Job will run automatically.

Centrally, the trigger changes will not take effect until the Sync Triggers Job runs. Instead of waiting for the Sync Triggers Job to run overnight after making a Trigger change, you can invoke the syncTriggers() method over JMX or simply restart the SymmetricDS server.

7.3. Changing Configuration

The configuration of your system as defined in the sym_* tables may be modified at runtime. By default, any changes made to the sym_* tables (with the exception of sym_node) should be made at the registration server. The changes will be synchronized out to the leaf nodes by SymmetricDS triggers that are automatically created on the tables.

If this behavior is not desired, the feature can be turned off using a parameter. Custom triggers may be added to the sym_* tables when the auto syncing feature is disabled.

7.4. Logging Configuration

The standalone SymmetricDS installation uses Log4J for logging. The configuration file is conf/log4j.xml. The log4j.xml file has hints as to what logging can be enabled for useful, finer-grained logging.

SymmetricDS proxies all of its logging through Commons Logging. When deploying to an application server, if Log4J is not being leveraged, then the general rules for for Commons Logging apply.

7.5. Java Management Extensions

Monitoring and administrative operations can be performed using Java Management Extensions (JMX). SymmetricDS uses MX4J to expose JMX attributes and operations that can be accessed from the built-in web console, Java's jconsole, or an application server. By default, the web management console can be opened from the following address:

http://localhost:31416/

Using the Java jconsole command, SymmetricDS is listed as a local process named SymmetricLauncher. In jconsole, SymmetricDS appears under the MBeans tab under then name defined by the engine.name property. The default value is SymmetricDS.

The management interfaces under SymmetricDS are organized as follows:

  • Node - administrative operations

  • Incoming - statistics about incoming batches

  • Outgoing - statistics about outgoing batches

  • Parameters - access to properties set through the parameter service

  • Notifications - setting thresholds and receiving notifications

7.6. Temporary Files

SymmetricDS creates temporary extraction and data load files with the CSV payload of a synchronization when the value of the stream.to.file.threshold.bytes SymmetricDS property has been reached. Before reaching the threshold, files are streamed to/from memory. The default threshold value is 32,767 bytes. This feature may be turned off by setting the stream.to.file.enabled property to false.

SymmetricDS creates these temporary files in the directory specified by the java.io.tmpdir Java System property. When SymmmetricDS starts up, stranded temporary files are aways cleaned up. Files will only be stranded if the SymmetricDS engine is force killed.

The location of the temporary directory may be changed by setting the Java System property passed into the Java program at startup. For example,

  -Djava.io.tmpdir=/home/.symmetricds/tmp
        

7.7. Database Purging

Purging is the act of cleaning up captured data that is no longer needed in SymmetricDS's runtime tables. Data is purged through delete statements by the Purge Job. Only data that has been successfully synchronized will be purged. Purged tables include:

The purge job is enabled by the start.purge.job SymmetricDS property. The job runs periodically according to the job.purge.period.time.ms property. The default period is to run every ten minutes.

Two retention period properties indicate how much history SymmetricDS will retain before purging. The purge.retention.minutes property indicates the period of history to keep for synchronization tables. The default value is 5 days. The statistic.retention.minutes property indicates the period of history to keep for statistics. The default value is also 5 days.

The purge properties should be adjusted according to how much data is flowing through the system and the amount of storage space the database has. For an initial deployment it is recommended that the purge properties be kept at the defaults, since it is often helpful to be able to look at the captured data in order to triage problems and profile the synchronization patterns. When scaling up to more nodes, it is recomended that the purge parameters be scaled back to 24 hours or less.