View Revisions: Issue #3514

Summary 0003514: Improve purge performance of sym_data and sym_data_event
Revision 2018-04-09 19:25 by elong
Description The delete statements used on sym_data and sym_data_event use joins to verify the rows are part of a batch in OK status, which can be slow on some systems. Use a first pass on the tables that looks at all rows before the smallest data_id (for sym_data) or batch_id (for sym_data_event) associated with outstanding (not OK) batches. Once the safe range for purging is determined, it can delete without any joins. Afterwards, the normal purge routines run, but on a health system (one without any or very few outstanding batches), there shouldn't any data left to purge.

This feature is disabled by default for now. Enable with job.purge.first.pass=true parameter.
Revision 2018-04-09 19:22 by elong
Description The delete statements used on sym_data and sym_data_event use joins to verify the rows are part of a batch in OK status, which can be slow on some systems. Use a first pass on the tables that looks at all rows before the smallest data_id (for sym_data) or batch_id (for sym_data_event) associated with outstanding (not OK) batches. Once the safe range for purging is determined, it can delete without any joins. Afterwards, the normal purge routines run, but on a health system (one without any or very few outstanding batches), there shouldn't any data left to purge.