View Issue Details

IDProjectCategoryView StatusLast Update
0004142SymmetricDSBugpublic2019-11-10 06:48
Reporterei99045Assigned To 
Prioritynormal 
Status newResolutionopen 
Product Version3.10.5 
Target VersionFixed in Version 
Summary0004142: DataExtractorService level queries don't handle initial_load_select conditions properly
DescriptionDataExtractorService attempts to handle self-referencing tables by performing multi level queries. These queries attempt to navigate the acyclical graph of references by starting with those rows that don't reference any other rows, then moving on to the rows that reference those "level 0" rows, and so on.
This is implemented by adding an "AND" clause to whichever WHERE clause is already being used in the extraction for the given table.

In my scenario, we have set the initial_load_select in the sym_trigger_router table to be "column_a = ${externalId} OR column_b = ${externalId}, and this results in a level 0 query that looks like this:
column_a='$(externalId)' OR column_b='$(externalId)' and reference_column is null or reference_column = uuid

This won't work as intended, because the AND operator takes precedence over the initial OR operator. The query will be interpreted as if it had been written:
column_a='$(externalId)'
OR
(column_b='$(externalId)' and reference_column is null)
OR
reference_column = uuid

This is clearly not what was intended and results is unwanted records being extracted. The initial_load_select clause should be included in parentheses to both sides of that final OR operator, something like this:
((column_a='$(externalId)' OR column_b='$(externalId)') AND reference_column is null)
OR
((column_a='$(externalId)' OR column_b='$(externalId)') AND reference_column = uuid)

The queries that are produced for levels 1 and beyond also have the same problem.
Steps To ReproduceTry to do an initial load for a table that has a reload enabled trigger configured with an initial_load_select that includes an OR expression.
The initial load will never finish because of duplicate key errors at the target node.
Additional InformationThis happens irrespective of the initial.load.use.extract.job.enabled value.
Tagsdialect: postgresql, extract, initial/partial load

Activities

There are no notes attached to this issue.

Issue History

Date Modified Username Field Change
2019-11-10 06:48 ei99045 New Issue
2019-11-10 06:48 ei99045 Tag Attached: dialect: postgresql
2019-11-10 06:48 ei99045 Tag Attached: extract
2019-11-10 06:48 ei99045 Tag Attached: initial/partial load