View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0004555 | SymmetricDS | Bug | public | 2020-09-17 15:43 | 2022-08-01 20:19 |
Reporter | myg388 | Assigned To | elong | ||
Priority | high | ||||
Status | closed | Resolution | no change required | ||
Product Version | 3.12.2 | ||||
Summary | 0004555: poor performance when update table with to many rows | ||||
Description | Hello I am testing the symmetricds to see if it is useful for the company where I work. Test with a bi-directionally replicated (BENEFICIARIES_DEL) table of 7184684 million records. Initially, I did the initial load and I was able to load it without problems. When performing an update on a complete column, the NAME column, the update of 50 thousand records was performed on the other side and then no more records were updated. When checking the log I see the following 2020-09-17 12:44:45,615 INFO [desa19c] [DataProcessor] [desa19c-dataloader-1] Batch '89', for node '002', for process 'data load from stage' has been processing for 1441 seconds. The following stats have been gathered: {STATEMENTCOUNT=177, LOADMILLIS=958, STARTTIME=1600356044195, INSERTCOUNT=177, FILTERMILLIS=41, BENEFICIARIOS_DEL UPDATECOUNT=0, LINENUMBER=177, FALLBACKINSERTCOUNT=177, UPDATECOUNT=0, BENEFICIARIOS_DEL INSERTCOUNT=177} 2020-09-17 12:45:52,929 INFO [desa19c] [DataProcessor] [desa19c-dataloader-1] Batch '89', for node '002', for process 'data load from stage' has been processing for 1508 seconds. The following stats have been gathered: {STATEMENTCOUNT=186, LOADMILLIS=1001, STARTTIME=1600356044195, INSERTCOUNT=186, FILTERMILLIS=43, BENEFICIARIOS_DEL UPDATECOUNT=0, LINENUMBER=186, FALLBACKINSERTCOUNT=186, UPDATECOUNT=0, BENEFICIARIOS_DEL INSERTCOUNT=186} 2020-09-17 12:46:53,715 INFO [desa19c] [DataProcessor] [desa19c-dataloader-1] Batch '89', for node '002', for process 'data load from stage' has been processing for 1569 seconds. The following stats have been gathered: {STATEMENTCOUNT=194, LOADMILLIS=1042, STARTTIME=1600356044195, INSERTCOUNT=194, FILTERMILLIS=43, BENEFICIARIOS_DEL UPDATECOUNT=0, LINENUMBER=194, FALLBACKINSERTCOUNT=194, UPDATECOUNT=0, BENEFICIARIOS_DEL INSERTCOUNT=194} 2020-09-17 12:47:53,983 INFO [desa19c] [DataProcessor] [desa19c-dataloader-1] Batch '89', for node '002', for process 'data load from stage' has been processing for 1629 seconds. The following stats have been gathered: {STATEMENTCOUNT=202, LOADMILLIS=1075, STARTTIME=1600356044195, INSERTCOUNT=202, FILTERMILLIS=44, BENEFICIARIOS_DEL UPDATECOUNT=0, LINENUMBER=202, FALLBACKINSERTCOUNT=202, UPDATECOUNT=0, BENEFICIARIOS_DEL INSERTCOUNT=202} 2020-09-17 12:48:58,945 INFO [desa19c] [DataProcessor] [desa19c-dataloader-1] Batch '89', for node '002', for process 'data load from stage' has been processing for 1694 seconds. The following stats have been gathered: {STATEMENTCOUNT=210, LOADMILLIS=1120, STARTTIME=1600356044195, INSERTCOUNT=210, FILTERMILLIS=46, BENEFICIARIOS_DEL UPDATECOUNT=0, LINENUMBER=210, FALLBACKINSERTCOUNT=210, UPDATECOUNT=0, BENEFICIARIOS_DEL INSERTCOUNT=210} I am not an expert in symmetrics but if the STATEMENTCOUNT that I see that it is increasing, it means that it is setting up a batch with the data of the sym_data, it runs too slow. When verifying in the DB what the symmetricsds is doing, I see that it is making the following query: SELECT MAX(create_time) FROM sym_data WHERE table_name = :1 AND ( ( event_type = 'I' AND row_data LIKE :2 ) OR ( event_type IN ( 'U', 'D' ) AND pk_data LIKE :3 ) ); i attach de explain plan to. Verifying the execution plan I see that it is performing a full scan of the sym data, it currently does not have too many 11863 records. It is verified that the sym_data table has an index on the sym_id column but this query does not use this column and that is why the delay may be generated. Statistics of the complete DB were updated, an attempt was made to create indexes in the sym_data table but it was not possible to optimize this query. | ||||
Tags | No tags attached. | ||||
|
|
|
That is the conflict manager resolving all the rows you put into conflict at the same. Is it a real world scenario for two databases to experience mass updates to the same set of rows? Why not update just one side and let the replication update the other side? It detects the conflict when the incoming row does apply the same way on the target as it did on the source. To resolve the conflict, it searches sym_data to get the last update time for the row. The newer timestamp between the source and target is the winning row. It was designed for occasional conflicts when the same row is updated on multiple databases at the same time. |
|
Thanks for your answer, please update a whole column on one side. I don't understand what conflicts would I be solving? The initial load had already been carried out and both tables were in sync before updating a complete column. |
|
I was doing more tests with update with the symmetricds. And I have the same drawbacks, any update I do for example: update MMENENDEZ.BENEFICIARIOS_DEL set ADDRESS = 'PEPE1234' where ID_LUGARALTABENEF = 1000; It takes a long time, when verifying how symmetricds passes this update to the other side I see the following: update "MMENENDEZ". "BENEFICIARIOS_DEL" set "ADDRESS" =: 1 where "ID_LUGARALTABENEF" =: 2 and "ID_BENEFICIARIO" =: 3 and "OS_ADHE" =: 4 and "NAME" =: 5 and "ID_TIPODOC" =: 6 and "NUMDOCUMENTO" =: 7 and "SEX" =: 8 and "DATE" =: 9 and "TELEPHONE" =: 10 and "ID_LUGARATENCIONMEDICA" =: 11 and "ID_LUGARALTATITULAR" is NULL and "ID_TITULAR" is NULL and "CUIL "=: 12 and" FECHASITUACION "=: 13 and" ESTADO "=: 14 and" ID_SITUACIONBENEF "=: 15 and" ID_PLAN "=: 16 and" DOMICILIO "=: 17 and" ID_LOCATENCIONMEDICA "=: 18 and" DATE " =: 19 and "ID_CARACTER" =: 20 and "ID_TIPOCARACTER" =: 21 and "ID_TIPOBENEFICIARIO" =: 22 and "ID_LOCALIDAD" =: 23 and "EXPIRATION_DATE" is NULL and "ID_CAJAAPORTE" =: 24 and "NROBENEFICIO" =: 25 and "NROEXPEDIENTE" =: 26 Therefore, if it works this way, everything that is update delete will be very few performant since it filters by absolutely all the columns to make each update generating a cost of 43000. The inserts worked ok. |
|
i attach capture of log file, |
|
I could see what was happening, a pk was missing in the table close the issue plz |
Date Modified | Username | Field | Change |
---|---|---|---|
2020-09-17 15:43 | myg388 | New Issue | |
2020-09-17 15:43 | myg388 | File Added: image.png | |
2020-09-17 15:43 | myg388 | File Added: image-2.png | |
2020-09-17 15:43 | myg388 | File Added: image-3.png | |
2020-09-17 16:30 | elong | Note Added: 0001772 | |
2020-09-17 16:30 | elong | Status | new => feedback |
2020-09-17 16:58 | myg388 | Note Added: 0001773 | |
2020-09-17 16:58 | myg388 | Status | feedback => new |
2020-09-23 18:20 | myg388 | Note Added: 0001782 | |
2020-09-23 18:21 | myg388 | Note Added: 0001783 | |
2020-09-23 18:21 | myg388 | File Added: image-4.png | |
2020-09-25 20:51 | myg388 | Note Added: 0001789 | |
2022-08-01 20:19 | elong | Assigned To | => elong |
2022-08-01 20:19 | elong | Status | new => closed |
2022-08-01 20:19 | elong | Resolution | open => no change required |