View Issue Details

IDProjectCategoryView StatusLast Update
0001960SymmetricDSImprovementpublic2022-11-30 19:49
Reportervilius Assigned Toelong  
Prioritynormal 
Status closedResolutionfixed 
Product Version3.6.7 
Target Version3.14.3Fixed in Version3.14.3 
Summary0001960: Service start script always waits for 5s
DescriptionStart script does not exit for more than 5s.

"""
# time /etc/init.d/symmetricds start; date
Waiting for server to start
......
Started

real 0m6.634s
user 0m0.354s
sys 0m0.037s
Tue Sep 9 10:27:40 EEST 2014
"""

Process start time show that the process has already been running 7s before the start script exited.

"""
ps -p 19451 -o lstart
                 STARTED
Tue Sep 9 10:27:33 2014
"""

The delay does not seem to serve any purpose and prevents the user from using the shell for several seconds, for example, to start monitoring SymmetricDS logs from the very startup.

The problem is that the condition in WrapperService.waitForPid is wrong in case the process is starting up - it waits until the PID does not exist, which does not make sense when starting up.

Please find the patch correcting the problem attached.

With the patch the start script exits as soon as the process is launched

"""
# time /etc/init.d/symmetricds start; date
Waiting for server to start
.
Started

real 0m0.631s
user 0m0.339s
sys 0m0.044s
Tue Sep 9 13:34:00 EEST 2014
"""

Process start time shows the start script exited only 1s after the process has started, which is expected.

"""
# ps -p 21525 -o lstart
                 STARTED
Tue Sep 9 13:33:59 2014
"""

This has only been tested on GNU/Linux, but according to implementation this problem also affects Windows Service startup and the patch should fix it for both platforms.
Additional InformationHave server output a message after checking configuration, binding to network port, and connecting to the database. After seeing message, the start command can return successful.
Tagssym/sym_service

Relationships

related to 0005583 closedelong Service wrapper wait for "started" message instead of 5 second timeout 

Activities

vilius

2014-09-09 11:06

reporter  

service_start_delay.patch (1,599 bytes)   
Index: symmetric-wrapper/src/main/java/org/jumpmind/symmetric/wrapper/WrapperService.java
===================================================================
--- symmetric-wrapper/src/main/java/org/jumpmind/symmetric/wrapper/WrapperService.java	(revision 8670)
+++ symmetric-wrapper/src/main/java/org/jumpmind/symmetric/wrapper/WrapperService.java	(working copy)
@@ -75,7 +75,7 @@
         try {
             ProcessBuilder pb = new ProcessBuilder(getWrapperCommand("exec"));
             Process process = pb.start();
-            if (!(success = waitForPid(getProcessPid(process)))) {
+            if (!(success = waitForPid(getProcessPid(process), true))) {
                 rc = process.exitValue();
             }
         } catch (IOException e) {
@@ -211,9 +211,9 @@
     
     protected boolean stopProcess(int pid, String name) {
         killProcess(pid, false);
-        if (waitForPid(pid)) {
+        if (waitForPid(pid, false)) {
             killProcess(pid, true);
-            if (waitForPid(pid)) {
+            if (waitForPid(pid, false)) {
                 System.out.println("ERROR: '" + name + "' did not stop");
                 return false;
             }
@@ -317,11 +317,11 @@
         new File(filename).delete();
     }
 
-    protected boolean waitForPid(int pid) {
+    protected boolean waitForPid(int pid, boolean starting) {
         int seconds = 0;
         while (seconds <= 5) {
             System.out.print(".");
-            if (!isPidRunning(pid)) {
+            if (starting ^ !isPidRunning(pid)) {
                 break;
             }
             try {
service_start_delay.patch (1,599 bytes)   

elong

2014-09-09 13:21

developer   ~0000570

The "sym_service start" command runs the SymmetricDS processes in the background and waits to make sure it runs without exiting. Instead of detecting or predicting when SymmetricDS has initialized, it waits for 5 seconds. If there is an initialization problem, SymmetricDS may exit with an error. Then the command line will get back an error code and let you know the server didn't start. If it didn't wait, you would run the command and get success even when the server fails a few seconds later. This is the same technique used by other wrappers, like Java Service Wrapper. If you don't want to wait, you can put the start command in the background also. Another way to make it work faster is to have the start command watch the output or log file of the process and look for a line that indicates initialization was successful. I guess as soon as the server can bind to the port and connect to the database, we could output success for that stage of initialization, which might be quickly than the 5 seconds.

vilius

2014-09-09 14:21

reporter   ~0000571

Thank you for the explanation. The aforementioned condition did not seem to make sense in case of startup and the delay was annoying (I had to reconfigure and restart many times during the initial setup and troubleshooting :) ), but now I see that in case the process exits within 5 seconds an error would be raised.

However, arbitrary timeouts are always fragile (what if the system is overloaded and the process exits after 6s?). I agree that verifying the process has started successfully is an important feature of a start script and I would prefer a more sophisticated verification criteria to detect other startup issues not necessarily causing the process to exit.

Feel free to either change this to an improvement (better verification if the started process actually started serving) or close.

elong

2014-09-09 14:26

developer   ~0000572

That's good feedback, thanks. I think we should remove the timing like you suggest, and wait until the server has checked its configuration, bound to its network port, and connected to the database.

Issue History

Date Modified Username Field Change
2014-09-09 11:06 vilius New Issue
2014-09-09 11:06 vilius File Added: service_start_delay.patch
2014-09-09 13:21 elong Note Added: 0000570
2014-09-09 14:21 vilius Note Added: 0000571
2014-09-09 14:26 elong Note Added: 0000572
2014-09-09 14:28 elong Assigned To => elong
2014-09-09 14:28 elong Status new => assigned
2014-09-09 14:28 elong Category Bug => Improvement
2014-09-09 14:28 elong Additional Information Updated View Revisions
2019-04-23 12:04 elong Tag Attached: sym/sym_service
2022-11-14 18:45 elong Relationship added related to 0005583
2022-11-14 18:51 elong Status assigned => resolved
2022-11-14 18:51 elong Resolution open => fixed
2022-11-14 18:51 elong Fixed in Version => 3.14.4
2022-11-14 18:51 elong Target Version => 3.14.4
2022-11-28 13:04 elong Fixed in Version 3.14.4 => 3.14.3
2022-11-28 13:04 elong Target Version 3.14.4 => 3.14.3
2022-11-30 19:49 admin Status resolved => closed