It is impossible to shut down the server safely, concurrently and cleanly #1645

OskiKervinen-MF · 2023-08-21T12:45:43Z

Hello,

There seems to be no way to call Server.stop() from another thread without the possibility that the server will never shut down. The root of the problem is that Server.stop() does nothing if the server is not running. Because of the nature of concurrency you can not easily guarantee the server has started if the server runs in a different thread.

Here is code to illustrate the problem. This program hangs forever:

#include "httplib.h"
#include <thread>
#include <iostream>

int main()
{
	std::cout << "Main: Start." << std::endl;

	httplib::Server svr1;
	svr1.Get( "/1", [ & ]( const httplib::Request& /*req*/, httplib::Response& res ) {
		res.set_content( "Hello World!", "text/plain" );
	} );

	int svr1_port = 0;
	auto thread1 = std::thread( [ & ]()
			{
				std::cout << "Server: Initializing..." << std::endl;

				// Simulate doing some slightly time consuming processing to prepare the server.
				std::this_thread::sleep_for( std::chrono::milliseconds( 100 ) );

				svr1_port = svr1.bind_to_any_port( "localhost" );

				std::cout << "Server: Starting the server..." << std::endl;
				svr1.listen_after_bind();
				std::cout << "Server: exiting." << std::endl;
			} );

	std::cout << "Main: Server thread started." << std::endl;

	// Simulate trying to start up and failing, taking some, but not much time.
	std::this_thread::sleep_for( std::chrono::milliseconds( 10 ) );

	svr1.stop();
	std::cout << "Main: Told server to stop." << std::endl;

	thread1.join();
	std::cout << "Main: Done" << std::endl;
}

The real use case is a program that actually used the server, but sometimes there is an error in start-up. The error will cause the program to shut down immediately, before the server is running, causing the stop to be ignored because of the race condition.

All work-arounds boil down to the main thread waiting in a loop until the server thread has initialized the Server. That works, I guess, but it's pretty ugly.

This is essentially the problem mentioned in #1371, but that was closed only with a few work-arounds mentioned. Also commenters in #43 seem to suffer from the same problem.

I think the correct solution would be for Server.stop() to set the server to a "shutting down" state, from which any listen_* calls would return immediately. This would automatically just work. The downside is that this would make a Server object usable only once. You couldn't call start and listen multiple times on the same object. So the change may not be backwards-compatible. Perhaps instead add a separate method named decommission that moves the server to a state whence it can never be started again.

The text was updated successfully, but these errors were encountered:

jimmy-park · 2023-08-21T17:44:46Z

Use Server::wait_until_ready before stopping the server. (#1548)

svr1.wait_until_ready();
svr1.stop();

Fixes yhirose#1645. `Server::stop` does nothing if the server is not running. This can cause a program to hang it it calls stop before the server processing thread has had time to initialize. It would make sense for a server that has been stopped to refuse to start, but that would be backwards incompatible. Instead, we introduce a new method: `decommission`. It sets the server object to stop and refuse to start. This means it does not matter in which order concurrent threads end up calling `listen` and `decommission`, the result is always that the listening thread will exit.

OskiKervinen-MF · 2023-08-22T05:07:18Z

Use Server::wait_until_ready before stopping the server. (#1548)
svr1.wait_until_ready();
svr1.stop();

That is handy. But it has two problems:

It wastes time. Not much in practice, but still.
It only works if it is inevitable that the server thread will start the server. If the server thread has any operations that could throw an exception, the server thread may never start the server, causing the main thread to hang.

Number 2 is the big issue here. Yes, this can be worked around by moving all initializations that could have errors to the main thread, and making the server thread only contain the call to listen but that is a pretty significant caveat for users to keep in mind and the refactoring required can be impractical.

Please consider my implementation for the proposed decommission in #1646.

jimmy-park · 2023-08-22T10:24:27Z

How about this way?

Server svr;
std::thread t;

try {
    t = std::thread { [&] { svr.listen(HOST, PORT); } }
    svr.wait_until_ready();
} catch (...) {}

svr.stop();

if (t.joinable())
    t.join()

OskiKervinen-MF · 2023-08-22T11:52:50Z

I was referring to a case where an exception occurs in the server thread. Like this:

#include "httplib.h"
#include <thread>
#include <iostream>

int main()
{
    std::cout << "Starting..." << std::endl;

    httplib::Server svr;
    std::thread t { [&] {
        try {
            throw std::runtime_error("Some thing that happens to go wrong.");
            svr.listen("localhost", 3000 );
        } catch (...) {
            std::cout << "[Thread] Exception. Aborting." << std::endl;
        }
    } };
    std::cout << "Before wait" << std::endl;
    svr.wait_until_ready();
    std::cout << "After wait" << std::endl;

    svr.stop();

    if (t.joinable())
        t.join();

    std::cout << "Reached the end successfully." << std::endl;
    return 0;
}

This program will hang forever.

yhirose · 2023-08-28T00:28:56Z

How about this?

int main() {
  std::cout << "Starting..." << std::endl;

  httplib::Server svr;
  auto server_started = std::atomic<bool>{false};

  std::thread t{[&] {
    try {
      throw std::runtime_error("Some thing that happens to go wrong.");
      server_started = true;
      svr.listen("localhost", 3000);
    } catch (...) { std::cout << "[Thread] Exception. Aborting." << std::endl; }
  }};

  if (server_started) {
    std::cout << "Before wait" << std::endl;
    svr.wait_until_ready();
    std::cout << "After wait" << std::endl;

    svr.stop();
  }

  if (t.joinable()) t.join();

  std::cout << "Reached the end successfully." << std::endl;
  return 0;
}

OskiKervinen-MF · 2023-08-28T06:29:22Z

Nothing in your example stops the main thread from reading server_started before the server thread writes it, thereby never calling wait_until_ready() or stop(). So it can leave the server thread living forever in the cases where the exception does not happen.

That being said, you are correct that the problem can probably be solved by a sufficiently precise combination of at least two signal booleans. One for the server thread to signal its intent to inevitably start the server (making waiting for it safe), the other for the main thread to signal it already decided to shut down the server (allowing the server thread to not start when the call to stop is in the past). I'm not sure even this would work, I'd have to write the code and perform a thorough analysis.

But if you merge #1646, a simple call to svr.decommission() will always work.
No wait loops, no signal variables, no locking required.
I would argue that removing a need for subtle synchronization almost nobody will get right on the first try (and where failure appears as rare non-deterministic jams) is worth the addition of the new method.

yhirose · 2023-08-29T00:03:46Z

@OskiKervinen-MF thanks for the comment and identifying the bug in my example. I corrected it, and the following code should work either with throw or not.

int main() {
  std::cout << "Starting..." << std::endl;

  auto ok = std::atomic<bool>{true};

  httplib::Server svr;
  svr.Get("/hi", [&](const httplib::Request &, httplib::Response &res) {
    res.body = "hi...";
  });

  std::thread t{[&] {
    try {
      std::this_thread::sleep_for(std::chrono::milliseconds(100));
      // throw std::runtime_error("Some thing that happens to go wrong.");
      svr.listen("localhost", 3000);
    } catch (...) {
      std::cout << "[Thread] Exception. Aborting." << std::endl;
      ok = false;
    }
  }};

  std::cout << "Before wait" << std::endl;
  while (ok && !svr.is_running())
    ;
  std::cout << "After wait" << std::endl;

  if (ok) {
    std::cout << "Ready to make HTTP requests." << std::endl;
    httplib::Client cli("localhost", 3000);
    auto res = cli.Get("/hi");
    std::cout << res->body << std::endl;

    svr.stop();
  }

  if (t.joinable()) t.join();

  std::cout << "Reached the end successfully." << std::endl;
  return 0;
}

Regarding svr.decommission(), how can you apply the method in this example? Can it make the example any simpler?

OskiKervinen-MF · 2023-08-29T07:32:27Z

Okay, I'm impressed. You got it working with a single signal boolean.
I didn't think that was possible. But I think it is worth noting that it took
two people analyzing the problem over several iterations to get there.
With decommission(), it simplifies to:

int main(int argc, char** argv) {
  std::cout << "Starting..." << std::endl;

  httplib::Server svr;
  svr.Get("/hi", [&](const httplib::Request &, httplib::Response &res) {
    res.body = "hi...";
  });

  std::thread t{[&] {
    try {
      std::this_thread::sleep_for(std::chrono::milliseconds(1000));
      std::cout << "Starting server." << std::endl;
      if( argc > 1 )
        throw std::runtime_error("Some thing that happens to go wrong.");
      svr.listen("localhost", 3000);
    } catch (...) {
      std::cout << "[Thread] Exception. Aborting." << std::endl;
    }
  }};

  std::cout << "Press enter to stop." << std::endl;
  std::cin.get();
  std::cout << "Stopping server" << std::endl;
  svr.decommission();
  t.join();

  std::cout << "Reached the end successfully." << std::endl;
  return 0;
}

For a server you start once and stop once, this gives you an entirely worry-free way to shut it down. After this discussion, I think I can solve my specific problem without the change, so it's no longer vital for us, but I still think it would be useful for users of the library. In the end, it's your call. Thanks for the discussion in any case.

yhirose · 2023-08-29T22:46:23Z

Thanks for the update. I'll close this issue.

By the way, I tested the code with your branch, but I didn't get "Reached the end successfully.". Also I still don't fully understand how decommission should be used, and hard me to explain it on README....

[topic/1645-decommission-server] ~/Projects/cpp-httplib-fork/example$ make repro && ./repro
make: `repro' is up to date.
Starting...
Press enter to stop.
Starting server.
^C

[topic/1645-decommission-server] ~/Projects/cpp-httplib-fork/example$ make repro && ./repro 1
make: `repro' is up to date.
Starting...
Press enter to stop.
Starting server.
[Thread] Exception. Aborting.
^C

OskiKervinen-MF · 2023-08-30T05:56:19Z

Did you press enter to stop? I was trying to simulate the arbitrary timing of the shutdown by letting it depend on user input.

How to use decommission? You use it like stop. If this library was just being published, stop() should do what decommission does. But with the current design there could be users that start and stop the same server object multiple times and that won't work with decommission. So fixing stop to be safe would be a breaking change. Therefore the safe stoppage needs its own method. In other words, you use decommission when you are done with a server object.

yhirose · 2023-08-30T10:07:54Z

@OskiKervinen-MF that was my silly mistake... Yes, I confirmed that your pull request worked. I'll put my comment regarding the decommission method in #1646. Thanks!

Nitecon · 2023-12-11T16:34:37Z

I'm running into a similar problem now and it does not seem like the server is currently honoring stop(). When I call stop() on the svr it just hangs until systemd finally kills that process. Is this in the latest version?

yhirose · 2023-12-11T18:48:01Z

@Nitecon at least I know, all the unit tests for client stop and server stop in the latest version are working file on GitHub Action CI.
https://github.com/yhirose/cpp-httplib/actions

OskiKervinen-MF · 2023-12-12T07:05:06Z

@Nitecon You can make stop work by calling it with the very specific pattern yhirose shows in a comment above. They did not merge the convenient method to do this, because they considered it enough that's it's possible to work around the problem.

@yhirose Well, the the tests work because there is no test for this case. If you replace decommission with the current stop in the test #1646 tried to add, it will deadlock. That case is not supported without the work around presented in the comment I linked.

Nitecon · 2023-12-12T14:42:46Z

The thing that tends to show up for me is this kills a single thread from a stop perspective but not all threads int he pool so it's not really a "stop" for the server which should effectively handle all of the child threads appropriately by killing them. I'm still testing to make sure it's not anything else on my side though.

OskiKervinen-MF mentioned this issue Aug 22, 2023

Fixed: #1645 Impossible to shut down server before it has started #1646

Closed

yhirose closed this as completed Aug 29, 2023

yhirose reopened this Aug 30, 2023

yhirose added the information label Sep 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It is impossible to shut down the server safely, concurrently and cleanly #1645

It is impossible to shut down the server safely, concurrently and cleanly #1645

OskiKervinen-MF commented Aug 21, 2023

jimmy-park commented Aug 21, 2023

OskiKervinen-MF commented Aug 22, 2023

jimmy-park commented Aug 22, 2023

OskiKervinen-MF commented Aug 22, 2023

yhirose commented Aug 28, 2023

OskiKervinen-MF commented Aug 28, 2023 •

edited

yhirose commented Aug 29, 2023

OskiKervinen-MF commented Aug 29, 2023

yhirose commented Aug 29, 2023

OskiKervinen-MF commented Aug 30, 2023

yhirose commented Aug 30, 2023

Nitecon commented Dec 11, 2023

yhirose commented Dec 11, 2023

OskiKervinen-MF commented Dec 12, 2023

Nitecon commented Dec 12, 2023

It is impossible to shut down the server safely, concurrently and cleanly #1645

It is impossible to shut down the server safely, concurrently and cleanly #1645

Comments

OskiKervinen-MF commented Aug 21, 2023

jimmy-park commented Aug 21, 2023

OskiKervinen-MF commented Aug 22, 2023

jimmy-park commented Aug 22, 2023

OskiKervinen-MF commented Aug 22, 2023

yhirose commented Aug 28, 2023

OskiKervinen-MF commented Aug 28, 2023 • edited

yhirose commented Aug 29, 2023

OskiKervinen-MF commented Aug 29, 2023

yhirose commented Aug 29, 2023

OskiKervinen-MF commented Aug 30, 2023

yhirose commented Aug 30, 2023

Nitecon commented Dec 11, 2023

yhirose commented Dec 11, 2023

OskiKervinen-MF commented Dec 12, 2023

Nitecon commented Dec 12, 2023

OskiKervinen-MF commented Aug 28, 2023 •

edited