Either the servers are going to have to talk to each other to agree on a version for the message or some other trick is going to have to be used.
The simplest approach is to send a request to one of the servers to get a version number and then send the version out with the message to all of the servers in parallel. This is nice because the message can be serialized to the network in parallel and there are mechanisms in place that will do this very quickly.
Here's a diagram of the first scenario. If any of the diagrams are hard to read, you can click on it to see a larger image.
Instead of sending a request for a version to the first server, the client could send the message and expect a version number in response. It could then add this to the message and send it to the other servers. This would require multiple serialization of the message to on-wire format, but might be faster for small messages.
Another approach is to delegate the sending of the message to one of the servers ("one hop messaging"). The client sends the message to one of the servers and it, in turn, sends the message to the other servers after adding a version stamp to it. This also requires some level of extra serialization since the client must write the message to the network and then the selected server must forward the message to the other servers over the network.
We can simplify the acknowledgment scheme by having each server send an ack directly back to the client instead of piping them through Server1. This would cause context switching in the client, but might be better than funneling the acks through Server1.
Yet another approach is to have the client send the message to all of the servers and include a tag that selects one of the servers to supply the version tag. After the selected server (Server1 in the diagram below) receives the message it stamps it with a version and then sends that version to the other servers. The other servers wait for the version number before accepting the message. This, too, requires extra context switching because the other servers have to wait for a signal that the version number has been received.
The payload column shows the size of message that was used in each test run, and each column shows how many seconds it took the approch to handle 1 million messages. The "no versioning" column is a base-line that shows the performance of simple send-with-acknowledgement messaging.
The results were a little surprising to me. I had expected the last approach, where the client sends the message to all servers and they wait for one to send a version number, to have the best performance. Instead, all but one of them converge to the same performance level when messages reach 10,000 bytes in size. At this level they are only moving about 70mb of data through the system per second, so they aren't being throttled by the network, but with the extra synchronization points and context switches CPU was becoming a limiting factor.
The "version request" and "message returns version" scenarios (the first two diagrams above) are clear losers because server2 and server3 do not even see the message until a complete send and response cycle is performed with server1.
The "one hop" scenario had a poor showing because of the long acknowledgement chain, with both server2 and server3 sending their acks to server1. Server1 has to wait for both acks before it finally sends its own ack back to the client.
The clear winner is the "one hop, ack client" algorithm with servers sending acknowledgements directly to the client. It even converged with and then passed the base-line "no versioning" scenario at about 3000 bytes/message.
No comments:
Post a Comment