Network Working Group                                       J. Burchfiel
Request for Comments: 467                                   R. Tomlinson
NIC: 14741                                       Bolt Beranek and Newman
                                                        20 February 1973

                 Proposed Change To Host-Host Protocol
                Resynchronization Of Connection Status

I. Introduction

   The current Host-Host protocol (NIC #8246) contains no provisions for
   resynchronizing the status information kept at the two ends of each
   connection.  In particular, if either host suffers a service
   interruption, or if a control message is lost or corrupted in an
   interface or in the subnet, the status information at the two ends of
   the connection will be inconsistent.

   Since the current protocol provides no way to correct this condition,
   the NCP's at the two ends stay "confused" forever.  A frequent and
   frustrating symptom of this effect is the "lost allocate" phenomenon,
   where the receiving NCP believes that it has bit and message
   allocations outstanding, while the sending NCP believes that it does
   not have any allocation.  As a result, information flow over that
   connection can never be restarted.

   Use of the Host-Host RST (reset) command is inappropriate here, as it
   destroys all connections between the two hosts.  What is needed is a
   way to reset only the affected connection without disturbing any
   others.

   A second troublesome symptom of inconsistency in status information
   is the "half-closed" connection: after a service interruption or
   network partitioning, one NCP may believe that a connection is still
   open, while the other believes that the connection is closed. (Does
   not exist.)  When such an inconsistency is discovered, the "open" end
   of the connection should be closed.















Burchfiel                                                       [Page 1]


RFC 467                                                    February 1973


II. The RCR and RCS Commands

   To achieve resynchronization of allocation, we propose the addition
   of the following two commands to the host-host protocol.

         8           8
   +-----------+-----------+
   |    RCS    |   link    |   Reset connection by sender
   +-----------+-----------+

         8           8
   +-----------+-----------+
   |    RCR    |   link    |   Reset connection by receiver
   +-----------+-----------+

   The RCS command is sent from the host sending on "link" to the host
   receiving on "link".  This command may be sent whenever the sending
   host desires to re-synch the status information associated with the
   connection.  Some circumstances in which the sending Host may choose
   to do this are:

      1.) After a timeout when there is traffic to move but no
      allocation. (Assumes that an allocation has been lost)

      2.) When an inconsistent event occurs associated with that
      connection (e.g. an outstanding allocation in excess of 2^32 bits
      or 2^16 messages.

   The mechanics of re-synchronizing the allocations is simply:

      1.) Empty all messages and allocates from the "pipeline".

      2.) Zero the variables at both ends indicating bit and message
      allocation.

      3.) Restart allocate/message exchanges in the normal way.

   This resynchronization scheme is race-free because the RCS and RCR
   commands are used as a positive acknowledgement pair.

III. Resynchronization by Sender

   To initiate resynchronization, the sending NCP should:

      1.) Put the connection in a "waiting-for-RCR-reply" state.  No
      more regular messages may be transmitted over this connection
      until the RCR reply is received.




Burchfiel                                                       [Page 2]


RFC 467                                                    February 1973


      2.) Wait until the message pipeline is empty, i.e. until a RFNM
      has been received for each regular message sent over this
      connection.  This synchronizes the control and data activity, and
      also assures that the data stream will not be corrupted during the
      control re-synchronization exchange.

      3.) Send the RCS command.

      4.) Continue to process allocates normally, updating the variables
      which indicate outstanding bit and message allocation.

   When the receiving NCP receives the RCS, it should:

      1.) Zero the variables indicating outstanding bit and message
      allocation.

      2.) Reset the connection to the state which indicates readiness to
      accept a message.

      3.) Confirm the re-synchronization by sending the RCR reply.

      4.) Reconsider bit and message allocation, and send an ALL command
      for any allocation it cares to do.

   When the sending host receives the RCR reply, it should:

      1.) Zero the variables indicating outstanding bit and message
      allocate.

      2.) Put the connection into the "ready-to-send-message" state in
      preparation for any forthcoming ALL commands.

   At this point, the "pipeline" contains no messages and no allocates,
   and the outstanding allocation variables at both ends are in
   agreement. (With value zero)

IV. Resynchronization By Receiver

   The re-synchronization sequence may be triggered by the receiving
   NCP.  Such resynchronization could be initiated manually by TIP and
   TELNET users who are expecting output but receiving none.  Again
   assuming that allocation has been lost, the appropriate action is to
   reset the connection by sending an RCR command.  This action is also
   appropriate if an inconsistent event occurs with respect to the
   connection.  (e.g. arrival of a message which exceeds allocation).






Burchfiel                                                       [Page 3]


RFC 467                                                    February 1973


   To initiate re-synchronization, the receiving NCP should:

      1.) Put the connection into a "waiting-for-RCS-reply" state.  No
      more allocates may be transmitted for this connection until the
      RCS reply is received.

      2.) Send the RCR command.

      3.) Continue to process regular messages normally, updating the
      variables which indicate outstanding bit and message allocation.

   When the sending NCP receives the RCR command, it should:

      1.) Wait until the message pipeline is empty, i.e. until the RFNM
      has been received for each regular message sent over the
      connection.  This synchronizes the control and data activity, and
      also assures that the data stream will not be corrupted during the
      control re-synchronization exchange.

      2.) Zero the variables indicating outstanding bit and message
      allocation.

      3.) Put the connection into the "ready-to-send-message" state in
      preparation for any forthcoming ALL commands.

      4.) Confirm the re-synchronization by sending the RCS reply.

   When the receiving host receives the RCS reply, it should:

      1.) Zero the variables indicating outstanding bit and message
      allocation.

      2.) Reset the connection to the state which indicates readiness to
      accept a message.

      3.) Reconsider bit and message allocation, and send an ALL command
      for any allocation it cares to do.

V. Simultaneous Resynchronization

   This specification for a re-synchronization exchange is guaranteed to
   restore the allocation information at the two ends to a consistent
   state.  This happens correctly whether the re-synchronization is
   triggered by the sender, the receiver, or both at the same time.
   When both ends initiate a command at the same time, (the RCS and RCR
   commands cross in the pipeline) each interprets the other's command
   as a confirmation reply; thus, the resynchronization happens
   correctly independent of the relative timing.



Burchfiel                                                       [Page 4]


RFC 467                                                    February 1973


   The essential factor here is that when either end receives the reset
   request, it is sure that the other end will take no further actions
   which could affect the allocation variables.  The activity which
   occurs during simultaneous resynchronization by both ends is as
   follows:

   The sending NCP:

      1. Puts the connection into a "waiting-for-RCR-reply" state.  No
      more regular messages may be transmitted over this connection
      until the RCR reply is received.

      2. Waits until the message pipeline is empty, i.e. until a RFNM
      has been received for each regular message sent over this
      connection.  This synchronizes the control and data activity, and
      also assures that the data stream will not be corrupted during the
      control re-synchronization exchange.

      3. Sends the RCS command.

      4. Continues to process allocates normally, updating the variables
      which indicate outstanding bit and message allocation.

   Concurrently with 1, 2, 3 and 4 above, the receiving NCP:

      5. Puts the connection into a "waiting-for-RCS-reply" state.  No
      more allocates may be transmitted for this connection until the
      RCS reply is received.

      6. Sends the RCR command.

      7. Continues to process regular messages normally.

   The RCS and RCR commands cross somewhere in the pipeline.  When the
   sender receives the RCR command, it interprets it as a reply to its
   own RCS command.  It then:

      8. Zeroes the variables indicating outstanding bit and message
      allocation.

      9. Puts the connection into the "ready-to-send-message" state in
      preparation for any forthcoming ALL commands.

   Concurrently with 8 and 9 above, the receiving NCP will receive the
   RCS command.  It will interpret it as a reply to its own RCR command.
   It then:





Burchfiel                                                       [Page 5]


RFC 467                                                    February 1973


      10. Zeroes the variables indicating outstanding bit and message
      allocation.

      11. Resets the connection to the state which indicates readiness
      to accept a message.

      12. Reconsiders bit and message allocation, and sends an ALL
      command for any allocation it cares to do.

VI. The Problem Of Half-closed Connections

   The above procedures provide a way to resynchronize a connection
   after a brief lapse by a communications component, which results in
   lost messages or allocates for an open connection.

   A longer and more severe interruption of communication may result
   from a partitioning of the subnet or from a service interruption on
   one of the communicating hosts.  It is undesirable to tie up
   resources indefinitely under such circumstances, so the user is
   provided with the option of freeing up these resources (including
   himself) by unilaterally dissolving the connection.  Here
   "unilaterally" means sending the CLS command and closing the
   connection without receiving the CLS acknowledgement.  Note that this
   is legal only if the subnet indicates that the destination is dead.

   When service is restored after such an interruption, the status
   information at the two ends of the connection is out of
   synchronization.  One end believes that the connection is open, and
   may proceed to use the connection.  The disconnecting end believes
   that the connection is closed (does not exist), and may proceed to
   re-initialize communication by opening a new connection (RTS or STR
   command) using the same local socket.

   The re-synchronization needed here is to properly close the open end
   of the connection when the inconsistency is detected.  We propose to
   accomplish this by changing the semantics of three existing host-host
   protocol commands.

VII. Redefinition of RTS, STR, ERR (link) to Handle Half-closed
   Connections

   The "missing CLS" situation described above can manifest itself in
   two ways.  The first way involves action taken by the NCP at the
   "open" end of the connection.  It may continue to send regular
   messages on the link of the half-closed connection, or control
   messages referencing its link.  The NCP at the "closed" end should
   respond with the ERR message, specifying that the link is unknown.
   (Error code = 5 does not correspond to an open connection).  On



Burchfiel                                                       [Page 6]


RFC 467                                                    February 1973


   receipt of such an ERR message, the NCP at the "open" end should
   close the connection by modifying its tables, (without sending any
   CLS command) thereby bringing both ends into agreement.

   The second way this inconsistency can show up involves actions
   initiated by the NCP at the "closed" end.  It may (thinking the
   connection is closed) send an STR or RTS to reopen the connection.
   The NCP at the "open" end will detect an inconsistency when it
   receives such an RTS or STR command, because it specifies the same
   foreign socket as an existing open connection.  In this case, the NCP
   at the "open" end should close the connection (without sending any
   CLS command) to bring the two ends into agreement before responding
   to the RTS/STR.

VIII. Conclusions

   The scheme presented in Section II to resynchronize allocation has
   one very important property: the data stream is preserved through the
   exchange.  Since no data is lost, it is safe to initiate re-
   synchronization from either end at any time.  When in doubt, re-
   synchronize.

   The changes in the semantics of RTS, STR, and ERR(code 5) commands
   provide the synchronization needed to complete the closing of "half-
   closed" connections.

   The protocol changes above will make the host-host protocol far more
   robust, in that useful work can continue in spite of lapses by the
   communications components.


         [ This RFC was put into machine readable form for entry ]
            [ into the online RFC archives by Via Genie 08/00]


















Burchfiel                                                       [Page 7]