Networks Lecture 11

Internet checksums

One's complement sum of 16-bit words of data (wrapping carry around into low-order bits)

Detect all 1-bit errors

Can a 2-bit error go undetected? YES

For these reasons, the internet checksum isn't a particularly powerful means of detecting bit errors. We'll see a better way when we study CRCs (Cyclical Redundancy Codes).

Socket implementation

TCP socket is a complex data structure with many state variables

Events that need to be handled by a socket implementation:

send (put application data in the send buffer) receive (get data from the receive buffer, but in app buffer) choose a segment to send or retransmit to the network receive a segment from the network segment timeout

Send window and send buffer

Segments are added to the buffer, and assigned a sequence number (NextSendSeq) such that each byte of data has a unique, steadily increasing sequence number

Segments are removed from the buffer when an acknowledgment is received from the receiver

Send buffer size is finite

Send buffer should be larger than the maximum expected CWin size

The send buffer does not need to be a single contiguous chunk of memory. Could be a linked list of segments.

Receive window and receive buffer

Buffering out-of-order segments is optional, but increases efficiency by requiring fewer packets to be retransmitted.

There may be gaps, and there may be a gap at the beginning.

The receive buffer size can grow, but should never shrink.

The application may request fewer bytes of data than is contained in the first segment in the receive buffer.  To handle this situation, the segments must support being partially consumed.  One approach is to keep an offset variable in each statement.  When offset is 0, buffer is full.  As data is consumed from the segment, offset moves by number of bytes consumed.  Buffer is empty when offset equals original number of bytes in the segment.

Example Pseudocode

These examples make the simplification that the application will not be blocked if either

  1. send() is called and the send buffer is completely full
  2. receive() is called and the receive buffer is completely empty

In both cases, we will return 0, and the application will try again later.

We also allow send() and receive() to copy fewer bytes than requested into or out of the send/receive buffers.

// Copy data from given application buffer into the socket's send buffer.
// Return number of bytes added to send buffer.
send(byte[] data) {
   numToSend = min(available space, data.length)

   if (numToSend > 0) {
      segment = new segment of size numToSend
      copy numToSend bytes from data to segment
      add segment to send buffer
   }

   return numToSend;
}
// Copy data from socket receive buffer into given application buffer.
// Return number of bytes copied.
receive(byte[] data) {
   numCopied = 0

   for each segment in receive buffer {
      // Don't return out of order data!
      if (gap before segment) break;

      // Have we completely filled application buffer?
      if (numCopied == data.length) break;

      // Copy some data from this segment to the app buffer
      toCopy = min(data.length - numCopied, segment.numBytesAvailable())
      numCopied += toCopy
      remove toCopy bytes from beginning of segment
      if (segment is now empty) {
         adjust receive window and remove segment
      }
   }

   return numCopied
}
// Receive a segment from the network
receiveSegmentFromNetwork(segment) {
   if (segment contains data and
       is in receive window and 
       does not overlap previously received data) {
      add segment to receive buffer
   }

   if (segment is an acknowledgment) {
      update send window (removing acknowledged segments)
      if (ack is received in-order) {
         add sample (based on transmission time of acked segment) to RTT estimate
      }
      fast retransmit if 3X ACK
   }

   if (segment requires ACK) {
      schedule ACK
   }
}

When does a segment require an ACK?

  1. it contains data
  2. the SYN bit is set

Updating the send window when an ACK is received: remove all segments earlier than the ack'ed sequence number (because acks are cumulative).

// Attempt to send or retransmit a segment.
// At most, a single segment will be selected.
sendOrRetransmit() {
   // Count of how many bytes we can have
   // in-flight, according to the current remote receive window advertisement
   // and the current congestion window.
   int canSend = min(remote RecvWin advertisement, CWin)

   for each segment in send window {
      if (segment not sent yet or segment timed out) {
         if (segment.size < canSend) {
            transmit(segment)
            segment.sent = true
            reset timeout for segment
         }
         break;
      }

     // The segment is in-flight, so it counts against
     // the maximum number of bytes we can send.
      canSend -= segment.size()
   }
}

The invariant we are trying to maintain is that the number of bytes in the pipe cannot exceed either the remote receive window (RecvWin) or the congestion window (CWin).

When a segment times out, the socket's congestion window and threshold should be adjusted.