Tcp keep alive wireshark что это
Перейти к содержимому

Tcp keep alive wireshark что это

  • автор:

TCP Keep-Alive packets sent after waiting about 29 sec

alt text

We are experiencing performance issue with one of our application. I ran Wireshark on the server and captured the traffic. I noticed that after a few packets the client sends (TCP Keep-Alive) packet after it waits almost 29 sec. Can someone elaborate in the issue please?

keep-alive tcp

asked 11 Oct ’17, 18:38

cnladmin
11 ● 1 ● 1 ● 4
accept rate: 0%

edited 11 Oct ’17, 18:42

Generally ‘keep-alive’ packet is a probe to figure out: is other endpoint still active on this particular TCP connection?

In your case some data exchange happens between server and client, then the server sends last data packet 261194 and stops transmitting further. The client ACKs this packet, but because it doesn’t receive neither more data nor connection close commands it becomes uncertain — what’s happened to other end? So after timeout it sends keep-alives to ask the server: are you still alive or has you been rebooted/got stuck somehow?

The server responds with Keep-alive ACK that means: my TCP stack is still active and is maintaining this TCP connection, BUT I do not receive any data/commands from my own application layer corresponding to this connection. Later it starts to send data again.

So, reasons could be:

  • server app process gets stuck from time to time;
  • server process just has nothing to send;
  • server overload (but timeouts are pretty stable for that reason).

The next we need to know is what app type is it, maybe this is normal behavior? And also it’s would be useful to monitor server app process itself.

answered 11 Oct ’17, 22:25

Packet_vlad
436 ● 1 ● 6 ● 13
accept rate: 20%

edited 11 Oct ’17, 22:29

Thank you very much for your great answer. The application name is Bid2Win, it is a construction application for job bidding. I will monitor the app process. Do you recommend any application for the this task? or should we just use Windows process monitor?

(12 Oct ’17, 13:34) cnladmin

I myself prefer to use Procmon and TCPview utilities from Sysinternals package. In Procmon you can add filter to log only needed process and see it’s activity: network, filesystem and so on. Therefore you could spot whether Bid2Win was transmitting data or not at any particular point of time. At the same time you can capture traffic with Wireshark and later do a correlation beetween the two.

Check it out also that data transfer stops for appr. 60 sec, and it looks like some timer (hardcoded or defined somewhere in settings). Maybe you’ll be able to spot this number somewhere in the software.

Wireshark TCP Keep-Alive detection

I have a trace showing two packets; both with a TCP Length of 1 byte, both with a payload of 0x00 and both with the ACK flag set. In fact they are identical except for seq no., ack no. and checksum. The Info column shows TCP Segment of a reassembled PDU for the first packet and TCP Keep-Alive for the second packet.

alt text

The screenshot above shows the hex dumps of both packets (1 and 8). Why does Wireshark interpret these two packets differently? I believe that they are both Keep-Alives.

Thanks and regards. Paul

keep-alive tcp keepalive

asked 29 Jul ’15, 14:41

PaulOfford
131 ● 28 ● 32 ● 37
accept rate: 11%

This is not easy to answer because we need to see the sequence numbers of the packets from the same source before the two packets you posted. Can you upload the (sanitized?) pcap to cloudshark? It’s much easier to work with pcaps than with screenshots.

(29 Jul ’15, 14:46) Jasper ♦♦
(29 Jul ’15, 23:29) PaulOfford

OK — I’ve just had a bit of a lesson on TCP from a colleague and I now understand the issue.

A TCP Keep-Alive is sent with a Seq No one less than the sequence number the receiver is expecting. Because the receiver has already ACKd the Seq No of the Keep-Alive (because that Seq No was in the range of an earlier segment), it just ACKs it again and discards the segment (packet).

In my trace I haven’t captured the previous packets and so Wireshark doesn’t know what the next expected sequence number should be, and so it is unable to determine the first packet as a Keep-Alive

Best regards. Paul

answered 30 Jul ’15, 03:37

PaulOfford
131 ● 28 ● 32 ● 37
accept rate: 11%

Yes it is the answer that I would give you, too.

So I think you can accept yourself the answer, so others can learn.

(30 Jul ’15, 12:30) Christian_R

I’ll do it for Paul, no problem 😉

Tcp keep alive wireshark что это

By default, Wireshark’s TCP dissector tracks the state of each TCP session and provides additional information when problems or potential problems are detected. Analysis is done once for each TCP packet when a capture file is first opened. Packets are processed in the order in which they appear in the packet list. You can enable or disable this feature via the “Analyze TCP sequence numbers” TCP dissector preference.

For analysis of data or protocols layered on top of TCP (such as HTTP), see Section 7.8.3, “TCP Reassembly”.

Figure 7.7. “TCP Analysis” packet detail items

ws tcp analysis

TCP Analysis flags are added to the TCP protocol tree under “SEQ/ACK analysis”. Each flag is described below. Terms such as “next expected sequence number” and “next expected acknowledgment number” refer to the following”:

Next expected sequence number The last-seen sequence number plus segment length. Set when there are no analysis flags and for zero window probes. This is initially zero and calculated based on the previous packet in the same TCP flow. Note that this may not be the same as the tcp.nxtseq protocol field.

Next expected acknowledgment number The last-seen sequence number for segments. Set when there are no analysis flags and for zero window probes.

Last-seen acknowledgment number Always set. Note that this is not the same as the next expected acknowledgment number.

Last-seen acknowledgment number Always updated for each packet. Note that this is not the same as the next expected acknowledgment number.

TCP ACKed unseen segment

Set when the expected next acknowledgment number is set for the reverse direction and it’s less than the current acknowledgment number.

TCP Dup ACK #

Set when all of the following are true:

  • The segment size is zero.
  • The window size is non-zero and hasn’t changed.
  • The next expected sequence number and last-seen acknowledgment number are non-zero (i.e., the connection has been established).
  • SYN, FIN, and RST are not set.
TCP Fast Retransmission

Set when all of the following are true:

  • This is not a keepalive packet.
  • In the forward direction, the segment size is greater than zero or the SYN or FIN is set.
  • The next expected sequence number is greater than the current sequence number.
  • We have at least two duplicate ACKs in the reverse direction.
  • The current sequence number equals the next expected acknowledgment number.
  • We saw the last acknowledgment less than 20ms ago.

Supersedes “Out-Of-Order” and “Retransmission”.

TCP Keep-Alive

Set when the segment size is zero or one, the current sequence number is one byte less than the next expected sequence number, and none of SYN, FIN, or RST are set.

Supersedes “Fast Retransmission”, “Out-Of-Order”, “Spurious Retransmission”, and “Retransmission”.

TCP Keep-Alive ACK

Set when all of the following are true:

  • The segment size is zero.
  • The window size is non-zero and hasn’t changed.
  • The current sequence number is the same as the next expected sequence number.
  • The current acknowledgment number is the same as the last-seen acknowledgment number.
  • The most recently seen packet in the reverse direction was a keepalive.
  • The packet is not a SYN, FIN, or RST.

Supersedes “Dup ACK” and “ZeroWindowProbeAck”.

TCP Out-Of-Order

Set when all of the following are true:

  • This is not a keepalive packet.
  • In the forward direction, the segment length is greater than zero or the SYN or FIN is set.
  • The next expected sequence number is greater than the current sequence number.
  • The next expected sequence number and the next sequence number differ.
  • The last segment arrived within the Out-Of-Order RTT threshold. The threshold is either the value shown in the “iRTT” (tcp.analysis.initial_rtt) field under “SEQ/ACK analysis” if it is present, or the default value of 3ms if it is not.
TCP Port numbers reused

Set when the SYN flag is set (not SYN+ACK), we have an existing conversation using the same addresses and ports, and the sequence number is different than the existing conversation’s initial sequence number.

TCP Previous segment not captured

Set when the current sequence number is greater than the next expected sequence number.

TCP Spurious Retransmission

Checks for a retransmission based on analysis data in the reverse direction. Set when all of the following are true:

  • The SYN or FIN flag is set.
  • This is not a keepalive packet.
  • The segment length is greater than zero.
  • Data for this flow has been acknowledged. That is, the last-seen acknowledgment number has been set.
  • The next sequence number is less than or equal to the last-seen acknowledgment number.

Supersedes “Fast Retransmission”, “Out-Of-Order”, and “Retransmission”.

TCP Retransmission

Set when all of the following are true:

  • This is not a keepalive packet.
  • In the forward direction, the segment length is greater than zero or the SYN or FIN flag is set.
  • The next expected sequence number is greater than the current sequence number.
TCP Window Full

Set when the segment size is non-zero, we know the window size in the reverse direction, and our segment size exceeds the window size in the reverse direction.

TCP Window Update

Set when the all of the following are true:

  • The segment size is zero.
  • The window size is non-zero and not equal to the last-seen window size.
  • The sequence number is equal to the next expected sequence number.
  • The acknowledgment number is equal to the last-seen acknowledgment number,
  • or to the next expected sequence number when answering to a ZeroWindowProbe.
  • None of SYN, FIN, or RST are set.
TCP ZeroWindow

Set when the receive window size is zero and none of SYN, FIN, or RST are set.

The window field in each TCP header advertises the amount of data a receiver can accept. If the receiver can’t accept any more data it will set the window value to zero, which tells the sender to pause its transmission. In some specific cases this is normal — for example, a printer might use a zero window to pause the transmission of a print job while it loads or reverses a sheet of paper. However, in most cases this indicates a performance or capacity problem on the receiving end. It might take a long time (sometimes several minutes) to resume a paused connection, even if the underlying condition that caused the zero window clears up quickly.

TCP ZeroWindowProbe

Set when the sequence number is equal to the next expected sequence number, the segment size is one, and last-seen window size in the reverse direction was zero.

If the single data byte from a Zero Window Probe is dropped by the receiver (not ACKed), then a subsequent segment should not be flagged as retransmission if all of the following conditions are true for that segment: * The segment size is larger than one. * The next expected sequence number is one less than the current sequence number.

This affects “Fast Retransmission”, “Out-Of-Order”, or “Retransmission”.

TCP ZeroWindowProbeAck

Set when the all of the following are true:

  • The segment size is zero.
  • The window size is zero.
  • The sequence number is equal to the next expected sequence number.
  • The acknowledgment number is equal to the last-seen acknowledgment number.
  • The last-seen packet in the reverse direction was a zero window probe.

Supersedes “TCP Dup ACK”.

TCP Ambiguous Interpretations

Some captures are quite difficult to analyze automatically, particularly when the time frame may cover both Fast Retransmission and Out-Of-Order packets. A TCP preference allows to switch the precedence of these two interpretations at the protocol level.

TCP Conversation Completeness

TCP conversations are said to be complete when they have both opening and closing handshakes, independently of any data transfer. However, we might be interested in identifying complete conversations with some data sent, and we are using the following bit values to build a filter value on the tcp.completeness field :

  • 1 : SYN
  • 2 : SYN-ACK
  • 4 : ACK
  • 8 : DATA
  • 16 : FIN
  • 32 : RST

For example, a conversation containing only a three-way handshake will be found with the filter ‘tcp.completeness==7’ (1+2+4) while a complete conversation with data transfer will be found with a longer filter as closing a connection can be associated with FIN or RST packets, or even both : ‘tcp.completeness==31 or tcp.completeness==47 or tcp.completeness==63’

Another way to select specific conversation values is to filter on the tcp.completeness.str field. Thus, ‘tcp.completeness.str matches «(R. |F)[^D]ASS»‘ will find all ‘Complete, NO_DATA’ conversations, while the ‘Complete, WITH_DATA’ ones will be found with ‘tcp.completeness.str matches «(R. |F)DASS»‘.

Prev Up Next
7.4. Expert Information Home 7.6. Time Stamps

Грабли на пути к keep-alive

Увеличение активности обмена данными между микросервисами зачастую является проблемой в архитектуре современных IT решений. Выжать максимум и выжить любой ценой — серьёзный вызов для любой разработки. Поэтому поиск оптимальных решений — это не прекращающийся процесс. В статье кратко изложены проблемы, которые могут возникнуть при высоконагруженном использовании http запросов и пути их обхода.

Эта история начинается с ошибки. Как-то мы проводили нагрузочное тестирование, основным элементом которого было выполнение большого количества коротких http запросов. Клиент, написаный под netcore 2.2, начиная с какого-то момента, выдавал System.Net.Sockets.SocketException: Address already in use. Достаточно быстро выяснилось, что на клиенте не успевали освобождаться порты, и в какой-то момент система получала отказ в открытии нового. Теперь, если перейти к коду, проблема была в использовании старого подхода с классом HttpWebRequest и конструкции:

var request = WebRequest.CreateHttp(uri); using(var resp = request.GetResponse())

Казалось бы, мы высвобождаем ресурс, и порт должен быть освобожден своевременно. Однако netstat сигнализировал о быстром росте количества портов в состоянии TIME_WAIT. Это состояние означает ожидание закрытия соединения (и возможно получение потерянных данных). Как следствие порт может находится в нем 1-2 минуты. Данная проблема рассмотрена довольно подробно во многих статьях (Проблемы с очередью TIME_WAIT, История о TIME_WAIT). Все же это означает, что dotnet «честно» пытается закрыть соединение, а дальнейшее происходит уже по вине настроек таймаута в системе.

Почему так происходит и как с этим бороться

Не буду рассказывать про keep-alive. Об этом можно почитать самостоятельно. Целью статьи является попытка обойти грабли, заботливо разложенные на пути разработчика. Согласно msdn, свойство KeepAlive класса HttpWebRequest по умолчанию равно true. То есть все это время HttpWebRequest «обманывал» сервер, предлагая ему поддержать соединение, после чего сам же его разрывал. Если быть точнее, HttpWebRequest с настройками по умолчанию не отправлял заголовок «Connection: keep-alive», просто этот режим подразумевается в стандарте HTTP/1.1. Первое, что следовало попробовать, это принудительно отключить KeepAlive. Если установить HttpWebRequest.KeepAlive = false, то в запросе появляется заголовок «Connection: close». Надо признать, что на тестовом стенде это полностью решило проблему. В качестве сервера был настроен nginx со статической страницей.

Тестировался следующий код:

while (true) < var request = WebRequest.CreateHttp(uri); request.KeepAlive = false; var resp = await request.GetResponseAsync(); using (var sr = new StreamReader(resp.GetResponseStream())) < var content = sr.ReadToEnd(); >>

Однако при попытке запустится на серверном железе, при больших нагрузках (свыше 1000 запросов в секунду) этот код вновь начал выдавать те же ошибки. Только теперь порты находились в состоянии CLOSE_WAIT, LAST_ACK. Это пред-финальные состояния закрытия соединения, когда клиент ждет подтверждение от инициатора закрытия. Такое поведение сигнализирует о том, что клиент начинает «захлебываться» вновь открываемыми соединениями.

Закрывать нельзя, переиспользовать

Действительно, чтобы добиться максимальной производительности, соединение нужно переиспользовать. Для этого необходимо включить режим keep-alive и взять класс HttpClient. Как именно он работает и как лучше его использовать стоит почитать здесь и здесь.

Другой вопрос заключается в том, как убедится, что соединения переиспользуются? Существование одного keep-alive соединения регулируется двумя основными параметрами на сервере nginx:

  • keepalive_timeout – время жизни (в среднем 15с)
  • keepalive_requests – максимальное количество запросов в одном соединении (по умолчанию 100)

Если просматривать соединения в netstat или wireshark, то при больших нагрузках открытые порты на клиенте также будут стремительно меняться. Только выставив keepalive_requests в большие значения (> 1000) можно увидеть, что все работает как надо.

Вывод

Если вы не используете http запросы в высоконагруженном режиме, то вам подойдет любой вариант. Вряд ли вы успеете исчерпать все порты. Если же в вашем приложении переиспользовать соединения смысла нет, например вы редко повторно обращаетесь к серверу, то стоит сознательно отключать keep-alive. Также keep-alive стоит использовать правильно и с осторожностью при большом потоке запросов, регулируя время жизни соединения в зависимости от частоты повторных обращений к серверу.

И напоследок немного тестовых сравнений производительности:

  • RunHttpClient – использует класс HttpClient режиме «Connection: keep-alive»
  • RunHttpClientClosed – использует класс HttpClient режиме «Connection: closed»
  • RunWebRequestClosed — использует класс HttpWebRequest режиме «Connection: closed»

Сервер nginx настроен с параметрами:

  • keepalive_timeout 60s;
  • keepalive_requests 100000;
Method N Theads Mean
RunHttpClient 1000 1 963.3 ms
RunWebRequestClosed 1000 1 3,857.4 ms
RunHttpClientClosed 1000 1 1,612.4 ms
RunHttpClient 10000 1 9,573.9 ms
RunWebRequestClosed 10000 1 37,947.4 ms
RunHttpClientClosed 10000 1 16,112.9 ms

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *