Opened on 09/10/2015 at 04:36:02 PM

Closed on 09/16/2015 at 06:32:11 AM

Last modified on 09/16/2015 at 09:34:16 AM

#3056 closed defect (fixed)

[CMS] cms.bin.test_server intermittently pauses for several seconds when serving static files

Reported by: kzar Assignee: sebastian
Priority: P4 Milestone:
Module: Sitescripts Keywords: cms
Cc: trev, Ross, sebastian, saroyanm Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29327611

Description (last modified by kzar)

Environment

  • Latest revision of CMS (f9fd64fbc629)
  • Ubuntu Linux 15.04 running on a Thinkpad T450
  • Serving the contents of the cms-problem-example repository.
  • Python version 2.7.9
  • Werkzeug 0.10.4 (But the problem still occurs when using wsgiref.simple_server.make_server instead of Werkzeug.)

How to reproduce

  1. Clone https://github.com/kzar/cms-problem-example and the cms.
  2. In the cms-problem-example directory launch the test server like ../cms/runserver.py.
  3. Browse to this exact URL http://localhost:5000/
  4. Keep refreshing the page until the delay problem occurs.

Observed behaviour

There are seemingly random large pauses of several seconds when the server is serving static files like images and scripts. With the example above it happens more than 50% of the time for me.

Expected behaviour

The files should be consistently served quickly.

Notes

I have also noticed a very similar / the same problem with the run_tests.py script in the adblockpluscore repository. The script acts as web server serving the pages and static files required to run the unit tests. The cause of this issue could be related.

The problem appears to happen when a connection is opened and then closed with no data sent. The request following will then be blocked for 10 seconds which is presumably a time-out somewhere in the web-server.

Attachments (0)

Change History (18)

comment:1 Changed on 09/10/2015 at 04:37:55 PM by kzar

  • Cc saroyanm added

@saroyanm Have you observed this behaviour as well? So far me and Ross have.

comment:2 Changed on 09/10/2015 at 05:11:15 PM by kzar

  • Description modified (diff)

comment:3 Changed on 09/10/2015 at 05:32:44 PM by sebastian

  • Cc sebastian added; snoack removed

comment:4 Changed on 09/10/2015 at 05:53:10 PM by kzar

  • Description modified (diff)

comment:5 Changed on 09/10/2015 at 05:54:32 PM by kzar

  • Description modified (diff)

comment:6 Changed on 09/10/2015 at 08:13:36 PM by Ross

This also occurred before the generic hide tests were added, but they do seem to cause it more.

comment:7 Changed on 09/10/2015 at 10:03:15 PM by kzar

Passing threaded=True through to werkzeug.serving.run_simple in cms/bin/test_server.py seems to work around the problem. I still don't understand what's happening however.

Last edited on 09/10/2015 at 10:08:37 PM by kzar

comment:8 Changed on 09/11/2015 at 10:53:52 AM by kzar

  • Description modified (diff)

comment:9 Changed on 09/11/2015 at 11:12:29 AM by sebastian

  • Priority changed from Unknown to P4
  • Ready set

Reproduced.

It seems that this happening when the browser closes the connection. Here is the strace output:

20:37:47 recvfrom(4, "", 8192, 0, NULL, NULL) = 0
20:37:57 shutdown(4, 1 /* send */)      = 0
20:37:57 close(4)                       = 0
20:37:57 select(4, [3], [], [], {0, 500000}) = 1 (in [3], left {0, 499995})
20:37:57 accept(3, {sa_family=AF_INET, sin_port=htons(36510), sin_addr=inet_addr("127.0.0.1")}, [16]) = 4

Always when recvfrom returns zero bytes, there is a delay of 10 seconds until the following shutdown call returns. When following the steps from this issue's description, this happens quite a lot. But when serving web.eyeo.com for example with cms/runserver.py, recvfrom never returns zero bytes.

comment:10 Changed on 09/11/2015 at 11:43:47 AM by kzar

I can also reproduce this when using wsgiref.simple_server.make_server instead of Werkzeug. It's possibly related to this issue http://bugs.python.org/issue21878 ?

comment:11 Changed on 09/11/2015 at 12:21:49 PM by kzar

  • Description modified (diff)

I've updated the description with a much simpler way to reproduce this problem.

comment:12 Changed on 09/11/2015 at 12:24:23 PM by kzar

  • Description modified (diff)

comment:13 Changed on 09/14/2015 at 03:31:03 PM by sebastian

I got the timestamp in the strace log wrong. It's the recvfrom, not the shutdown call, that blocks for 10 seconds. That happens when the browser opens a second connection to the server. The server then waits for another request on the first connection. After 10 seconds the browser closes the first connection. And the server responds to the request on the second connection. This is reproducible in the scenario here, because content is served from two different domains, forcing the browser to open one connection per domain.

Enabling multi-threading seems to be the appropriate solution here. That way the server can handle multiple connections at the same time.

Alternatively, we could add the Connection: close header to force the browser to close the connection after every request. But Chrome seems to ignore the Connection: close header, leaving the connection open until its closed by the server or until there are no more requests to send. However, disabling keep-alive would be another difference to the production server that we should avoid anyway.

Last edited on 09/16/2015 at 09:34:16 AM by sebastian

comment:14 Changed on 09/15/2015 at 06:30:57 AM by sebastian

  • Owner set to sebastian

comment:15 Changed on 09/15/2015 at 06:32:34 AM by sebastian

  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:16 Changed on 09/16/2015 at 06:32:11 AM by sebastian

  • Resolution set to fixed
  • Status changed from reviewing to closed

comment:17 Changed on 09/16/2015 at 06:35:26 AM by sebastian

@kzar: As you experienced the issue as well with run_tests.py from adblockpluscore, mind filing a follow up issue to fix it there as well?

comment:18 Changed on 09/16/2015 at 09:19:04 AM by kzar

Sure, I've done that - #3081.

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from sebastian.
 
Note: See TracTickets for help on using tickets.