#158 open
James

Integration test suite failures - too many open files

Reported by James | January 23rd, 2009 @ 12:15 PM | in 1.4

Right now there are numerous issues when running the test suite, mostly to do with "too many open files" - although there are a number of other failures that appear to be to do with outdated tests (vs. new changes in master).

One way I fixed the "too many open files" problem in master was to set the File instances on class variables that were only set once for the entire suite. I had a go at doing this for integration tests, but without luck.

You can see an example of the test/unit changes here for reference: http://github.com/kete/kete/blob... (line 14)

The problem only arrises when running the whole suite, not individual tests. (I.e. when running rake test:integration.)

James

Comments and changes to this ticket

  • Walter McGinnis

    Walter McGinnis January 23rd, 2009 @ 12:27 PM

    To clarify, the "numerous issues" seem to be tests erroring out or
    failing after a certain point of running either the entire test suit
    (i.e. rake test) or running the integration test (i.e. rake
    test:integration). These tests are getting computationally intensive
    and seem to be eating up resources ("too many open files") and then
    the tests fail.

    As James has mentioned, there is likely a configuration change to make
    sure that our machines don't get swamped and cause all tests to error
    out or fail.

    On Jan 23, 2009, at 12:15 PM, Lighthouse wrote:

  • Kieran P

    Kieran P February 16th, 2009 @ 01:08 PM

    To clarify, the "too many open files" issue is not the same issue that occurred in unit/functionals. This one is because too many errors writing/readings occurred (webrat dumps info to a file in tmp/).

    So I looked for errors. I went through each test and ran them individually. They all pass fine. So I ran the full suite, and this error came up again.

    My best guess is that memcache is running out of memory quickly, and gives the "no connection to server" error I get occasionally in development mode. Each test after that keeps failing for the same reason and eventually writes enough error fails...

    Possibly, the easiest solution is to increase your memcache memory limit to 256mb or so while running the tests, possibly even 512mb if you can. A CI server dedicated to this would be very helpful.

    Might be helpful to find a way to clear/clean up memcache from within the tests.

  • Andy Robertson

    Andy Robertson February 24th, 2009 @ 11:11 AM

    This could maybe help. I needed to do this on the kete server here as they all share the same memcache server. It was spawning way too many connections to the the memcache server at one point I saw about 2048 (A nice power of 2) connections and kete's stopped serving pages. It occcured to me that they could all be stepping on each other. They were. The current config we use defaults ActionController::Base.session_options[:cache] to 172.0.0.1:11211 with no memcache namespace. It is used only for the sessions right. Session info is also in cookies.

    A cleaner fix would be to use the cache_fu plugin but heres a little hack that I did from what I could find on the web.

    What was required was a separate namespace for memcache for each of the kete. I found an example on the internet and I've put this into /config/initializers for each of my kete so they get a different memcache namespace

    
    memcache.rb
    
    memcache_options = {
              :namespace => 'keteham',
    }
    memcache_servers = [ '127.0.0.1:11211' ]
    cache_params = *([memcache_servers, memcache_options].flatten)
    CACHE = MemCache.new *cache_params
    ActionController::Base.session_options[:cache] = CACHE
    
    

    Where keteham is replaced with a short name I've given for each of my kete, like ketenp, ketewaimak, ketetest...

    Seems to have kept the number of connections down nicely. I couldn't find anything equiv for making backgroundrb to use namespaces however.

  • Kieran P

    Kieran P March 4th, 2009 @ 05:24 PM

    • Assigned user changed from “Kete” to “Kieran P”
    • Milestone cleared.
  • Kieran P

    Kieran P June 8th, 2009 @ 05:44 PM

    • State changed from “new” to “open”

    The solution Andy detailed in the post above doesn't appear to work in the test suite because the tests run on the same instance of kete, so namespacing doesn't appear to do anything. Increasing memcache memory size from 62mb to 124mb and even 256mb has no effect either.

    The error (one of many I got during a full test suite test) can be seen here: http://gist.github.com/125626

    A few tests before and after this error worked fine, so Rails connection details must be correct.

    The error, which appears to originate from vendor/rails/activesupport/lib/active_support/vendor/memcache-client-1.5.0/memcache.rb:663:in request_setup' , which is triggering because server.socket.nil? is true. It has already passed a check for active? (testing if memcache is online).

    The server.socket method appears to use multi threading so (the following is an unconfirmed hypothesis) it may be possible that Rails/test suite is trying to request a page while a memcache connection is being initialized, thus can't establish its own connections, and triggers the error.

    After consulting with members on the Ruby on Rails IRC channel, the general consensus there is that memcache shouldn't be used for session storage and that either the current method (cookie store) or previous method (file store) would be more appropriate and less likely to error.

    The memcache connection errors themselves do not cause the test suite to fail, and is not caused by the writing of webrat error files I mentioned in my last comment. In fact, it appears no webrat files were written to disk. A possible cause may be the continuous opening of public/500.html, given that it is ruby that is opening these, and not apache serving them, so some file restrictions may apply.

    Looking at details from Google, a similar issue looks to come from the Kernel, when it has no more 'file handles' (http://www.patoche.org/LTT/kernel/00000128.html). That link details a possible fix (untested). However, given that we can't expect all users to edit their system, I'm still looking round for a better fix.

    Based on that and similar articles about the "too many files open" error, Ruby on Rails may not be releasing its file pointer when it opens and reads from 500.html . I'll dig into Rails source code tomorrow an see if anything can be done.

  • Kieran P

    Kieran P November 2nd, 2009 @ 11:12 AM

    • State changed from “open” to “hold”
  • Walter McGinnis

    Walter McGinnis October 1st, 2010 @ 07:02 PM

    • State changed from “hold” to “open”
    • Milestone set to 1.3
    • Assigned user changed from “Kieran P” to “Walter McGinnis”
    • Milestone order changed from “0” to “0”

    The root of the problem is that we are repeatedly testing the login procedure during almost all of our integration tests.
    This process only really needs to be tested once and not every time another set of functionality that we are testing needs a logged in user to carry out its functionality.

    This should have two benefits:

    • we won't get this "too many files open" error that prevents us from running "rake test" and "rake test:integration" (i.e. running all of our integration in one fell swoop)
    • we should dramatically speed up our integration tests

    I've had my eye out for months for a way to make test/integration/integration_test_helper.rb#login_as convenience method skip right to putting the user in a logged in state (i.e. having the right cookie) rather than literally walking the webrat process through navigating to the login form, filling it out with the credentials, and being redirected back to the original page and testing all of that along the way.

    I think I have finally found it in a blog post (which I can't find at the moment, but will look up later), the account action for logging in check the site's mode to see if it is test and if so, simply logs the user in.

    This strategy will need to be modified slightly so that we can still test that logging in works, but the general idea seems sound.

    Steps:
    modify authentication mechanism to check server's mode and situation (i.e. we aren't testing the login process itself, but are in test mode) modify the login_as method to have the default to be NOT to test the login process modify the login_as method to have an option to test the login process as triggered by an option add a file of login tests under test/integration to make the login process is tested

  • Walter McGinnis

    Walter McGinnis November 16th, 2010 @ 12:29 PM

    Ok, the issue with too many files open for integration tests has been dealt with in two ways:

    • updating all tests (mostly integration) to be pass by either fixing broken functionality (actually rare) or fix the test's hardcoding text that is checked for (common) or other logic errors in the test
    • making login_as method by default simply post to the login action rather than do a full series of web page requests, etc. (still optionally works the old way for tests that require the old behavior)

    The tests are much faster, but still take a ton of time. However, they don't error out anymore when running the full test suite (rake test).

    Note that some optional tests haven't been run recently (selenium for javascript, backgroundrb ones) and these still need to be checked.

    So people should be using the full test suite regularly now (and adding to it, too) and not allow test rot.

    I'll close this once the selenium and backgroundrb tests are checked and in good shape.

  • Walter McGinnis

    Walter McGinnis September 25th, 2011 @ 04:24 PM

    • Milestone changed from 1.3 to 1.4
    • Milestone order changed from “62” to “0”

    Unfortunately integration tests are taking a bit longer again and about about half way through begin erroring out again (they pass fine individually).

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Kete was developed by Horowhenua Library Trust and Katipo Communications Ltd. to build a digital library of Horowhenua material.

People watching this ticket

Pages