Too Many Open Files with pas.plugins.ldap

Hi folks. I've got a Plone-5.1.5 site with 2 or 3 simultaneous users who routinely experience getting their login sessions being "kicked out" and an inability to log back in for a while. These users are backed by an LDAP server (Apache DS) using pas.plugins.ldap :slightly_frowning_face:

The error log shows either INVALID_CREDENTIALS: {'info': u'INVALID_CREDENTIALS: Bind failed: null', 'desc': u'Invalid credentials'} or SERVER_DOWN: {u'info': 'Too many open files', 'errno': 24, 'desc': u"Can't contact LDAP server"}. I'm thinking the former is a red herring, and the "Too many open files" is the actual problem. :thinking:

If I run Plone in isolation and keep lsof -p PID running on it with a constant curl --cookie __ac="…" http://localhost/…/folder_contents on the site, sure enough I see the file count rise up and up and up, especially TCP connections to my LDAP server hanging around in the ESTABLISHED state.

Why isn't pas.plugins.ldap re-using or closing these connections? :angry:

Note that there is a memcached server in use and indeed if I telnet into and check stats I see a nice constant number of objects in it. But Plone+Zope+pas.plugins.ldap insists on opening new LDAP connections :face_with_raised_eyebrow:

Versions:

  • Products.CMFPlone-5.1.5
  • pas.plugins.ldap-1.7.2
  • python_ldap-3.2.0
  • node.ext.ldap-1.0b10

I don’t see this happening in an environment where we have multiple Plone sites using plas.plugins.ldap 1.5.4 in Plone 4 and Plone 5.2 with quite some users logged in per site. As in: I have not received any complaints about user logins for the last 2-3 years.

But the LDAP backend is MS AD (1500-2000 users).

I plan to upgrade the plugin in the 5.2 site shortly.

Maybe ApacheDS is doing something funny with not closing/allowing to re-use the connection? I guess if it was an obvious bug in pas.plugins.ldap other setups would have experience similar issues earlier.

I’ll try to redo your tests with this setup and see what happens with the open files.

In general we increase max open files per process anyway in our Plone stacks for the user running: nginx, varnish and load balancer use a lot of FD’s and the default open files limit of 1024 on some linux distributions is too low.

Thanks @fredvd.

In the intervening time I've tried:

  • Setting up an OpenLDAP 2.4.50 installation, migrating all of the ApacheDS data into it (2000 users, 500 groups), installing identical indexes, and switching Plone to use it :x:
  • Migrating the site from Plone 5.1.5 to 5.2.1 :x:
  • Testing on an empty Plone 5.2.1 site with no local code at all :x:

ApacheDS doesn't support the memberOf attribute, so on the off chance that was the issue I enabled the memberOf-overlay on OpenLDAP and checked the corresponding box on pas.plugins.ldap and it still failed :x:

All it takes to hit the errno 24 "Too many open files" limit is to have LDAP on and hit /Plone/folder_contents in three concurrent curl loops (with --cookie _ac="…"), and it doesn't take long either—about a minute on my Mac.

I'm becoming more convinced this an issue in pas.plugins.ldap (or its dependencies, perhaps node.ext.ldap) because this is the first time the problem has shown up for us since we were using Plone 4 with Products.LDAPUserFolder and friends with the same LDAP server just fine.

(I don't have access to Microsoft Active Directory so I can't compare there.)

I'll see if I can't narrow it down, but in the mean time I've got angry bosses to placate :sweat_smile:

Plone Foundation Code of Conduct