LDAP status quo and where to go from here

Hi,

currently, there are two LDAP stacks available for Plone: plone.app.ldap and
pas.plugins.ldap. Both stacks are using the defacto standard python-ldap library.

Issues with the current LDAP stacks

It is unclear under which license python-ldap is published.

The legal department of one of our customers has forbidden the use
of python-ldap in any of its companies projects.

Apart from the legal situation of the underlying library, both Plone LDAP
stacks have grown unnecessarily big and complex. Partly, this is due to
the lack of support of commonly needed features in python-ldap.

python-ldap is closely modelled after the C-API of OpenLDAP's libldap
and leaves handling of many common tasks to higher level libraries:
asynchronous calls, attribute name and type mapping from/to UTF-8,
connection pooling - to name just a few.

The lack of support in the defacto standard library leads to every
higher level LDAP-library either not supporting or reinventing those
common tasks more or less successfully.

Both Plone LDAP stacks have their own or at least different libraries
handling attribute name and type mapping and rely on a multitude of
other libraries to work. Both do not support asynchronous calls.

Searching for a user in Plone's sharing tab results in 1001 LDAP
requests for 1000 matches.

On top of that, pas.plugins.ldap needs to fetch all valid user ids for
each plone request.

Long-living caches mitigate the situation, but create problems
like users not appearing (depending on which frontend) or users still
being able to log-in despite being deleted hours ago.

Alternatives?

There is python3-ldap, an LDAP library written in pure Python, which
is also compatible with Python 2.

Given the heavy ASN-parsing involved in talking LDAP, we expect it to
be clearly slower than a C-based library.

Its API is confusing.

Recently, @datakurre revealed Asynchronous stream iterators and
experimental promises for Plone
, which might help to greatly speed up
views involving LDAP-searches. In order to follow this idea we need
support for asynchronous LDAP calls.

Goal

Our goal is one LDAP request per sharing tab search and transparent
support of asynchronous calls where sensible, licensed under MIT/BSD
license.

Solution

We started development on a three-tier solution:

  1. ldapy, a low-level pythonic library using libldap via cffi and
    transparent support for asynchronous calls via generators.

  2. ldapalchemy, modelled after sqlalchemy to support session
    management with connection pools and querying of LDAP entries with
    attribute name and type mapping.

  3. pas.plugins.ldapalchemy, PAS plugin using ldapalchemy to
    communicate with LDAP.

For uses cases where changing the full stack is not feasible we are
working on a drop-in replacement for python-ldap.

We would like to know whether you are sharing these concerns and what
you would like to see in such a new, slim stack with focus on
performance.

Initial funding was provided by a customer, the results of which are
published in ldapalchemy.

Currently, we are continuing without funding in our free time, but would
love to work on it full time, given the opportunity. If you have means to
support this effort, please let us know.

regards
@durko and @chaoflow

1 Like

Hello everybody,
I'm the author of python3-ldap. I read in your message that you think that the python3-ldap API is "confusing". I'd like to know what you mean by that, are you referring to the documentation (still not completed) or to the API itself?. I started to develop python3-ldap from scratch and strictly following the official LDAP v3 protocol specified in RFCs (4511 and others), so the library reflects what the protocol is. I've been using python-ldap for years in many projects, and I were never fully satisfied with it. Python is not C so you have to make compromise to use a "wrapper" around the C openldap client library.

Building a library from the ground up in pure Python leads, in my opinion, to a more pythonic approach to writing the code using it. Furthermore you have the full control of the code up to the socket level. This was one of the goal of my project, together with eliminating the hassle of installing the C components at system level (in Windows or Linux).

Regarding the speed of the library obviously pure python is slower than C, but I've built python3-ldap with the idea of pluggable "communication strategies'. There are alraedy a few of those (sync, async, threaded, ...) so if speed is a concern you can always develop a "strategy" that use a C ASN.1 library to achieve the speed you need.

Let me know if you need more information on this topic.

Bye,
Giovanni

Hi Giovanni,

we posted our quick feedback about python3-ldap's API to the python3-ldap list.

We understand the goal of python3-ldap and think it is great to have an LDAP implementation in pure Python. It might also come handy for us some time. Thank you very much for that!

We are targeting server environments with an emphasize on reliability and performance. To achieve this we are ready to make trade-offs regarding installation convenience. That said, python-ldap provides windows installers that provide all needed libraries, including the C dependencies. Therefore, having C dependencies does not necessarily have a negative impact on the installation convenience.

Openldap is a mature library written in C that handles ASN parsing, TLS, and SASL among others. For our use cases we consider it a better starting point than a new library written in pure Python.

best regards
Marko and Florian

BTW, there's also https://github.com/yykamei/python-libldap, Python 3.4+ only but maybe it could be backported...?

Where does this project stand today? I was steered to pas.plugins.ldap recently, but this approach sounds better.

Which one?

The ldapalchemy stack sounds better. However, I did some more web sleuthing last night; these projects seem to have been untouched for a couple of years. At this point, I'm guessing that this effort has been abandon, but I could be wrong.

pas.plugins.ldap is what we have settled on

AFAIK ldapalchemy did not fly. We did fund a single sprint for it, but that did show that it would have got too expensive. Instead, we ended up optimizing pas.plugins.ldap for our purpose.