Hi,
today my sentry gave me an error: "A storage error occurred during the second phase of the two-phase commit. Resources may be in an inconsistent state."
I vaguely remembered that I looked into collective.solr and decided that it does not handle two-phase-commits properly.
(https://en.wikipedia.org/wiki/Two-phase_commit_protocol)
The error made me have a second look into this and I stumbled upon this code snippet:
This is the management summary:
collective.indexing is wrapping TPC functionality ignoringTPC requirements so that it can provide a simpler commit interface. So collective.solr using collective.indexing has no way to implement 2PC properly.
Longer explanation:
In the first TPC phase, every storage is getting asked if they can "guarantee" that their transaction can be committed.
After everybody confirmed this, in the second TPC phase everbody is told to actually commit. The implementation above does never ask its storages if a commit is possible and first tries to commit in the second phase where a sucessful commit should be "guaranteed".
Example: Add an item. During second phase of TPC, I first I ask solr to commit, and solr finishes, then I ask ZODB to commit and get a WriteConflictError. ZODB does not store the document but solr has the information.
On the next search I get a result for a document that does not exist and trigger a site error when trying to open the page.
This is a problem considering that we are planning/doing/did add collective.indexing to core.
The solr example might look not soo bad, considering that everything can be repaired with a full reindex. But if you rely on TPC for another database that is not just an index, there is no working TPC implementation when using collective.indexing.
I am mentioning this here instead of just a ticket because I assume that there is a general interest into how to handle this. After all, a full fix would mean changing interfaces, and the alternative, giving up on TPC, might have big consequences for users who need this.
A bit of history: I was told that Nexedi originally provided the TPC code to ZODB because they needed this for accessing multiple database in a transactional save way for ERP5. ERP5 does not use Plone, just Zope, so it does not matter to them what we do with collective.indexing, but there might be Plone users who rely on this.