Problems with zodb transactions: Client has seen newer transactions than server

Hi,
i have the following setup:

  • Plone 5.1.0
  • zeoserver with two zeoclient on the same server

Sometimes we starting have these errors in the log:

2018-10-11T12:31:00 INFO ZEO.asyncio.base Connected Protocol(('127.0.0.1', 6069), '1', False)
------
2018-10-11T12:31:00 CRITICAL ZEO.asyncio.client Client has seen newer transactions than server!
------
2018-10-11T12:31:00 ERROR ZEO.asyncio.client Registration or cache validation failed, ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03
\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))
Traceback (most recent call last):
File "/opt/my-project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/asyncio/client.py", line 493, in verify
server_tid, cache_tid, protocol)
AssertionError: ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', 
Protocol(('127.0.0.1', 6069), '1', False))
------

After that, we can't do anything until we restart the instances.
I don't understand how is it possible that transactions are not synced between server and client.

The only resource that i found, is an old section in Plone docs for Plone 3..but i don't have that file.

Any ideas? Is it a known bug?

zec are an option, but that should happen at startup right?
Reminds me about caching issues. May it be possible?

that looks like a bug, indeed; in all my Plone life I have only seen this once: Plone 4.3 running ZEO, the ZEO server had a failure and some content was lost after we restarted the instances.

seem the editors of the site were able to continue working creating content even with the ZEO server down.

your issue may be related with this:

going a bit deeply in the logs, i've found some other informations.
In zeoserver.log i have this strange thing:

2018-10-11T12:42:21 Unexpected error while trying to resolve conflict on <ZODB.ConflictResolution.BadClass object at 0x7f1d3e80d750>
Traceback (most recent call last):
  File "/opt/my_project/eggs/ZODB-5.3.0-py2.7.egg/ZODB/ConflictResolution.py", line 259, in tryToResolveConflict
    inst = klass.__new__(klass, *newargs)
TypeError: object.__new__(X): X is not a type object (BadClass)
2018-10-11T12:42:21 Bad request, 'vote'
Traceback (most recent call last):
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/asyncio/server.py", line 102, in message_received
    result = getattr(self.zeo_storage, name)(*args)
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/StorageServer.py", line 371, in vote
    return self.lock_manager.lock(self, self._vote)
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/StorageServer.py", line 1166, in lock
    result = func()
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/StorageServer.py", line 392, in _vote
    getattr(self, op)(*args)
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/StorageServer.py", line 526, in _store
    self.storage.store(oid, serial, data, '', self.transaction)
  File "/opt/my_project/eggs/ZODB-5.3.0-py2.7.egg/ZODB/FileStorage/FileStorage.py", line 603, in store
    oldserial, data)
  File "/opt/my_project/eggs/ZODB-5.3.0-py2.7.egg/ZODB/ConflictResolution.py", line 297, in tryToResolveConflict
    data=newpickle)
ConflictError: database conflict error (oid 0x1694, class Products.PlonePAS.tools.memberdata.MemberData, serial this txn started with 0x03cad963732af699 2018-10-10 10:11:26.992459, serial currently committed 0x03cadf22435a5377 2018-10-11 10:42:15.785821)
2018-10-11T12:42:21 (:6069) no current transaction: tpc_abort()
2018-10-11T13:46:03 Unexpected error while trying to resolve conflict on <ZODB.ConflictResolution.BadClass object at 0x7f1d3df28390>
Traceback (most recent call last):
  File "/opt/my_project/eggs/ZODB-5.3.0-py2.7.egg/ZODB/ConflictResolution.py", line 259, in tryToResolveConflict
    inst = klass.__new__(klass, *newargs)
TypeError: object.__new__(X): X is not a type object (BadClass)
2018-10-11T13:46:03 Bad request, 'vote'
Traceback (most recent call last):
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/asyncio/server.py", line 102, in message_received
    result = getattr(self.zeo_storage, name)(*args)
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/StorageServer.py", line 371, in vote
    return self.lock_manager.lock(self, self._vote)
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/StorageServer.py", line 1166, in lock
    result = func()
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/StorageServer.py", line 392, in _vote
    getattr(self, op)(*args)
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/StorageServer.py", line 526, in _store
    self.storage.store(oid, serial, data, '', self.transaction)
  File "/opt/my_project/eggs/ZODB-5.3.0-py2.7.egg/ZODB/FileStorage/FileStorage.py", line 603, in store
    oldserial, data)
  File "/opt/my_project/eggs/ZODB-5.3.0-py2.7.egg/ZODB/ConflictResolution.py", line 297, in tryToResolveConflict
    data=newpickle)
ConflictError: database conflict error (oid 0x128d7e, class plone.scale.storage.ScalesDict, serial this txn started with 0x03cadf61faeda277 2018-10-11 11:45:58.811311, serial currently committed 0x03cadf620edb81dd 2018-10-11 11:46:03.482215)
2018-10-11T13:46:03 (:6069) no current transaction: tpc_abort()

I see several strange things:

  • the conflict is on MemberData tool. Maybe related to a login?
  • there is a gap of 1 day in a commit: started on 2018-10-10 and committed on 2018-10-11

and in instance.log:

2018-10-11T12:29:07 INFO ZEO.ClientStorage zeostorage Disconnected from storage: "('localhost', 6069)"

2018-10-11T12:29:07 INFO ZEO.asyncio.base Connected Protocol(('127.0.0.1', 6069), '1', False)

2018-10-11T12:29:07 CRITICAL txn.140251473311488 A storage error occurred during the second phase of the two-phase commit.  Resources may be in an inconsistent state.

2018-10-11T12:29:07 CRITICAL ZEO.asyncio.client Client has seen newer transactions than server!

2018-10-11T12:29:07 ERROR ZEO.asyncio.client Registration or cache validation failed, ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))
Traceback (most recent call last):
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/asyncio/client.py", line 493, in verify
    server_tid, cache_tid, protocol)
AssertionError: ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))

2018-10-11T12:29:16 INFO ZEO.asyncio.base Connected Protocol(('127.0.0.1', 6069), '1', False)

2018-10-11T12:29:16 CRITICAL ZEO.asyncio.client Client has seen newer transactions than server!

2018-10-11T12:29:16 ERROR ZEO.asyncio.client Registration or cache validation failed, ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))
Traceback (most recent call last):
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/asyncio/client.py", line 493, in verify
    server_tid, cache_tid, protocol)
AssertionError: ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))

2018-10-11T12:29:17 WARNING RequestMonitor.DumpTrace Long running request
Request 4190 "/VirtualHostBase/http/www.mysite.com:80/Plone/VirtualHostRoot/some-url" running in thread 140251473311488 since 12.6127619743s
Python call stack (innermost first)
  Module threading, line 359, in wait
  Module concurrent.futures._base, line 424, in result
  Module ZEO.asyncio.client, line 756, in wait_for_result
  Module ZEO.asyncio.client, line 748, in call
  Module ZEO.asyncio.client, line 764, in call
  Module ZEO.ClientStorage, line 864, in tpc_abort
  Module ZODB.Connection, line 639, in tpc_abort
  Module transaction._transaction, line 461, in _cleanup
  Module transaction._transaction, line 443, in _commitResources
  Module transaction._transaction, line 301, in commit
  Module transaction._manager, line 131, in commit
  Module Zope2.App.startup, line 303, in commit
  Module ZPublisher.Publish, line 146, in publish
  Module ZPublisher.Publish, line 249, in publish_module_standard
  Module ZPublisher.Publish, line 455, in publish_module
  Module ZServer.PubCore.ZServerPublisher, line 31, in __init__


2018-10-11T12:29:26 INFO ZEO.asyncio.base Connected Protocol(('127.0.0.1', 6069), '1', False)

2018-10-11T12:29:26 CRITICAL ZEO.asyncio.client Client has seen newer transactions than server!

2018-10-11T12:29:26 ERROR ZEO.asyncio.client Registration or cache validation failed, ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))
Traceback (most recent call last):
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/asyncio/client.py", line 493, in verify
    server_tid, cache_tid, protocol)
AssertionError: ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))

2018-10-11T12:29:27 WARNING RequestMonitor.DumpTrace Long running request
Request 4190 "/VirtualHostBase/http/www.mysite.com:80/Plone/VirtualHostRoot/some-url" running in thread 140251473311488 since 22.6194620132s
Python call stack (innermost first)
  Module threading, line 359, in wait
  Module concurrent.futures._base, line 424, in result
  Module ZEO.asyncio.client, line 756, in wait_for_result
  Module ZEO.asyncio.client, line 748, in call
  Module ZEO.asyncio.client, line 764, in call
  Module ZEO.ClientStorage, line 864, in tpc_abort
  Module ZODB.Connection, line 639, in tpc_abort
  Module transaction._transaction, line 461, in _cleanup
  Module transaction._transaction, line 443, in _commitResources
  Module transaction._transaction, line 301, in commit
  Module transaction._manager, line 131, in commit
  Module Zope2.App.startup, line 303, in commit
  Module ZPublisher.Publish, line 146, in publish
  Module ZPublisher.Publish, line 249, in publish_module_standard
  Module ZPublisher.Publish, line 455, in publish_module
  Module ZServer.PubCore.ZServerPublisher, line 31, in __init__


2018-10-11T12:29:35 INFO ZEO.asyncio.base Connected Protocol(('127.0.0.1', 6069), '1', False)

2018-10-11T12:29:35 CRITICAL ZEO.asyncio.client Client has seen newer transactions than server!

2018-10-11T12:29:35 ERROR ZEO.asyncio.client Registration or cache validation failed, ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))
Traceback (most recent call last):
  File "/opt/my_project/eggs/ZEO-5.1.1-py2.7.egg/ZEO/asyncio/client.py", line 493, in verify
    server_tid, cache_tid, protocol)
AssertionError: ('Server behind client, %r < %r, %s', '\x03\xca\xdf\x13Nz\xda\x00', '\x03\xca\xdf\x14\xca\x05G\xaa', Protocol(('127.0.0.1', 6069), '1', False))

2018-10-11T12:29:37 ERROR Zope.SiteErrorLog 1539253777.30.975442753589 http://www.mysite.com/some-other-url/versions_history_form
Traceback (innermost last):
  Module ZPublisher.Publish, line 146, in publish
  Module Zope2.App.startup, line 303, in commit
  Module transaction._manager, line 131, in commit
  Module transaction._transaction, line 310, in commit
  Module transaction._transaction, line 301, in commit
  Module transaction._transaction, line 446, in _commitResources
  Module transaction._transaction, line 428, in _commitResources
  Module ZODB.Connection, line 710, in tpc_finish
  Module ZODB.mvccadapter, line 181, in tpc_finish
  Module ZEO.ClientStorage, line 877, in tpc_finish
  Module ZEO.asyncio.client, line 786, in tpc_finish
  Module ZEO.asyncio.client, line 743, in call
  Module ZEO.asyncio.client, line 756, in wait_for_result
  Module concurrent.futures._base, line 429, in result
  Module concurrent.futures._base, line 381, in __get_result
ValueError: new last tid (273276001423383040) must be greater than previous one (273276007791019946)

It really seems that the client lost connection to the server like Hector said.

Some additional informations about this environment:

  • Data.fs previously was a Plone 5.0.x with a RelStorage db that has been exported with his tool. I exported this Data.fs with a similar Plone installation, and then i migrated it to 5.1.0. Locally i didn't see any db problem.
  • the site has 2 instances with a varnish that handles the cache and load balancing (we're trying this setup to see if we can avoid using haproxy). Does this could be the cause of the problem? Maybe it doesn't balance properly and creates some inconsistence?

We are having a very similar issue here with Plone 5.1. The only thing that seems to stop the error and prevent the disconnection of the zeoclients is to turn off plone.app.caching via the control panel. We especially notice, or able to reproduce the issue, if we view the history of an object on the site.

We've shut off all caching and disabled versioning to stop the errors for now. We have also rebuilt this server migrated database/blobs, cleared and rebuilt all site catalogs and packed database. Still the error continues.

have you seen the same issue in Plone 5.0?

I don't think so; I was checking and Plone 5.1 is using ZEO = 5.1.1; this new version states the following:

This is a major ZEO revision, which replaces the ZEO network protocol implementation.

I think you must read the changelog to try to find out more information on this; there are some new parameters also.

you could also post a question in the ZODB list.

that looks like a very bad idea.

Let me re-phrase this. We disabled plone.app.caching tool and hid the history button from editors. Turning off Plone.app.caching, for whatever reason, seems to alleviate if not eliminate the problem. At least for now. We're grabbing at straws here. Our issue seems to come up if we view version history of objects and do any reverts to older versions of content.

update: i've found some errors in the catalog, and i made a clear and rebuild.
After that, i never had that error anymore..i don't know if these two things could be related

1 Like

On a Plone 5.1.5 site which was migrated from 4.x I was experiencing the same problems. When a @@history page was it hit, after a couple times it would result in a Client has seen newer transactions than server! exception. Shortly after the initial exception multiple instances would return a ConnectionStateError, which caused an outage on the site.

The site was not running from RelStorage, but in our migration notes we had noted to reindex the catalog. This was not done on the live site. The fix recommended by @cekk fixed the exception we were experiencing. :+1:

I'm having this issue today on a Plone 5.1.4 site.
It wasn't migrated from any earlier version.

My suspicions are:

  • importing CSV data for membrane users via transmogrify.dexterity
  • enabling (in the last few days) of the history view (particularly for the above users)

I've tried a catalog rebuild, but the issue hasn't cleared up

Further info - I don't actually have plone.app.caching enabled.
I disabled the portal_diff tool for the apparently troublesome membrane and did another restart and catalog rebuild, but the error still seems to occur on versions_history_form

TL-DR - I think the issue is with Zeo 5

While not exactly a similar scenario, (my setup required encrypting ZODB per client need),
I logged a situation](TypeError/ReadError with ZEO.asyncio.client in verify) where a single Zeo and three Zope clients performance disintegrated to the point of just refusing to allow connections.

I too was using Plone 5.1 series here (pre 5.2 times) and cipher.encryptingstorage which worked fine previously. However, somewhere along the way, a change was introduced requiring Zeo 5. I paid zero attention to that at first, because initially empty of objects ZODB performed fine.

However, as my userrs added objects things went downhill rather fast. Frontline, the server appeared to be frozen, backend showed clients dead with:

TypeError: invalidate() takes exactly 3 arguments (4 given)

No amount of catalog rebuilds solved anything, I am not using plone.app.caching to cache objects, just styles, etc.

I ended up switching back to a single Zope, eliminated ZEO and performance was good enough to get the job done.

Plone Foundation Code of Conduct